0% found this document useful (0 votes)
39 views821 pages

MPC Book 2nd Edition 4th Printing

This document provides a preface to the second edition of the textbook "Model Predictive Control: Theory, Computation, and Design". Key points: - A new chapter has been added on numerical optimal control to cover advances in algorithms for solving nonlinear optimal control problems. - Sections have been added on economic MPC, MPC with discrete actuators, and stochastic MPC. - Chapter 4 has been updated to reflect recent improvements in analysis of moving horizon estimation and full information estimation with bounded disturbances. - Notation has been changed and typographical errors from the first edition have been corrected.

Uploaded by

Zhipeng HOU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views821 pages

MPC Book 2nd Edition 4th Printing

This document provides a preface to the second edition of the textbook "Model Predictive Control: Theory, Computation, and Design". Key points: - A new chapter has been added on numerical optimal control to cover advances in algorithms for solving nonlinear optimal control problems. - Sections have been added on economic MPC, MPC with discrete actuators, and stochastic MPC. - Chapter 4 has been updated to reflect recent improvements in analysis of moving horizon estimation and full information estimation with bounded disturbances. - Notation has been changed and typographical errors from the first edition have been corrected.

Uploaded by

Zhipeng HOU
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 821

Model Predictive Control:

Theory, Computation, and Design


2nd Edition
Model Predictive Control:
Theory, Computation, and Design
2nd Edition

James B. Rawlings
Department of Chemical Engineering
University of California
Santa Barbara, California, USA

David Q. Mayne
Department of Electrical and Electronic Engineering
Imperial College London
London, England

Moritz M. Diehl
Department of Microsystems Engineering and

D
Department of Mathematics
University of Freiburg
Freiburg, Germany

b Hi
No ll Publishing

Santa Barbara, California


This book was set in Lucida using LATEX, and printed and bound by
Worzalla.

Cover design by Cheryl M. and James B. Rawlings.

Copyright © 2022 by Nob Hill Publishing, LLC

All rights reserved.

Nob Hill Publishing, LLC


Cheryl M. Rawlings, publisher
Santa Barbara, CA 93101
[email protected]
http://www.nobhillpublishing.com

No part of this book may be reproduced, in any form or by any means,


without permission in writing from the publisher.

Library of Congress Control Number: 2020942771

Printed in the United States of America.

First Edition
First Printing August 2009
Electronic Download (1st) November 2013
Electronic Download (2nd) April 2014
Electronic Download (3rd) July 2014
Electronic Download (4th) October 2014
Electronic Download (5th) February 2015
Second Edition
First Printing October 2017
Electronic Download (1st) October 2018
Electronic Download (2nd) February 2019
Paperback Edition
Third Printing October 2020
Electronic Download (3rd) October 2020
Electronic Download (4th) April 2022
v
To Cheryl, Josephine, and Stephanie,

for their love, encouragement, and patience.


Preface to the Second Edition

In the eight years since the publication of the first edition, the field
of model predictive control (MPC) has seen tremendous progress. First
and foremost, the algorithms and high-level software available for solv-
ing challenging nonlinear optimal control problems have advanced sig-
nificantly. For this reason, we have added a new chapter, Chapter 8,
“Numerical Optimal Control,” and coauthor, Professor Moritz M. Diehl.
This chapter gives an introduction into methods for the numerical so-
lution of the MPC optimization problem. Numerical optimal control
builds on two fields: simulation of differential equations, and numeri-
cal optimization. Simulation is often covered in undergraduate courses
and is therefore only briefly reviewed. Optimization is treated in much
more detail, covering topics such as derivative computations, Hessian
approximations, and handling inequalities. Most importantly, the chap-
ter presents some of the many ways that the specific structure of opti-
mal control problems arising in MPC can be exploited algorithmically.
We have also added a software release with the second edition of
the text. The software enables the solution of all of the examples and
exercises in the text requiring numerical calculation. The software is
based on the freely available CasADi language, and a high-level set of
Octave/MATLAB functions, MPCTools, to serve as an interface to CasADi.
These tools have been tested in several MPC short courses to audiences
composed of researchers and practitioners. The software can be down-
loaded from www.chemengr.ucsb.edu/~jbraw/mpc.
In Chapter 2, we have added sections covering the following topics:
• economic MPC
• MPC with discrete actuators
We also present a more recent form of suboptimal MPC that is prov-
ably robust as well as computationally tractable for online solution of
nonconvex MPC problems.
In Chapter 3, we have added a discussion of stochastic MPC, which
has received considerable recent research attention.
In Chapter 4, we have added a new treatment of state estimation
with persistent, bounded process and measurement disturbances. We
have also removed the discussion of particle filtering. There are two

vii
viii

reasons for this removal; first, we wanted to maintain a manageable


total length of the text; second, all of the available sampling strate-
gies in particle filtering come up against the “curse of dimensionality,”
which renders the state estimates inaccurate for dimension higher than
about five. The material on particle filtering remains available on the
text website.
In Chapter 6, we have added a new section for distributed MPC of
nonlinear systems.
In Chapter 7, we have added the software to compute the critical
regions in explicit MPC.
Throughout the text, we support the stronger KL-definition of asymp-
totic stability, in place of the classical definition used in the first edition.
The most significant notational change is to denote a sequence with
(a, b, c, . . .) instead of with {a, b, c, . . .} as in the first edition.

JBR DQM MMD


Madison, Wis., USA London, England Freiburg, Germany

Added for the second edition, third printing


The second edition, first printing was made available electronically in
October 2018. The February 2019 second (electronic only) printing
mainly corrected typographical errors. This third printing was printed
as a paperback and made available electronically in October 2020.
In this third printing, besides removing typographical and other er-
rors, Chapter 4 was revised significantly. The analysis of Moving Hori-
zon Estimation and Full Information Estimation with bounded distur-
bances has improved significantly in the last several years due to the
research efforts of several groups. We have attempted to bring the
material in Chapter 4 up to date with this current literature.
Moreover, the section in Chapter 3 on Stochastic MPC was updated,
and a new section on discrete actuators was added to Chapter 8.

JBR DQM MMD


Santa Barbara, CA, USA London, England Freiburg, Germany
Preface

Our goal in this text is to provide a comprehensive and foundational


treatment of the theory and design of model predictive control (MPC).
By now several excellent monographs emphasizing various aspects of
MPC have appeared (a list appears at the beginning of Chapter 1), and
the reader may naturally wonder what is offered here that is new and
different. By providing a comprehensive treatment of the MPC foun-
dation, we hope that this text enables researchers to learn and teach
the fundamentals of MPC without continuously searching the diverse
control research literature for omitted arguments and requisite back-
ground material. When teaching the subject, it is essential to have a
collection of exercises that enables the students to assess their level of
comprehension and mastery of the topics. To support the teaching and
learning of MPC, we have included more than 200 end-of-chapter exer-
cises. A complete solution manual (more than 300 pages) is available
for course instructors.
Chapter 1 is introductory. It is intended for graduate students in en-
gineering who have not yet had a systems course. But it serves a second
purpose for those who have already taken the first graduate systems
course. It derives all the results of the linear quadratic regulator and
optimal Kalman filter using only those arguments that extend to the
nonlinear and constrained cases to be covered in the later chapters.
Instructors may find that this tailored treatment of the introductory
systems material serves both as a review and a preview of arguments
to come in the later chapters.
Chapters 2–4 are foundational and should probably be covered in
any graduate level MPC course. Chapter 2 covers regulation to the ori-
gin for nonlinear and constrained systems. This material presents in a
unified fashion many of the major research advances in MPC that took
place during the last 20 years. It also includes more recent topics such
as regulation to an unreachable setpoint that are only now appearing in
the research literature. Chapter 3 addresses MPC design for robustness,
with a focus on MPC using tubes or bundles of trajectories in place of
the single nominal trajectory. This chapter again unifies a large body of
research literature concerned with robust MPC. Chapter 4 covers state
estimation with an emphasis on moving horizon estimation, but also

ix
x

covers extended and unscented Kalman filtering, and particle filtering.


Chapters 5–7 present more specialized topics. Chapter 5 addresses
the special requirements of MPC based on output measurement instead
of state measurement. Chapter 6 discusses how to design distributed
MPC controllers for large-scale systems that are decomposed into many
smaller, interacting subsystems. Chapter 7 covers the explicit optimal
control of constrained linear systems. The choice of coverage of these
three chapters may vary depending on the instructor’s or student’s own
research interests.
Three appendices are included, again, so that the reader is not sent
off to search a large research literature for the fundamental arguments
used in the text. Appendix A covers the required mathematical back-
ground. Appendix B summarizes the results used for stability analysis
including the various types of stability and Lyapunov function theory.
Since MPC is an optimization-based controller, Appendix C covers the
relevant results from optimization theory. In order to reduce the size
and expense of the text, the three appendices are available on the web:
www.chemengr.ucsb.edu/~jbraw/mpc. Note, however, that all mate-
rial in the appendices is included in the book’s printed table of contents,
and subject and author indices. The website also includes sample ex-
ams, and homework assignments for a one-semester graduate course
in MPC. All of the examples and exercises in the text were solved with
Octave. Octave is freely available from www.octave.org.

JBR DQM
Madison, Wisconsin, USA London, England
Acknowledgments

Both authors would like to thank the Department of Chemical and Bio-
logical Engineering of the University of Wisconsin for hosting DQM’s
visits to Madison during the preparation of this monograph. Funding
from the Paul A. Elfers Professorship provided generous financial sup-
port.
JBR would like to acknowledge the graduate students with whom
he has had the privilege to work on model predictive control topics:
Rishi Amrit, Dennis Bonné, John Campbell, John Eaton, Peter Findeisen,
Rolf Findeisen, Eric Haseltine, John Jørgensen, Nabil Laachi, Scott Mead-
ows, Scott Middlebrooks, Steve Miller, Ken Muske, Brian Odelson, Mu-
rali Rajamani, Chris Rao, Brett Stewart, Kaushik Subramanian, Aswin
Venkat, and Jenny Wang. He would also like to thank many colleagues
with whom he has collaborated on this subject: Frank Allgöwer, Tom
Badgwell, Bhavik Bakshi, Don Bartusiak, Larry Biegler, Moritz Diehl,
Jim Downs, Tom Edgar, Brian Froisy, Ravi Gudi, Sten Bay Jørgensen,
Jay Lee, Fernando Lima, Wolfgang Marquardt, Gabriele Pannocchia, Joe
Qin, Harmon Ray, Pierre Scokaert, Sigurd Skogestad, Tyler Soderstrom,
Steve Wright, and Robert Young.
DQM would like to thank his colleagues at Imperial College, espe-
cially Richard Vinter and Martin Clark, for providing a stimulating and
congenial research environment. He is very grateful to Lucien Polak
and Graham Goodwin with whom he has collaborated extensively and
fruitfully over many years; he would also like to thank many other col-
leagues, especially Karl Åström, Roger Brockett, Larry Ho, Petar Koko-
tovic, and Art Krener, from whom he has learned much. He is grateful
to past students who have worked with him on model predictive con-
trol: Ioannis Chrysochoos, Wilbur Langson, Hannah Michalska, Sasa
Raković, and Warren Schroeder; Hannah Michalska and Sasa Raković, in
particular, contributed very substantially. He owes much to these past
students, now colleagues, as well as to Frank Allgöwer, Rolf Findeisen,
Eric Kerrigan, Konstantinos Kouramus, Chris Rao, Pierre Scokaert, and
Maria Seron for their collaborative research in MPC.
Both authors would especially like to thank Tom Badgwell, Bob Bird,
Eric Kerrigan, Ken Muske, Gabriele Pannocchia, and Maria Seron for
their careful and helpful reading of parts of the manuscript. John Eaton

xi
xii

again deserves special mention for his invaluable technical support dur-
ing the entire preparation of the manuscript.
Added for the second edition. JBR would like to acknowledge the
most recent generation of graduate students with whom he has had the
privilege to work on model predictive control research topics: Doug Al-
lan, Travis Arnold, Cuyler Bates, Luo Ji, Nishith Patel, Michael Risbeck,
and Megan Zagrobelny.
In preparing the second edition, and, in particular, the software re-
lease, the current group of graduate students far exceeded expectations
to help finish the project. Quite simply, the project could not have been
completed in a timely fashion without their generosity, enthusiasm,
professionalism, and selfless contribution. Michael Risbeck deserves
special mention for creating the MPCTools interface to CasADi, and
updating and revising the tools used to create the website to distribute
the text- and software-supporting materials. He also wrote code to cal-
culate explicit MPC control laws in Chapter 7. Nishith Patel made a
major contribution to the subject index, and Doug Allan contributed
generously to the presentation of moving horizon estimation in Chap-
ter 4.
A research leave for JBR in Fall 2016, again funded by the Paul A.
Elfers Professorship, was instrumental in freeing up time to complete
the revision of the text and further develop computational exercises.
MMD wants to especially thank Jesus Lago Garcia, Jochem De Schut-
ter, Andrea Zanelli, Dimitris Kouzoupis, Joris Gillis, Joel Andersson,
and Robin Verschueren for help with the preparation of exercises and
examples in Chapter 8; and also wants to acknowledge the following
current and former team members that contributed to research and
teaching on optimal and model predictive control at the Universities of
Leuven and Freiburg: Adrian Bürger, Hans Joachim Ferreau, Jörg Fis-
cher, Janick Frasch, Gianluca Frison, Niels Haverbeke, Greg Horn, Boris
Houska, Jonas Koenemann, Attila Kozma, Vyacheslav Kungurtsev, Gio-
vanni Licitra, Rien Quirynen, Carlo Savorgnan, Quoc Tran-Dinh, Milan
Vukov, and Mario Zanon. MMD also wants to thank Frank Allgöwer, Al-
berto Bemporad, Rolf Findeisen, Larry Biegler, Hans Georg Bock, Stephen
Boyd, Sébastien Gros, Lars Grüne, Colin Jones, John Bagterp Jørgensen,
Christian Kirches, Daniel Leineweber, Katja Mombaur, Yurii Nesterov,
Toshiyuki Ohtsuka, Goele Pipeleers, Andreas Potschka, Sebastian Sager,
Johannes P. Schlöder, Volker Schulz, Marc Steinbach, Jan Swevers, Phil-
ippe Toint, Andrea Walther, Stephen Wright, Joos Vandewalle, and Ste-
fan Vandewalle for inspiring discussions on numerical optimal control
xiii

methods and their presentation during the last 20 years.


All three authors would especially like to thank Joel Andersson and
Joris Gillis for having developed CasADi and for continuing its support,
and for having helped to improve some of the exercises in the text.
Added for the second edition, third printing. The authors would
like to acknowledge and thank Doug Allan again for his suggestions
and help with the revision of Chapter 4. Much of the new material
on Full Information Estimation and Moving Horizon Estimation is a di-
rect result of Doug’s research papers and 2020 PhD thesis on state
estimation. Koty McAllister provided expert assistance in the update
of stochastic MPC in Chapter 3. Finally, Adrian Buerger and Pratyush
Kumar provided valuable assistance on the addition of the discrete ac-
tuator numerics to Chapter 8.
Contents

1 Getting Started with Model Predictive Control 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Models and Modeling . . . . . . . . . . . . . . . . . . . . . . 1
1.2.1 Linear Dynamic Models . . . . . . . . . . . . . . . . . 2
1.2.2 Input-Output Models . . . . . . . . . . . . . . . . . . 3
1.2.3 Distributed Models . . . . . . . . . . . . . . . . . . . 4
1.2.4 Discrete Time Models . . . . . . . . . . . . . . . . . . 5
1.2.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.6 Deterministic and Stochastic . . . . . . . . . . . . . 9
1.3 Introductory MPC Regulator . . . . . . . . . . . . . . . . . . 11
1.3.1 Linear Quadratic Problem . . . . . . . . . . . . . . . 11
1.3.2 Optimizing Multistage Functions . . . . . . . . . . . 12
1.3.3 Dynamic Programming Solution . . . . . . . . . . . 18
1.3.4 The Infinite Horizon LQ Problem . . . . . . . . . . . 21
1.3.5 Controllability . . . . . . . . . . . . . . . . . . . . . . 23
1.3.6 Convergence of the Linear Quadratic Regulator . . 24
1.4 Introductory State Estimation . . . . . . . . . . . . . . . . . 26
1.4.1 Linear Systems and Normal Distributions . . . . . 27
1.4.2 Linear Optimal State Estimation . . . . . . . . . . . 29
1.4.3 Least Squares Estimation . . . . . . . . . . . . . . . 33
1.4.4 Moving Horizon Estimation . . . . . . . . . . . . . . 39
1.4.5 Observability . . . . . . . . . . . . . . . . . . . . . . . 41
1.4.6 Convergence of the State Estimator . . . . . . . . . 43
1.5 Tracking, Disturbances, and Zero Offset . . . . . . . . . . 46
1.5.1 Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.5.2 Disturbances and Zero Offset . . . . . . . . . . . . . 49
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

2 Model Predictive Control—Regulation 89


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
2.2 Model Predictive Control . . . . . . . . . . . . . . . . . . . . 91
2.3 Dynamic Programming Solution . . . . . . . . . . . . . . . 107
2.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 112

xv
xvi Contents

2.4.2 Stabilizing Conditions . . . . . . . . . . . . . . . . . 114


2.4.3 Exponential Stability . . . . . . . . . . . . . . . . . . 120
2.4.4 Controllability and Observability . . . . . . . . . . . 120
2.4.5 Time-Varying Systems . . . . . . . . . . . . . . . . . 123
2.5 Examples of MPC . . . . . . . . . . . . . . . . . . . . . . . . . 131
2.5.1 The Unconstrained Linear Quadratic Regulator . . 132
2.5.2 Unconstrained Linear Periodic Systems . . . . . . . 133
2.5.3 Stable Linear Systems with Control Constraints . 135
2.5.4 Linear Systems with Control and State Constraints 136
2.5.5 Constrained Nonlinear Systems . . . . . . . . . . . 139
2.5.6 Constrained Nonlinear Time-Varying Systems . . . 141
2.6 Is a Terminal Constraint Set Xf Necessary? . . . . . . . . 144
2.7 Suboptimal MPC . . . . . . . . . . . . . . . . . . . . . . . . . 147
2.7.1 Extended State . . . . . . . . . . . . . . . . . . . . . . 150
2.7.2 Asymptotic Stability of Difference Inclusions . . . 150
2.8 Economic Model Predictive Control . . . . . . . . . . . . . 153
2.8.1 Asymptotic Average Performance . . . . . . . . . . 155
2.8.2 Dissipativity and Asymptotic Stability . . . . . . . 156
2.9 Discrete Actuators . . . . . . . . . . . . . . . . . . . . . . . . 160
2.10 Concluding Comments . . . . . . . . . . . . . . . . . . . . . 163
2.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
2.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

3 Robust and Stochastic Model Predictive Control 193


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
3.1.1 Types of Uncertainty . . . . . . . . . . . . . . . . . . 193
3.1.2 Feedback Versus Open-Loop Control . . . . . . . . 195
3.1.3 Robust and Stochastic MPC . . . . . . . . . . . . . . 200
3.1.4 Tubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
3.1.5 Difference Inclusion Description of Uncertain Sys-
tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
3.2 Nominal (Inherent ) Robustness . . . . . . . . . . . . . . . . 204
3.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 204
3.2.2 Difference Inclusion Description of Discontinu-
ous Systems . . . . . . . . . . . . . . . . . . . . . . . 206
3.2.3 When Is Nominal MPC Robust? . . . . . . . . . . . . 207
3.2.4 Robustness of Nominal MPC . . . . . . . . . . . . . 209
3.3 Min-Max Optimal Control: Dynamic Programming Solution 214
3.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 214
3.3.2 Properties of the Dynamic Programming Solution 216
Contents xvii

3.4 Robust Min-Max MPC . . . . . . . . . . . . . . . . . . . . . . 220


3.5 Tube-Based Robust MPC . . . . . . . . . . . . . . . . . . . . 223
3.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 223
3.5.2 Outer-Bounding Tube for a Linear System with
Additive Disturbance . . . . . . . . . . . . . . . . . . 224
3.5.3 Tube-Based MPC of Linear Systems with Additive
Disturbances . . . . . . . . . . . . . . . . . . . . . . . 228
3.5.4 Improved Tube-Based MPC of Linear Systems with
Additive Disturbances . . . . . . . . . . . . . . . . . 234
3.6 Tube-Based MPC of Nonlinear Systems . . . . . . . . . . . 236
3.6.1 The Nominal Trajectory . . . . . . . . . . . . . . . . 238
3.6.2 Model Predictive Controller . . . . . . . . . . . . . . 238
3.6.3 Choosing the Nominal Constraint Sets Ū and X̄ . . 242
3.7 Stochastic MPC . . . . . . . . . . . . . . . . . . . . . . . . . . 246
3.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 246
3.7.2 Stability of Stochastic MPC . . . . . . . . . . . . . . 248
3.7.3 Tube-based stochastic MPC . . . . . . . . . . . . . . 250
3.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

4 State Estimation 269


4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
4.2 Full Information Estimation . . . . . . . . . . . . . . . . . . 269
4.2.1 Nominal Estimator Stability . . . . . . . . . . . . . . 279
4.2.2 Robust Estimator Stability . . . . . . . . . . . . . . . 284
4.2.3 Interlude—Linear System Review . . . . . . . . . . . 287
4.3 Moving Horizon Estimation . . . . . . . . . . . . . . . . . . 292
4.3.1 Zero Prior Weighting . . . . . . . . . . . . . . . . . . 293
4.3.2 Nonzero Prior Weighting . . . . . . . . . . . . . . . . 296
4.3.3 RGES of MHE under exponential assumptions . . . 297
4.4 Other Nonlinear State Estimators . . . . . . . . . . . . . . . 302
4.4.1 Particle Filtering . . . . . . . . . . . . . . . . . . . . . 302
4.4.2 Extended Kalman Filtering . . . . . . . . . . . . . . . 302
4.4.3 Unscented Kalman Filtering . . . . . . . . . . . . . . 304
4.4.4 EKF, UKF, and MHE Comparison . . . . . . . . . . . 306
4.5 On combining MHE and MPC . . . . . . . . . . . . . . . . . 312
4.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
xviii Contents

5 Output Model Predictive Control 333


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
5.2 A Method for Output MPC . . . . . . . . . . . . . . . . . . . 335
5.3 Linear Constrained Systems: Time-Invariant Case . . . . 338
5.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 338
5.3.2 State Estimator . . . . . . . . . . . . . . . . . . . . . . 338
5.3.3 Controlling x̂ . . . . . . . . . . . . . . . . . . . . . . . 340
5.3.4 Output MPC . . . . . . . . . . . . . . . . . . . . . . . . 342
5.3.5 Computing the Tightened Constraints . . . . . . . 346
5.4 Linear Constrained Systems: Time-Varying Case . . . . . 347
5.5 Offset-Free MPC . . . . . . . . . . . . . . . . . . . . . . . . . 347
5.5.1 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . 349
5.5.2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
5.5.3 Convergence Analysis . . . . . . . . . . . . . . . . . 354
5.6 Nonlinear Constrained Systems . . . . . . . . . . . . . . . . 357
5.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

6 Distributed Model Predictive Control 363


6.1 Introduction and Preliminary Results . . . . . . . . . . . . 363
6.1.1 Least Squares Solution . . . . . . . . . . . . . . . . . 364
6.1.2 Stability of Suboptimal MPC . . . . . . . . . . . . . . 369
6.2 Unconstrained Two-Player Game . . . . . . . . . . . . . . . 374
6.2.1 Centralized Control . . . . . . . . . . . . . . . . . . . 376
6.2.2 Decentralized Control . . . . . . . . . . . . . . . . . 377
6.2.3 Noncooperative Game . . . . . . . . . . . . . . . . . 378
6.2.4 Cooperative Game . . . . . . . . . . . . . . . . . . . . 386
6.2.5 Tracking Nonzero Setpoints . . . . . . . . . . . . . . 392
6.2.6 State Estimation . . . . . . . . . . . . . . . . . . . . . 399
6.3 Constrained Two-Player Game . . . . . . . . . . . . . . . . 400
6.3.1 Uncoupled Input Constraints . . . . . . . . . . . . . 402
6.3.2 Coupled Input Constraints . . . . . . . . . . . . . . 405
6.3.3 Exponential Convergence with Estimate Error . . . 406
6.3.4 Disturbance Models and Zero Offset . . . . . . . . 408
6.4 Constrained M-Player Game . . . . . . . . . . . . . . . . . . 412
6.5 Nonlinear Distributed MPC . . . . . . . . . . . . . . . . . . . 414
6.5.1 Nonconvexity . . . . . . . . . . . . . . . . . . . . . . . 415
6.5.2 Distributed Algorithm for Nonconvex Functions . 417
6.5.3 Distributed Nonlinear Cooperative Control . . . . 419
6.5.4 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Contents xix

6.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425


6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429

7 Explicit Control Laws for Constrained Linear Systems 445


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
7.2 Parametric Programming . . . . . . . . . . . . . . . . . . . . 446
7.3 Parametric Quadratic Programming . . . . . . . . . . . . . 451
7.3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 451
7.3.2 Preview . . . . . . . . . . . . . . . . . . . . . . . . . . 452
7.3.3 Optimality Condition for a Convex Program . . . . 453
7.3.4 Solution of the Parametric Quadratic Program . . 456
7.3.5 Continuity of V 0 (·) and u0 (·) . . . . . . . . . . . . 460
7.4 Constrained Linear Quadratic Control . . . . . . . . . . . . 461
7.5 Parametric Piecewise Quadratic Programming . . . . . . . 463
7.6 DP Solution of the Constrained LQ Control Problem . . . 469
7.7 Parametric Linear Programming . . . . . . . . . . . . . . . 470
7.7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 470
7.7.2 Minimizer u0 (x) is Unique for all x ∈ X . . . . . . 472
7.8 Constrained Linear Control . . . . . . . . . . . . . . . . . . 475
7.9 Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
7.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
7.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478

8 Numerical Optimal Control 485


8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
8.1.1 Discrete Time Optimal Control Problem . . . . . . 486
8.1.2 Convex Versus Nonconvex Optimization . . . . . . 487
8.1.3 Simultaneous Versus Sequential Optimal Control 490
8.1.4 Continuous Time Optimal Control Problem . . . . 492
8.2 Numerical Simulation . . . . . . . . . . . . . . . . . . . . . . 495
8.2.1 Explicit Runge-Kutta Methods . . . . . . . . . . . . . 496
8.2.2 Stiff Equations and Implicit Integrators . . . . . . . 500
8.2.3 Implicit Runge-Kutta and Collocation Methods . . 501
8.2.4 Differential Algebraic Equations . . . . . . . . . . . 505
8.2.5 Integrator Adaptivity . . . . . . . . . . . . . . . . . . 507
8.3 Solving Nonlinear Equation Systems . . . . . . . . . . . . . 507
8.3.1 Linear Systems . . . . . . . . . . . . . . . . . . . . . . 507
8.3.2 Nonlinear Root-Finding Problems . . . . . . . . . . 508
8.3.3 Local Convergence of Newton-Type Methods . . . 511
8.3.4 Affine Invariance . . . . . . . . . . . . . . . . . . . . . 513
8.3.5 Globalization for Newton-Type Methods . . . . . . 513
xx Contents

8.4 Computing Derivatives . . . . . . . . . . . . . . . . . . . . . 514


8.4.1 Numerical Differentiation . . . . . . . . . . . . . . . 515
8.4.2 Algorithmic Differentiation . . . . . . . . . . . . . . 516
8.4.3 Implicit Function Interpretation . . . . . . . . . . . 517
8.4.4 Algorithmic Differentiation in Forward Mode . . . 520
8.4.5 Algorithmic Differentiation in Reverse Mode . . . 522
8.4.6 Differentiation of Simulation Routines . . . . . . . 525
8.4.7 Algorithmic and Symbolic Differentiation Software 527
8.4.8 CasADi for Optimization . . . . . . . . . . . . . . . . 527
8.5 Direct Optimal Control Parameterizations . . . . . . . . . 530
8.5.1 Direct Single Shooting . . . . . . . . . . . . . . . . . 532
8.5.2 Direct Multiple Shooting . . . . . . . . . . . . . . . . 534
8.5.3 Direct Transcription and Collocation Methods . . 538
8.6 Nonlinear Optimization . . . . . . . . . . . . . . . . . . . . . 542
8.6.1 Optimality Conditions and Perturbation Analysis 543
8.6.2 Nonlinear Optimization with Equalities . . . . . . . 546
8.6.3 Hessian Approximations . . . . . . . . . . . . . . . . 547
8.7 Newton-Type Optimization with Inequalities . . . . . . . 550
8.7.1 Sequential Quadratic Programming . . . . . . . . . 551
8.7.2 Nonlinear Interior Point Methods . . . . . . . . . . 552
8.7.3 Comparison of SQP and Nonlinear IP Methods . . 554
8.8 Structure in Discrete Time Optimal Control . . . . . . . . 555
8.8.1 Simultaneous Approach . . . . . . . . . . . . . . . . 556
8.8.2 Linear Quadratic Problems (LQP) . . . . . . . . . . . 558
8.8.3 LQP Solution by Riccati Recursion . . . . . . . . . . 558
8.8.4 LQP Solution by Condensing . . . . . . . . . . . . . 560
8.8.5 Sequential Approaches and Sparsity Exploitation 562
8.8.6 Differential Dynamic Programming . . . . . . . . . 564
8.8.7 Additional Constraints in Optimal Control . . . . . 566
8.9 Online Optimization Algorithms . . . . . . . . . . . . . . . 567
8.9.1 General Algorithmic Considerations . . . . . . . . . 568
8.9.2 Continuation Methods and Real-Time Iterations . 571
8.10 Discrete Actuators . . . . . . . . . . . . . . . . . . . . . . . . 574
8.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
8.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581

Author Index 600

Citation Index 608

Subject Index 614


Contents xxi

A Mathematical Background 624


A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
A.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
A.3 Range and Nullspace of Matrices . . . . . . . . . . . . . . . 624
A.4 Linear Equations — Existence and Uniqueness . . . . . . 625
A.5 Pseudo-Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . 625
A.6 Partitioned Matrix Inversion Theorem . . . . . . . . . . . . 628
A.7 Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . 629
A.8 Norms in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
A.9 Sets in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
A.10Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
A.11Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
A.12Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 636
A.13Convex Sets and Functions . . . . . . . . . . . . . . . . . . . 641
A.13.1 Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . 641
A.13.2 Convex Functions . . . . . . . . . . . . . . . . . . . . 646
A.14Differential Equations . . . . . . . . . . . . . . . . . . . . . . 648
A.15Random Variables and the Probability Density . . . . . . 654
A.16Multivariate Density Functions . . . . . . . . . . . . . . . . 659
A.16.1 Statistical Independence and Correlation . . . . . . 668
A.17Conditional Probability and Bayes’s Theorem . . . . . . . 672
A.18Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678

B Stability Theory 693


B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693
B.2 Stability and Asymptotic Stability . . . . . . . . . . . . . . 696
B.3 Lyapunov Stability Theory . . . . . . . . . . . . . . . . . . . 700
B.3.1 Time-Invariant Systems . . . . . . . . . . . . . . . . 701
B.3.2 Time-Varying, Constrained Systems . . . . . . . . . 707
B.3.3 Upper bounding K functions . . . . . . . . . . . . . 709
B.4 Robust Stability . . . . . . . . . . . . . . . . . . . . . . . . . 709
B.4.1 Nominal Robustness . . . . . . . . . . . . . . . . . . 709
B.4.2 Robustness . . . . . . . . . . . . . . . . . . . . . . . . 711
B.5 Control Lyapunov Functions . . . . . . . . . . . . . . . . . . 713
B.6 Input-to-State Stability . . . . . . . . . . . . . . . . . . . . . 717
B.7 Output-to-State Stability and Detectability . . . . . . . . . 719
B.8 Input/Output-to-State Stability . . . . . . . . . . . . . . . . 720
B.9 Incremental-Input/Output-to-State Stability . . . . . . . . 722
B.10 Observability . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
B.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
xxii Contents

C Optimization 729
C.1 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . 729
C.1.1 Optimal Control Problem . . . . . . . . . . . . . . . 731
C.1.2 Dynamic Programming . . . . . . . . . . . . . . . . . 733
C.2 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . 737
C.2.1 Tangent and Normal Cones . . . . . . . . . . . . . . 737
C.2.2 Convex Optimization Problems . . . . . . . . . . . . 741
C.2.3 Convex Problems: Polyhedral Constraint Set . . . 743
C.2.4 Nonconvex Problems . . . . . . . . . . . . . . . . . . 745
C.2.5 Tangent and Normal Cones . . . . . . . . . . . . . . 746
C.2.6 Constraint Set Defined by Inequalities . . . . . . . 750
C.2.7 Constraint Set; Equalities and Inequalities . . . . . 753
C.3 Set-Valued Functions and Continuity of Value Function . 755
C.3.1 Outer and Inner Semicontinuity . . . . . . . . . . . 757
C.3.2 Continuity of the Value Function . . . . . . . . . . . 759
C.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767
List of Figures

1.1 System with input u, output y, and transfer function ma-


trix G connecting them; the model is y = Gu. . . . . . . . . 3
1.2 Typical input constraint sets U for (a) continuous actua-
tors and (b) mixed continuous/discrete actuators. . . . . . 9
1.3 Output of a stochastic system versus time. . . . . . . . . . . 10
1.4 Two quadratic functions and their sum. . . . . . . . . . . . . 15
1.5 Schematic of the moving horizon estimation problem. . . . 39
1.6 MPC controller consisting of: receding horizon regulator,
state estimator, and target selector. . . . . . . . . . . . . . . 52
1.7 Schematic of the well-stirred reactor. . . . . . . . . . . . . . . 54
1.8 Three measured outputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 2. . . . . . . . . . . . . 57
1.9 Two manipulated inputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 2. . . . . . . . . . . . . 57
1.10 Three measured outputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 3. . . . . . . . . . . . . 58
1.11 Two manipulated inputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 3. . . . . . . . . . . . . 59
1.12 Plug-flow reactor. . . . . . . . . . . . . . . . . . . . . . . . . . . 60
1.13 Pendulum with applied torque. . . . . . . . . . . . . . . . . . 62
1.14 Feedback control system with output disturbance d, and
setpoint ysp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

2.1 Example of MPC. . . . . . . . . . . . . . . . . . . . . . . . . . . 101


2.2 Feasible region U2 , elliptical cost contours and ellipse
center a(x), and constrained minimizers for different
values of x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
2.3 First element of control constraint set U3 (x) (shaded)
and control law κ3 (x) (line) versus x = (cos(θ), sin(θ)),
θ ∈ [−π , π ] on the unit circle for a nonlinear system with
terminal constraint. . . . . . . . . . . . . . . . . . . . . . . . . 106
2.4 Optimal cost V30 (x) versus x on the unit circle. . . . . . . . 107

xxiii
xxiv List of Figures

2.5 Closed-loop economic MPC versus tracking MPC starting


at x = (−8, 8) with optimal steady state (8, 4). Both con-
trollers asymptotically stabilize the steady state. Dashed
contours show cost functions for each controller. . . . . . . 159
2.6 Closed-loop evolution under economic MPC. The rotated
0
cost function V e is a Lyapunov function for the system. . . 160
2.7 Diagram of tank/cooler system. Each cooling unit can be
either on or off, and if on, it must be between its (possibly
nonzero) minimum and maximum capacities. . . . . . . . . 163
2.8 Feasible sets XN for two values of Q̇min . Note that for
Q̇min = 9 (right-hand side), XN for N ≤ 4 are discon-
nected sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
2.9 Phase portrait for closed-loop evolution of cooler system
with Q̇min = 9. Line colors show value of discrete actuator u2 . 165
2.10 Region of attraction (shaded region) for constrained MPC
controller of Exercise 2.6. . . . . . . . . . . . . . . . . . . . . . 174
2.11 The region Xf , in which the unconstrained LQR control
law is feasible for Exercise 2.7. . . . . . . . . . . . . . . . . . . 175
2.12 The region of attraction for terminal constraint x(N) ∈
Xf and terminal penalty Vf (x) = (1/2)x ′ Πx and the es-
timate of X̄N for Exercise 2.8. . . . . . . . . . . . . . . . . . . 177
2.13 Inconsistent setpoint (xsp , usp ), unreachable stage cost
ℓ(x, u), and optimal steady states (xs , us ), and stage costs
ℓs (x, u) for constrained and unconstrained systems. . . . 181
2.14 Stage cost versus time for the case of unreachable setpoint. 182
2.15 Rotated cost-function contour ℓ e (x, u) = 0 (circles) for
λ = 0, −8, −12. Shaded region shows feasible region where
e (x, u) < 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ℓ 185

3.1 Open-loop and feedback trajectories. . . . . . . . . . . . . . . 198


3.2 The sets XN , Rb , and Rc . . . . . . . . . . . . . . . . . . . . . . 214
3.3 Outer-bounding tube X(z, ū). . . . . . . . . . . . . . . . . . . . 228
3.4 Minimum feasible α for varying N. Note that we require
α ∈ [0, 1). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
3.5 Bounds on tightened constraint set Z̄ for varying N. Bounds
are |x1 | ≤ χ1 , |x2 | ≤ χ2 , and |u| ≤ µ. . . . . . . . . . . . . . . 233
3.6 Comparison of 100 realizations of standard and tube-
based MPC for the chemical reactor example. . . . . . . . . 244
3.7 Comparison of standard and tube-based MPC with an ag-
gressive model predictive controller. . . . . . . . . . . . . . . 245
List of Figures xxv

3.8 Concentration versus time for the ancillary model predic-


tive controller with sample time ∆ = 12 (left) and ∆ = 8
(right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
3.9 Observed probability εtest of constraint violation for i =
10. Distribution is based on 500 trials for each value of
ε. Dashed line shows the outcome predicted by formula
(3.25), i.e., εtest = ε. . . . . . . . . . . . . . . . . . . . . . . . . . 255
3.10 Closed-loop robust MPC state evolution with uniformly
distributed |w| ≤ 0.1 from four different x0 . . . . . . . . . . 263

4.1 Smoothing update. . . . . . . . . . . . . . . . . . . . . . . . . . 299


4.2 Comparison of filtering and smoothing updates for the
batch reactor system. Second column shows absolute es-
timate error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
4.3 Evolution of the state (solid line) and EKF state estimate
(dashed line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
4.4 Evolution of the state (solid line) and UKF state estimate
(dashed line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
4.5 Evolution of the state (solid line) and MHE state estimate
(dashed line). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
4.6 Perturbed trajectories terminating in Xf . . . . . . . . . . . . 315
4.7 Closed-loop performance of combined nonlinear MHE/MPC
with no disturbances. First column shows system states,
and second column shows estimation error. Dashed line
shows concentration setpoint. Vertical lines indicate times
of setpoint changes. . . . . . . . . . . . . . . . . . . . . . . . . 317
4.8 Closed-loop performance of combined nonlinear MHE/MPC
for varying disturbance size. The system is controlled be-
tween two steady states. . . . . . . . . . . . . . . . . . . . . . . 318

5.1 State estimator tube. The solid line x̂(t) is the center of
the tube, and the dashed line is a sample trajectory of x(t). 336
5.2 The system with disturbance. The state estimate lies in
the inner tube, and the state lies in the outer tube. . . . . . 337
p p p+1 p+1
6.1 Convex step from (u1 , u2 ) to (u1 , u2 ). . . . . . . . . . 380
6.2 Ten iterations of noncooperative steady-state calculation. . 397
6.3 Ten iterations of cooperative steady-state calculation. . . . 397
6.4 Ten iterations of noncooperative steady-state calculation;
reversed pairing. . . . . . . . . . . . . . . . . . . . . . . . . . . 398
xxvi List of Figures

6.5 Ten iterations of cooperative steady-state calculation; re-


versed pairing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
6.6 Cooperative control stuck on the boundary of U under
coupled constraints . . . . . . . . . . . . . . . . . . . . . . . . 406
6.7 Cost contours for a two-player, nonconvex game. . . . . . . 416
6.8 Nonconvex function optimized with the distributed gra-
dient algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
6.9 Closed-loop state and control evolution with (x1 (0), x2 (0)) =
(3, −3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
6.10 Contours of V (x(0), u1 , u2 ) for N = 1. . . . . . . . . . . . . . 425
6.11 Optimizing a quadratic function in one set of variables at
a time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
6.12 Constrained optimality conditions and the normal cone. . 438

7.1 The sets Z, X, and U(x). . . . . . . . . . . . . . . . . . . . . . 448


7.2 Parametric linear program. . . . . . . . . . . . . . . . . . . . . 448
7.3 Unconstrained parametric quadratic program. . . . . . . . . 449
7.4 Parametric quadratic program. . . . . . . . . . . . . . . . . . . 449
7.5 Polar cone. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
7.6 Regions Rx , x ∈ X for a second-order example. . . . . . . . 462
7.7 Solution to a parametric LP. . . . . . . . . . . . . . . . . . . . 473
7.8 Solution times for explicit and implicit MPC for N = 20. . . 480

8.1 Feasible set and reduced objective ψ(u(0)) of the non-


linear MPC Example 8.1. . . . . . . . . . . . . . . . . . . . . . . 490
8.2 Performance of different integration methods. . . . . . . . . 499
8.3 Polynomial approximation x e 1 (t) and true trajectory x1 (t)
of the first state and its derivative. . . . . . . . . . . . . . . . 504
8.4 Performance of implicit integration methods on a stiff ODE. 506
8.5 Newton-type iterations for solution of R(z) = 0 from Ex-
ample 8.5. Left: exact Newton method. Right: constant
Jacobian approximation. . . . . . . . . . . . . . . . . . . . . . 510
8.6 Convergence of different sequences as a function of k. . . 512
8.7 Relaxed and binary feasible solution for Example 8.17. . . 578
8.8 A hanging chain at rest. See Exercise 8.6(b). . . . . . . . . . 585
8.9 Direct single shooting solution for (8.65) without path con-
straints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587
8.10 Open-loop simulation for (8.65) using collocation. . . . . . . 590
8.11 Gauss-Newton iterations for the direct multiple-shooting
method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
List of Figures xxvii

A.1 The four fundamental subspaces of matrix A . . . . . . . . 626


A.2 Matrix A maps into R(A). . . . . . . . . . . . . . . . . . . . . . 627
A.3 Pseudo-inverse of A maps into R(A′ ). . . . . . . . . . . . . . 627
A.4 Subgradient. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640
A.5 Separating hyperplane. . . . . . . . . . . . . . . . . . . . . . . 642
A.6 Polar cone. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645
A.7 A convex function. . . . . . . . . . . . . . . . . . . . . . . . . . 646
A.8 Normal distribution. . . . . . . . . . . . . . . . . . . . . . . . . 658
A.9 Multivariate normal in two dimensions. . . . . . . . . . . . . 660
A.10 The geometry of quadratic form x ′ Ax = b. . . . . . . . . . . 661
A.11 A nearly singular normal density in two dimensions. . . . . 665
A.12 The region X(c) for y = max(x1 , x2 ) ≤ c. . . . . . . . . . . . 667
A.13 A joint density function for the two uncorrelated random
variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 670
A.14 The probability distribution and inverse distribution for
random variable ξ. . . . . . . . . . . . . . . . . . . . . . . . . . 687

B.1 Stability of the origin. . . . . . . . . . . . . . . . . . . . . . . . 697


B.2 An attractive but unstable origin. . . . . . . . . . . . . . . . . 698

C.1 Routing problem. . . . . . . . . . . . . . . . . . . . . . . . . . . 730


C.2 Approximation of the set U . . . . . . . . . . . . . . . . . . . . 738
C.3 Tangent cones. . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
C.4 Normal at u. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
C.5 Condition of optimality. . . . . . . . . . . . . . . . . . . . . . . 745
C.6 Tangent and normal cones. . . . . . . . . . . . . . . . . . . . . 747
C.7 Condition of optimality. . . . . . . . . . . . . . . . . . . . . . . 749
C.8 FU (u) ̸= TU (u). . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
C.9 Graph of set-valued function U(·). . . . . . . . . . . . . . . . 756
C.10 Graphs of discontinuous set-valued functions. . . . . . . . . 757
C.11 Outer and inner semicontinuity of U(·). . . . . . . . . . . . . 758
C.12 Subgradient of f (·). . . . . . . . . . . . . . . . . . . . . . . . . 762
List of Examples and Statements

1.1 Example: Sum of quadratic functions . . . . . . . . . . . . . 15


1.2 Lemma: Hautus lemma for controllability . . . . . . . . . . . 24
1.3 Lemma: LQR convergence . . . . . . . . . . . . . . . . . . . . . 24
1.4 Lemma: Hautus lemma for observability . . . . . . . . . . . 42
1.5 Lemma: Convergence of estimator cost . . . . . . . . . . . . 43
1.6 Lemma: Estimator convergence . . . . . . . . . . . . . . . . . 44
1.7 Assumption: Target feasibility and uniqueness . . . . . . . 48
1.8 Lemma: Detectability of the augmented system . . . . . . . 50
1.9 Corollary: Dimension of the disturbance . . . . . . . . . . . 50
1.10 Lemma: Offset-free control . . . . . . . . . . . . . . . . . . . . 52
1.11 Example: More measured outputs than inputs and zero
offset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
1.12 Lemma: Hautus lemma for stabilizability . . . . . . . . . . . 68
1.13 Lemma: Hautus lemma for detectability . . . . . . . . . . . . 72
1.14 Lemma: Stabilizable systems and feasible targets . . . . . . 82

2.1 Proposition: Continuity of system solution . . . . . . . . . . 94


2.2 Assumption: Continuity of system and cost . . . . . . . . . 97
2.3 Assumption: Properties of constraint sets . . . . . . . . . . 98
2.4 Proposition: Existence of solution to optimal control problem 98
2.5 Example: Linear quadratic MPC . . . . . . . . . . . . . . . . . 99
2.6 Example: Closer inspection of linear quadratic MPC . . . . 101
2.7 Theorem: Continuity of value function and control law . . 104
2.8 Example: Discontinuous MPC control law . . . . . . . . . . . 105
2.9 Definition: Positive and control invariant sets . . . . . . . . 109
2.10 Proposition: Existence of solutions to DP recursion . . . . . 110
2.11 Definition: Asymptotically stable and GAS . . . . . . . . . . 112
2.12 Definition: Lyapunov function . . . . . . . . . . . . . . . . . . 113
2.13 Theorem: Lyapunov stability theorem . . . . . . . . . . . . . 113
2.14 Assumption: Basic stability assumption . . . . . . . . . . . . 114
2.15 Proposition: The value function VN0 (·) is locally bounded . 115
2.16 Proposition: Extension of upper bound to XN . . . . . . . . 115
2.17 Assumption: Weak controllability . . . . . . . . . . . . . . . . 116
2.18 Proposition: Monotonicity of the value function . . . . . . . 118
2.19 Theorem: Asymptotic stability of the origin . . . . . . . . . 119

xxix
xxx List of Examples and Statements

2.20 Definition: Exponential stability . . . . . . . . . . . . . . . . . 120


2.21 Theorem: Lyapunov function and exponential stability . . 120
2.22 Definition: Input/output-to-state stable (IOSS) . . . . . . . . 121
2.23 Assumption: Modified basic stability assumption . . . . . . 121
2.24 Theorem: Asymptotic stability with stage cost ℓ(y, u) . . . 122
2.25 Assumption: Continuity of system and cost; time-varying
case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
2.26 Assumption: Properties of constraint sets; time-varying case 124
2.27 Definition: Sequential positive invariance and sequential
control invariance . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2.28 Proposition: Continuous system solution; time-varying case 125
2.29 Proposition: Existence of solution to optimal control prob-
lem; time-varying case . . . . . . . . . . . . . . . . . . . . . . . 125
2.30 Definition: Asymptotically stable and GAS for time-varying
systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2.31 Definition: Lyapunov function: time-varying, constrained
case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
2.32 Theorem: Lyapunov theorem for asymptotic stability (time-
varying, constrained) . . . . . . . . . . . . . . . . . . . . . . . . 126
2.33 Assumption: Basic stability assumption; time-varying case 127
2.34 Proposition: Optimal cost decrease; time-varying case . . . 127
2.35 Proposition: MPC cost is less than terminal cost . . . . . . . 127
2.36 Proposition: Optimal value function properties; time-varying
case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
2.37 Assumption: Uniform weak controllability . . . . . . . . . . 128
2.38 Proposition: Conditions for uniform weak controllability . 128
2.39 Theorem: Asymptotic stability of the origin: time-varying
MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
2.40 Lemma: Entering the terminal region . . . . . . . . . . . . . . 146
2.41 Theorem: MPC stability; no terminal constraint . . . . . . . 146
2.42 Proposition: Admissible warm start in Xf . . . . . . . . . . . 149
2.43 Algorithm: Suboptimal MPC . . . . . . . . . . . . . . . . . . . 149
2.44 Proposition: Linking warm start and state . . . . . . . . . . . 150
2.45 Definition: Asymptotic stability (difference inclusion) . . . 150
2.46 Definition: Lyapunov function (difference inclusion) . . . . 151
2.47 Proposition: Asymptotic stability (difference inclusion) . . 151
2.48 Theorem: Asymptotic stability of suboptimal MPC . . . . . 151
2.49 Assumption: Continuity of system and cost . . . . . . . . . 154
2.50 Assumption: Properties of constraint sets . . . . . . . . . . 154
2.51 Assumption: Cost lower bound . . . . . . . . . . . . . . . . . 154
List of Examples and Statements xxxi

2.52 Proposition: Asymptotic average performance . . . . . . . . 155


2.53 Definition: Dissipativity . . . . . . . . . . . . . . . . . . . . . . 156
2.54 Assumption: Continuity at the steady state . . . . . . . . . . 157
2.55 Assumption: Strict dissipativity . . . . . . . . . . . . . . . . . 157
2.56 Theorem: Asymptotic stability of economic MPC . . . . . . 157
2.57 Example: Economic MPC versus tracking MPC . . . . . . . . 158
2.58 Example: MPC with mixed continuous/discrete actuators . 162
2.59 Theorem: Lyapunov theorem for asymptotic stability . . . 177
2.60 Proposition: Convergence of state under IOSS . . . . . . . . 178
2.61 Lemma: An equality for quadratic functions . . . . . . . . . 178
2.62 Lemma: Evolution in a compact set . . . . . . . . . . . . . . . 179

3.1 Definition: Robust global asymptotic stability . . . . . . . . 207


3.2 Theorem: Lyapunov function and RGAS . . . . . . . . . . . . 208
3.3 Theorem: Robust global asymptotic stability and regular-
ization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
3.4 Proposition: Bound for continuous functions . . . . . . . . . 211
3.5 Proposition: Robustness of nominal MPC . . . . . . . . . . . 214
3.6 Definition: Robust control invariance . . . . . . . . . . . . . 217
3.7 Definition: Robust positive invariance . . . . . . . . . . . . . 217
3.8 Assumption: Basic stability assumption; robust case . . . . 218
3.9 Theorem: Recursive feasibility of control policies . . . . . . 218
3.10 Definition: Set algebra and Hausdorff distance . . . . . . . . 224
3.11 Definition: Robust asymptotic stability of a set . . . . . . . 230
3.12 Proposition: Robust asymptotic stability of tube-based
MPC for linear systems . . . . . . . . . . . . . . . . . . . . . . 230
3.13 Example: Calculation of tightened constraints . . . . . . . . 231
3.14 Proposition: Recursive feasibility of tube-based MPC . . . . 235
3.15 Proposition: Robust exponential stability of improved tube-
based MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
3.16 Proposition: Implicit satisfaction of terminal constraint . . 239
3.17 Proposition: Properties of the value function . . . . . . . . . 240
3.18 Proposition: Neighborhoods of the uncertain system . . . . 241
3.19 Proposition: Robust positive invariance of tube-based MPC
for nonlinear systems . . . . . . . . . . . . . . . . . . . . . . . 241
3.20 Example: Robust control of an exothermic reaction . . . . . 243
3.21 Assumption: Stabilizing conditions, stochastic MPC: Ver-
sion 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
3.22 Assumption: Stabilizing conditions, stochastic MPC: Ver-
sion 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
xxxii List of Examples and Statements

3.23 Proposition: Expected cost bound . . . . . . . . . . . . . . . . 250


3.24 Assumption: Robust terminal set condition . . . . . . . . . 253
3.25 Example: Constraint tightening via sampling . . . . . . . . . 254

4.1 Definition: State Estimator . . . . . . . . . . . . . . . . . . . . 271


4.2 Definition: Robustly globally asymptotically stable esti-
mation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
4.3 Proposition: RGAS plus convergent disturbances imply
convergent estimates . . . . . . . . . . . . . . . . . . . . . . . . 273
4.4 Example: The Kalman filter of a linear system is RGAS . . . 273
4.5 Definition: i-IOSS . . . . . . . . . . . . . . . . . . . . . . . . . . 275
4.6 Proposition: RGAS estimator implies i-IOSS . . . . . . . . . . 276
4.7 Definition: i-IOSS Lyapunov function . . . . . . . . . . . . . . 277
4.8 Theorem: i-IOSS and Lyapunov function equivalence . . . . 277
4.9 Definition: Incremental Stabilizability with respect to stage
cost L(·) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
4.10 Assumption: Continuity . . . . . . . . . . . . . . . . . . . . . . 277
4.11 Assumption: Positive-definite stage cost . . . . . . . . . . . . 278
4.12 Assumption: Stabilizability . . . . . . . . . . . . . . . . . . . . 278
4.13 Assumption: Detectability . . . . . . . . . . . . . . . . . . . . 278
4.14 Definition: Q-function for estimation . . . . . . . . . . . . . 283
4.15 Theorem: Q-function theorem for global asymptotic stability 283
4.16 Theorem: Stability of full information estimation . . . . . . 283
4.17 Assumption: Stage cost under disturbances . . . . . . . . . 284
4.18 Assumption: Stabilizability under disturbances . . . . . . . 284
4.19 Definition: Exponentially i-IOSS . . . . . . . . . . . . . . . . . 285
4.20 Definition: Robustly globally exponentially stable estimation 285
4.21 Proposition: Equivalent definition of RGES . . . . . . . . . . 286
4.22 Assumption: Power-law bounds for stage costs . . . . . . . 286
4.23 Assumption: Exponential stabilizability . . . . . . . . . . . . 286
4.24 Assumption: Exponential detectability . . . . . . . . . . . . . 286
4.25 Theorem: Robust stability of full information estimation . 287
4.26 Lemma: Duality of controllability and observability . . . . . 291
4.27 Theorem: Riccati iteration and regulator stability . . . . . . 291
4.28 Definition: Observability . . . . . . . . . . . . . . . . . . . . . 293
4.29 Definition: Final-state observability . . . . . . . . . . . . . . . 294
4.30 Definition: Globally K-continuous . . . . . . . . . . . . . . . 294
4.31 Proposition: Observable and global K-continuous imply FSO 294
4.32 Definition: RGAS estimation (observable case) . . . . . . . . 295
4.33 Theorem: MHE is RGAS (observable case) . . . . . . . . . . . 295
List of Examples and Statements xxxiii

4.34 Definition: Full information arrival cost . . . . . . . . . . . . 297


4.35 Lemma: MHE and FIE equivalence . . . . . . . . . . . . . . . . 297
4.36 Assumption: MHE prior weighting bounds . . . . . . . . . . 297
4.37 Theorem: MHE is RGES . . . . . . . . . . . . . . . . . . . . . . 298
4.38 Example: Filtering and smoothing updates . . . . . . . . . . 300
4.39 Example: EKF, UKF, and MHE performance comparison . . 306
4.40 Definition: i-UIOSS . . . . . . . . . . . . . . . . . . . . . . . . . 312
4.41 Assumption: Bounded estimate error . . . . . . . . . . . . . 313
4.42 Definition: Robust positive invariance . . . . . . . . . . . . . 313
4.43 Definition: Robust asymptotic stability . . . . . . . . . . . . 314
4.44 Definition: ISS Lyapunov function . . . . . . . . . . . . . . . . 314
4.45 Proposition: ISS Lyapunov stability theorem . . . . . . . . . 314
4.46 Theorem: Combined MHE/MPC is RAS . . . . . . . . . . . . . 316
4.47 Example: Combined MHE/MPC . . . . . . . . . . . . . . . . . . 317

5.1 Definition: Positive invariance; robust positive invariance . 339


5.2 Proposition: Proximity of state and state estimate . . . . . 339
5.3 Proposition: Proximity of state estimate and nominal state 341
5.4 Assumption: Constraint bounds . . . . . . . . . . . . . . . . . 342
5.5 Algorithm: Robust control algorithm (linear constrained
systems) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
5.6 Proposition: Exponential stability of output MPC . . . . . . 345
5.7 Algorithm: Robust control algorithm (offset-free MPC) . . . 353

6.1 Algorithm: Suboptimal MPC (simplified) . . . . . . . . . . . . 369


6.2 Definition: Lyapunov stability . . . . . . . . . . . . . . . . . . 370
6.3 Definition: Uniform Lyapunov stability . . . . . . . . . . . . 371
6.4 Definition: Exponential stability . . . . . . . . . . . . . . . . . 371
6.5 Lemma: Exponential stability of suboptimal MPC . . . . . . 372
6.6 Lemma: Global asymptotic stability and exponential con-
vergence with mixed powers of norm . . . . . . . . . . . . . 373
6.7 Lemma: Converse theorem for exponential stability . . . . 374
6.8 Assumption: Unconstrained two-player game . . . . . . . . 380
6.9 Example: Nash equilibrium is unstable . . . . . . . . . . . . . 383
6.10 Example: Nash equilibrium is stable but closed loop is
unstable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
6.11 Example: Nash equilibrium is stable and the closed loop
is stable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
6.12 Example: Stability and offset in the distributed target cal-
culation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
xxxiv List of Examples and Statements

6.13 Assumption: Constrained two-player game . . . . . . . . . . 400


6.14 Lemma: Global asymptotic stability and exponential con-
vergence of perturbed system . . . . . . . . . . . . . . . . . . 408
6.15 Assumption: Disturbance models . . . . . . . . . . . . . . . . 409
6.16 Lemma: Detectability of distributed disturbance model . . 409
6.17 Assumption: Constrained M-player game . . . . . . . . . . . 413
6.18 Lemma: Distributed gradient algorithm properties . . . . . 418
6.19 Assumption: Basic stability assumption (distributed) . . . . 420
6.20 Proposition: Terminal constraint satisfaction . . . . . . . . 421
6.21 Theorem: Asymptotic stability . . . . . . . . . . . . . . . . . . 423
6.22 Example: Nonlinear distributed control . . . . . . . . . . . . 423
6.23 Lemma: Local detectability . . . . . . . . . . . . . . . . . . . . 437

7.1 Definition: Polytopic (polyhedral) partition . . . . . . . . . . 450


7.2 Definition: Piecewise affine function . . . . . . . . . . . . . . 450
7.3 Assumption: Strict convexity . . . . . . . . . . . . . . . . . . . 451
7.4 Definition: Polar cone . . . . . . . . . . . . . . . . . . . . . . . 453
7.5 Proposition: Farkas’s lemma . . . . . . . . . . . . . . . . . . . 453
7.6 Proposition: Optimality conditions for convex set . . . . . . 453
7.7 Proposition: Optimality conditions in terms of polar cone 455
7.8 Proposition: Optimality conditions for linear inequalities . 455
7.9 Proposition: Solution of P(w), w ∈ Rx0 . . . . . . . . . . . . 457
7.10 Proposition: Piecewise quadratic (affine) cost (solution) . . 458
7.11 Example: Parametric QP . . . . . . . . . . . . . . . . . . . . . . 458
7.12 Example: Explicit optimal control . . . . . . . . . . . . . . . . 459
7.13 Proposition: Continuity of cost and solution . . . . . . . . . 461
7.14 Assumption: Continuous, piecewise quadratic function . . 464
7.15 Definition: Active polytope (polyhedron) . . . . . . . . . . . 465
7.16 Proposition: Solving P using Pi . . . . . . . . . . . . . . . . . 465
7.17 Proposition: Optimality of u0x (w) in Rx . . . . . . . . . . . . 468
7.18 Proposition: Piecewise quadratic (affine) solution . . . . . . 468
7.19 Proposition: Optimality conditions for parametric LP . . . 472
7.20 Proposition: Solution of P . . . . . . . . . . . . . . . . . . . . . 475
7.21 Proposition: Piecewise affine cost and solution . . . . . . . 475

8.1 Example: Nonlinear MPC . . . . . . . . . . . . . . . . . . . . . 489


8.2 Example: Sequential approach . . . . . . . . . . . . . . . . . . 492
8.3 Example: Integration methods of different order . . . . . . 498
8.4 Example: Implicit integrators for a stiff ODE system . . . . 505
8.5 Example: Finding a fifth root with Newton-type iterations . 510
List of Examples and Statements xxxv

8.6 Example: Convergence rates . . . . . . . . . . . . . . . . . . . 511


8.7 Theorem: Local contraction for Newton-type methods . . . 512
8.8 Corollary: Convergence of exact Newton’s method . . . . . 513
8.9 Example: Function evaluation via elementary operations . 516
8.10 Example: Implicit function representation . . . . . . . . . . 518
8.11 Example: Forward algorithmic differentiation . . . . . . . . 520
8.12 Example: Algorithmic differentiation in reverse mode . . . 522
8.13 Example: Sequential optimal control using CasADi from
Octave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
8.14 Theorem: KKT conditions . . . . . . . . . . . . . . . . . . . . . 543
8.15 Theorem: Strong second-order sufficient conditions for
optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
8.16 Theorem: Tangential predictor by quadratic program . . . 545
8.17 Example: MPC with discrete actuator . . . . . . . . . . . . . . 577

A.1 Theorem: Schur decomposition . . . . . . . . . . . . . . . . . 629


A.2 Theorem: Real Schur decomposition . . . . . . . . . . . . . . 630
A.3 Theorem: Bolzano-Weierstrass . . . . . . . . . . . . . . . . . . 632
A.4 Proposition: Convergence of monotone sequences . . . . . 633
A.5 Proposition: Uniform continuity . . . . . . . . . . . . . . . . . 634
A.6 Proposition: Compactness of continuous functions of com-
pact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635
A.7 Proposition: Weierstrass . . . . . . . . . . . . . . . . . . . . . 636
A.8 Proposition: Derivative and partial derivative . . . . . . . . 637
A.9 Proposition: Continuous partial derivatives . . . . . . . . . . 638
A.10 Proposition: Chain rule . . . . . . . . . . . . . . . . . . . . . . 638
A.11 Proposition: Mean value theorem for vector functions . . . 638
A.12 Definition: Convex set . . . . . . . . . . . . . . . . . . . . . . . 641
A.13 Theorem: Caratheodory . . . . . . . . . . . . . . . . . . . . . . 641
A.14 Theorem: Separation of convex sets . . . . . . . . . . . . . . 642
A.15 Theorem: Separation of convex set from zero . . . . . . . . 643
A.16 Corollary: Existence of separating hyperplane . . . . . . . . 643
A.17 Definition: Support hyperplane . . . . . . . . . . . . . . . . . 644
A.18 Theorem: Convex set and halfspaces . . . . . . . . . . . . . . 644
A.19 Definition: Convex cone . . . . . . . . . . . . . . . . . . . . . . 644
A.20 Definition: Polar cone . . . . . . . . . . . . . . . . . . . . . . . 644
A.21 Definition: Cone generator . . . . . . . . . . . . . . . . . . . . 645
A.22 Proposition: Cone and polar cone generator . . . . . . . . . 645
A.23 Theorem: Convexity implies continuity . . . . . . . . . . . . 647
A.24 Theorem: Differentiability and convexity . . . . . . . . . . . 647
xxxvi List of Examples and Statements

A.25 Theorem: Second derivative and convexity . . . . . . . . . . 647


A.26 Definition: Level set . . . . . . . . . . . . . . . . . . . . . . . . 648
A.27 Definition: Sublevel set . . . . . . . . . . . . . . . . . . . . . . 648
A.28 Definition: Support function . . . . . . . . . . . . . . . . . . . 648
A.29 Proposition: Set membership and support function . . . . . 648
A.30 Proposition: Lipschitz continuity of support function . . . 648
A.31 Theorem: Existence of solution to differential equations . 651
A.32 Theorem: Maximal interval of existence . . . . . . . . . . . . 651
A.33 Theorem: Continuity of solution to differential equation . 651
A.34 Theorem: Bellman-Gronwall . . . . . . . . . . . . . . . . . . . 651
A.35 Theorem: Existence of solutions to forced systems . . . . . 653
A.36 Example: Fourier transform of the normal density. . . . . . 659
A.37 Definition: Density of a singular normal . . . . . . . . . . . . 662
A.38 Example: Marginal normal density . . . . . . . . . . . . . . . 663
A.39 Example: Nonlinear transformation . . . . . . . . . . . . . . . 666
A.40 Example: Maximum of two random variables . . . . . . . . . 667
A.41 Example: Independent implies uncorrelated . . . . . . . . . 668
A.42 Example: Does uncorrelated imply independent? . . . . . . 669
A.43 Example: Independent and uncorrelated are equivalent
for normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
A.44 Example: Conditional normal density . . . . . . . . . . . . . 674
A.45 Example: More normal conditional densities . . . . . . . . . 675

B.1 Definition: Equilibrium point . . . . . . . . . . . . . . . . . . . 694


B.2 Definition: Positive invariant set . . . . . . . . . . . . . . . . . 694
B.3 Definition: K, K∞ , KL, and PD functions . . . . . . . . . . 695
B.4 Definition: Local stability . . . . . . . . . . . . . . . . . . . . . 696
B.5 Definition: Global attraction . . . . . . . . . . . . . . . . . . . 697
B.6 Definition: Global asymptotic stability . . . . . . . . . . . . . 697
B.7 Definition: Various forms of stability . . . . . . . . . . . . . . 698
B.8 Definition: Global asymptotic stability (KL version) . . . . . 699
B.9 Proposition: Connection of classical and KL global asymp-
totic stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
B.10 Definition: Various forms of stability (constrained) . . . . . 699
B.11 Definition: Asymptotic stability (constrained, KL version) . 700
B.12 Definition: Lyapunov function (unconstrained and con-
strained) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701
B.13 Theorem: Lyapunov function and GAS (classical definition) 702
B.14 Lemma: From PD to K∞ function (Jiang and Wang (2002)) 703
List of Examples and Statements xxxvii

B.15 Theorem: Lyapunov function and global asymptotic sta-


bility (KL definition) . . . . . . . . . . . . . . . . . . . . . . . . 703
B.16 Proposition: Improving convergence (Sontag (1998b)) . . . 705
B.17 Theorem: Converse theorem for global asymptotic stability 705
B.18 Theorem: Lyapunov function for asymptotic stability (con-
strained) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
B.19 Theorem: Lyapunov function for exponential stability . . . 706
B.20 Lemma: Lyapunov function for linear systems . . . . . . . . 707
B.21 Definition: Sequential positive invariance . . . . . . . . . . . 707
B.22 Definition: Asymptotic stability (time-varying, constrained) 707
B.23 Definition: Lyapunov function: time-varying, constrained
case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708
B.24 Theorem: Lyapunov theorem for asymptotic stability (time-
varying, constrained) . . . . . . . . . . . . . . . . . . . . . . . . 708
B.25 Proposition: Global K function overbound . . . . . . . . . . 709
B.26 Definition: Nominal robust global asymptotic stability . . . 710
B.27 Theorem: Nominal robust global asymptotic stability and
Lyapunov function . . . . . . . . . . . . . . . . . . . . . . . . . 710
B.28 Definition: Positive invariance with disturbances . . . . . . 711
B.29 Definition: Local stability (disturbances) . . . . . . . . . . . . 712
B.30 Definition: Global attraction (disturbances) . . . . . . . . . . 712
B.31 Definition: GAS (disturbances) . . . . . . . . . . . . . . . . . . 712
B.32 Definition: Lyapunov function (disturbances) . . . . . . . . . 712
B.33 Theorem: Lyapunov function for global asymptotic sta-
bility (disturbances) . . . . . . . . . . . . . . . . . . . . . . . . 713
B.34 Definition: Global control Lyapunov function (CLF) . . . . . 714
B.35 Definition: Global stabilizability . . . . . . . . . . . . . . . . . 714
B.36 Definition: Positive invariance (disturbance and control) . . 715
B.37 Definition: CLF (disturbance and control) . . . . . . . . . . . 715
B.38 Definition: Control invariance (constrained) . . . . . . . . . 715
B.39 Definition: CLF (constrained) . . . . . . . . . . . . . . . . . . . 716
B.40 Definition: Control invariance (disturbances, constrained) 716
B.41 Definition: CLF (disturbances, constrained) . . . . . . . . . . 716
B.42 Definition: Input-to-state stable (ISS) . . . . . . . . . . . . . . 717
B.43 Definition: ISS-Lyapunov function . . . . . . . . . . . . . . . . 718
B.44 Lemma: ISS-Lyapunov function implies ISS . . . . . . . . . . 718
B.45 Definition: ISS (constrained) . . . . . . . . . . . . . . . . . . . 718
B.46 Definition: ISS-Lyapunov function (constrained) . . . . . . . 718
B.47 Lemma: ISS-Lyapunov function implies ISS (constrained) . 719
B.48 Definition: Output-to-state stable (OSS) . . . . . . . . . . . . 720
xxxviii List of Examples and Statements

B.49 Definition: OSS-Lyapunov function . . . . . . . . . . . . . . . 720


B.50 Theorem: OSS and OSS-Lyapunov function . . . . . . . . . . 720
B.51 Definition: Input/output-to-state stable (IOSS) . . . . . . . . 721
B.52 Definition: IOSS-Lyapunov function . . . . . . . . . . . . . . . 721
B.53 Theorem: Modified IOSS-Lyapunov function . . . . . . . . . 721
B.54 Conjecture: IOSS and IOSS-Lyapunov function . . . . . . . . 722
B.55 Definition: Incremental input/output-to-state stable . . . . 722
B.56 Definition: Observability . . . . . . . . . . . . . . . . . . . . . 722
B.57 Assumption: Lipschitz continuity of model . . . . . . . . . . 723
B.58 Lemma: Lipschitz continuity and state difference bound . 723
B.59 Theorem: Observability and convergence of state . . . . . . 723

C.1 Lemma: Principle of optimality . . . . . . . . . . . . . . . . . 734


C.2 Theorem: Optimal value function and control law from DP 734
C.3 Example: DP applied to linear quadratic regulator . . . . . . 736
C.4 Definition: Tangent vector . . . . . . . . . . . . . . . . . . . . 739
C.5 Proposition: Tangent vectors are closed cone . . . . . . . . 739
C.6 Definition: Regular normal . . . . . . . . . . . . . . . . . . . . 739
C.7 Proposition: Relation of normal and tangent cones . . . . . 740
C.8 Proposition: Global optimality for convex problems . . . . 741
C.9 Proposition: Optimality conditions—normal cone . . . . . . 742
C.10 Proposition: Optimality conditions—tangent cone . . . . . 743
C.11 Proposition: Representation of tangent and normal cones . 743
C.12 Proposition: Optimality conditions—linear inequalities . . 744
C.13 Corollary: Optimality conditions—linear inequalities . . . . 744
C.14 Proposition: Necessary condition for nonconvex problem . 746
C.15 Definition: General normal . . . . . . . . . . . . . . . . . . . . 748
C.16 Definition: General tangent . . . . . . . . . . . . . . . . . . . . 748
C.17 Proposition: Set of regular tangents is closed convex cone 748
C.18 Definition: Regular set . . . . . . . . . . . . . . . . . . . . . . . 749
C.19 Proposition: Conditions for regular set . . . . . . . . . . . . 749
C.20 Proposition: Quasiregular set . . . . . . . . . . . . . . . . . . 751
C.21 Proposition: Optimality conditions nonconvex problem . . 752
C.22 Proposition: Fritz-John necessary conditions . . . . . . . . . 753
C.23 Definition: Outer semicontinuous function . . . . . . . . . . 757
C.24 Definition: Inner semicontinuous function . . . . . . . . . . 758
C.25 Definition: Continuous function . . . . . . . . . . . . . . . . . 758
C.26 Theorem: Equivalent conditions for outer and inner semi-
continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759
C.27 Proposition: Outer semicontinuity and closed graph . . . . 759
List of Examples and Statements xxxix

C.28 Theorem: Minimum theorem . . . . . . . . . . . . . . . . . . . 760


C.29 Theorem: Lipschitz continuity of the value function, con-
stant U . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761
C.30 Definition: Subgradient of convex function . . . . . . . . . . 762
C.31 Theorem: Clarke et al. (1998) . . . . . . . . . . . . . . . . . . 762
C.32 Corollary: A bound on d(u, U(x ′ )) for u ∈ U (x) . . . . . . 763
C.33 Theorem: Continuity of U(·) . . . . . . . . . . . . . . . . . . . 765
C.34 Theorem: Continuity of the value function . . . . . . . . . . 765
C.35 Theorem: Lipschitz continuity of the value function—U (x) 766
Notation

Mathematical notation
∃ there exists
∈ is an element of
∀ for all
=⇒ ⇐= implies; is implied by
≠ ⇒ ⇐=
̸ does not imply; is not implied by
a := b a is defined to be equal to b.
a =: b b is defined to be equal to a.
≈ approximately equal
V (·) function V
V :A→B V is a function mapping set A into set B
x , V (x) function V maps variable x to value V (x)
x+ value of x at next sample time (discrete time system)
ẋ time derivative of x (continuous time system)
fx partial derivative of f (x) with respect to x
∇ nabla or del operator
δ unit impulse or delta function
|x| absolute value of scalar; norm of vector (two-norm unless
stated otherwise); induced norm of matrix
x sequence of vector-valued variable x, (x(0), x(1), . . .)
∥x∥ sup norm over a sequence, supi≥0 |x(i)|
∥x∥a:b max a≤i≤b |x(i)|
tr(A) trace of matrix A
det(A) determinant of matrix A
eig(A) set of eigenvalues of matrix A
ρ(A) spectral radius of matrix A, max i |λi | for λi ∈ eig(A)
A−1 inverse of matrix A
A †
pseudo-inverse of matrix A
A′ transpose of matrix A
inf infimum or greatest lower bound
min minimum
sup supremum or least upper bound
max maximum

xli
xlii Notation

arg argument or solution of an optimization


s.t. subject to
I integers
I≥0 nonnegative integers
In:m integers in the interval [n, m]
R real numbers
R≥0 nonnegative real numbers
Rn real-valued n-vectors
m×n
R real-valued m × n matrices
C complex numbers
B ball in Rn of unit radius
x ∼ px random variable x has probability density px
E(x) expectation of random variable x
var(x) variance of random variable x
cov(x, y) covariance of random variables x and y
N(m, P ) normal distribution (mean m, covariance P ), x ∼ N(m, P )
n(x, m, P ) normal probability density, px (x) = n(x, m, P )
∅ the empty set
aff(A) affine hull of set A
int(A) interior of set A
co(A) convex hull of the set A
A closure of set A
leva V sublevel set of function V , {x | V (x) ≤ a}
f ◦g composition of functions f and g, f ◦ g (s) := f (g(s))
a⊕b maximum of scalars a and b, Chapter 4
Ln
i=1 ai a1 ⊕ a2 ⊕ · · · ⊕ an , Chapter 4
A⊕B set addition of sets A and B, Chapters 3 and 5
A⊖B set subtraction of set B from set A
A\B elements of set A not in set B
A∪B union of sets A and B
A∩B intersection of sets A and B
A⊆B set A is a subset of set B
A⊇B set A is a superset of set B
A⊂B set A is a proper (or strict) subset of set B
A⊃B set A is a proper (or strict) superset of set B
d(a, B) Distance between element a and set B
dH (A, B) Hausdorff distance between sets A and B
x ↘ y (x ↗ y) x converges to y from above (below)
sat(x) saturation, sat(x) = x if |x| ≤ 1, −1 if x < −1, 1 if x > 1
Notation xliii

Symbols
A, B, C system matrices, discrete time, x + = Ax + Bu, y = Cx
Ac , Bc system matrices, continuous time, ẋ = Ac x + Bc u
Aij state transition matrix for player i to player j
Ai state transition matrix for player i
ALi estimate error transition matrix Ai − Li Ci
Bd input disturbance matrix
Bij input matrix of player i for player j’s inputs
Bi input matrix of player i
Cij output matrix of player i for player j’s interaction states
Ci output matrix of player i
Cd output disturbance matrix
C controllability matrix
C∗ polar cone of cone C
d integrating disturbance
E, F constraint matrices, F x + Eu ≤ e
f,h system functions, discrete time, x + = f (x, u), y = h(x)
fc (x, u) system function, continuous time, ẋ = fc (x, u)
F (x, u) difference inclusion, x + ∈ F (x, u), F is set valued
G input noise-shaping matrix
Gij steady-state gain of player i to player j
H controlled variable matrix
I(x, u) index set of constraints active at (x, u)
0
I (x) index set of constraints active at (x, u0 (x))
k sample time
K optimal controller gain
ℓ(x, u) stage cost
ℓN (x, u) final stage cost
L optimal estimator gain
m input dimension
M cross-term penalty matrix x ′ Mu
M number of players, Chapter 6
M class of admissible input policies, µ ∈ M
n state dimension
N horizon length
O observability matrix, Chapters 1 and 4
O compact robust control invariant set containing the origin,
Chapter 3
p output dimension
xliv Notation

p optimization iterate, Chapter 6


pξ probability density of random variable ξ
P
ps (x) sampled probability density, ps (x) = i wi δ(x − xi )
P covariance matrix in the estimator
Pf terminal penalty matrix
P polytopic partition, Chapter 3
P polytopic partition, Chapter 7
PN (x) MPC optimization problem; horizon N and initial state x
q importance function in importance sampling
Q state penalty matrix
r controlled variable, r = Hy
R input penalty matrix
s number of samples in a sampled probability density
S input rate of change penalty matrix
S(x, u) index set of active polytopes at (x, u)
0
S (x) index set of active polytopes at (x, u0 (x))
t time
T current time in estimation problem
u input (manipulated variable) vector
+
u
e warm start for input sequence
u+ improved input sequence
UN (x) control constraint set
U input constraint set
v output disturbance, Chapters 1 and 4
v nominal control input, Chapters 3 and 5
VN (x, u) MPC objective function
VN0 (x) MPC optimal value function
VT (χ, ω) Full information state estimation objective function at time T
with initial state χ and disturbance sequence ω
V̂T (χ, ω) MHE objective function at time T with initial state χ and distur-
bance sequence ω
Vf (x) terminal penalty
VN (z) nominal control input constraint set
V output disturbance constraint set
w disturbance to the state evolution
wi weights in a sampled probability density, Chapter 4
wi convex weight for player i, Chapter 6
wi normalized weights in a sampled probability density
W class of admissible disturbance sequences, w ∈ W
Notation xlv

W state disturbance constraint set


x state vector
xi sample values in a sampled probability density
xij state interaction vector from player i to player j
x(0) mean of initial state density
X(k; x, µ) state tube at time k with initial state x and control policy µ
Xj set of feasible states for optimal control problem at stage j
X state constraint set
Xf terminal region
y output (measurement) vector
Y output constraint set
z nominal state, Chapters 3 and 5
ZT (x) full information arrival cost
ẐT (x) MHE arrival cost
e T (x)
Z MHE smoothing arrival cost
Z system constraint region, (x, u) ∈ Z
Zf terminal constraint region, (x, u) ∈ Zf
ZN (x, u) constraint set for state and input sequence

Greek letters

ΓT (χ) MHE prior weighting on state at time T


∆ sample time
κ control law
κj control law at stage j
κf control law applied in terminal region Xf
µi (x) control law at stage i
µ(x) control policy or sequence of control laws
ν output disturbance decision variable in estimation problem
Π cost-to-go matrix in regulator, Chapter 1
Π covariance matrix in the estimator, Chapter 5
ρi objective function weight for player i
Σi Solution to Lyapunov equation for player i
φ(k; x, u) state at time k given initial state x and input sequence u
φ(k; x, i, u) state at time k given state at time i is x and input sequence u
φ(k; x, u, w) state at time k given initial state is x, input sequence is u, and
disturbance sequence is w
χ state decision variable in estimation problem
ω state disturbance decision variable in estimation problem
xlvi Notation

Subscripts, superscripts, and accents


x̂ estimate
x̂ −
estimate before measurement
x
e estimate error
xs steady state
xi subsystem i in a decomposed large-scale system
xsp setpoint
V0 optimal
V uc unconstrained
V sp unreachable setpoint
Acronyms

AD algorithmic (or automatic) differentiation


AS asymptotically stable
BFGS Broyden-Fletcher-Goldfarb-Shanno
CIA combinatorial integral approximation
CLF control Lyapunov function
DAE differential algebraic equation
DARE discrete algebraic Riccati equation
DDP differential dynamic programming
DP dynamic programming
END external numerical differentiation
FIE full information estimation
FLOP floating point operation
FSO final-state observable
GAS globally asymptotically stable
GES globally exponentially stable
GL Gauss-Legendre
GPC generalized predictive control
EKF extended Kalman filter
i-IOSS incrementally input/output-to-state stable
IND internal numerical differentiation
i-OSS incrementally output-to-state stable
IOSS input/output-to-state stable
IP interior point
ISS input-to-state stable
i-UIOSS incrementally uniformly input/output-to-state stable
KF Kalman filter
KKT Karush-Kuhn-Tucker
LAR linear absolute regulator
LICQ linear independence constraint qualification
LP linear program
LQ linear quadratic
LQG linear quadratic Gaussian
LQP linear quadratic problem
LQR linear quadratic regulator

xlvii
xlviii Acronyms

MHE moving horizon estimation


MILP mixed-integer linear program
MINLP mixed-integer nonlinear program
MIQP mixed-integer quadratic program
NLP nonlinear program
MPC model predictive control
OCP optimal control problem
ODE ordinary differential equation
OSS output-to-state stable
PID proportional-integral-derivative
QP quadratic program
RGA relative gain array
RAS robustly asymptotically stable
RGAS robustly globally asymptotically stable
RGES robustly globally exponentially stable
RHC receding horizon control
RK Runge-Kutta
SQP sequential quadratic programming
SVD singular-value decomposition
UKF unscented Kalman filter
1
Getting Started with Model Predictive
Control

1.1 Introduction
The main purpose of this chapter is to provide a compact and acces-
sible overview of the essential elements of model predictive control
(MPC). We introduce deterministic and stochastic models, regulation,
state estimation, dynamic programming (DP), tracking, disturbances,
and some important performance properties such as closed-loop sta-
bility and zero offset to disturbances. The reader with background in
MPC and linear systems theory may wish to skim this chapter briefly
and proceed to Chapter 2. Other introductory texts covering the ba-
sics of MPC include Maciejowski (2002); Camacho and Bordons (2004);
Rossiter (2004); Goodwin, Serón, and De Doná (2005); Kwon (2005);
Wang (2009).

1.2 Models and Modeling


Model predictive control has its roots in optimal control. The basic
concept of MPC is to use a dynamic model to forecast system behavior,
and optimize the forecast to produce the best decision—the control
move at the current time. Models are therefore central to every form of
MPC. Because the optimal control move depends on the initial state of
the dynamic system, a second basic concept in MPC is to use the past
record of measurements to determine the most likely initial state of the
system. The state estimation problem is to examine the record of past
data, and reconcile these measurements with the model to determine
the most likely value of the state at the current time. Both the regulation
problem, in which a model forecast is used to produce the optimal
control action, and the estimation problem, in which the past record

1
2 Getting Started with Model Predictive Control

of measurements is used to produce an optimal state estimate, involve


dynamic models and optimization.
We first discuss the dynamic models used in this text. We start with
the familiar differential equation models
dx
= f (x, u, t)
dt
y = h(x, u, t)
x(t0 ) = x0

in which x ∈ Rn is the state, u ∈ Rm is the input, y ∈ Rp is the


output, and t ∈ R is time. We use Rn to denote the set of real-valued
n-vectors. The initial condition specifies the value of the state x at
time t = t0 , and we seek a solution to the differential equation for time
greater than t0 , t ∈ R≥t0 . Often we define the initial time to be zero,
with a corresponding initial condition, in which case t ∈ R≥0 .

1.2.1 Linear Dynamic Models

Time-varying model. The most general linear state space model is


the time-varying model
dx
= A(t)x + B(t)u
dt
y = C(t)x + D(t)u
x(0) = x0

in which A(t) ∈ Rn×n is the state transition matrix, B(t) ∈ Rn×m is


the input matrix, C(t) ∈ Rp×n is the output matrix, and D(t) ∈ Rp×m
allows a direct coupling between u and y. In many applications D = 0.
Time-invariant model. If A, B, C, and D are time invariant, the linear
model reduces to
dx
= Ax + Bu
dt
y = Cx + Du (1.1)
x(0) = x0

One of the main motivations for using linear models to approximate


physical systems is the ease of solution and analysis of linear models.
Equation (1.1) can be solved to yield
Zt
At
x(t) = e x0 + eA(t−τ) Bu(τ)dτ (1.2)
0
1.2 Models and Modeling 3

u(s) y(s)
G(s)

Figure 1.1: System with input u, output y, and transfer function ma-
trix G connecting them; the model is y = Gu.

in which eAt ∈ Rn×n is the matrix exponential.1 Notice the solution


is a convolution integral of the entire u(t) behavior weighted by the
matrix exponential of At. We will see later that the eigenvalues of A
determine whether the past u(t) has more effect or less effect on the
current x(t) as time increases.

1.2.2 Input-Output Models

If we know little about the internal structure of a system, it may be


convenient to take another approach in which we suppress the state
variable, and focus attention only on the manipulatable inputs and mea-
surable outputs. As shown in Figure 1.1, we consider the system to be
the connection between u and y. In this viewpoint, we usually perform
system identification experiments in which we manipulate u and mea-
sure y, and develop simple linear models for G. To take advantage of
the usual block diagram manipulation of simple series and feedback
connections, it is convenient to consider the Laplace transform of the
signals rather than the time functions
Z∞
y(s) := e−st y(t)dt
0

in which s ∈ C is the complex-valued Laplace transform variable, in con-


trast to t, which is the real-valued time variable. The symbol := means
“equal by definition” or “is defined by.” The transfer function matrix
is then identified from the data, and the block diagram represents the
1 We can define the exponential of matrix X in terms of its Taylor series
1 1 1 1
eX := I + X + X2 + X3 + · · ·
0! 1! 2! 3!
This series converges for all X.
4 Getting Started with Model Predictive Control

following mathematical relationship between input and output

y(s) = G(s)u(s)

G(s) ∈ Cp×m is the transfer function matrix. Notice the state does
not appear in this input-output description. If we are obtaining G(s)
instead from a state space model, then G(s) = C(sI − A)−1 B + D, and
we assume x(0) = 0 as the system initial condition.

1.2.3 Distributed Models

Distributed models arise whenever we consider systems that are not


spatially uniform. Consider, for example, a multicomponent, chemi-
cal mixture undergoing convection and chemical reaction. The micro-
scopic mass balance for species A is

∂cA
+ ∇ · (cA vA ) − RA = 0
∂t
in which cA is the molar concentration of species A, vA is the velocity
of species A, and RA is the production rate of species A due to chemical
reaction, in which

∂ ∂ ∂
∇ := δx + δy + δz
∂x ∂y ∂z

and the δx,y,z are the respective unit vectors in the (x, y, z) spatial
coordinates.
We also should note that the distribution does not have to be “spa-
tial.” Consider a particle size distribution f (r , t) in which f (r , t)dr
represents the number of particles of size r to r + dr in a particle reac-
tor at time t. The reactor volume is considered well mixed and spatially
homogeneous. If the particles nucleate at zero size with nucleation rate
B(t) and grow with growth rate, G(t), the evolution of the particle size
distribution is given by

∂f ∂f
= −G
∂t ∂r
f (r , t) = B/G r =0 t≥0
f (r , t) = f0 (r ) r ≥0 t=0

Again we have partial differential equation descriptions even though


the particle reactor is well mixed and spatially uniform.
1.2 Models and Modeling 5

1.2.4 Discrete Time Models

Discrete time models are often convenient if the system of interest is


sampled at discrete times. If the sampling rate is chosen appropriately,
the behavior between the samples can be safely ignored and the model
describes exclusively the behavior at the sample times. The finite di-
mensional, linear, time-invariant, discrete time model is

x(k + 1) = Ax(k) + Bu(k)


y(k) = Cx(k) + Du(k) (1.3)
x(0) = x0

in which k ∈ I≥0 is a nonnegative integer denoting the sample number,


which is connected to time by t = k∆ in which ∆ is the sample time.
We use I to denote the set of integers and I≥0 to denote the set of non-
negative integers. The linear discrete time model is a linear difference
equation.
It is sometimes convenient to write the time index with a subscript

xk+1 = Axk + Buk


yk = Cxk + Duk
x0 given

but we avoid this notation in this text. To reduce the notational com-
plexity we usually express (1.3) as

x + = Ax + Bu
y = Cx + Du
x(0) = x0

in which the superscript + means the state at the next sample time.
The linear discrete time model is convenient for presenting the ideas
and concepts of MPC in the simplest possible mathematical setting.
Because the model is linear, analytical solutions are readily derived.
The solution to (1.3) is
k−1
X
x(k) = Ak x0 + Ak−j−1 Bu(j) (1.4)
j=0

Notice that a convolution sum corresponds to the convolution integral


of (1.2) and powers of A correspond to the matrix exponential. Be-
cause (1.4) involves only multiplication and addition, it is convenient
to program for computation.
6 Getting Started with Model Predictive Control

The discrete time analog of the continuous time input-output model


is obtained by defining the Z-transform of the signals

X
y(z) := zk y(k)
k=0

The discrete transfer function matrix G(z) then represents the discrete
input-output model
y(z) = G(z)u(z)
and G(z) ∈ Cp×m is the transfer function matrix. Notice the state does
not appear in this input-output description. We make only passing
reference to transfer function models in this text.

1.2.5 Constraints

The manipulated inputs (valve positions, voltages, torques, etc.) to


most physical systems are bounded. We include these constraints by
linear inequalities
Eu(k) ≤ e k ∈ I≥0
in which " # " #
I u
E= e=
−I −u
are chosen to describe simple bounds such as

u ≤ u(k) ≤ u k ∈ I≥0

We sometimes wish to impose constraints on states or outputs for rea-


sons of safety, operability, product quality, etc. These can be stated
as
F x(k) ≤ f k ∈ I≥0
Practitioners find it convenient in some applications to limit the rate of
change of the input, u(k) − u(k − 1). To maintain the state space form
of the model, we may augment the state as
" #
x(k)
x
e (k) =
u(k − 1)

and the augmented system model becomes


+ ex
x
e =A eu
e +B
ex
y =C e
1.2 Models and Modeling 7

in which
" # " #
h i
e = A
A
0 e= B
B e = C
C 0
0 0 I

A rate of change constraint such as

∆ ≤ u(k) − u(k − 1) ≤ ∆ k ∈ I≥0

is then stated as
" # " # " #
0 −I I ∆
Fx
e (k) + Eu(k) ≤ e F= E= e=
0 I −I −∆

To simplify analysis, it pays to maintain linear constraints when us-


ing linear dynamic models. So if we want to consider fairly general
constraints for a linear system, we choose the form

F x(k) + Eu(k) ≤ e k ∈ I≥0

which subsumes all the forms listed previously.


When we consider nonlinear systems, analysis of the controller is
not significantly simplified by maintaining linear inequalities, and we
generalize the constraints to set membership

x(k) ∈ X u(k) ∈ U k ∈ I≥0

or, more generally

(x(k), u(k)) ∈ Z k ∈ I≥0

We should bear in mind one general distinction between input con-


straints, and output or state constraints. The input constraints often
represent physical limits. In these cases, if the controller does not
respect the input constraints, the physical system enforces them. In
contrast, the output or state constraints are usually desirables. They
may not be achievable depending on the disturbances affecting the sys-
tem. It is often the function of an MPC controller to determine in real
time that the output or state constraints are not achievable, and relax
them in some satisfactory manner. As we discuss in Chapter 2, these
considerations lead implementers of MPC often to set up the optimiza-
tion problem using hard constraints for the input constraints and some
form of soft constraints for the output or state constraints.
8 Getting Started with Model Predictive Control

Soft state or output constraints. A simple formulation for soft state


or output constraints is presented next. Consider a set of hard input
and state constraints such as those described previously

Eu(k) ≤ e F x(k) ≤ f k ∈ I≥0

To soften state constraints one introduces slack variables, ε(k), which


are considered decision variables, like the manipulated inputs. One
then relaxes the state constraints via

F x(k) ≤ f + ε(k) k ∈ I≥0

and adds the new “input” constraint

ε(k) ≥ 0 k ∈ I≥0

Consider the augmented input to be u e (k) = (u(k), ε(k)), the soft state
constraint formulation is then a set of mixed input-state constraints

Fe x(k) + E
eue (k) ≤ e
e k≥0

with      
0 E 0 " # e
  e=  u  
Fe =  0  E 0 −I  u
e = e
e = 0
ε
F 0 −I f
As we discuss subsequently, one then formulates a stage-cost penalty
that weights how much one cares about the state x, the input u and
the violation of the hard state constraint, which is given by ε. The hard
state constraint has been replaced by a mixed state-input constraint.
The benefit of this reformulation is that the state constraint cannot
cause an infeasiblity in the control problem because it can be relaxed
by choosing ε; large values of ε may be undesirable as measured by the
stage-cost function, but they are not infeasible.
Discrete actuators and integrality constraints. In many industrial
applications, a subset of the actuators or decision variables may be in-
teger valued or discrete. A common case arises when the process has
banks of similar units such as furnaces, heaters, chillers, compressors,
etc., operating in parallel. In this kind of process, part of the control
problem is to decide how many and which of these discrete units should
be on or off during process operation to meet the setpoint or reject a
disturbance. Discrete decisions also arise in many scheduling prob-
lems. In chemical production scheduling, for example, the discrete de-
cisions can be whether or not to produce a certain chemical in a certain
1.2 Models and Modeling 9

(a) (b)

Figure 1.2: Typical input constraint sets U for (a) continuous ac-
tuators and (b) mixed continuous/discrete actuators.
The origin (circle) represents the steady-state operating
point.

reactor during the production schedule. Since these decisions are often
made repeatedly as new measurement information becomes available,
these (re)scheduling problems are also feedback control problems.
To define discrete-valued actuators, one may add constraints like

ui (k) ∈ {0, 1} i ∈ ID , k ∈ I≥0

in which the set ID ⊂ {1, 2, . . . , m} represents the indices of the actu-


ators that are discrete, which are binary (on/off) decisions in the case
illustrated above. Alternatively, one may use the general set member-
ship constraint u(k) ∈ U, and employ the set U to define the discrete
actuators as shown in Figure 1.2. In the remainder of this introduc-
tory chapter we focus exclusively on continuous actuators, but return
to discrete actuators in later chapters.

1.2.6 Deterministic and Stochastic

If one examines measurements coming from any complex, physical pro-


cess, fluctuations in the data as depicted in Figure 1.3 are invariably
present. For applications at small length scales, the fluctuations may
be caused by the random behavior of small numbers of molecules. This
type of application is becoming increasingly prevalent as scientists and
engineers study applications in nanotechnology. This type of system
also arises in life science applications when modeling the interactions
of a few virus particles or protein molecules with living cells. In these
applications there is no deterministic simulation model; the only sys-
tem model available is stochastic.
10 Getting Started with Model Predictive Control

20

15

10

−5
0 50 100 150 200
time

Figure 1.3: Output of a stochastic system versus time.

Linear time-invariant models. In mainstream, classical process con-


trol problems, we are usually concerned with modeling, monitoring and
controlling macroscopic systems, i.e., we are not considering systems
composed of small numbers of molecules. So one may naturally ask
(many do) what is the motivation for stochastic models in this arena?
The motivation for stochastic models is to account for the unmodeled
effects of the environment (disturbances) on the system under study. If
we examine the measurement from any process control system of inter-
est, no matter how “macroscopic,” we are confronted with the physical
reality that the measurement still looks a lot like Figure 1.3. If it is im-
portant to model the observed measurement fluctuations, we turn to
stochastic models.
Some of the observed fluctuation in the data is assignable to the
measurement device. This source of fluctuation is known as measure-
ment “noise.” Some of the observed fluctuation in the data is assignable
to unmodeled disturbances from the environment affecting the state of
the system. The simplest stochastic model for representing these two
possible sources of disturbances is a linear model with added random
1.3 Introductory MPC Regulator 11

variables

x + = Ax + Bu + Gw
y = Cx + Du + v

with initial condition x(0) = x0 . The variable w ∈ Rg is the random


variable acting on the state transition, v ∈ Rp is a random variable act-
ing on the measured output, and x0 is a random variable specifying the
initial state. The random variable v is used to model the measurement
noise and w models the process disturbance. The matrix G ∈ Rn×g
allows further refinement of the modeling between the source of the
disturbance and its effect on the state. Often G is chosen to be the
identity matrix with g = n.

1.3 Introductory MPC Regulator


1.3.1 Linear Quadratic Problem

We start by designing a controller to take the state of a deterministic,


linear system to the origin. If the setpoint is not the origin, or we wish
to track a time-varying setpoint trajectory, we will subsequently make
modifications of the zero setpoint problem to account for that. The
system model is

x + = Ax + Bu
y = Cx (1.5)

In this first problem, we assume that the state is measured, or C = I. We


will handle the output measurement problem with state estimation in
the next section. Using the model we can predict how the state evolves
given any set of inputs we are considering. Consider N time steps into
the future and collect the input sequence into u

u = (u(0), u(1), . . . , u(N − 1))

Constraints on the u sequence (i.e., valve saturations, etc.) are covered


extensively in Chapter 2. The constraints are the main feature that
distinguishes MPC from the standard linear quadratic (LQ) control.
We first define an objective function V (·) to measure the deviation
of the trajectory of x(k), u(k) from zero by summing the weighted
squares
N−1
1 X   1
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) + x(N)′ Pf x(N)
2 k=0 2
12 Getting Started with Model Predictive Control

subject to
x + = Ax + Bu
The objective function depends on the input sequence and state se-
quence. The initial state is available from the measurement. The re-
mainder of the state trajectory, x(k), k = 1, . . . , N, is determined by the
model and the input sequence u. So we show the objective function’s
explicit dependence on the input sequence and initial state. The tuning
parameters in the controller are the matrices Q and R. We allow the
final state penalty to have a different weighting matrix, Pf , for general-
ity. Large values of Q in comparison to R reflect the designer’s intent
to drive the state to the origin quickly at the expense of large control
action. Penalizing the control action through large values of R relative
to Q is the way to reduce the control action and slow down the rate at
which the state approaches the origin. Choosing appropriate values of
Q and R (i.e., tuning) is not always obvious, and this difficulty is one of
the challenges faced by industrial practitioners of LQ control. Notice
that MPC inherits this tuning challenge.
We then formulate the following optimal LQ control problem

min V (x(0), u) (1.6)


u

The Q, Pf , and R matrices often are chosen to be diagonal, but we do


not assume that here. We assume, however, that Q, Pf , and R are real
and symmetric; Q and Pf are positive semidefinite; and R is positive
definite. These assumptions guarantee that the solution to the optimal
control problem exists and is unique.

1.3.2 Optimizing Multistage Functions

We next provide a brief introduction to methods for solving multistage


optimization problems like (1.6). Consider the set of variables w, x, y,
and z, and the following function to be optimized

f (w, x) + g(x, y) + h(y, z)

Notice that the objective function has a special structure in which each
stage’s cost function in the sum depends only on adjacent variable
pairs. For the first version of this problem, we consider w to be a
fixed parameter, and we would like to solve the problem

min f (w, x) + g(x, y) + h(y, z) w fixed


x,y,z
1.3 Introductory MPC Regulator 13

One option is to optimize simultaneously over all three decision vari-


ables. Because of the objective function’s special structure, however,
we can obtain the solution by optimizing a sequence of three single-
variable problems defined as follows
 
 
min f (w, x) + min g(x, y) + min h(y, z)
x y z

We solve the inner problem over z first, and denote the optimal value
and solution as follows

h0 (y) = min h(y, z) z0 (y) = arg min h(y, z)


z z

Notice that the optimal z and value function for this problem are both
expressed as a function of the y variable. We then move to the next
optimization problem and solve for the y variable

min g(x, y) + h0 (y)


y

and denote the solution and value function as

g 0 (x) = min g(x, y) + h0 (y) y 0 (x) = arg min g(x, y) + h0 (y)


y y

The optimal solution for y is a function of x, the remaining variable to


be optimized. The third and final optimization is

min f (w, x) + g 0 (x)


x

with solution and value function


0
f (w) = min f (w, x) + g 0 (x) x 0 (w) = arg min f (w, x) + g 0 (x)
x x

We summarize the recursion with the following annotated equation

g 0 (x), y 0 (x)
 z }| 
 {
min f (w, x) + min g(x, y) + min h(y, z)
x y z
| {z }
h0 (y), z0 (y)
| {z }
0
f (w), x 0 (w)

If we are mainly interested in the first variable x, then the function


x 0 (w) is of primary interest and we have obtained this function quite
efficiently. This nested solution approach is an example of a class of
14 Getting Started with Model Predictive Control

techniques known as dynamic programming (DP). DP was developed


by Bellman (Bellman, 1957; Bellman and Dreyfus, 1962) as an efficient
means for solving these kinds of multistage optimization problems.
Bertsekas (1987) provides an overview of DP.
The version of the method we just used is called backward DP be-
cause we find the variables in reverse order: first z, then y, and finally
x. Notice we find the optimal solutions as functions of the variables to
be optimized at the next stage. If we wish to find the other variables
y and z as a function of the known parameter w, then we nest the
optimal solutions found by the backward DP recursion
0
y (w) = y 0 (x 0 (w)) z 0 (w) = z0 ( y 0
(w)) = z0 (y 0 (x 0 (w)))
e e e

As we see shortly, backward DP is the method of choice for the regulator


problem.
In the state estimation problem to be considered later in this chap-
ter, w becomes a variable to be optimized, and z plays the role of a
parameter. We wish to solve the problem

min f (w, x) + g(x, y) + h(y, z) z fixed


w,x,y

We can still break the problem into three smaller nested problems, but
the order is reversed
g 0 (y), x 0 (y)
 z }| 
 {
min h(y, z) + min g(x, y) + min f (w, x) (1.7)
y x w
| {z }
0
f (x), w 0 (x)
| {z }
0
h (z), y 0 (z)

This form is called forward DP because we find the variables in the


order given: first w, then x, and finally y. The optimal value functions
and optimal solutions at each of the three stages are shown in (1.7).
This version is preferable if we are primarily interested in finding the
final variable y as a function of the parameter z. As before, if we need
the other optimized variables x and w as a function of the parameter z,
we must insert the optimal functions found by the forward DP recursion
0 0 0
e (z) = x 0 (y 0 (z))
x e (z) = w 0 (x
w e (z)) = w 0 (x 0 (y 0 (z)))

For the reader interested in trying some exercises to reinforce the con-
0
cepts of DP, Exercise 1.15 considers finding the function w e (z) with
1.3 Introductory MPC Regulator 15

1.5 V2 (x)

1
V1 (x) b

x2 0.5

0 v
a V (x) = V1 (x) + V2 (x)

−0.5

−1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x1

Figure 1.4: Level sets of two quadratic functions V1 (x) = (1/4),


V2 (x) = (1/4), and their sum; V (x) = V1 (x)+V2 (x) = 2.

backward DP instead of forward DP as we just did here. Exercise C.1


discusses showing that the nested optimizations indeed give the same
answer as simultaneous optimization over all decision variables.
Finally, if we optimize over all four variables, including the one con-
sidered as a fixed parameter in the two versions of DP we used, then
we have two equivalent ways to express the value of the complete op-
timization
0 0
min f (w, x) + g(x, y) + h(y, z) = min f (w) = min h (z)
w,x,y,z w z

The result in the next example proves useful in combining quadratic


functions to solve the LQ problem.

Example 1.1: Sum of quadratic functions


Consider the two quadratic functions given by

V1 (x) = (1/2)(x − a)′ A(x − a) V2 (x) = (1/2)(x − b)′ B(x − b)


16 Getting Started with Model Predictive Control

in which A, B > 0 are positive definite matrices and a and b are n-


vectors locating the minimum of each function. Figure 1.4 displays the
ellipses defined by the level sets V1 (x) = 1/4 and V2 (x) = 1/4 for the
following data
" # " # " # " #
1.25 0.75 −1 1.5 −0.5 1
A= a= B= b=
0.75 1.25 0 −0.5 1.5 1

(a) Show that the sum V (x) = V1 (x) + V2 (x) is also quadratic

V (x) = (1/2)((x − v)′ H(x − v) + d)

in which

H =A+B v = H −1 (Aa + Bb)


d = −(Aa + Bb)′ H −1 (Aa + Bb) + a′ Aa + b′ Bb

and verify the three ellipses given in Figure 1.4.

(b) Consider a generalization useful in the discussion of the upcom-


ing regulation and state estimation problems. Let

V1 (x) = (1/2)(x−a)′ A(x−a) V2 (x) = (1/2)(Cx−b)′ B(Cx−b)

Derive the formulas for H, v, d for this case.

(c) Use the matrix inversion lemma (see Exercise 1.12) and show that
V (x) of part (b) can be expressed also in an inverse form, which
is useful in state estimation problems
−1
e
V (x) = (1/2)((x − v)′ H (x − v) + d)
e =A
H −1
−A −1 ′
C (CA −1
C ′ + B −1 )−1 CA−1
v = a + A−1 C ′ (CA−1 C ′ + B −1 )−1 (b − Ca)
d = (b − Ca)′ (CA−1 C ′ + B −1 )−1 (b − Ca)

Solution
(a) The sum of two quadratics is also quadratic, so we parameterize
the sum as

V (x) = (1/2) (x − v)′ H(x − v) + d
1.3 Introductory MPC Regulator 17

and solve for v, H, and d. Comparing the expansion of the quadrat-


ics of the right- and left-hand sides gives

x ′ Hx−2x ′ Hv+v ′ Hv+d = x ′ (A+B)x−2x ′ (Aa+Bb)+a′ Aa+b′ Bb

Equating terms at each order gives

H =A+B
v = H −1 (Aa + Bb)
d = −v ′ Hv + a′ Aa + b′ Bb
= −(Aa + Bb)′ H −1 (Aa + Bb) + a′ Aa + b′ Bb

Notice that H is positive definite since A and B are positive defi-


nite. Substituting the values of a, A, b, and B gives
" # " #
2.75 0.25 −0.1
H= v= d = 3.2
0.25 2.75 0.1

The level set V (x) = 2 is also plotted in Figure 1.4.

(b) Expanding and comparing terms as before, we obtain

H = A + C ′ BC
v = H −1 (Aa + C ′ Bb)
d = −(Aa + C ′ Bb)′ H −1 (Aa + C ′ Bb) + a′ Aa + b′ Bb (1.8)

Notice that H is positive definite since A is positive definite and


C ′ BC is positive semidefinite for any C.

(c) Define x = x − a and b = b − Ca, and express the problem as

V (x) = (1/2)x ′ Ax + (1/2)(C(x + a) − b)′ B(C(x + a) − b)


= (1/2)x ′ Ax + (1/2)(Cx − b)′ B(Cx − b)

Apply the solution of part (b) to obtain

V (x) = (1/2)((x − v)′ H(x − v) + d)


H = A + C ′ BC v = H −1 C ′ Bb
d = (b − Ca)′ (B − BCH −1 C ′ B)−1 (b − Ca)
18 Getting Started with Model Predictive Control

From the matrix inversion lemma, use (1.54) on H and (1.55) on


v to obtain
e = A−1 − A−1 C ′ (CA−1 C ′ + B −1 )−1 CA−1
H −1 = H
v = A−1 C ′ (CA−1 C ′ + B −1 )−1 b
d = (b − Ca)′ (CA−1 C ′ + B −1 )−1 (b − Ca)

The function V (x) is then given by


−1
e
V (x) = (1/2)((x − v)′ H (x − v) + d)

with v = a + A−1 C ′ (CA−1 C ′ + B −1 )−1 (b − Ca). □

1.3.3 Dynamic Programming Solution

After this brief introduction to DP, we apply it to solve the LQ con-


trol problem. We first rewrite (1.6) in the following form to see the
structure clearly
N−1
X
V (x(0), u) = ℓ(x(k), u(k)) + ℓN (x(N)) s.t. x + = Ax + Bu
k=0

in which the stage cost ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru), k = 0, . . . , N − 1


and the terminal stage cost ℓN (x) = (1/2)x ′ Pf x. Since x(0) is known,
we choose backward DP as the convenient method to solve this prob-
lem. We first rearrange the overall objective function so we can opti-
mize over input u(N − 1) and state x(N)

min ℓ(x(0), u(0)) + ℓ(x(1), u(1)) + · · · +


u(0),x(1),...u(N−2),x(N−1)

min ℓ(x(N − 1), u(N − 1)) + ℓN (x(N))


u(N−1),x(N)

subject to

x(k + 1) = Ax(k) + Bu(k) k = 0, . . . N − 1

The problem to be solved at the last stage is

min ℓ(x(N − 1), u(N − 1)) + ℓN (x(N)) (1.9)


u(N−1),x(N)

subject to
x(N) = Ax(N − 1) + Bu(N − 1)
1.3 Introductory MPC Regulator 19

in which x(N − 1) appears in this stage as a parameter. We denote the


0
optimal cost by VN−1 (x(N − 1)) and the optimal decision variables by
0 0
uN−1 (x(N − 1)) and xN (x(N − 1)). The optimal cost and decisions at
the last stage are parameterized by the state at the previous stage as
we expect in backward DP. We next solve this optimization. First we
substitute the state equation for x(N) and combine the two quadratic
terms using (1.8)

ℓ(x(N − 1), u(N − 1)) + ℓN (x(N))


 
= (1/2) |x(N − 1)|2Q + |u(N − 1)|2R + |Ax(N − 1) + Bu(N − 1)|2Pf
 
= (1/2) |x(N − 1)|2Q + |(u(N − 1) − v)|2H + d

in which

H = R + B ′ Pf B
v = −(B ′ Pf B + R)−1 B ′ Pf A x(N − 1)
 
d = x(N − 1)′ A′ Pf A − A′ Pf B(B ′ Pf B + R)−1 B ′ Pf A x(N − 1)

Given this form of the cost function, we see by inspection that the opti-
mal input for u(N − 1) is v, so the optimal control law at stage N − 1 is
a linear function of the state x(N − 1). Then using the model equation,
the optimal final state is also a linear function of state x(N − 1). The
optimal cost is (1/2)(|x(N − 1)|2Q + d), which makes the optimal cost
a quadratic function of x(N − 1). Summarizing, for all x

u0N−1 (x) = K(N − 1) x


0
xN (x) = (A + BK(N − 1)) x
0
VN−1 (x) = (1/2)x ′ Π(N − 1) x

with the definitions

K(N − 1) := −(B ′ Pf B + R)−1 B ′ Pf A


Π(N − 1) := Q + A′ Pf A − A′ Pf B(B ′ Pf B + R)−1 B ′ Pf A
0
The function VN−1 (x) defines the optimal cost to go from state x for the
last stage under the optimal control law u0N−1 (x). Having this function
allows us to move to the next stage of the DP recursion. For the next
stage we solve the optimization
0
min ℓ(x(N − 2), u(N − 2)) + VN−1 (x(N − 1))
u(N−2),x(N−1)
20 Getting Started with Model Predictive Control

subject to
x(N − 1) = Ax(N − 2) + Bu(N − 2)

Notice that this problem is identical in structure to the stage we just


solved, (1.9), and we can write out the solution by simply renaming
variables

u0N−2 (x) = K(N − 2) x


0
xN−1 (x) = (A + BK(N − 2)) x
0
VN−2 (x) = (1/2)x ′ Π(N − 2) x
K(N − 2) := −(B ′ Π(N − 1)B + R)−1 B ′ Π(N − 1)A
Π(N − 2) := Q + A′ Π(N − 1)A−
A′ Π(N − 1)B(B ′ Π(N − 1)B + R)−1 B ′ Π(N − 1)A

The recursion from Π(N −1) to Π(N −2) is known as a backward Riccati
iteration. To summarize, the backward Riccati iteration is defined as
follows

−1
Π(k − 1) = Q + A′ Π(k)A − A′ Π(k)B B ′ Π(k)B + R B ′ Π(k)A
k = N, N − 1, . . . , 1 (1.10)

with terminal condition


Π(N) = Pf (1.11)

The terminal condition replaces the typical initial condition because


the iteration is running backward. The optimal control policy at each
stage is
u0k (x) = K(k)x k = N − 1, N − 2, . . . , 0 (1.12)

The optimal gain at time k is computed from the Riccati matrix at time
k+1
−1
K(k) = − B ′ Π(k + 1)B + R B ′ Π(k + 1)A
k = N − 1, N − 2, . . . , 0
(1.13)
and the optimal cost to go from time k to time N is

Vk0 (x) = (1/2)x ′ Π(k)x k = N, N − 1, . . . , 0 (1.14)


1.3 Introductory MPC Regulator 21

1.3.4 The Infinite Horizon LQ Problem

Let us motivate the infinite horizon problem by showing a weakness of


the finite horizon problem. Kalman (1960b, p.113) pointed out in his
classic 1960 paper that optimality does not ensure stability.

In the engineering literature it is often assumed (tacitly and


incorrectly) that a system with optimal control law (6.8) is
necessarily stable.

Assume that we use as our control law the first feedback gain of the
finite horizon problem, K(0)

u(k) = K(0)x(k)

Then the stability of the closed-loop system is determined by the eigen-


values of A+BK(0). We now construct an example that shows choosing
Q > 0, R > 0, and N ≥ 1 does not ensure stability. In fact, we can find
reasonable values of these parameters such that the controller desta-
bilizes a stable system.2 Let
" # " #
4/3 −2/3 1
A= B= C = [−2/3 1]
1 0 0

This system is chosen so that G(z) has a zero at z = 3/2, i.e., an unsta-
ble zero. We now construct an LQ controller that inverts this zero and
hence produces an unstable system. We would like to choose Q = C ′ C
so that y itself is penalized, but that Q is only semidefinite. We add a
small positive definite piece to C ′ C so that Q is positive definite, and
choose a small positive R penalty (to encourage the controller to mis-
behave), and N = 5
" #
′ 4/9 + .001 −2/3
Q = C C + 0.001I = R = 0.001
−2/3 1.001

We now iterate the Riccati equation four times starting from Π = Pf =


Q and compute K(0) for N = 5; then we compute the eigenvalues of
A + BK(0) and achieve3

eig(A + BK5 (0)) = {1.307, 0.001}


2 In Chapter 2, we present several controller design methods that prevent this kind

of instability.
3 Please check this answer with Octave or MATLAB.
22 Getting Started with Model Predictive Control

Using this controller the closed-loop system evolution is x(k) = (A +


BK5 (0))k x0 . Since an eigenvalue of A + BK5 (0) is greater than unity,
x(k) → ∞ as k → ∞. In other words the closed-loop system is unstable.

If we continue to iterate the Riccati equation, which corresponds to


increasing the horizon in the controller, we obtain for N = 7

eig(A + BK7 (0)) = {0.989, 0.001}

and the controller is stabilizing. If we continue iterating the Riccati


equation, we converge to the following steady-state closed-loop eigen-
values
eig(A + BK∞ (0)) = {0.664, 0.001}

This controller corresponds to an infinite horizon control law. Notice


that it is stabilizing and has a reasonable stability margin. Nominal
stability is a guaranteed property of infinite horizon controllers as we
prove in the next section.
With this motivation, we are led to consider directly the infinite hori-
zon case

1 X
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) (1.15)
2 k=0

in which x(k) is the solution at time k of x + = Ax+Bu if the initial state


is x(0) and the input sequence is u. If we are interested in a continuous
process (i.e., no final time), then the natural cost function is an infinite
horizon cost. If we were truly interested in a batch process (i.e., the
process does stop at k = N), then stability is not a relevant property,
and we naturally would use the finite horizon LQ controller and the
time-varying controller, u(k) = K(k)x(k), k = 0, 1, . . . , N.
In considering the infinite horizon problem, we first restrict atten-
tion to systems for which there exist input sequences that give bounded
cost. Consider the case A = I and B = 0, for example. Regardless of the
choice of input sequence, (1.15) is unbounded for x(0) ≠ 0. It seems
clear that we are not going to stabilize an unstable system (A = I) with-
out any input (B = 0). This is an example of an uncontrollable system.
In order to state the sharpest results on stabilization, we require the
concepts of controllability, stabilizability, observability, and detectabil-
ity. We shall define these concepts subsequently.
1.3 Introductory MPC Regulator 23

1.3.5 Controllability

A system is controllable if, for any pair of states x, z in the state space,
z can be reached in finite time from x (or x controlled to z) (Sontag,
1998, p.83). A linear discrete time system x + = Ax + Bu is therefore
controllable if there exists a finite time N and a sequence of inputs

(u(0), u(1), . . . u(N − 1))

that can transfer the system from any x to any z in which


 
u(N − 1)
h iu(N − 2)

z = AN x + B AB · · · AN−1 B   . 

 .. 
u(0)

We can simplify this condition by noting that the matrix powers Ak


for k ≥ n are expressible as linear combinations of the powers 0 to
n − 1. This result is a consequence of the Cayley-Hamilton theorem
(Horn
h and Johnson, 1985,i pp. 86–87). Therefore hthe range of the matrixi
B AB · · · AN−1 B for N ≥ n is the same as B AB · · · An−1 B .
In other words, for an unconstrained linear system, if we cannot reach
z in n moves, we cannot reach z in any number of moves. The ques-
tion of controllability of a linear time-invariant system is therefore a
question of existence of solutions to linear equations for an arbitrary
right-hand side
 
u(n − 1)
h iu(n − 2)

B AB · · · An−1 B  .  = z − An x
 .. 
 
u(0)

The matrix appearing in this equation is known as the controllability


matrix C h i
C = B AB · · · An−1 B (1.16)
From the fundamental theorem of linear algebra, we know a solution
exists for all right-hand sides if and only if the rows of the n × nm
controllability matrix are linearly independent.4 Therefore, the system
(A, B) is controllable if and only if

rank(C) = n
4 See Section A.4 of Appendix A or (Strang, 1980, pp.87–88) for a review of this result.
24 Getting Started with Model Predictive Control

The following result for checking controllability also proves useful (Hau-
tus, 1972).

Lemma 1.2 (Hautus lemma for controllability). A system is controllable


if and only if
h i
rank λI − A B = n for all λ ∈ C (1.17)

in which C is the set of complex numbers.

Notice that the first n columns of the matrix in (1.17) are linearly
independent if λ is not an eigenvalue of A, so (1.17) is equivalent to
checking the rank at just the eigenvalues of A
h i
rank λI − A B = n for all λ ∈ eig(A)

1.3.6 Convergence of the Linear Quadratic Regulator

We now show that the infinite horizon regulator asymptotically stabi-


lizes the origin for the closed-loop system. Define the infinite horizon
objective function

1 X
V (x, u) = x(k)′ Qx(k) + u(k)′ Ru(k)
2 k=0

subject to

x + = Ax + Bu
x(0) = x

with Q, R > 0. If (A, B) is controllable, the solution to the optimization


problem
min V (x, u)
u

exists and is unique for all x. We denote the optimal solution by u0 (x),
and the first input in the optimal sequence by u0 (x). The feedback con-
trol law κ∞ (·) for this infinite horizon case is then defined as u = κ∞ (x)
in which κ∞ (x) = u0 (x) = u0 (0; x). As stated in the following lemma,
this infinite horizon linear quadratic regulator (LQR) is stabilizing.

Lemma 1.3 (LQR convergence). For (A, B) controllable, the infinite hori-
zon LQR with Q, R > 0 gives a convergent closed-loop system

x + = Ax + Bκ∞ (x)
1.3 Introductory MPC Regulator 25

Proof. The cost of the infinite horizon objective is bounded above for
all x(0) because (A, B) is controllable. Controllability implies that there
exists a sequence of n inputs (u(0), u(1), . . . , u(n − 1)) that transfers
the state from any x(0) to x(n) = 0. A zero control sequence after
k = n for (u(n + 1), u(n + 2), . . .) generates zero cost for all terms
in V after k = n, and the objective function for this infinite control
sequence is therefore finite. The cost function is strictly convex in u
because R > 0 so the solution to the optimization is unique.
If we consider the sequence of costs to go along the closed-loop
trajectory, we have

Vk+1 = Vk − (1/2) x(k)′ Qx(k) + u(k)′ Ru(k)

in which Vk = V 0 (x(k)) is the cost at time k for state value x(k)


and u(k) = u0 (x(k)) is the optimal control for state x(k). The cost
along the closed-loop trajectory is nonincreasing and bounded below
(by zero). Therefore, the sequence (Vk ) converges and

x(k)′ Qx(k) → 0 u(k)′ Ru(k) → 0 as k → ∞

Since Q, R > 0, we have

x(k) → 0 u(k) → 0 as k → ∞

and closed-loop convergence is established. ■

In fact we know more. From the previous sections, we know the


optimal solution is found by iterating the Riccati equation, and the
optimal infinite horizon control law and optimal cost are given by

u0 (x) = Kx V 0 (x) = (1/2)x ′ Πx

in which

K = −(B ′ ΠB + R)−1 B ′ ΠA
Π = Q + A′ ΠA − A′ ΠB(B ′ ΠB + R)−1 B ′ ΠA (1.18)

Proving Lemma 1.3 has shown also that for (A, B) controllable and Q,
R > 0, a positive definite solution to the discrete algebraic Riccati equa-
tion (DARE), (1.18), exists and the eigenvalues of (A+BK) are asymptot-
ically stable for the K corresponding to this solution (Bertsekas, 1987,
pp.58–64).
This basic approach to establishing regulator stability will be gener-
alized in Chapter 2 to handle constrained and nonlinear systems, so it
26 Getting Started with Model Predictive Control

is helpful for the new student to first become familiar with these ideas
in the unconstrained, linear setting. For linear systems, asymptotic
convergence is equivalent to asymptotic stability, and we delay the dis-
cussion of stability until Chapter 2. In Chapter 2 the optimal cost is
shown to be a Lyapunov function for the closed-loop system. We also
can strengthen the stability for linear systems from asymptotic stability
to exponential stability based on the form of the Lyapunov function.
The LQR convergence result in Lemma 1.3 is the simplest to estab-
lish, but we can enlarge the class of systems and penalties for which
closed-loop stability is guaranteed. The system restriction can be weak-
ened from controllability to stabilizability, which is discussed in Exer-
cises 1.19 and 1.20. The restriction on the allowable state penalty Q
can be weakened from Q > 0 to Q ≥ 0 and (A, Q) detectable, which
is also discussed in Exercise 1.20. The restriction R > 0 is retained to
ensure uniqueness of the control law. In applications, if one cares little
about the cost of the control, then R is chosen to be small, but positive
definite.

1.4 Introductory State Estimation


The next topic is state estimation. In most applications, the variables
that are conveniently or economically measurable (y) are a small sub-
set of the variables required to model the system (x). Moreover, the
measurement is corrupted with sensor noise and the state evolution
is corrupted with process noise. Determining a good state estimate
for use in the regulator in the face of a noisy and incomplete output
measurement is a challenging task. That is the challenge of state esti-
mation.
To fully appreciate the fundamentals of state estimation, we must
address the fluctuations in the data. Probability theory has proven it-
self as the most successful and versatile approach to modeling these
fluctuations. In this section we introduce the probability fundamentals
necessary to develop an optimal state estimator in the simplest possi-
ble setting: a linear discrete time model subject to normally distributed
process and measurement noise. This optimal state estimator is known
as the Kalman filter (Kalman, 1960a). In Chapter 4 we revisit the state
estimation problem in a much wider setting, and consider nonlinear
models and constraints on the system that preclude an analytical solu-
tion such as the Kalman filter. The probability theory presented here
is also preparation for understanding that chapter.
1.4 Introductory State Estimation 27

1.4.1 Linear Systems and Normal Distributions

This section summarizes the probability and random variable results


required for deriving a linear optimal estimator such as the Kalman fil-
ter. We assume that the reader is familiar with the concepts of a random
variable, probability density and distribution, the multivariate normal
distribution, mean and variance, statistical independence, and condi-
tional probability. Readers unfamiliar with these terms should study
the material in Appendix A before reading this and the next sections.
In the following discussion let x, y, and z be vectors of random
variables. We use the notation
x ∼ N(m, P )
px (x) = n(x, m, P )
to denote random variable x is normally distributed with mean m and
covariance (or simply variance) P , in which
 
1 1 ′ −1
n(x, m, P ) = exp − (x − m) P (x − m)
(2π )n/2 (det P )1/2 2
(1.19)
and det P denotes the determinant of matrix P . Note that if x ∈ Rn ,
then m ∈ Rn and P ∈ Rn×n is a positive definite matrix. We require
three main results. The simplest version can be stated as follows.
Joint independent normals. If x and y are normally distributed and
(statistically) independent5
x ∼ N(mx , Px ) y ∼ N(my , Py )
then their joint density is given by
px,y (x, y) = n(x, mx , Px ) n(y, my , Py )
" # " # " #!
x mx Px 0
∼N , (1.20)
y my 0 Py
Note that, depending on convenience, we use both (x, y) and the
x
vector y to denote the pair of random variables.

Linear transformation of a normal. If x is normally distributed with


mean m and variance P , and y is a linear transformation of x,
y = Ax, then y is distributed with mean Am and variance AP A′
x ∼ N(m, P ) y = Ax y ∼ N(Am, AP A′ ) (1.21)
5 We may emphasize that two vectors of random variables are independent using sta-

tistically independent to distinguish this concept from linear independence of vectors.


28 Getting Started with Model Predictive Control

Conditional of a joint normal. If x and y are jointly normally distributed


as " # " #" #!
x mx Px Pxy
∼N
y my Pyx Py

then the conditional density of x given y is also normal

px|y (x|y) = n(x, m, P ) (1.22)

in which the mean is

m = mx + Pxy Py−1 (y − my )

and the covariance is

P = Px − Pxy Py−1 Pyx

Note that the conditional mean m is itself a random variable because


it depends on the random variable y.
To derive the optimal estimator, we actually require these three
main results conditioned on additional random variables. The anal-
ogous results are the following.

Joint independent normals. If px|z (x|z) is normal, and y is statisti-


cally independent of x and z and normally distributed

px|z (x|z) = n(x, mx , Px )


y ∼ N(my , Py ) y independent of x and z

then the conditional joint density of (x, y) given z is

px,y|z (x, y|z) = n(x, mx , Px ) n(y, my , Py )


" # ! " # " # " #!
x x mx Px 0
px,y|z z =n , , (1.23)
y y my 0 Py

Linear transformation of a normal.

px|z (x|z) = n(x, m, P ) y = Ax


py|z (y|z) = n(y, Am, AP A′ ) (1.24)
1.4 Introductory State Estimation 29

Conditional of a joint normal. If x and y are jointly normally distributed


as " # ! " # " # " #!
x x mx Px Pxy
px,y|z z =n , ,
y y my Pyx Py
then the conditional density of x given y, z is also normal

px|y,z (x|y, z) = n(x, m, P ) (1.25)

in which

m = mx + Pxy Py−1 (y − my )
P = Px − Pxy Py−1 Pyx

1.4.2 Linear Optimal State Estimation

We start by assuming the initial state x(0) is normally distributed with


some mean and covariance

x(0) ∼ N(x(0), Q(0))

In applications, we often do not know x(0) or Q(0). In such cases we


often set x(0) = 0 and choose a large value for Q(0) to indicate our
lack of prior knowledge. The choice of a large variance prior forces the
upcoming y(k) measurements to determine the state estimate x̂(k).
Combining the measurement. We obtain noisy measurement y(0)
satisfying
y(0) = Cx(0) + v(0)
in which v(0) ∼ N(0, R) is the measurement noise. If the measurement
process is quite noisy, then R is large. If the measurements are highly
accurate, then R is small. We choose a zero mean for v because all
of the deterministic effects with nonzero mean are considered part
of the model, and the measurement noise reflects what is left after
all these other effects have been considered. Given the measurement
y(0), we want to obtain the conditional density px(0)|y(0) (x(0)|y(0)).
This conditional density describes the change in our knowledge about
x(0) after we obtain measurement y(0). This step is the essence of
state estimation. To derive this conditional density, first consider the
pair of variables (x(0), y(0)) given as
" # " #" #
x(0) I 0 x(0)
=
y(0) C I v(0)
30 Getting Started with Model Predictive Control

We assume that the noise v(0) is statistically independent of x(0),


and use the independent joint normal result (1.20) to express the joint
density of (x(0), v(0))
" # " # " #!
x(0) x(0) Q(0) 0
∼N ,
v(0) 0 0 R

From the previous equation, the pair (x(0), y(0)) is a linear transfor-
mation of the pair (x(0), v(0)). Therefore, using the linear transfor-
mation of normal result (1.21), and the density of (x(0), v(0)) gives
the density of (x(0), y(0))
" # " # " #!
x(0) x(0) Q(0) Q(0)C ′
∼N ,
y(0) Cx(0) CQ(0) CQ(0)C ′ + R

Given this joint density, we then use the conditional of a joint normal
result (1.22) to obtain

px(0)|y(0) x(0)|y(0) = n (x(0), m, P )

in which

m = x(0) + L(0) y(0) − Cx(0)
L(0) = Q(0)C ′ (CQ(0)C ′ + R)−1
P = Q(0) − Q(0)C ′ (CQ(0)C ′ + R)−1 CQ(0)

We see that the conditional density px(0)|y(0) is normal. The optimal


state estimate is the value of x(0) that maximizes this conditional den-
sity. For a normal, that is the mean, and we choose x̂(0) = m. We
also denote the variance in this conditional after measurement y(0)
by P (0) = P with P given in the previous equation. The change in
variance after measurement (Q(0) to P (0)) quantifies the information
increase by obtaining measurement y(0). The variance after measure-
ment, P (0), is always less than or equal to Q(0), which implies that we
can only gain information by measurement; but the information gain
may be small if the measurement device is poor and the measurement
noise variance R is large.
Forecasting the state evolution. Next we consider the state evolution
from k = 0 to k = 1, which satisfies
" #
h i x(0)
x(1) = A I
w(0)
1.4 Introductory State Estimation 31

in which w(0) ∼ N(0, Q) is the process noise. If the state is subjected to


large disturbances, then Q is large, and if the disturbances are small, Q
is small. Again we choose zero mean for w because the nonzero-mean
disturbances should have been accounted for in the system model. We
next calculate the conditional density px(1)|y(0) . Now we require the
conditional version of the joint density (x(0), w(0)). We assume that
the process noise w(0) is statistically independent of both x(0) and
v(0), hence it is also independent of y(0), which is a linear combination
of x(0) and v(0). Therefore we use (1.23) to obtain
" # " # " #!
x(0) x̂(0) P (0) 0
∼N ,
w(0) 0 0 Q

We then use the conditional version of the linear transformation of a


normal (1.24) to obtain

px(1)|y(0) (x(1)|y(0)) = n(x(1), x̂ − (1), P − (1))

in which the mean and variance are

x̂ − (1) = Ax̂(0) P − (1) = AP (0)A′ + Q

We see that forecasting forward one time step may increase or decrease
the conditional variance of the state. If the eigenvalues of A are less
than unity, for example, the term AP (0)A′ may be smaller than P (0),
but the process noise Q adds a positive contribution. If the system is
unstable, AP (0)A′ may be larger than P (0), and then the conditional
variance definitely increases upon forecasting. See also Exercise 1.27
for further discussion of this point.
Given that px(1)|y(0) is also a normal, we are situated to add mea-
surement y(1) and continue the process of adding measurements fol-
lowed by forecasting forward one time step until we have processed
all the available data. Because this process is recursive, the storage re-
quirements are small. We need to store only the current state estimate
and variance, and can discard the measurements as they are processed.
The required online calculation is minor. These features make the op-
timal linear estimator an ideal candidate for rapid online application.
We next summarize the state estimation recursion.

Summary. Denote the measurement trajectory by



y(k) := y(0), y(1), . . . y(k)
32 Getting Started with Model Predictive Control

At time k the conditional density with data y(k − 1) is normal

px(k)|y(k−1) (x(k)|y(k − 1)) = n(x(k), x̂ − (k), P − (k))

and we denote the mean and variance with a superscript minus to in-
dicate these are the statistics before measurement y(k). At k = 0, the
recursion starts with x̂ − (0) = x(0) and P − (0) = Q(0) as discussed
previously. We obtain measurement y(k) which satisfies
" # " #" #
x(k) I 0 x(k)
=
y(k) C I v(k)

The density of (x(k), v(k)) follows from (1.23) since measurement


noise v(k) is independent of x(k) and y(k − 1)
" # " # " #!
x(k) x̂ − (k) P − (k) 0
∼N ,
v(k) 0 0 R

Equation (1.24) then gives the joint density


" # " # " #!
x(k) x̂ − (k) P − (k) P − (k)C ′
∼N − ,
y(k) C x̂ (k) CP − (k) CP − (k)C ′ + R

We note y(k − 1), y(k) = y(k), and using the conditional density
result (1.25) gives

px(k)|y(k) (x(k)|y(k)) = n (x(k), x̂(k), P (k))

in which

x̂(k) = x̂ − (k) + L(k) y(k) − C x̂ − (k)
L(k) = P − (k)C ′ (CP − (k)C ′ + R)−1
P (k) = P − (k) − P − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)

We forecast from k to k + 1 using the model


" #
h i x(k)
x(k + 1) = A I
w(k)

Because w(k) is independent of x(k) and y(k), the joint density of


(x(k), w(k)) follows from a second use of (1.23)
" # " # " #!
x(k) x̂(k) P (k) 0
∼N ,
w(k) 0 0 Q
1.4 Introductory State Estimation 33

and a second use of the linear transformation result (1.24) gives

px(k+1)|y(k) (x(k + 1)|y(k)) = n(x(k + 1), x̂ − (k + 1), P − (k + 1))

in which

x̂ − (k + 1) = Ax̂(k)
P − (k + 1) = AP (k)A′ + Q

and the recursion is complete.

1.4.3 Least Squares Estimation

We next consider the state estimation problem as a deterministic op-


timization problem rather than an exercise in maximizing conditional
density. This viewpoint proves valuable in Chapter 4 when we wish to
add constraints to the state estimator. Consider a time horizon with
measurements y(k), k = 0, 1, . . . , T . We consider the prior information
to be our best initial guess of the initial state x(0), denoted x(0), and
weighting matrices P − (0), Q, and R for the initial state, process distur-
bance, and measurement disturbance. A reasonably flexible choice for
objective function is


1
VT (x(T )) = |x(0) − x(0)|2(P − (0))−1 +
2
TX
−1 XT 
2
|x(k + 1) − Ax(k)|2Q−1 + y(k) − Cx(k) R −1 (1.26)
k=0 k=0

in which x(T ) := (x(0), x(1), . . . , x(T )). We claim and then show that
the following (deterministic) least squares optimization problem pro-
duces the same result as the conditional density function maximization
of the Kalman filter
min VT (x(T )) (1.27)
x(T )

Game plan. Using forward DP, we can decompose and solve recur-
sively the least squares state estimation problem. To see clearly how
the procedure works, first we write out the terms in the state estimation
least squares problem (1.27)
34 Getting Started with Model Predictive Control


1 2 2 2
min |x(0) − x(0)|(P − (0))−1 + y(0) − Cx(0) R −1 +|x(1) − Ax(0)|Q−1
x(0),...,x(T ) 2
2 2
+ y(1) − Cx(1) R −1 + |x(2) − Ax(1)|Q−1 + · · · +

2 2
|x(T ) − Ax(T − 1)|Q−1 + y(T ) − Cx(T ) R −1 (1.28)

We decompose this T -stage optimization problem with forward DP.


First we combine the prior and the measurement y(0) into the quad-
ratic function V0 (x(0)) as shown in the following equation

arrival cost V1− (x(1))


z  }| {
1 2
min min |x(0) − x(0)|2(P − (0))−1 + y(0) − Cx(0) R −1 + |x(1) − Ax(0)|2Q−1 +
x(T ),...,x(1) x(0) 2
| {z }
combine V0 (x(0))
2
y(1) − Cx(1) R −1 + |x(2) − Ax(1)|2Q−1 + · · · +

2
|x(T ) − Ax(T − 1)|2Q−1 + y(T ) − Cx(T ) R −1

Then we optimize over the first state, x(0). This produces the arrival
cost for the first stage, V1− (x(1)), which we will show is also quadratic

2
V1− (x(1)) = (1/2)( x(1) − x̂ − (1) (P − (1))−1 + d(0))

Next we combine the arrival cost of the first stage with the next mea-
surement y(1) to obtain V1 (x(1))

arrival cost V2− (x(2))


z  }| {
1 2 2
min min x(1) − x̂ − (1) + d(0) + y(1) − Cx(1)
(P − (1))−1 R −1 + |x(2) − Ax(1)|2Q−1 +
x(T ),...,x(2) x(1) 2
| {z }
combine V1 (x(1))
2
y(2) − Cx(2) R −1 + |x(3) − Ax(2)|2Q−1 + · · · +

|x(T ) − Ax(T − 1)|2Q−1 + y(T ) − Cx(T )
2
R −1 (1.29)

We optimize over the second state, x(1), which defines arrival cost for
the first two stages, V2− (x(2)). We continue in this fashion until we
have optimized finally over x(T ) and have solved (1.28). Now that we
have in mind an overall game plan for solving the problem, we look at
each step in detail and develop the recursion formulas of forward DP.
1.4 Introductory State Estimation 35

Combine prior and measurement. Combining the prior and mea-


surement defines V0
 
1 2
V0 (x(0)) = |x(0) − x(0)|2(P − (0))−1 + y(0) − Cx(0) R−1 (1.30)
2 | {z } | {z }
prior measurement

which can be expressed also as



1
V0 (x(0)) = |x(0) − x(0)|2(P − (0))−1 +
2

2
(y(0) − Cx(0)) − C(x(0) − x(0)) R −1

Using the third form in Example 1.1 we can combine these two terms
into a single quadratic function
−1 
e
V0 (x(0)) = (1/2) (x(0) − x(0) − v)′ H (x(0) − x(0) − v) + d(0)

in which

v = P − (0)C ′ (CP − (0)C ′ + R)−1 y(0) − Cx(0)
e = P − (0) − P − (0)C ′ (CP − (0)C ′ + R)−1 CP − (0)
H
2
d(0) = y(0) − Cx(0) (CP − (0)C ′ +R)−1

If we define

P (0) = P − (0) − P − (0)C ′ (CP − (0)C ′ + R)−1 CP − (0)


L(0) = P − (0)C ′ (CP − (0)C ′ + R)−1

and define the state estimate x̂(0) as follows

x̂(0) = x(0) + v

x̂(0) = x(0) + L(0) y(0) − Cx(0)

then we have the following compact expression for the function V0 .

V0 (x(0)) = (1/2)(|x(0) − x̂(0)|2P (0)−1 + d(0))

State evolution and arrival cost. Now we add the next term in (1.28)
to the function V0 (·) and denote the sum as V (·)

V (x(0), x(1)) = V0 (x(0)) + (1/2) |x(1) − Ax(0)|2Q−1


1 
V (x(0), x(1)) = |x(0) − x̂(0)|2P (0)−1 + |x(1) − Ax(0)|2Q−1 + d(0)
2
36 Getting Started with Model Predictive Control

Again using the third form in Example 1.1, we can add the two quadrat-
ics to obtain
V (x(0), x(1)) = (1/2)(|x(0) − v|2He −1 + d)
in which
−1
v = x̂(0) + P (0)A′ AP (0)A′ + Q (x(1) − Ax̂(0))
′ ′
−1
d = (x(1) − Ax̂(0)) AP (0)A + Q (x(1) − Ax̂(0)) + d(0)
 −1
e = P (0) − P (0)A′ AP (0)A′ + Q
H AP (0)
This form is convenient for optimization over the first decision variable
x(0); by inspection the solution is x(0) = v and the cost is (1/2)d. We
define the arrival cost to be the result of this optimization
V1− (x(1)) = min V (x(0), x(1))
x(0)

and we have that


2
V1− (x(1)) = (1/2)( x(1) − x̂ − (1) (P − (1))−1 + d(0))
with
x̂ − (1) = Ax̂(0)
P − (1) = AP (0)A′ + Q
Combine arrival cost and measurement. We now combine the ar-
rival cost and measurement for the next stage of the optimization to
obtain
2
V1 (x(1)) = V1− (x(1)) + (1/2) (y(1) − Cx(1)) R−1
| {z } | {z }
prior measurement
 
1 2 2
V1 (x(1)) = x(1) − x̂ − (1) (P − (1))−1 + y(1) − Cx(1) R −1 + d(0)
2
We can see that this equation is exactly the form as (1.30) of the previ-
ous step, and, by simply changing the variable names, we have that
P (1) = P − (1) − P − (1)C ′ (CP − (1)C ′ + R)−1 CP − (1)
L(1) = P − (1)C ′ (CP − (1)C ′ + R)−1
x̂(1) = x̂ − (1) + L(1)(y(1) − C x̂ − (1))
2
d(1) = d(0) + y(1) − C x̂ − (1) (CP − (1)C ′ +R)−1

and the cost function V1 is defined as


V1 (x(1)) = (1/2)(|x(1) − x̂(1)|2P (1)−1 + d(1))
1.4 Introductory State Estimation 37

Recursion and termination. The recursion can be summarized by


two steps. Adding the measurement at time k produces

P (k) = P − (k) − P − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)


L(k) = P − (k)C ′ (CP − (k)C ′ + R)−1
x̂(k) = x̂ − (k) + L(k)(y(k) − C x̂ − (k))
2
d(k) = d(k − 1) + y(k) − C x̂ − (k) (CP − (k)C ′ +R)−1

Propagating the model to time k + 1 produces

x̂ − (k + 1) = Ax̂(k)
P − (k + 1) = AP (k)A′ + Q

and the recursion starts with the prior information x̂ − (0) = x(0) and
P − (0). The arrival cost, Vk− , and arrival cost plus measurement, Vk , for
each stage are given by
2 
Vk− (x(k)) = (1/2) x(k) − x̂ − (k) + d(k − 1)
(P − (k))−1

Vk (x(k)) = (1/2) |x(k) − x̂(k)|2(P (k))−1 + d(k)

The process terminates with the final measurement y(T ), at which


point we have recursively solved the original problem (1.28).
We see by inspection that the recursion formulas given by forward
DP of (1.28) are the same as those found by calculating the conditional
density function in Section 1.4.2. Moreover, the conditional densities
before and after measurement are closely related to the least squares
value functions as shown below

1
p(x(k)|y(k − 1)) =
(2π )n/2 (det P − (k))1/2

exp − (Vk− (x(k)) − (1/2)d(k − 1))

1
p(x(k)|y(k)) =
(2π )n/2 (det P (k))1/2

exp − (Vk (x(k)) − (1/2)d(k)) (1.31)

The discovery (and rediscovery) of the close connection between re-


cursive least squares and optimal statistical estimation has not always
been greeted happily by researchers:
38 Getting Started with Model Predictive Control

The recursive least squares approach was actually inspired


by probabilistic results that automatically produce an equa-
tion of evolution for the estimate (the conditional mean).
In fact, much of the recent least squares work did nothing
more than rederive the probabilistic results (perhaps in an
attempt to understand them). As a result, much of the least
squares work contributes very little to estimation theory.
—Jazwinski (1970, pp.152–153)

In contrast with this view, we find both approaches valuable in the


subsequent development. The probabilistic approach, which views the
state estimator as maximizing conditional density of the state given
measurement, offers the most insight. It provides a rigorous basis for
comparing different estimators based on the variance of their estimate
error. It also specifies what information is required to define an op-
timal estimator, with variances Q and R of primary importance. In
the probabilistic framework, these parameters should be found from
modeling and data. The main deficiency in the least squares viewpoint
is that the objective function, although reasonable, is ad hoc and not
justified. The choice of weighting matrices Q and R is arbitrary. Practi-
tioners generally choose these parameters based on a tradeoff between
the competing goals of speed of estimator response and insensitivity
to measurement noise. But a careful statement of this tradeoff often
just leads back to the probabilistic viewpoint in which the process dis-
turbance and measurement disturbance are modeled as normal distri-
butions. If we restrict attention to unconstrained linear systems, the
probabilistic viewpoint is clearly superior.
Approaching state estimation with the perspective of least squares
pays off, however, when the models are significantly more complex. It
is generally intractable to find and maximize the conditional density of
the state given measurements for complex, nonlinear and constrained
models. Although the state estimation problem can be stated in the
language of probability, it cannot be solved with current methods. But
reasonable objective functions can be chosen for even complex, nonlin-
ear and constrained models. Moreover, knowing which least squares
problems correspond to which statistically optimal estimation prob-
lems for the simple linear case, provides the engineer with valuable in-
sight in choosing useful objective functions for nonlinear estimation.
We explore these more complex and realistic estimation problems in
Chapter 4. The perspective of least squares also leads to succinct ar-
guments for establishing estimator stability, which we take up shortly.
1.4 Introductory State Estimation 39

y(T − N) y(T )

x(T − N) x(T )

moving horizon

full information

0 T −N T

Figure 1.5: Schematic of the moving horizon estimation problem.

First we consider situations in which it is advantageous to use moving


horizon estimation.

1.4.4 Moving Horizon Estimation

When using nonlinear models or considering constraints on the esti-


mates, we cannot calculate the conditional density recursively in closed
form as we did in Kalman filtering. Similarly, we cannot solve recur-
sively the least squares problem. If we use least squares we must opti-
mize all the states in the trajectory x(T ) simultaneously to obtain the
state estimates. This optimization problem becomes computationally
intractable as T increases. Moving horizon estimation (MHE) removes
this difficulty by considering only the most recent N measurements and
finds only the most recent N values of the state trajectory as sketched
in Figure 1.5. The states to be estimated are xN (T ) = x(T − N), . . . ,
 
x(T ) given measurements yN (T ) = y(T − N), . . . , y(T ) . The data
have been broken into two sections with (y(T − N − 1), yN (T )) = y(T ).
We assume here that T ≥ N − 1 to ignore the initial period in which the
estimation window fills with measurements and assume that the win-
dow is always full.
The simplest form of MHE is the following least squares problem

min V̂T (xN (T )) (1.32)


xN (T )
40 Getting Started with Model Predictive Control

in which the objective function is

 TX
−1
1
V̂T (xN (T )) = |x(k + 1) − Ax(k)|2Q−1 +
2 k=T −N
T
X 
2
y(k) − Cx(k) R −1 (1.33)
k=T −N

We use the circumflex (hat) to indicate this is the MHE cost function
considering data sequence from T − N to T rather than the full infor-
mation or least squares cost considering the data from 0 to T .
MHE in terms of least squares. Notice that from our previous DP
recursion in (1.29), we can write the full least squares problem as

VT (xN (T )) = VT−−N (x(T − N))+


 TX −1 T
X 
1 2
|x(k + 1) − Ax(k)|2Q−1 + y(k) − Cx(k) R −1
2 k=T −N k=T −N

in which VT−−N (·) is the arrival cost at time T − N. Comparing these


two objective functions, it is clear that the simplest form of MHE is
equivalent to setting up a full least squares problem, but then setting
the arrival cost function VT−−N (·) to zero.
MHE in terms of conditional density. Because we have established
the close connection between least squares and conditional density in
(1.31), we can write the full least squares problem also as an equivalent
conditional density maximization

max px(T )|yN (T ) (x(T )|yN (T ))


x(T )

with prior density

px(T −N)|y(T −N−1) (x|y(T − N − 1)) = c exp(−VT−−N (x)) (1.34)

in which the constant c can be found from (1.19) if desired, but its
value does not change the solution to the optimization. We can see
from (1.34) that setting VT−−N (·) to zero in the simplest form of MHE is
equivalent to giving infinite variance to the conditional density of x(T −
N)|y(T − N − 1). This means we are using no information about the
state x(T −N) and completely discounting the previous measurements
y(T − N − 1).
1.4 Introductory State Estimation 41

To provide a more flexible MHE problem, we therefore introduce a


penalty on the first state to account for the neglected data y(T − N − 1)

V̂T (xN (T )) = ΓT −N (x(T − N))+


 TX −1 T
X 
1 2
|x(k + 1) − Ax(k)|2Q−1 + y(k) − Cx(k) R −1
2 k=T −N k=T −N

For the linear Gaussian case, we can account for the neglected data
exactly with no approximation by setting Γ equal to the arrival cost, or,
equivalently, the negative logarithm of the conditional density of the
state given the prior measurements. Indeed, there is no need to use
MHE for the linear Gaussian problem at all because we can solve the
full problem recursively. When addressing nonlinear and constrained
problems in Chapter 4, however, we must approximate the conditional
density of the state given the prior measurements in MHE to obtain a
computationally tractable and high-quality estimator.

1.4.5 Observability

We next explore the convergence properties of the state estimators.


For this we require the concept of system observability. The basic idea
of observability is that any two distinct states can be distinguished by
applying some input and observing the two system outputs over some
finite time interval (Sontag, 1998, p.262–263). We discuss this general
definition in more detail when treating nonlinear systems in Chapter
4, but observability for linear systems is much simpler. First of all, the
applied input is irrelevant and we can set it to zero. Therefore consider
the linear time-invariant system (A, C) with zero input

x(k + 1) = Ax(k)
y(k) = Cx(k)

The system is observable if there exists a finite N, such that for every

x(0), N measurements y(0), y(1), . . . , y(N − 1) distinguish uniquely
the initial state x(0). Similarly to the case of controllability, if we can-
not determine the initial state using n measurements, we cannot de-
termine it using N > n measurements. Therefore we can develop a
convenient test for observability as follows. For n measurements, the
42 Getting Started with Model Predictive Control

system model gives


   
y(0) C
 y(1)   CA 
 
 
 ..  =  .  x(0) (1.35)
   . 
 .   . 
y(n − 1) CAn−1

The question of observability is therefore a question of uniqueness of


solutions to these linear equations. The matrix appearing in this equa-
tion is known as the observability matrix O
 
C
 CA 
 
O=
 .. 
 (1.36)
 . 
CAn−1

From the fundamental theorem of linear algebra, we know the solution


to (1.35) is unique if and only if the columns of the np × n observability
matrix are linearly independent.6 Therefore, we have that the system
(A, C) is observable if and only if

rank(O) = n

The following result for checking observability also proves useful (Hau-
tus, 1972).

Lemma 1.4 (Hautus lemma for observability). A system is observable if


and only if " #
λI − A
rank =n for all λ ∈ C (1.37)
C

in which C is the set of complex numbers.

Notice that the first n rows of the matrix in (1.37) are linearly inde-
pendent if λ ∉ eig(A), so (1.37) is equivalent to checking the rank at
just the eigenvalues of A
" #
λI − A
rank =n for all λ ∈ eig(A)
C

6 See Section A.4 of Appendix A or (Strang, 1980, pp.87–88) for a review of this result.
1.4 Introductory State Estimation 43

1.4.6 Convergence of the State Estimator

Next we consider the question of convergence of the estimates of sev-


eral of the estimators we have considered. The simplest convergence
question to ask is the following. Given an initial estimate error, and
zero state and measurement noises, does the state estimate converge to
the state as time increases and more measurements become available?
If the answer to this question is yes, we say the estimates converge;
sometimes we say the estimator converges. As with the regulator, op-
timality of an estimator does not ensure its stability. Consider the case
A = I, C = 0. The optimal estimate is x̂(k) = x(0), which does not
converge to the true state unless we have luckily chosen x(0) = x(0).7
Obviously the lack of stability is caused by our choosing an unobserv-
able (undetectable) system.
We treat first the Kalman filtering or full least squares problem. Re-
call that this estimator optimizes over the entire state trajectory x(T ) :=

(x(0), . . . , x(T )) based on all measurements y(T ) := y(0), . . . , y(T ) .
In order to establish convergence, the following result on the optimal
estimator cost function proves useful.
Lemma 1.5 (Convergence of estimator cost). Given noise-free measure-

ments y(T ) = Cx(0), CAx(0), . . . , CAT x(0) , the optimal estimator
cost VT0 (y(T )) converges as T → ∞.
Proof. Denote the optimal state sequence at time T given measurement
y(T ) by
(x̂(0|T ), x̂(1|T ), . . . , x̂(T |T ))
We wish to compare the optimal costs at time T and T − 1. Therefore,
consider using the first T − 1 elements of the solution at time T as
decision variables in the state estimation problem at time T − 1. The
cost for those decision variables at time T − 1 is given by
 
1 2
VT0 − |x̂(T |T ) − Ax̂(T − 1|T )|2Q−1 + y(T ) − C x̂(T |T ) R−1
2
In other words, we have the full cost at time T and we deduct the cost
of the last stage, which is not present at T − 1. Now this choice of
decision variables is not necessarily optimal at time T − 1, so we have
the inequality
 
1 2
VT0−1 ≤ VT0 − |x̂(T |T ) − Ax̂(T − 1|T )|2Q−1 + y(T ) − C x̂(T |T ) R−1
2
7 If we could count on that kind of luck, we would have no need for state estimation.
44 Getting Started with Model Predictive Control

Because the quadratic terms are nonnegative, the sequence of opti-


mal estimator costs is nondecreasing with increasing T . We can es-
tablish that the optimal cost is bounded above as follows: at any time

T we can choose the decision variables to be x(0), Ax(0), . . . , AT x(0) ,
which achieves cost |x(0) − x(0)|2(P − (0))−1 independent of T . The opti-
mal cost sequence is nondecreasing and bounded above and, therefore,
converges. ■

The optimal estimator cost converges regardless of system observ-


ability. But if we want the optimal estimate to converge to the state, we
have to restrict the system further. The following lemma provides an
example of what is required.

Lemma 1.6 (Estimator convergence). For (A, C) observable, Q, R > 0,



and noise-free measurements y(T ) = Cx(0), CAx(0), . . . , CAT x(0) ,
the optimal linear state estimate converges to the state

x̂(T ) → x(T ) as T → ∞

Proof. To compress the notation somewhat, let ŵT (j) = x̂(T + j +


1|T + n − 1) − Ax̂(T + j|T + n − 1). Using the optimal solution at time
T + n − 1 as decision variables at time T − 1 allows us to write the
following inequality

VT0−1 ≤ VT0+n−1 −
 n−2 n−1 
1 X 2
X 2
ŵT (j) Q−1 + y(T + j) − C x̂(T + j|T + n − 1) R −1
2 j=−1 j=0

Because the sequence of optimal costs converges with increasing T ,


and Q−1 , R −1 > 0, we have established that for increasing T

ŵT (j) → 0 j = −1, . . . , n − 2


y(T + j) − C x̂(T + j|T + n − 1) → 0 j = 0, . . . , n − 1 (1.38)

From the system model we have the following relationship between the
last n stages in the optimization problem at time T + n − 1 with data
1.4 Introductory State Estimation 45

y(T + n − 1)
   
x̂(T |T + n − 1) I
 x̂(T + 1|T + n − 1)   A 
   
 ..  =  .  x̂(T |T + n − 1)+
   . 
 .   . 
x̂(T + n − 1|T + n − 1) An−1
  
0 ŵT (0)
 I   
 0   ŵT (1) 
 . ..   ..  (1.39)
 . ..  
 . . .  . 
An−2 An−3 · · · I ŵT (n − 2)

We note the measurements satisfy


 
y(T )
 y(T + 1) 
 
 ..  = Ox(T )
 
 . 
y(T + n − 1)

Multiplying (1.39) by C and subtracting gives

 
y(T ) − C x̂(T |T + n − 1)
 
 y(T + 1) − C x̂(T + 1|T + n − 1)  
 
 .  = O x(T ) − x̂(T |T + n − 1) −
 .. 
 
y(T + n − 1) − C x̂(T + n − 1|T + n − 1)
  
0 ŵT (0)
  
 C 0   ŵT (1) 
  
 . .. ..  .. 
 .. . .   . 
  
n−2 n−3
CA CA ··· C ŵT (n − 2)

Applying (1.38) to this equation, we conclude O(x(T ) − x̂(T |T + n −


1)) → 0 with increasing T . Because the observability matrix has inde-
pendent columns, we conclude x(T ) − x̂(T |T + n − 1) → 0 as T → ∞.
Thus we conclude that the smoothed estimate x̂(T |T +n−1) converges
to the state x(T ). Because the ŵT (j) terms go to zero with increasing
T , the last line of (1.39) gives x̂(T +n−1|T +n−1) → An−1 x̂(T |T +n−1)
as T → ∞. From the system model An−1 x(T ) = x(T +n−1) and, there-
fore, after replacing T + n − 1 by T , we have

x̂(T |T ) → x(T ) as T → ∞

and asymptotic convergence of the estimator is established. ■


46 Getting Started with Model Predictive Control

This convergence result also covers MHE with prior weighting set to
the exact arrival cost because that is equivalent to Kalman filtering and
full least squares. The simplest form of MHE, which discounts prior
data completely, is also a convergent estimator, however, as discussed
in Exercise 1.28.
The estimator convergence result in Lemma 1.6 is the simplest to
establish, but, as in the case of the LQ regulator, we can enlarge the
class of systems and weighting matrices (variances) for which estimator
convergence is guaranteed. The system restriction can be weakened
from observability to detectability, which is discussed in Exercises 1.31
and 1.32. The restriction on the process disturbance weight (variance)
Q can be weakened from Q > 0 to Q ≥ 0 and (A, Q) stabilizable, which
is discussed in Exercise 1.33. The restriction R > 0 remains to ensure
uniqueness of the estimator.

1.5 Tracking, Disturbances, and Zero Offset


In the last section of this chapter we show briefly how to use the MPC
regulator and MHE estimator to handle different kinds of control prob-
lems, including setpoint tracking and rejecting nonzero disturbances.

1.5.1 Tracking

It is a standard objective in applications to use a feedback controller


to move the measured outputs of a system to a specified and constant
setpoint. This problem is known as setpoint tracking. In Chapter 5
we consider the case in which the system is nonlinear and constrained,
but for simplicity here we consider the linear unconstrained system
in which ysp is an arbitrary constant. In the regulation problem of
Section 1.3 we assumed that the goal was to take the state of the system
to the origin. Such a regulator can be used to treat the setpoint tracking
problem with a coordinate transformation. Denote the desired output
setpoint as ysp . Denote a steady state of the system model as (xs , us ).
From (1.5), the steady state satisfies
" #
h i x
s
I−A −B =0
us

For unconstrained systems, we also impose the requirement that the


steady state satisfies Cxs = ysp for the tracking problem, giving the
1.5 Tracking, Disturbances, and Zero Offset 47

set of equations " #" # " #


I−A −B xs 0
= (1.40)
C 0 us ysp
If this set of equations has a solution, we can then define deviation
variables

x
e (k) = x(k) − xs
u
e (k) = u(k) − us

that satisfy the dynamic model

x
e (k + 1) = x(k + 1) − xs
= Ax(k) + Bu(k) − (Axs + Bus )
x
e (k + 1) = Ax
e (k) + Bu
e (k)

so that the deviation variables satisfy the same model equation as the
original variables. The zero regulation problem applied to the system in
deviation variables finds u e (k) that takes x
e (k) to zero, or, equivalently,
which takes x(k) to xs , so that at steady state, Cx(k) = Cxs = ysp ,
which is the goal of the setpoint tracking problem. After solving the
regulation problem in deviation variables, the input applied to the sys-
tem is u(k) = u e (k) + us .
We next discuss when we can solve (1.40). We also note that for con-
strained systems, we must impose the constraints on the steady state
(xs , us ). The matrix in (1.40) is a (n + p) × (n + m) matrix. For (1.40)
to have a solution for all ysp , it is sufficient that the rows of the ma-
trix are linearly independent. That requires p ≤ m: we require at least
as many inputs as outputs with setpoints. But it is not uncommon in
applications to have many more measured outputs than manipulated
inputs. To handle these more general situations, we choose a matrix
H and denote a new variable r = Hy as a selection of linear combi-
nations of the measured outputs. The variable r ∈ Rnc is known as
the controlled variable. For cases in which p > m, we choose some set
of outputs nc ≤ m, as controlled variables, and assign setpoints to r ,
denoted rsp .
We also wish to treat systems with more inputs than outputs, m > p.
For these cases, the solution to (1.40) may exist for some choice of H
and rsp , but cannot be unique. If we wish to obtain a unique steady
state, then we also must provide desired values for the steady inputs,
usp . To handle constrained systems, we simply impose the constraints
on (xs , us ).
48 Getting Started with Model Predictive Control

Steady-state target problem. Our candidate optimization problem is


therefore
1 2 2

min us − usp Rs + Cxs − ysp Qs (1.41a)
xs ,us 2

subject to
" #" # " #
I−A −B xs 0
= (1.41b)
HC 0 us rsp
Eus ≤ e (1.41c)
F Cxs ≤ f (1.41d)

We make the following assumptions.

Assumption 1.7 (Target feasibility and uniqueness).


(a) The target problem is feasible for the controlled variable setpoints
of interest rsp .

(b) The steady-state input penalty Rs is positive definite.

Assumption 1.7 (a) ensures that the solution (xs , us ) exists, and
Assumption 1.7 (b) ensures that the solution is unique. If one chooses
nc = 0, then no controlled variables are required to be at setpoint, and
the problem is feasible for any (usp , ysp ) because (xs , us ) = (0, 0) is a
feasible point. Exercises 1.56 and 1.57 explore the connection between
feasibility of the equality constraints and the number of controlled vari-
ables relative to the number of inputs and outputs. One restriction is
that the number of controlled variables chosen to be offset free must
be less than or equal to the number of manipulated variables and the
number of measurements, nc ≤ m and nc ≤ p.

Dynamic regulation problem. Given the steady-state solution, we de-


fine the following multistage objective function

N−1
1 X 2 2 +
V (x
e (0), u
e) = x
e (k) Q + u
e (k) R e = Ax
s.t. x e + Bu
e
2 k=0

in which xe (0) = x̂(k) − xs , i.e., the initial condition for the regula-
tion problem comes from the state estimate shifted by the steady-state
xs . The regulator solves the following dynamic, zero-state regulation
problem
min V (x e (0), u
e)
u
e
1.5 Tracking, Disturbances, and Zero Offset 49

subject to

Eu
e ≤ e − Eus
F Cx
e ≤ f − F Cxs

in which the constraints also are shifted by the steady state (xs , us ).
0
The optimal cost and solution are V 0 (x e (0)) and u
e (xe (0)). The mov-
ing horizon control law uses the first move of this optimal sequence,
0 0 0
u
e (x e (0)) = u
e (0; x
e (0)), so the controller output is u(k) = u e (x
e (0)) +
us .

1.5.2 Disturbances and Zero Offset

Another common objective in applications is to use a feedback con-


troller to compensate for an unmeasured disturbance to the system
with the input so the disturbance’s effect on the controlled variable
is mitigated. This problem is known as disturbance rejection. We may
wish to design a feedback controller that compensates for nonzero dis-
turbances such that the selected controlled variables asymptotically ap-
proach their setpoints without offset. This property is known as zero
offset. In this section we show a simple method for constructing an
MPC controller to achieve zero offset.
In Chapter 5, we address the full problem. Here we must be content
to limit our objective. We will ensure that if the system is stabilized in
the presence of the disturbance, then there is zero offset. But we will
not attempt to construct the controller that ensures stabilization over
an interesting class of disturbances. That topic is treated in Chapter 5.
This more limited objective is similar to what one achieves when us-
ing the integral mode in proportional-integral-derivative (PID) control
of an unconstrained system: either there is zero steady offset, or the
system trajectory is unbounded. In a constrained system, the state-
ment is amended to: either there is zero steady offset, or the system
trajectory is unbounded, or the system constraints are active at steady
state. In both constrained and unconstrained systems, the zero-offset
property precludes one undesirable possibility: the system settles at
an unconstrained steady state, and the steady state displays offset in
the controlled variables.
A simple method to compensate for an unmeasured disturbance is
to (i) model the disturbance, (ii) use the measurements and model to
estimate the disturbance, and (iii) find the inputs that minimize the
effect of the disturbance on the controlled variables. The choice of
50 Getting Started with Model Predictive Control

disturbance model is motivated by the zero-offset goal. To achieve


offset-free performance we augment the system state with an integrat-
ing disturbance d driven by a white noise wd

d+ = d + wd (1.42)

This choice is motivated by the works of Davison and Smith (1971,


1974); Qiu and Davison (1993) and the Internal Model Principle of Fran-
cis and Wonham (1976). To remove offset, one designs a control sys-
tem that can remove asymptotically constant, nonzero disturbances
(Davison and Smith, 1971), (Kwakernaak and Sivan, 1972, p.278). To
accomplish this end, the original system is augmented with a replicate
of the constant, nonzero disturbance model, (1.42). Thus the states of
the original system are moved onto the manifold that cancels the effect
of the disturbance on the controlled variables. The augmented system
model used for the state estimator is given by
" #+ " #" # " #
x A Bd x B
= + u+w (1.43a)
d 0 I d 0
" #
h i x
y = C Cd +v (1.43b)
d

and we are free to choose how the integrating disturbance affects the
states and measured outputs through the choice of Bd and Cd . The only
restriction is that the augmented system is detectable. That restriction
can be easily checked using the following result.

Lemma 1.8 (Detectability of the augmented system). The augmented


system (1.43) is detectable if and only if the unaugmented system (A, C)
is detectable, and the following condition holds
" #
I − A −Bd
rank = n + nd (1.44)
C Cd

Corollary 1.9 (Dimension of the disturbance). The maximal dimension


of the disturbance d in (1.43) such that the augmented system is de-
tectable is equal to the number of measurements, that is

nd ≤ p

A pair of matrices (Bd , Cd ) such that (1.44) is satisfied


h i always exists.
I−A
In fact, since (A, C) is detectable, the submatrix C ∈ R(p+n)×n has
1.5 Tracking, Disturbances, and Zero Offset 51

p+n
h n. iThus,h we can
rank i choose any nd ≤ p columns in R independent
I−A −Bd
of C for Cd .
The state and the additional integrating disturbance are estimated
from the plant measurement using a Kalman filter designed for the
augmented system. The variances of the stochastic disturbances w
and v may be treated as adjustable parameters or found from input-
output measurements (Odelson, Rajamani, and Rawlings, 2006). The
estimator provides x̂(k) and d̂(k) at each time k. The best forecast of
the steady-state disturbance using (1.42) is simply

d̂s = d̂(k)

The steady-state target problem is therefore modified to account for


the nonzero disturbance d̂s
 
1 2 2
min us − usp Rs + Cxs + Cd d̂s − ysp (1.45a)
xs ,us 2 Qs

subject to
" #" # " #
I−A −B xs Bd d̂s
= (1.45b)
HC 0 us rsp − HCd d̂s
Eus ≤ e (1.45c)
F Cxs ≤ f − F Cd d̂s (1.45d)

Comparing (1.41) to (1.45), we see the disturbance model affects the


steady-state target determination in four ways.

1. The output target is modified in (1.45a) to account for the effect


of the disturbance on the measured output (ysp → ysp − Cd d̂s ).

2. The output constraint in (1.45d) is similarly modified (f → f −


F Cd d̂s ).

3. The system steady-state relation in (1.45b) is modified to account


for the effect of the disturbance on the state evolution (0 → Bd d̂s ).

4. The controlled variable target in (1.45b) is modified to account


for the effect of the disturbance on the controlled variable (rsp →
rsp − HCd d̂s ).

Given the steady-state target, the same dynamic regulation problem as


presented in the tracking section, Section 1.5, is used for the regulator.
52 Getting Started with Model Predictive Control

+
e = Ax
x e + Bu
e
(Q, R)

u y
regulator plant

xs
us estimator

" #+ " #" # " #
x̂ A Bd x̂ B
= + u+
target d̂ 0 I d̂ 0
selector   " #
x̂ e  h i
d̂ Lx  y − C Cd x̂
ysp , usp , rsp e d̂
L d
(Qs , Rs )

Figure 1.6: MPC controller consisting of: receding horizon regula-


tor, state estimator, and target selector; for simplicity we
show the steady-state Kalman predictor form of the state
estimator where x̂ := x̂(k | k − 1) and Le x := ALx + Bd Ld
and Le d := Ld .

In other words, the regulator is based on the deterministic system (A,


B) in which the current state is x̂(k) − xs and the goal is to take the
system to the origin.
The following lemma summarizes the offset-free control property
of the combined control system.
Lemma 1.10 (Offset-free control). Consider a system controlled by the
MPC algorithm as shown in Figure 1.6. The target problem (1.45) is
assumed feasible. Augment the system model with a number of inte-
grating disturbances equal to the number of measurements (nd = p);
choose any Bd ∈ Rn×p , Cd ∈ Rp×p such that
" #
I − A −Bd
rank =n+p
C Cd

If the plant output y(k) goes to steady state ys , the closed-loop system is
stable, and constraints are not active at steady state, then there is zero
offset in the controlled variables, that is

Hys = rsp

The proof of this lemma is given in Pannocchia and Rawlings (2003).


It may seem surprising that the number of integrating disturbances
1.5 Tracking, Disturbances, and Zero Offset 53

must be equal to the number of measurements used for feedback rather


than the number of controlled variables to guarantee offset-free con-
trol. To gain insight into the reason, consider the disturbance part
(bottom half) of the Kalman filter equations shown in Figure 1.6
 " #
h i x̂ 
+
d̂ = d̂ + Ld y − C Cd

Because of the integrator, the disturbance estimate cannot converge


until " #
 h i x̂ 
Ld y − C Cd =0

But notice this condition merely restricts the output prediction error
to lie in the nullspace of the matrix Ld , which is an nd × p matrix. If
we choose nd = nc < p, then the number of columns of Ld is greater
than the number of rows and Ld has a nonzero nullspace.8 In general,
we require the output prediction error to be zero to achieve zero offset
independently of the regulator tuning. For Ld to have only the zero
vector in its nullspace, we require nd ≥ p. Since we also know nd ≤ p
from Corollary 1.9, we conclude nd = p.
Notice also that Lemma 1.10 does not require that the plant output
be generated by the model. The theorem applies regardless of what
generates the plant output. If the plant is identical to the system plus
disturbance model assumed in the estimator, then the conclusion can
be strengthened. In the nominal case without measurement or process
noise (w = 0, v = 0), for a set of plant initial states, the closed-loop sys-
tem converges to a steady state and the feasible steady-state target is
achieved leading to zero offset in the controlled variables. Characteriz-
ing the set of initial states in the region of convergence, and stabilizing
the system when the plant and the model differ, are treated in Chap-
ters 3 and 5. We conclude the chapter with a nonlinear example that
demonstrates the use of Lemma 1.10.

Example 1.11: More measured outputs than inputs and zero offset
We consider a well-stirred chemical reactor depicted in Figure 1.7, as
in Pannocchia and Rawlings (2003). An irreversible, first-order reac-
tion A -→ B occurs in the liquid phase and the reactor temperature is

8 This is another consequence of the fundamental theorem of linear algebra. The


result is depicted in Figure A.1.
54 Getting Started with Model Predictive Control

F0 , T0 , c0

Tc

F T,c
r

Figure 1.7: Schematic of the well-stirred reactor.

regulated with external cooling. Mass and energy balances lead to the
following nonlinear state space model
 
dc F0 (c0 − c) E
= − k 0 exp − c
dt π r 2h RT
 
dT F0 (T0 − T ) −∆H E 2U
= + k0 exp − c+ (Tc − T )
dt π r 2h ρCp RT r ρCp
dh F0 − F
=
dt πr2
The controlled variables are h, the level of the tank, and c, the molar
concentration of species A. The additional state variable is T , the re-
actor temperature; while the manipulated variables are Tc , the coolant
liquid temperature, and F , the outlet flowrate. Moreover, it is assumed
that the inlet flowrate acts as an unmeasured disturbance. The model
parameters in nominal conditions are reported in Table 1.1. The open-
loop stable steady-state operating conditions are the following

c s = 0.878 kmol/m3 T s = 324.5 K hs = 0.659 m


Tcs = 300 K F s = 0.1 m3 /min

Using a sampling time of 1 min, a linearized discrete state space model


is obtained and, assuming that all the states are measured, the state
space variables are
   
c − cs "
s
# c − cs
  Tc − Tc  
x = T − T s  u= s y = T − T s  p = F0 − F0s
s F − F s
h−h h−h
1.5 Tracking, Disturbances, and Zero Offset 55

Parameter Nominal value Units


F0 0.1 m3 /min
T0 350 K
c0 1 kmol/m3
r 0.219 m
k0 7.2 × 1010 min−1
E/R 8750 K
U 54.94 kJ/min·m2 ·K
ρ 1000 kg/m3
Cp 0.239 kJ/kg·K
∆H −5 × 104 kJ/kmol

Table 1.1: Parameters of the well-stirred reactor.

The corresponding linear model is

x(k + 1) = Ax(k) + Bu(k) + Bp p


y(k) = Cx(k)

in which
   
0.2681 −0.00338 −0.00728 1 0 0
   
A =  9.703 0.3279 −25.44  C = 0 1 0
0 0 1 0 0 1
   
−0.00537 0.1655 −0.1175
   
B =  1.297 97.91  Bp =  69.74 
0 −6.637 6.637

(a) Since we have two inputs, Tc and F , we try to remove offset in


two controlled variables, c and h. Model the disturbance with two
integrating output disturbances on the two controlled variables.
Assume that the covariances of the state noises are zero except
for the two integrating states. Assume that the covariances of the
three measurements’ noises are also zero.
Notice that although there are only two controlled variables, this
choice of two integrating disturbances does not follow the pre-
scription of Lemma 1.10 for zero offset.
Simulate the response of the controlled system after a 10% in-
crease in the inlet flowrate F0 at time t = 10 min. Use the nonlin-
56 Getting Started with Model Predictive Control

ear differential equations for the plant model. Do you have steady
offset in any of the outputs? Which ones?

(b) Follow the prescription of Lemma 1.10 and choose a disturbance


model with three integrating modes. Can you choose three inte-
grating output disturbances for this plant? If so, prove it. If not,
state why not.

(c) Again choose a disturbance model with three integrating modes;


choose two integrating output disturbances on the two controlled
variables. Choose one integrating input disturbance on the outlet
flowrate F . Is the augmented system detectable?
Simulate again the response of the controlled system after a 10%
increase in the inlet flowrate F0 at time t = 10 min. Again use the
nonlinear differential equations for the plant model. Do you have
steady offset in any of the outputs? Which ones?
Compare and contrast the closed-loop performance for the design
with two integrating disturbances and the design with three inte-
grating disturbances. Which control system do you recommend
and why?

Solution
(a) Integrating disturbances are added to the two controlled variables
(first and third outputs) by choosing
 
1 0
 
Cd = 0 0 Bd = 0
0 1

The results with two integrating disturbances are shown in Fig-


ures 1.8 and 1.9. Notice that despite adding integrating distur-
bances to the two controlled variables, c and h, both of these con-
trolled variables as well as the third output, T , all display nonzero
offset at steady state.

(b) A third integrating disturbance is added to the second output


giving  
1 0 0
 
Cd = 0 0 1 Bd = 0
0 1 0
1.5 Tracking, Disturbances, and Zero Offset 57

c (kmol/m3 ) 0.9
0.895
0.89
0.885
0.88
0.875
0.87
0 10 20 30 40 50
326

324
T (K)

322

320
0 10 20 30 40 50
0.8
0.76
h (m)

0.72
0.68
0.64
0 10 20 30 40 50
time (min)

Figure 1.8: Three measured outputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 2.

301

300
Tc (K)

299

298
0 10 20 30 40 50

0.11
F (m3 /min)

0.1

0 10 20 30 40 50
time (min)

Figure 1.9: Two manipulated inputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 2.
58 Getting Started with Model Predictive Control

0.9
c (kmol/m3 )

0.87
0 10 20 30 40 50
330
T (K)

325

320
0 10 20 30 40 50
0.78
0.72
h (m)

0.66
0.6
0 10 20 30 40 50
time (min)

Figure 1.10: Three measured outputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 3.

The augmented system h is not


i detectable with this disturbance
I−A −B
model. The rank of C Cdd is only 5 instead of 6. The problem
here is that the system level is itself an integrator, and we cannot
distinguish h from the integrating disturbance added to h.

(c) Next we try three integrating disturbances: two added to the two
controlled variables, and one added to the second manipulated
variable
   
1 0 0 0 0 0.1655
   
Cd = 0 0 0 Bd = 0 0 97.91 
0 1 0 0 0 −6.637

The augmented system is detectable for this disturbance model.


The results for this choice of three integrating disturbances are
shown in Figures 1.10 and 1.11. Notice that we have zero offset in
the two controlled variables, c and h, and have successfully forced
the steady-state effect of the inlet flowrate disturbance entirely
into the second output, T .
Notice also that the dynamic behavior of all three outputs is supe-
rior to that achieved with the model using two integrating distur-
bances. The true disturbance, which is a step at the inlet flowrate,
1.5 Tracking, Disturbances, and Zero Offset 59

302

Tc (K)
300

298
0 10 20 30 40 50

0.12
F (m3 /min)

0.11

0.1

0 10 20 30 40 50
time (min)

Figure 1.11: Two manipulated inputs versus time after a step change
in inlet flowrate at 10 minutes; nd = 3.

is better represented by including the integrator in the outlet


flowrate. With a more accurate disturbance model, better over-
all control is achieved. The controller uses smaller manipulated
variable action and also achieves better output variable behavior.
An added bonus is that steady offset is removed in the maximum
possible number of outputs. □

Further notation
G transfer function matrix
m mean of normally distributed random variable
T reactor temperature
u
e input deviation variable
x, y, z spatial coordinates for a distributed system
x
e state deviation variable
60 Getting Started with Model Predictive Control

1.6 Exercises

Exercise 1.1: State space form for chemical reaction model


Consider the following chemical reaction kinetics for a two-step series reaction
k1 k2
A -→ B B -→ C

We wish to follow the reaction in a constant volume, well-mixed, batch reactor. As


taught in the undergraduate chemical engineering curriculum, we proceed by writing
material balances for the three species giving
dcA dcB dcC
= −r1 = r1 − r2 = r2
dt dt dt
in which cj is the concentration of species j, and r1 and r2 are the rates (mol/(time·vol))
at which the two reactions occur. We then assume some rate law for the reaction
kinetics, such as
r1 = k1 cA r2 = k2 cB
We substitute the rate laws into the material balances and specify the starting concen-
trations to produce three differential equations for the three species concentrations.

(a) Write the linear state space model for the deterministic series chemical reaction
model. Assume we can measure the component A concentration. What are x,
y, A, B, C, and D for this model?

(b) Simulate this model with initial conditions and parameters given by

cA0 = 1 cB0 = cC0 = 0 k1 = 2 k2 = 1

Exercise 1.2: Distributed systems and time delay


We assume familiarity with the transfer function of a time delay from an undergraduate
systems course
y(s) = e−θs u(s)
Let’s see the connection between the delay and the distributed systems, which give rise
to it. A simple physical example of a time delay is the delay caused by transport in a
flowing system. Consider plug flow in a tube depicted in Figure 1.12.
(a) Write down the equation of change for moles of component j for an arbitrary
volume element and show that
∂cj
= −∇ · (cj vj ) + Rj
∂t

cj (0, t) = u(t) cj (L, t) = y(t)

v
z=0 z=L

Figure 1.12: Plug-flow reactor.


1.6 Exercises 61

in which cj is the molar concentration of component j, vj is the velocity of


component j, and Rj is the production rate of component j due to chemical
reaction.9
Plug flow means the fluid velocity of all components is purely in the z direction,
and is independent of r and θ and, we assume here, z
vj = vδz

(b) Assuming plug flow and neglecting chemical reaction in the tube, show that the
equation of change reduces to
∂cj ∂cj
= −v (1.46)
∂t ∂z
This equation is known as a hyperbolic, first-order partial differential equation.
Assume the boundary and initial conditions are
cj (z, t) = u(t) 0=z t≥0 (1.47)
cj (z, t) = cj0 (z) 0≤z≤L t=0 (1.48)
In other words, we are using the feed concentration as the manipulated variable,
u(t), and the tube starts out with some initial concentration profile of compo-
nent j, cj0 (z).

(c) Show that the solution to (1.46) with these boundary conditions is
(
u(t − z/v) vt > z
cj (z, t) = (1.49)
cj0 (z − vt) vt < z

(d) If the reactor starts out empty of component j, show that the transfer function
between the outlet concentration, y = cj (L, t), and the inlet concentration, cj (0,
t) = u(t), is a time delay. What is the value of θ?

Exercise 1.3: Pendulum in state space


Consider the pendulum suspended at the end of a rigid link depicted in Figure 1.13. Let
r and θ denote the polar coordinates of the center of the pendulum, and let p = r δr be
the position vector of the pendulum, in which δr and δθ are the unit vectors in polar
coordinates. We wish to determine a state space description of the system. We are
able to apply a torque T to the pendulum as our manipulated variable. The pendulum
has mass m, the only other external force acting on the pendulum is gravity, and we
neglect friction. The link provides force −tδr necessary to maintain the pendulum at
distance r = R from the axis of rotation, and we measure this force t.
(a) Provide expressions for the four partial derivatives for changes in the unit vec-
tors with r and θ
∂δr ∂δr ∂δθ ∂δθ
∂r ∂θ ∂r ∂θ

(b) Use the chain rule to find the velocity of the pendulum in terms of the time
derivatives of r and θ. Do not simplify yet by assuming r is constant. We want
the general result.
9 Youwill need the Gauss divergence theorem and 3D Leibniz formula to go from a
mass balance on a volume element to the equation of continuity.
62 Getting Started with Model Predictive Control

θ
m

Figure 1.13: Pendulum with applied torque.

(c) Differentiate again to show that the acceleration of the pendulum is


p̈ = (r̈ − r θ̇ 2 )δr + (r θ̈ + 2ṙ θ̇)δθ

(d) Use a momentum balance on the pendulum mass (you may assume it is a point
mass) to determine both the force exerted by the link
t = mR θ̇ 2 + mg cos θ
and an equation for the acceleration of the pendulum due to gravity and the
applied torque
mR θ̈ − T /R + mg sin θ = 0

(e) Define a state vector and give a state space description of your system. What is
the physical significance of your state. Assume you measure the force exerted
by the link.
One answer is
dx1
= x2
dt
dx2
= −(g/R) sin x1 + u
dt
y = mRx22 + mg cos x1
in which u = T /(mR 2 )

Exercise 1.4: Time to Laplace domain


Take the Laplace transform of the following set of differential equations and find the
transfer function, G(s), connecting u(s) and y(s), y = Gu
dx
= Ax + Bu
dt
y = Cx + Du (1.50)
For x ∈ Rn , y ∈ Rp , and u ∈ Rm , what is the dimension of the G matrix? What
happens to the initial condition, x(0) = x0 ?
1.6 Exercises 63

Exercise 1.5: Converting between continuous and discrete time models


Given a prescribed u(t), derive and check the solution to (1.50). Given a prescribed
u(k) sequence, what is the solution to the discrete time model
e x(k) + B
x(k + 1) = A e u(k)

e x(k) + D
y(k) = C e u(k)

(a) Compute A e,B e , and D


e, C e so that the two solutions agree at the sample times for
a zero-order hold input, i.e., y(k) = y(tk ) for u(t) = u(k), t ∈ (tk , tk+1 ) in
which tk = k∆ for sample time ∆.

e,B
(b) Is your result valid for A singular? If not, how can you find A e , and D
e, C e for
this case?

Exercise 1.6: Continuous to discrete time conversion for nonlinear models


Consider the autonomous nonlinear differential equation model
dx
= f (x, u)
dt
x(0) = x0 (1.51)
Given a zero-order hold on the input, let s(t, u, x0 ), 0 ≤ t ≤ ∆, be the solution to (1.51)
given initial condition x0 at time t = 0, and constant input u is applied for t in the
interval 0 ≤ t ≤ ∆. Consider also the nonlinear discrete time model
x(k + 1) = F (x(k), u(k))
(a) What is the relationship between F and s so that the solution of the discrete
time model agrees at the sample times with the continuous time model with a
zero-order hold?

(b) Assume f is linear and apply this result to check the result of Exercise 1.5.

Exercise 1.7: Commuting functions of a matrix


Although matrix multiplication does not commute in general
AB ≠ BA
multiplication of functions of the same matrix do commute. You may have used the
following fact in Exercise 1.5
A−1 exp(At) = exp(At)A−1 (1.52)
(a) Prove that (1.52) is true assuming A has distinct eigenvalues and can therefore
be represented as
 
λ1 0 ··· 0
 
0 λ2 ··· 0
 
A = QΛQ−1 Λ= . . .. . 
 . . . . 
 . . . 
0 0 ··· λn
in which Λ is a diagonal matrix containing the eigenvalues of A, and Q is the
matrix of eigenvectors such that
Aqi = λi qi , i = 1, . . . , n
in which qi is the ith column of matrix Q.
64 Getting Started with Model Predictive Control

(b) Prove the more general relationship


f (A)g(A) = g(A)f (A) (1.53)
in which f and g are any functions definable by Taylor series.

(c) Prove that (1.53) is true without assuming the eigenvalues are distinct.
Hint: use the Taylor series defining the functions and apply the Cayley-Hamilton
theorem (Horn and Johnson, 1985, pp. 86–87).

Exercise 1.8: Finite difference formula and approximating the exponential


Instead of computing the exact conversion of a continuous time to a discrete time
system as in Exercise 1.5, assume instead one simply approximates the time derivative
with a first-order finite difference formula
dx x(tk+1 ) − x(tk )

dt ∆
with step size equal to the sample time, ∆. For this approximation of the continuous
time system, compute A e and Be so that the discrete time system agrees with the approx-
imate continuous time system at the sample times. Comparing these answers to the
exact solution, what approximation of eA∆ results from the finite difference approxi-
mation? When is this a good approximation of eA∆ ?

Exercise 1.9: Mapping eigenvalues of continuous time systems to discrete


time systems
Consider the continuous time differential equation and discrete time difference equa-
tion
dx
= Ax
dt
x+ = Aex

and the transformation


e = eA∆
A
Consider the scalar A case.
(a) What A represents an integrator in continuous time? What is the corresponding
e value for the integrator in discrete time?
A

e?
(b) What A give purely oscillatory solutions? What are the corresponding A

(c) For what A is the solution of the ODE stable? Unstable? What are the corre-
e?
sponding A

e regions in two complex-plane diagrams.


(d) Sketch and label these A and A

Exercise 1.10: State space realization


Define a state vector and realize the following models as state space models by hand.
One should do a few by hand to understand what the Octave or MATLAB calls are doing.
Answer the following questions. What is the connection between the poles of G and the
state space description? For what kinds of G(s) does one obtain a nonzero D matrix?
What is the order and gain of these systems? Is there a connection between order and
the numbers of inputs and outputs?
1.6 Exercises 65

1
(a) G(s) = (d) y(k + 1) = y(k) + 2u(k)
2s + 1
1 (e) y(k + 1) = a1 y(k) + a2 y(k − 1) +
(b) G(s) = b1 u(k) + b2 u(k − 1)
(2s + 1)(3s + 1)

2s + 1
(c) G(s) =
3s + 1

Exercise 1.11: Minimal realization


Find minimal realizations of the state space models you found by hand in Exercise 1.10.
Use Octave or MATLAB for computing minimal realizations. Were any of your hand
realizations nonminimal?

Exercise 1.12: Partitioned matrix inversion lemma


Let matrix Z be partitioned into
" #
B C
Z=
D E

and assume Z −1 , B −1 and E −1 exist.


(a) Perform row elimination and show that
" #
B −1 + B −1 C(E − DB −1 C)−1 DB −1 −B −1 C(E − DB −1 C)−1
Z −1 =
−(E − DB −1 C)−1 DB −1 (E − DB −1 C)−1

Note that this result is still valid if E is singular.

(b) Perform column elimination and show that


" #
−1 (B − CE −1 D)−1 −(B − CE −1 D)−1 CE −1
Z =
−E −1 D(B − CE −1 D)−1 E −1 + E −1 D(B − CE −1 D)−1 CE −1

Note that this result is still valid if B is singular.

(c) A host of other useful control-related inversion formulas follow from these re-
sults. Equate the (1,1) or (2,2) entries of Z −1 and derive the identity

(A + BCD)−1 = A−1 − A−1 B(DA−1 B + C −1 )−1 DA−1 (1.54)

A useful special case of this result is

(I + X −1 )−1 = I − (I + X)−1

(d) Equate the (1,2) or (2,1) entries of Z −1 and derive the identity

(A + BCD)−1 BC = A−1 B(DA−1 B + C −1 )−1 (1.55)

Equations (1.54) and (1.55) prove especially useful in rearranging formulas in


least squares estimation.
66 Getting Started with Model Predictive Control

Exercise 1.13: Perturbation to an asymptotically stable linear system


Given the system
x + = Ax + Bu

If A is an asymptotically stable matrix, prove that if u(k) → 0, then x(k) → 0.

Exercise 1.14: Exponential stability of a perturbed linear system


Given the system
x + = Ax + Bu

If A is an asymptotically stable matrix, prove that if u(k) decreases exponentially to


zero, then x(k) decreases exponentially to zero.

Exercise 1.15: Are we going forward or backward today?


In the chapter we derived the solution to
min f (w, x) + g(x, y) + h(y, z)
w,x,y

in which z is a fixed parameter using forward dynamic programming (DP)


y 0 (z)
0
e (z) = x 0 (y 0 (z))
x
0
e (z) = w 0 (x 0 (y 0 (z)))
w
(a) Solve for optimal w as a function of z using backward DP.

(b) Is forward or backward DP more efficient if you want optimal w as a function


of z?

Exercise 1.16: Method of Lagrange multipliers


Consider the objective function V (x) = (1/2)x ′ Hx + h′ x and optimization problem
min V (x) (1.56)
x

subject to
Dx = d
in which H > 0, x ∈ Rn , d ∈ Rm , m < n, i.e., fewer constraints than decisions. Rather
than partially solving for x using the constraint and eliminating it, we make use of the
method of Lagrange multipliers for treating the equality constraints (Fletcher, 1987;
Nocedal and Wright, 2006).
In the method of Lagrange multipliers, we augment the objective function with the
constraints to form the Lagrangian function, L
L(x, λ) = (1/2)x ′ Hx + h′ x − λ′ (Dx − d)
in which λ ∈ Rm is the vector of Lagrange multipliers. The necessary and sufficient
conditions for a global minimizer are that the partial derivatives of L with respect to x
and λ vanish (Nocedal and Wright, 2006, p. 451), (Fletcher, 1987, p.198,236).
1.6 Exercises 67

(a) Show that the necessary and sufficient conditions are equivalent to the matrix
equation " #" # " #
H −D ′ x h
=− (1.57)
−D 0 λ d
The solution to (1.57) then provides the solution to the original problem (1.56).

(b) We note one other important feature of the Lagrange multipliers, their relation-
ship to the optimal cost of the purely quadratic case. For h = 0, the cost is given
by
V 0 = (1/2)(x 0 )′ Hx 0
Show that this can also be expressed in terms of λ0 by the following
V 0 = (1/2)d′ λ0

Exercise 1.17: Minimizing a constrained, quadratic function


Consider optimizing the positive definite quadratic function subject to a linear con-
straint
min(1/2)x ′ Hx s.t. Ax = b
x
Using the method of Lagrange multipliers presented in Exercise 1.16, show that the
optimal solution, multiplier, and cost are given by
x 0 = H −1 A′ (AH −1 A′ )−1 b
λ0 = (AH −1 A′ )−1 b
V 0 = (1/2)b′ (AH −1 A′ )−1 b

Exercise 1.18: Minimizing a partitioned quadratic function


Consider the partitioned constrained minimization
" #′ " #" #
x1 H1 x1
min
x1 ,x2 x2 H2 x2
subject to " #
h i x
1
D I =d
x2
The solution to this optimization is required in two different forms, depending on
whether one is solving an estimation or regulation problem. Show that the solution
can be expressed in the following two forms if both H1 and H2 are full rank.
• Regulator form
V 0 (d) = d′ (H2 − H2 D(D ′ H2 D + H1 )−1 D ′ H2 )d

x10 (d) = K
ed e = (D ′ H2 D + H1 )−1 D ′ H2
K

x20 (d) = (I − DK
e )d

• Estimator form
V 0 (d) = d′ (DH1−1 D ′ + H2−1 )−1 d

x10 (d) = L
ed e = H −1 D ′ (DH −1 D ′ + H −1 )−1
L 1 1 2

x20 (d) = (I − DL
e )d
68 Getting Started with Model Predictive Control

Exercise 1.19: Stabilizability and controllability canonical forms


Consider the partitioned system
" #+ " #" # " #
x1 A11 A12 x1 B1
= + u
x2 0 A22 x2 0

with (A11 , B1 ) controllable. This form is known as controllability canonical form.

(a) Show that the system is not controllable by checking the rank of the controlla-
bility matrix.

(b) Show that the modes x1 can be controlled from any x1 (0) to any x1 (n) with a se-
quence of inputs u(0), . . . , u(n−1), but the modes x2 cannot be controlled from
any x2 (0) to any x2 (n). The states x2 are termed the uncontrollable modes.

(c) If A22 is stable the system is termed stabilizable. Although not all modes can be
controlled, the uncontrollable modes are stable and decay to steady state.
The following lemma gives an equivalent condition for stabilizability.
Lemma 1.12 (Hautus lemma for stabilizability). A system is stabilizable if and
only if h i
rank λI − A B =n for all |λ| ≥ 1

Prove this lemma using Lemma 1.2 as the condition for controllability.

Exercise 1.20: Regulator stability, stabilizable systems, and semidefinite


state penalty
(a) Show that the infinite horizon LQR is stabilizing for (A, B) stabilizable with R,
Q > 0.

(b) Show that the infinite horizon LQR is stabilizing for (A, B) stabilizable and R > 0,
Q ≥ 0, and (A, Q) detectable. Discuss what happens to the controller’s stabiliz-
ing property if Q is not positive semidefinite or (A, Q) is not detectable.

Exercise 1.21: Time-varying linear quadratic problem


Consider the time-varying version of the LQ problem solved in the chapter. The system
model is
x(k + 1) = A(k)x(k) + B(k)u(k)
The objective function also contains time-varying penalties
 
N−1
1X ′ ′
 ′
min V (x(0), u) = x(k) Q(k)x(k) + u(k) R(k)u(k) + x(N) Q(N)x(N)
u 2
k=0

subject to the model. Notice the penalty on the final state is now simply Q(N) instead
of Pf .
Apply the DP argument to this problem and determine the optimal input sequence
and cost. Can this problem also be solved in closed form like the time-invariant case?
1.6 Exercises 69

Exercise 1.22: Steady-state Riccati equation


Generate a random A and B for a system model for whatever n(≥ 3) and m(≥ 3) you
wish. Choose a positive semidefinite Q and positive definite R of the appropriate sizes.
(a) Iterate the DARE by hand with Octave or MATLAB until Π stops changing. Save
this result. Now call the MATLAB or Octave function to solve the steady-state
DARE. Do the solutions agree? Where in the complex plane are the eigenvalues
of A + BK? Increase the size of Q relative to R. Where do the eigenvalues move?

(b) Repeat for a singular A matrix. What happens to the two solution techniques?

(c) Repeat for an unstable A matrix.

Exercise 1.23: Positive definite Riccati iteration


If Π(k), Q, R > 0 in (1.10), show that Π(k − 1) > 0.
Hint: apply (1.54) to the term (B ′ Π(k)B + R)−1 .

Exercise 1.24: Existence and uniqueness of the solution to constrained least


squares
Consider the least squares problem subject to linear constraint
min(1/2)x ′ Qx subject to Ax = b
x

in which x ∈ Rn , b ∈ Rp , Q ∈ Rn×n , Q ≥ 0, A ∈ Rp×n . Show that this problem has a


solution for every b and the solution is unique if and only if
" #
Q
rank(A) = p rank =n
A

Exercise 1.25: Rate-of-change penalty


Consider the generalized LQR problem with the cross term between x(k) and u(k)

N−1
1 X 
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) + 2x(k)′ Mu(k) +(1/2)x(N)′ Pf x(N)
2
k=0

(a) Solve this problem with backward DP and write out the Riccati iteration and
feedback gain.

(b) Control engineers often wish to tune a regulator by penalizing the rate of change
of the input rather than the absolute size of the input. Consider the additional
positive definite penalty matrix S and the modified objective function

N−1
1 X 
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) + ∆u(k)′ S∆u(k)
2
k=0
+ (1/2)x(N)′ Pf x(N)
in which ∆u(k) = u(k) − u(k − 1). Show that you can augment the state to
include u(k − 1) via " #
x(k)
xe (k) =
u(k − 1)
70 Getting Started with Model Predictive Control

and reduce this new problem to the standard LQR with the cross term. What are
e, B
A e, R
e, Q e , and M
e for the augmented problem (Rao and Rawlings, 1999)?

Exercise 1.26: Existence, uniqueness and stability with the cross term
Consider the linear quadratic problem with system
x + = Ax + Bu (1.58)
and infinite horizon cost function

X
V (x(0), u) = (1/2) x(k)′ Qx(k) + u(k)′ Ru(k)
k=0
The existence, uniqueness and stability conditions for this problem are: (A, B) stabi-
lizable, Q ≥ 0, (A, Q) detectable, and R > 0. Consider the modified objective function
with the cross term
X∞
V = (1/2) x(k)′ Qx(k) + u(k)′ Ru(k) + 2x(k)′ Mu(k) (1.59)
k=0
(a) Consider reparameterizing the input as
v(k) = u(k) + T x(k) (1.60)
Choose T such that the cost function in x and v does not have a cross term,
and express the existence, uniqueness and stability conditions for the trans-
formed system. Goodwin and Sin (1984, p.251) discuss this procedure in the
state estimation problem with nonzero covariance between state and output
measurement noises.

(b) Translate and simplify these to obtain the existence, uniqueness and stability
conditions for the original system with cross term.

Exercise 1.27: Forecasting and variance increase or decrease


Given positive definite initial state variance P (0) and process disturbance variance Q,
the variance after forecasting one sample time was shown to be
P − (1) = AP (0)A′ + Q
(a) If A is stable, is it true that AP (0)A′ < P (0)? If so, prove it. If not, provide a
counterexample.

(b) If A is unstable, is it true that AP (0)A′ > P (0)? If so, prove it. If not, provide a
counterexample.

(c) If the magnitudes of all the eigenvalues of A are unstable, is it true that AP (0)A′ >
P (0)? If so, prove it. If not, provide a counterexample.

Exercise 1.28: Convergence of MHE with zero prior weighting


Show that the simplest form of MHE defined in (1.32) and (1.33) is also a convergent
estimator for an observable system. What restrictions on the horizon length N do you
require for this result to hold?
Hint: you can solve the MHE optimization problem by inspection when there is no
prior weighting of the data.
1.6 Exercises 71

Exercise 1.29: Symmetry in regulation and estimation


In this exercise we display the symmetry of the backward DP recursion for regulation,
and the forward DP recursion for estimation. In the regulation problem we solve at
stage k
min ℓ(z, u) + Vk0 (x) s.t. x = Az + Bu
x,u
In backward DP, x is the state at the current stage and z is the state at the previous
stage. The stage cost and cost to go are given by
ℓ(z, u) = (1/2)(z′ Qz + u′ Ru) Vk0 (x) = (1/2)x ′ Π(k)x
0
and the optimal cost is Vk−1 (z) since z is the state at the previous stage.
In estimation we solve at stage k
min ℓ(z, w) + Vk0 (x) s.t. z = Ax + w
x,w

In forward DP, x is the state at the current stage, z is the state at the next stage. The
stage cost and arrival cost are given by
2 
ℓ(z, w) = (1/2) y(k + 1) − Cz R−1 + w ′ Q−1 w Vk0 (x) = (1/2) |x − x̂(k)|2P (k)−1
0
and we wish to find Vk+1 (z) in the estimation problem.
(a) In the estimation problem, take the z term outside the optimization and solve
 
1
min w ′ Q−1 w + (x − x̂(k))′ P (k)−1 (x − x̂(k)) s.t. z = Ax + w
x,w 2

using the inverse form in Exercise 1.18, and show that the optimal cost is given
by
V 0 (z) = (1/2)(z − Ax̂(k))′ (P − (k + 1))−1 (z − Ax̂(k))
P (k + 1) = AP (k)A′ + Q

Add the z term to this cost using the third part of Example 1.1 and show that
0
Vk+1 (z) = (1/2)(z − x̂(k + 1))′ P −1 (k + 1)(z − x̂(k + 1))
P (k + 1) = P − (k + 1) − P − (k + 1)C ′ (CP − (k + 1)C ′ + R)−1 CP − (k + 1)
x̂(k + 1) = Ax̂(k) + L(k + 1)(y(k + 1) − CAx̂(k))
L(k + 1) = P − (k + 1)C ′ (CP − (k + 1)C ′ + R)−1

(b) In the regulator problem, take the z term outside the optimization and solve the
remaining two-term problem using the regulator form of Exercise 1.18. Then
add the z term and show that
0
Vk−1 (z) = (1/2)z′ Π(k − 1)z
Π(k − 1) = Q + A′ Π(k)A − A′ Π(k)B(B ′ Π(k)B + R)−1 B ′ Π(k)A
u0 (z) = K(k − 1)z
x 0 (z) = (A + BK(k − 1))z
K(k − 1) = −(B ′ Π(k)B + R)−1 B ′ Π(k)A

This symmetry can be developed further if we pose an output tracking problem rather
than zero state regulation problem in the regulator.
72 Getting Started with Model Predictive Control

Exercise 1.30: Symmetry in the Riccati iteration


Show that the covariance before measurement P − (k + 1) in estimation satisfies an
identical iteration to the cost to go Π(k − 1) in regulation under the change of variables
P − -→ Π, A -→ A′ , C -→ B ′ .

Exercise 1.31: Detectability and observability canonical forms


Consider the partitioned system
" #+ " #" #
x1 A11 0 x1
=
x2 A21 A22 x2
" #
h i x
1
y = C1 0
x2

with (A11 , C1 ) observable. This form is known as observability canonical form.

(a) Show that the system is not observable by checking the rank of the observability
matrix.

(b) Show that the modes x1 can be uniquely determined from a sequence of mea-
surements, but the modes x2 cannot be uniquely determined from the measure-
ments. The states x2 are termed the unobservable modes.

(c) If A22 is stable the system is termed detectable. Although not all modes can be
observed, the unobservable modes are stable and decay to steady state.
The following lemma gives an equivalent condition for detectability.
Lemma 1.13 (Hautus lemma for detectability). A system is detectable if and only
if " #
λI − A
rank =n for all |λ| ≥ 1
C

Prove this lemma using Lemma 1.4 as the condition for observability.

Exercise 1.32: Estimator stability and detectable systems


Show that the least squares estimator given in (1.27) is stable for (A, C) detectable with
Q > 0.

Exercise 1.33: Estimator stability and semidefinite state noise penalty


We wish to show that the least squares estimator is stable for (A, C) detectable and
Q ≥ 0, (A, Q) stabilizable.

(a) Because Q−1 is not defined in this problem, the objective function defined in
(1.26) requires modification. Show that the objective function with semidefinite
Q ≥ 0 can be converted into the following form

1
V (x(0), w(T )) = |x(0) − x(0)|2(P − (0))−1 +
2
TX
−1 T
X 
2
|w(k)|2e −1 + y(k) − Cx(k) R −1
Q
k=0 k=0
1.6 Exercises 73

in which
x + = Ax + Gw e >0
Q
e and G in terms of the original semidefinite Q. How are
Find expressions for Q
e
the dimension of Q and G related to the rank of Q?

(b) What is the probabilistic interpretation of the state estimation problem with
semidefinite Q?

(c) Show that (A, Q) stabilizable implies (A, G) stabilizable in the converted form.

(d) Show that this estimator is stable for (A, C) detectable and (A, G) stabilizable
e , R > 0.
with Q

(e) Discuss what happens to the estimator’s stability if Q is not positive semidefinite
or (A, Q) is not stabilizable.

Exercise 1.34: Calculating mean and variance from data


We are sampling a real-valued scalar random variable x(k) ∈ R at time k. Assume
the random variable comes from a distribution with mean x and variance P , and the
samples at different times are statistically independent.
A colleague has suggested the following formulas for estimating the mean and
variance from N samples
N N
1 X 1 X
x̂N = x(j) P̂N = (x(j) − x̂N )2
N N
j=1 j=1

(a) Prove that the estimate of the mean is unbiased for all N, i.e., show that for all
N
E(x̂N ) = x

(b) Prove that the estimate of the variance is not unbiased for any N, i.e., show that
for all N
E(P̂N ) ≠ P

(c) Using the result above, provide an alternative formula for the variance estimate
that is unbiased for all N. How large does N have to be before these two estimates
of P are within 1%?

Exercise 1.35: Expected sum of squares


Given that a random variable x has mean m and covariance P , show that the expected
sum of squares is given by the formula (Selby, 1973, p.138)

E(x ′ Qx) = m′ Qm + tr(QP )

The trace of a square matrix A, written tr(A), is defined to be the sum of the diagonal
elements X
tr(A) := Aii
i
74 Getting Started with Model Predictive Control

Exercise 1.36: Normal distribution


Given a normal distribution with scalar parameters m and σ
s "   #
1 1 x−m 2
pξ (x) = exp − (1.61)
2π σ 2 2 σ
By direct calculation, show that
(a)
E(ξ) = m
var(ξ) = σ 2

(b) Show that the mean and the maximum likelihood are equal for the normal dis-
tribution. Draw a sketch of this result. The maximum likelihood estimate, x̂, is
defined as
x̂ := arg max pξ (x)
x
in which arg returns the solution to the optimization problem.

Exercise 1.37: Conditional densities are positive definite


We show in Example A.44 that if ξ and η are jointly normally distributed as
" #
ξ
∼ N(m, P )
η
" # " #!
mx Px Pxy
∼N ,
my Pyx Py
then the conditional density of ξ given η is also normal
(ξ|η) ∼ N(mx|y , Px|y )
in which the conditional mean is
mx|y = mx + Pxy Py−1 (y − my )
and the conditional covariance is
Px|y = Px − Pxy Py−1 Pyx

Given that the joint density is well defined, prove the marginal densities and the condi-
tional densities also are well defined, i.e., given P > 0, prove Px > 0, Py > 0, Px|y > 0,
Py|x > 0.

Exercise 1.38: Expectation and covariance under linear transformations


Consider the random variable x ∈ Rn with density px and mean and covariance
E(x) = mx cov(x) = Px
Consider the random variable y ∈ Rp defined by the linear transformation
y = Cx
(a) Show that the mean and covariance for y are given by
E(y) = Cmx cov(y) = CPx C ′
Does this result hold for all C? If yes, prove it; if no, provide a counterexample.

(b) Apply this result to solve Exercise A.35.


1.6 Exercises 75

Exercise 1.39: Normal distributions under linear transformations


Given the normally distributed random variable, ξ ∈ Rn , consider the random variable,
η ∈ Rn , obtained by the linear transformation

η = Aξ

in which A is a nonsingular matrix. Using the result on transforming probability densi-


ties, show that if ξ ∼ N(m, P ), then η ∼ N(Am, AP A′ ). This result basically says that
linear transformations of normal random variables are normal.

Exercise 1.40: More on normals and linear transformations


Consider a normally distributed random variable x ∈ Rn , x ∼ N(mx , Px ). You showed
in Exercise 1.39 for C ∈ Rn×n invertible, that the random variable y defined by the
linear transformation y = Cx is also normal and is distributed as

y ∼ N(Cmx , CPx C ′ )

Does this result hold for all C? If yes, prove it; if no, provide a counterexample.

Exercise 1.41: Signal processing in the good old days—recursive least squares
Imagine we are sent back in time to 1960 and the only computers available have ex-
tremely small memories. Say we have a large amount of data coming from a process
and we want to compute the least squares estimate of model parameters from these
data. Our immediate challenge is that we cannot load all of these data into memory to
make the standard least squares calculation.
Alternatively, go 150 years further back in time and consider the situation from
Gauss’s perspective,
It occasionally happens that after we have completed all parts of an ex-
tended calculation on a sequence of observations, we learn of a new ob-
servation that we would like to include. In many cases we will not want to
have to redo the entire elimination but instead to find the modifications
due to the new observation in the most reliable values of the unknowns
and in their weights.
C.F. Gauss, 1823
G.W. Stewart Translation, 1995, p. 191.
Given the linear model
yi = Xi′ θ
in which scalar yi is the measurement at sample i, Xi′ is the independent model variable
(row vector, 1 × p) at sample i, and θ is the parameter vector (p × 1) to be estimated
from these data. Given the weighted least squares objective and n measurements, we
wish to compute the usual estimate

θ̂ = (X ′ X)−1 X ′ y (1.62)

in which    
y1 X1′
   
 .   . 
y =  ..  X =  .. 
   
yn Xn ′

We do not wish to store the large matrices X(n × p) and y(n × 1) required for this
calculation. Because we are planning to process the data one at a time, we first modify
our usual least squares problem to deal with small n. For example, we wish to estimate
76 Getting Started with Model Predictive Control

the parameters when n < p and the inverse in (1.62) does not exist. In such cases, we
may choose to regularize the problem by modifying the objective function as follows
n
X
Φ(θ) = (θ − θ)′ P0−1 (θ − θ) + (yi − Xi′ θ)2
i=1

in which θ and P0 are chosen by the user. In Bayesian estimation, we call θ and P0 the
prior information, and often assume that the prior density of θ (without measurements)
is normal
θ ∼ N(θ, P0 )
The solution to this modified least squares estimation problem is
θb = θ + (X ′ X + P0−1 )−1 X ′ (y − Xθ) (1.63)
Devise a means to recursively estimate θ so that:
1. We never store more than one measurement at a time in memory.
2. After processing all the measurements, we obtain the same least squares esti-
mate given in (1.63).

Exercise 1.42: Least squares parameter estimation and Bayesian estimation


Consider a model linear in the parameters
y = Xθ + e (1.64)
in which y ∈ Rp is a vector of measurements, θ ∈ Rm
is a vector of parameters,
X ∈ Rp×m is a matrix of known constants, and e ∈ Rp is a random variable modeling
the measurement error. The standard parameter estimation problem is to find the best
estimate of θ given the measurements y corrupted with measurement error e, which
we assume is distributed as
e ∼ N(0, R)
(a) Consider the case in which the errors in the measurements are independently
and identically distributed with variance σ 2 , R = σ 2 I. For this case, the classic
least squares problem and solution are
2 −1 ′
min y − Xθ θ̂ = X ′ X X y
θ

Consider the measurements to be sampled from (1.64) with true parameter value
θ0 . Show that using the least squares formula, the parameter estimate is dis-
tributed as −1
θ̂ ∼ N(θ0 , Pθ̂ ) Pθ̂ = σ 2 X ′ X

(b) Now consider again the model of (1.64) and a Bayesian estimation problem. As-
sume a prior distribution for the random variable θ
θ ∼ N(θ, P )
Compute the conditional density of θ given measurement y, show that this
density is normal, and find its mean and covariance
pθ|y (θ|y) = n(θ, m, P )
Show that Bayesian estimation and least squares estimation give the same result
in the limit of an infinite variance prior. In other words, if the covariance of the
prior is large compared to the covariance of the measurement error, show that
m ≈ (X ′ X)−1 X ′ y P ≈ Pθ̂
1.6 Exercises 77

(c) What (weighted) least squares minimization problem is solved for the general
measurement error covariance
e ∼ N(0, R)
Derive the least squares estimate formula for this case.

(d) Again consider the measurements to be sampled from (1.64) with true param-
eter value θ0 . Show that the weighted least squares formula gives parameter
estimates that are distributed as
θ̂ ∼ N(θ0 , Pθ̂ )
and find Pθ̂ for this case.

(e) Show again that Bayesian estimation and least squares estimation give the same
result in the limit of an infinite variance prior.

Exercise 1.43: Least squares and minimum variance estimation


Consider again the model linear in the parameters and the least squares estimator from
Exercise 1.42
y = Xθ + e e ∼ N(0, R)
 −1
′ −1
θ̂ = X R X X ′ R −1 y

Show that the covariance of the least squares estimator is the smallest covariance of
all linear unbiased estimators.

Exercise 1.44: Two stages are not better than one


We often can decompose an estimation problem into stages. Consider the following
case in which we wish to estimate x from measurements of z, but we have the model
between x and an intermediate variable, y, and the model between y and z
y = Ax + e1 cov(e1 ) = Q1
z = By + e2 cov(e2 ) = Q2
(a) Write down the optimal least squares problem to solve for ŷ given the z mea-
surements and the second model. Given ŷ, write down the optimal least squares
problem for x̂ in terms of ŷ. Combine these two results together and write the
resulting estimate of x̂ given measurements of z. Call this the two-stage estimate
of x.

(b) Combine the two models together into a single model and show that the rela-
tionship between z and x is
z = BAx + e3 cov(e3 ) = Q3
Express Q3 in terms of Q1 , Q2 and the models A, B. What is the optimal least
squares estimate of x̂ given measurements of z and the one-stage model? Call
this the one-stage estimate of x.

(c) Are the one-stage and two-stage estimates of x the same? If yes, prove it. If
no, provide a counterexample. Do you have to make any assumptions about the
models A, B?
78 Getting Started with Model Predictive Control

Exercise 1.45: Time-varying Kalman filter


Derive formulas for the conditional densities of x(k)|y(k − 1) and x(k)|y(k) for the
time-varying linear system
x(k + 1) = A(k)x(k) + G(k)w(k)
y(k) = C(k)x(k) + v(k)
in which the initial state, state noise and measurement noise are independently dis-
tributed as
x(0) ∼ N(x 0 , Q0 ) w(k) ∼ N(0, Q) v(k) ∼ N(0, R)

Exercise 1.46: More on conditional densities


In deriving the discrete time Kalman filter, we have px|y (x(k)|y(k)) and we wish to
calculate recursively px|y (x(k + 1)|y(k + 1)) after we collect the output measurement
at time k + 1. It is straightforward to calculate px,y|y (x(k + 1), y(k + 1)|y(k)) from
our established results on normal densities and knowledge of px|y (x(k)|y(k)), but
we still need to establish a formula for pushing the y(k + 1) to the other side of the
conditional density bar. Consider the following statement as a possible lemma to aid
in this operation.
pa,b|c (a, b|c)
pa|b,c (a|b, c) =
pb|c (b|c)
If this statement is true, prove it. If it is false, give a counterexample.

Exercise 1.47: Other useful conditional densities


Using the definitions of marginal and conditional density, establish the following useful
conditional density relations
R
1. pA|B (a|b) = pA|B,C (a|b, c)pC|B (c|b)dc
pA|B (a|b)
2. pA|B,C (a|b, c) = pC|A,B (c|a, b)
pC|B (c|b)

Exercise 1.48: Optimal filtering and deterministic least squares



Given the data sequence y(0), . . . , y(k) and the system model
x + = Ax + w
y = Cx + v
(a) Write down a least squares problem whose solution would provide a good state
estimate for x(k) in this situation. What probabilistic interpretation can you
assign to the estimate calculated from this least squares problem?

(b) Now consider the nonlinear model


x + = f (x) + w
y = g(x) + v
What is the corresponding nonlinear least squares problem for estimating x(k)
in this situation? What probabilistic interpretation, if any, can you assign to this
estimate in the nonlinear model context?

(c) What is the motivation for changing from these least squares estimators to the
moving horizon estimators we discussed in the chapter?
1.6 Exercises 79

Exercise 1.49: A nonlinear transformation and conditional density


Consider the following relationship between the random variable y, and x and v

y = f (x) + v

The author of a famous textbook wants us to believe that

py|x (y|x) = pv (y − f (x))

Derive this result and state what additional assumptions on the random variables x
and v are required for this result to be correct.

Exercise 1.50: Some smoothing


One of the problems with asking you to derive the Kalman filter is that the derivation
is in so many textbooks that it is difficult to tell if you are thinking independently.
So here’s a variation on the theme that should help you evaluate your level of under-
standing of these ideas. Let’s calculate a smoothed rather than filtered estimate and
covariance. Here’s the problem.
We have the usual setup with a prior on x(0)

x(0) ∼ N(x(0), Q0 )

and we receive data from the following system

x(k + 1) = Ax(k) + w(k)


y(k) = Cx(k) + v(k)

in which the random variables w(k) and v(k) are independent, identically distributed
normals, w(k) ∼ N(0, Q), v(k) ∼ N(0, R).
(a) Calculate the standard density for the filtering problem, px(0)|y(0) (x(0)|y(0)).

(b) Now calculate the density for the smoothing problem

px(0)|y(0),y(1) (x(0)|y(0), y(1))

that is, not the usual px(1)|y(0),y(1) (x(1)|y(0), y(1)).

Exercise 1.51: Alive on arrival


The following two optimization problems are helpful in understanding the arrival cost
decomposition in state estimation.
(a) Let V (x, y, z) be a positive, strictly convex function consisting of the sum of
two functions, one of which depends on both x and y, and the other of which
depends on y and z

V (x, y, z) = g(x, y) + h(y, z) V : Rm × Rn × Rp → R≥0

Consider the optimization problem

P 1 : min V (x, y, z)
x,y,z

The arrival cost decomposes this three-variable optimization problem into two,
smaller dimensional optimization problems. Define the “arrival cost” ge for this
problem as the solution to the following single-variable optimization problem

g
e (y) = min g(x, y)
x
80 Getting Started with Model Predictive Control

and define optimization problem P 2 as follows

P 2 : min g
e (y) + h(y, z)
y,z

Let (x ′ , y ′ , z′ ) denote the solution to P 1 and (x 0 , y 0 , z0 ) denote the solution to


P 2, in which
x 0 = arg min g(x, y 0 )
x
Prove that the two solutions are equal

(x ′ , y ′ , z′ ) = (x 0 , y 0 , z0 )

(b) Repeat the previous part for the following optimization problems

V (x, y, z) = g(x) + h(y, z)

Here the y variables do not appear in g but restrict the x variables through a
linear constraint. The two optimization problems are

P 1 : min V (x, y, z) subject to Ex = y


x,y,z

P 2 : min g
e (y) + h(y, z)
y,z
in which
g
e (y) = min g(x) subject to Ex = y
x

Exercise 1.52: On-time arrival


Consider the deterministic, full information state estimation optimization problem
 TX
−1 
1
min |x(0) − x(0)|2(P − (0))−1 + |w(i)|2Q−1 + |v(i)|2R−1 (1.65)
x(0),w,v 2
i=0

subject to

x + = Ax + w
y = Cx + v (1.66)

in which the sequence of measurements y(T ) are known values. Notice we assume the
noise-shaping matrix, G, is an identity matrix here. See Exercise 1.53 for the general
case. Using the result of the first part of Exercise 1.51, show that this problem is
equivalent to the following problem
TX
−1
1
min VT−−N (x(T − N)) + |w(i)|2Q−1 + |v(i)|2R−1
x(T −N),w,v 2
i=T −N

subject to (1.66). The arrival cost is defined as


 N−1
X 
1

VN (a) := min |x(0) − x(0)|2(P − (0))−1 + |w(i)|2Q−1 + |v(i)|2R−1
x(0),w,v 2
i=0

subject to (1.66) and x(N) = a. Notice that any value of N, 0 ≤ N ≤ T , can be used to
split the cost function using the arrival cost.
1.6 Exercises 81

Exercise 1.53: Arrival cost with noise-shaping matrix G


Consider the deterministic, full information state estimation optimization problem
 TX
−1 
1
min |x(0) − x(0)|2(P − (0))−1 + |w(i)|2Q−1 + |v(i)|2R−1
x(0),w,v 2
i=0

subject to
x + = Ax + Gw
y = Cx + v (1.67)
in which the sequence of measurements y are known values. Using the result of the
second part of Exercise 1.51, show that this problem also is equivalent to the following
problem
 TX
−1 
1
min VT−−N (x(T − N)) + |w(i)|2Q−1 + |v(i)|2R−1
x(T −N),w,v 2
i=T −N

subject to (1.67). The arrival cost is defined for all k ≥ 0 and a ∈ Rn by


 k−1
X 
1
Vk− (a) := min |x(0) − x(0)|2(P − (0))−1 + |w(i)|2Q−1 + |v(i)|2R−1
x(0),w,v 2
i=0

subject to x(k) = a and the model (1.67). Notice that any value of N, 0 ≤ N ≤ T , can
be used to split the cost function using the arrival cost.

Exercise 1.54: Where is the steady state?


Consider the two-input, two-output system
   
0.5 0 0 0 0.5 0 " #
 0 0.6 0 0   0.4 
  0  1 1 0 0
A=  B=  C=
 0 0 0.5 0   0.25 0  0 0 1 1
0 0 0 0.6 0 0.6
h i′ h i′
(a) The output setpoint is ysp = 1 −1 and the input setpoint is usp = 0 0 .
Calculate the target triple (xs , us , ys ). Is the output setpoint feasible, i.e., does
ys = ysp ?

(b) Assume only input one u1 is available for control. Is the output setpoint feasible?
What is the target in this case using Qs = I?

(c) Assume both inputs are available for control but only the first output has a
setpoint, y1t = 1. What is the solution to the target problem for Rs = I?

Exercise 1.55: Detectability of integrating disturbance models


(a) Prove Lemma 1.8; the augmented system is detectable if and only if the system
(A, C) is detectable and
" #
I − A −Bd
rank = n + nd
C Cd

(b) Prove Corollary 1.9; the augmented system is detectable only if nd ≤ p.


82 Getting Started with Model Predictive Control

Exercise 1.56: Unconstrained tracking problem


(a) For an unconstrained system, show that the following condition is sufficient for
feasibility of the target problem for any rsp .
" #
I − A −B
rank = n + nc (1.68)
HC 0

(b) Show that (1.68) implies that the number of controlled variables without offset
is less than or equal to the number of manipulated variables and the number of
measurements, nc ≤ m and nc ≤ p.

(c) Show that (1.68) implies the rows of H are independent.

(d) Does (1.68) imply that the rows of C are independent? If so, prove it; if not,
provide a counterexample.

(e) By choosing H, how can one satisfy (1.68) if one has installed redundant sensors
so several rows of C are identical?

Exercise 1.57: Unconstrained tracking problem for stabilizable systems


If we restrict attention to stabilizable systems, the sufficient condition of Exercise 1.56
becomes a necessary and sufficient condition. Prove the following lemma.

Lemma 1.14 (Stabilizable systems and feasible targets). Consider an unconstrained,


stabilizable system (A, B). The target is feasible for any rsp if and only if
" #
I − A −B
rank = n + nc
HC 0

Exercise 1.58: Existence and uniqueness of the unconstrained target


Assume a system having p controlled variables z = Hx, with setpoints rsp , and m
manipulated variables u, with setpoints usp . Consider the steady-state target problem

min(1/2)(u − usp )′ R(u − usp ) R>0


x,u

subject to
" #" # " #
I−A −B x 0
=
H 0 u rsp

Show that the steady-state solution (x, u) exists for any (rsp , usp ) and is unique if
" # " #
I − A −B I−A
rank =n+p rank =n
H 0 H

Exercise 1.59: Choose a sample time


Consider the unstable continuous time system
dx
= Ax + Bu y = Cx
dt
1.6 Exercises 83

in which
   
−0.281 0.935 0.035 0.008 0.687
 0.047 −0.116 0.053 0.383  0.589
   
A=  B=  C=I
 0.679 0.519 0.030 0.067  0.930
0.679 0.831 0.671 −0.083 0.846

Consider regulator tuning parameters and constraints


 
1
2
 
Q = diag(1, 2, 1, 2) R=1 N = 10 |x| ≤  
1
3

(a) Compute the eigenvalues of A. Choose a sample time of ∆ = 0.04 and simulate
h i′
the MPC regulator response given x(0) = −0.9 −1.8 0.7 2 until t = 20.
Use an ODE solver to simulate the continuous time plant response. Plot all states
and the input versus time.
Now add an input disturbance to the regulator so the control applied to the plant
is ud instead of u in which

ud (k) = (1 + 0.1w1 )u(k) + 0.1w2

and w1 and w2 are zero-mean, normally distributed random variables with unit
variance. Simulate the regulator’s performance given this disturbance. Plot all
states and ud (k) versus time.

(b) Repeat the simulations with and without disturbance for ∆ = 0.4 and ∆ = 2.

(c) Compare the simulations for the different sample times. What happens if the
sample time is too large? Choose an appropriate sample time for this system
and justify your choice.

Exercise 1.60: Disturbance models and offset


Consider the following two-input, three-output plant discussed in Example 1.11

x + = Ax + Bu + Bp p
y = Cx

in which
   
0.2681 −0.00338 −0.00728 1 0 0
   
A =  9.703 0.3279 −25.44  C = 0 1 0
0 0 1 0 0 1
   
−0.00537 0.1655 −0.1175
   
B =  1.297 97.91  Bp =  69.74 
0 −6.637 6.637

The input disturbance p results from a reactor inlet flowrate disturbance.

(a) Since there are two inputs, choose two outputs in which to remove steady-state
offset. Build an output disturbance model with two integrators. Is your aug-
mented model detectable?
84 Getting Started with Model Predictive Control

ysp u y
gc g

Figure 1.14: Feedback control system with output disturbance d,


and setpoint ysp .

(b) Implement your controller using p = 0.01 as a step disturbance at k = 0. Do you


remove offset in your chosen outputs? Do you remove offset in any outputs?

(c) Can you find any two-integrator disturbance model that removes offset in two
outputs? If so, which disturbance model do you use? If not, why not?

Exercise 1.61: MPC, PID, and time delay


Consider the following first-order system with time delay shown in Figure 1.14

k
g(s) = e−θs , k = 1, τ = 1, θ = 5
τs + 1
Consider a unit step change in setpoint ysp , at t = 0.
(a) Choose a reasonable sample time, ∆, and disturbance model, and simulate an
offset-free discrete time MPC controller for this setpoint change. List all of your
chosen parameters.

(b) Choose PID tuning parameters to achieve “good performance” for this system.
List your PID tuning parameters. Compare the performances of the two con-
trollers.

Exercise 1.62: CSTR heat-transfer coefficient


Your mission is to design the controller for the nonlinear CSTR model given in Ex-
ample 1.11. We wish to use a linear controller and estimator with three integrating
disturbances to remove offset in two controlled variables: temperature and level; use
the nonlinear CSTR model as the plant.
(a) You are particularly concerned about disturbances to the heat-transfer rate (pa-
rameter U) for this reactor. If changes to U are the primary disturbance, what dis-
turbance model do you recommend and what covariances do you recommend for
the three disturbances so that the disturbance state accounting for heat transfer
is used primarily to explain the output error in the state estimator? First do a
simulation with no measurement noise to test your estimator design. In the sim-
ulation let the reactor’s heat-transfer coefficient decrease (and increase) by 20%
at 10 minutes to test your control system design. Comment on the performance
of the control system.
1.6 Exercises 85

(b) Now let’s add some measurement noise to all three sensors. So we all work on
the same problem, choose the variance of the measurement error Rv to be

Rv = 10−3 diag (cs2 , Ts2 , h2s )

in which (cs , Ts , hs ) are the nominal steady states of the three measurements.
Is the performance from the previous part assuming no measurement noise ac-
ceptable? How do you adjust your estimator from the previous part to obtain
good performance? Rerun the simulation with measurement noise and your ad-
justed state estimator. Comment on the change in the performance of your new
design that accounts for the measurement noise.

(c) Recall that the offset lemma 1.10 is an either-or proposition, i.e., either the con-
troller removes steady offset in the controlled variables or the system is closed-
loop unstable. From closed-loop simulation, approximate the range of plant
U values for which the controller is stabilizing (with zero measurement noise).
From a stabilization perspective, which disturbance is worse, an increase or de-
crease in the plant’s heat-transfer coefficient?

Exercise 1.63: System identification of the nonlinear CSTR


In many practical applications, it may not be convenient to express system dynamics
from first principles. Hence, identifying a suitable model from data is a critical step
in the design of an MPC controller. Your final mission is to obtain a 2-input, 3-output
process model for the nonlinear CSTR given in Example 1.11 using the System Identi-
fication Toolbox in MATLAB. Relevant functions are provided.
(a) Begin first by creating a dataset for identification. Generate a pseudo-random,
binary signal (PRBS) for the inputs using idinput. Ensure you have generated
uncorrelated signals for each input. Think about the amplitude of the PRBS to
use when collecting data from a nonlinear process keeping in mind that large per-
turbations may lead to undesirable phenomena such as reactor ignition. Inject
these generated input sequences into the nonlinear plant of Example 1.11 and
simulate the system by solving the nonlinear ODEs. Add measurement noise
to the simulation so that you have a realistic dataset for the ID and plot the
input-output data.

(b) Use the data to identify a third-order linear state space model by calling iddata
and ssest. Compare the step tests of your identified model with those from
the linear model used in Example 1.11. Which is more accurate compared to the
true plant simulation?

(c) Using the code for Example 1.11 as a starting point, replace the linear model in
the MPC controller with your identified model and recalculate Figures 1.10 and
1.11 from the example. Is your control system robust enough to obtain good
closed-loop control of the nonlinear plant using your linear model identified
from data in the MPC controller? Do you maintain zero offset in the controlled
variables?
Bibliography

R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton,


New Jersey, 1957.

R. E. Bellman and S. E. Dreyfus. Applied Dynamic Programming. Princeton


University Press, Princeton, New Jersey, 1962.

D. P. Bertsekas. Dynamic Programming. Prentice-Hall, Inc., Englewood Cliffs,


New Jersey, 1987.

E. F. Camacho and C. Bordons. Model Predictive Control. Springer-Verlag, Lon-


don, second edition, 2004.

E. J. Davison and H. W. Smith. Pole assignment in linear time-invariant multi-


variable systems with constant disturbances. Automatica, 7:489–498, 1971.

E. J. Davison and H. W. Smith. A note on the design of industrial regulators:


Integral feedback and feedforward controllers. Automatica, 10:329–332,
1974.

R. Fletcher. Practical Methods of Optimization. John Wiley & Sons, New York,
1987.

B. A. Francis and W. M. Wonham. The internal model principle of control


theory. Automatica, 12:457–465, 1976.

G. C. Goodwin and K. S. Sin. Adaptive Filtering Prediction and Control. Prentice-


Hall, Englewood Cliffs, New Jersey, 1984.

G. C. Goodwin, M. M. Serón, and J. A. De Doná. Constrained control and esti-


mation: an optimization approach. Springer, New York, 2005.

M. L. J. Hautus. Controllability and stabilizability of sampled systems. IEEE


Trans. Auto. Cont., 17(4):528–531, August 1972.

R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press,


1985.

A. H. Jazwinski. Stochastic Processes and Filtering Theory. Academic Press,


New York, 1970.

R. E. Kalman. A new approach to linear filtering and prediction problems.


Trans. ASME, J. Basic Engineering, pages 35–45, March 1960a.

86
Bibliography 87

R. E. Kalman. Contributions to the theory of optimal control. Bull. Soc. Math.


Mex., 5:102–119, 1960b.

H. Kwakernaak and R. Sivan. Linear Optimal Control Systems. John Wiley and
Sons, New York, 1972.

W. H. Kwon. Receding horizon control: model predictive control for state models.
Springer-Verlag, London, 2005.

J. M. Maciejowski. Predictive Control with Contraints. Prentice-Hall, Harlow,


UK, 2002.

J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York,


second edition, 2006.

B. J. Odelson, M. R. Rajamani, and J. B. Rawlings. A new autocovariance least-


squares method for estimating noise covariances. Automatica, 42(2):303–
308, February 2006.

G. Pannocchia and J. B. Rawlings. Disturbance models for offset-free MPC


control. AIChE J., 49(2):426–437, 2003.

L. Qiu and E. J. Davison. Performance limitations of non-minimum phase sys-


tems in the servomechanism problem. Automatica, 29(2):337–349, 1993.

C. V. Rao and J. B. Rawlings. Steady states and constraints in model predictive


control. AIChE J., 45(6):1266–1278, 1999.

J. A. Rossiter. Model-based predictive control: a practical approach. CRC Press


LLC, Boca Raton, FL, 2004.

S. M. Selby. CRC Standard Mathematical Tables. CRC Press, twenty-first edition,


1973.

E. D. Sontag. Mathematical Control Theory. Springer-Verlag, New York, second


edition, 1998.

G. Strang. Linear Algebra and its Applications. Academic Press, New York,
second edition, 1980.

L. Wang. Model Predictive Control System Design and Implementation Using


Matlab. Springer, New York, 2009.
2
Model Predictive Control—Regulation

2.1 Introduction
In Chapter 1 we investigated a special, but useful, form of model pre-
dictive control (MPC); an important feature of this form of MPC is that, if
the terminal cost is chosen to be the value function of infinite horizon
unconstrained optimal control problem, there exists a set of initial
states for which MPC is actually optimal for the infinite horizon con-
strained optimal control problem and therefore inherits its associated
advantages. Just as there are many methods other than infinite horizon
linear quadratic control for stabilizing linear systems, there are alterna-
tive forms of MPC that can stabilize linear and even nonlinear systems.
We explore these alternatives in the remainder of this chapter. But first
we place MPC in a more general setting to facilitate comparison with
other control methods.
MPC is, as we have seen earlier, a form of control in which the control
action is obtained by solving online, at each sampling instant, a finite
horizon optimal control problem in which the initial state is the current
state of the plant. Optimization yields a finite control sequence, and
the first control action in this sequence is applied to the plant. MPC
differs, therefore, from conventional control in which the control law
is precomputed offline. But this is not an essential difference; MPC
implicitly implements a control law that can, in principle, be computed
offline as we shall soon see. Specifically, if the current state of the
system being controlled is x, MPC obtains, by solving an open-loop
optimal control problem for this initial state, a specific control action
u to apply to the plant.
Dynamic programming (DP) may be used to solve a feedback version
of the same optimal control problem, however, yielding a receding hori-
zon control law κ(·). The important fact is that if x is the current state,

89
90 Model Predictive Control—Regulation

the optimal control u obtained by MPC (by solving an open-loop opti-


mal control problem) satisfies u = κ(x); MPC computes the value κ(x)
of the optimal receding horizon control law for the current state x,
while DP yields the control law κ(·) that can be used for any state. DP
would appear to be preferable since it provides a control law that can
be implemented simply (as a look-up table). However, obtaining a DP
solution is difficult, if not impossible, for most optimal control prob-
lems if the state dimension is reasonably high — unless the system is
linear, the cost quadratic and there are no control or state constraints.
The great advantage of MPC is that open-loop optimal control prob-
lems often can be solved rapidly enough, using standard mathematical
programming algorithms, to permit the use of MPC even though the
system being controlled is nonlinear, and hard constraints on states
and controls must be satisfied. Thus MPC permits the application of
a DP solution, even though explicit determination of the optimal con-
trol law is intractable. MPC is an effective implementation of the DP
solution.

In this chapter we study MPC for the case when the state is known.
This case is particularly important, even though it rarely arises in prac-
tice, because important properties, such as stability and performance,
may be relatively easily established. The relative simplicity of this case
arises from the fact that if the state is known and if there are no dis-
turbances or model error, the problem is deterministic, i.e., there is no
uncertainty making feedback unnecessary in principle. As we pointed
out previously, for deterministic systems the MPC action for a given
state is identical to the receding horizon control law, determined using
DP, and evaluated at the given state. When the state is not known, it has
to be estimated and state estimation error, together with model error
and disturbances, makes the system uncertain in that future trajecto-
ries cannot be precisely predicted. The simple connection between MPC
and the DP solution is lost because there does not exist an open-loop
optimal control problem whose solution yields a control action that is
the same as that obtained by the DP solution. A practical consequence
is that special techniques are required to ensure robustness against
these various forms of uncertainty. So the results of this chapter hold
when there is no uncertainty. We prove, in particular, that the optimal
control problem that defines the model predictive control can always
be solved if the initial optimal control problem can be solved (recursive
feasibility), and that the optimal cost can always be reduced allowing
us to prove asymptotic or exponential stability of the target state. We
2.2 Model Predictive Control 91

refer to stability in the absence of uncertainty as nominal or inherent


stability.
When uncertainty is present, however, neither of these two asser-
tions is necessarily true; uncertainty may cause the state to wander
outside the region where the optimal control problem can be solved
and may lead to instability. Procedures for overcoming the problems
arising from uncertainty are presented in Chapters 3 and 5. In most
of the control algorithms presented in this chapter, the decrease in the
optimal cost, on which the proof of stability is founded, is based on
the assumption that the next state is exactly as predicted and that the
global solution to the optimal control problem can be computed. In the
suboptimal control algorithm presented in Chapter 6, where global op-
timality is not required, the decrease in the optimal cost is still based on
the assumption that the current state is exactly the state as predicted
at the previous time.

2.2 Model Predictive Control


As discussed briefly in Chapter 1, most nonlinear system descriptions
derived from physical arguments are continuous time models in the
form of nonlinear differential equations
dx
= f (x, u)
dt
For this class of systems, the control law with the best closed-loop
properties is the solution to the following infinite horizon, constrained
optimal control problem. The cost is defined to be
Z∞
V∞ (x, u(·)) = ℓ(x(t), u(t))dt
0

in which x(t) and u(t) satisfy ẋ = f (x, u). The optimal control prob-
lem P∞ (x) is defined by

min V∞ (x, u(·))


u(·)

subject to

ẋ = f (x, u) x(0) = x0
(x(t), u(t)) ∈ Z for all t ∈ R≥0

If ℓ(·) is positive definite, the goal of the regulator is to steer the state
of the system to the origin.
92 Model Predictive Control—Regulation

We denote the solution to this problem (when it exists) by u0∞ (·; x)


0
and the resultant optimal value function by V∞ (x). The closed-loop
system under this optimal control law evolves as

dx(t)
= f (x(t), u0∞ (t; x))
dt
If f (·), ℓ(·) and Vf (·) satisfy certain differentiability and growth as-
sumptions, and if the class of admissible controls is sufficiently rich,
then a solution to P∞ (x) exists for all x and satisfies
0
V̇∞ (x) = −ℓ(x, u0∞ (0; x))
0
Using this and upper and lower bounds on V∞ (·) enables global asymp-
totic stability of the origin to be established.
Although the control law u0∞ (0; ·) provides excellent closed-loop
properties, there are several impediments to its use. A feedback, rather
than an open-loop, solution of the optimal control problem is desirable
because of uncertainty; solution of the optimal control problem P∞ (x)
yields the optimal control sequence u0∞ (0; x) for the state x but does
not provide a control law. Dynamic programming may, in principle, be
employed, but is generally impractical if the state dimension and the
horizon are not small.
If we turn instead to an MPC approach in which we generate on-
line only the value of optimal control sequence u0∞ (·; x) for the cur-
rently measured value of x, rather than for all x, the problem remains
formidable for the following reasons. First, we are optimizing a time
function, u(·), and functions are infinite dimensional. Secondly, the
time interval of interest, [0, ∞), is a semi-infinite interval, which poses
other numerical challenges. Finally, the cost function V (x, u(·)) is usu-
ally not a convex function of u(·), which presents significant optimiza-
tion difficulties, especially in an online setting. Even proving existence
of the optimal control in this general setting is a challenge. However,
see Pannocchia, Rawlings, Mayne, and Mancuso (2015) in which it is
shown how an infinite horizon optimal control may be solved online if
the system is linear, the cost quadratic and the control but not the state
is constrained.
Our task in this chapter may therefore be viewed as restricting the
system and control parameterization to make problem P∞ (x) more eas-
ily computable. We show how to pose various problems for which we
can establish existence of the optimal solution and asymptotic closed-
loop stability of the resulting controller. For these problems, we almost
2.2 Model Predictive Control 93

always replace the continuous time differential equation with a discrete


time difference equation. We often replace the semi-infinite time inter-
val with a finite time interval and append a terminal region so that we
can approximate the cost to go for the semi-infinite interval once the
system enters the terminal region. Although the solution of problem
P∞ (x) in its full generality is out of reach with today’s computational
methods, its value lies in distinguishing what is desirable in the control
problem formulation and what is achievable with available computing
technology.
We develop here MPC for the control of constrained nonlinear time-
invariant systems. The nonlinear system is described by the nonlinear
difference equation

x + = f (x, u) f :X×U→X (2.1)

in which x ∈ X ⊆ Rn is the current state, u ∈ U ⊆ Rm , is the cur-


rent control, (sets X and U are assumed closed), and x + the successor
state; x + = f (x, u) is the discrete time analog of the continuous time
differential equation ẋ = f (x, u). The function f (·) is assumed to be
continuous and to satisfy f (0, 0) = 0; (0, 0) is the desired equilibrium
pair. The subsequent analysis is easily extended to the case when the
desired equilibrium pair is (xs , us ) satisfying xs = f (xs , us ).
We introduce here some notation that we employ in the sequel. The
set I denotes the set of integers, I≥0 := {0, 1, 2, . . .} and, for any two
integers m and n satisfying m ≤ n, Im:n := {m, m + 1, . . . , n}. We refer
to the pair (x, i) as an event; an event (x, i) denotes that the state at
time i is x. We use u to denote the possibly infinite control sequence
(u(k))k∈I≥0 = (u(0), u(1), u(2), . . .). In the context of MPC, u fre-
quently denotes the finite sequence uI0:N−1 = (u(0), u(1), . . . , u(N − 1))
in which N is the control horizon. For any integer j ∈ I≥0 , we sometimes

employ uj to denote the finite sequence u(0), u(1), . . . , u(j − 1) . Sim-
ilarly x denotes the possibly infinite state sequence (x(0), x(1), x(2), . . .)

and xj the finite sequence x(0), x(1), . . . , x(j) . When no confusion
can arise we often employ, for simplicity in notation, u in place of uN
and x in place of xN . Also for simplicity in notation, u, when used
in algebraic expressions, denotes the column vector (u(0)′ , u(1)′ , . . . ,
u(N − 1)′ )′ ; similarly x in algebraic expressions denotes the column
vector (x(0)′ , x(1)′ , . . . , x(N)′ )′ .
The solution of (2.1) at time k, if the initial state at time zero is x
and the control sequence is u, is denoted by φ(k; x, u); the solution at
time k depends only on u(0), u(1), . . . , u(k − 1). Similarly, the solution
94 Model Predictive Control—Regulation

of the system (2.1) at time k, if the initial state at time i is x and the
control sequence is u, is denoted by φ(k; (x, i), u). Because the system
is time invariant, the solution does not depend on the initial time; if
the initial state is x at time i, the solution at time j ≥ i is φ(j − i; x, u).
Thus the solution at time k if the initial event is (x, i) is identical to
the solution at time k − i if the initial event is (x, 0). For each k, the
function (x, u) , φ(k; x, u) is continuous as we show next.

Proposition 2.1 (Continuity of system solution). Suppose the function


f (·) is continuous. Then, for each integer k ∈ I, the function (x, u) ,
φ(k; x, u) is continuous.

Proof.
Since φ(1; x, u(0)) = f (x, u(0)), the function (x, u(0)) , φ(1; x,
u(0)) is continuous. Suppose the function (x, uj−1 ) , φ(j; x, uj−1 )
is continuous and consider the function (x, uj ) , φ(j + 1; x, uj ). Since

φ(j + 1; x, uj ) = f (φ(j; x, uj−1 ), u(j))

in which f (·) and φ(j; · ) are continuous and since φ(j + 1; · ) is the
composition of two continuous functions f (·) and φ(j; · ), it follows
that φ(j + 1; · ) is continuous. By induction φ(k; · ) is continuous for
any positive integer k. ■

The system (2.1) is subject to hard constraints which may take the
form
(x(k), u(k)) ∈ Z for all k ∈ I≥0 (2.2)

in which Z ⊆ X × U is generally polyhedral, i.e., Z = {(x, u) | F x + Eu ≤


e} for some F , E, e. For example, many problems have a rate constraint
|u(k) − u(k − 1)| ≤ c on the control. This constraint may equivalently
be expressed as |u(k) − z(k)| ≤ c in which z is an extra state satisfying
z+ = u so that z(k) = u(k − 1). The constraint (x, u) ∈ Z implies the
control constraint is possibly state-dependent, i.e., (x, u) ∈ Z implies
that
u ∈ U(x) := {u ∈ U | (x, u) ∈ Z}

It also implies that the state must satisfy the constraint

x ∈ {x ∈ X | U(x) ≠ ∅}

If there are no mixed constraints, then Z = X × U so the system con-


straints become x(k) ∈ X and u(k) ∈ U.
2.2 Model Predictive Control 95

We assume in this chapter that the state x is known; if the state


x is estimated, uncertainty (state estimation error) is introduced and
robust MPC, discussed in Chapter 3, is required.
The next ingredient of the optimal control problem is the cost func-
tion. Practical considerations normally require that the cost be defined
over a finite horizon N to ensure the resultant optimal control prob-
lem can be solved sufficiently rapidly to permit effective control. We
consider initially the regulation problem in which the target state is the
origin. If x is the current state and i the current time, then the optimal
control problem may be posed as minimizing a cost defined over the
interval from time i to time N +i. The optimal control problem PN (x, i)
at event (x, i) is the problem of minimizing the cost

i+N−1
X
ℓ(x(k), u(k)) + Vf (x(N + i))
k=i

with respect to the sequences x := (x(i), x(i + 1), . . . , x(i + N)) and
u := (u(i), u(i + 1), . . . , u(i + N − 1)) subject to the constraints that x
and u satisfy the difference equation (2.1), the initial condition x(i) =
x, and the state and control constraints (2.2). We assume that ℓ(·)
is continuous and that ℓ(0, 0) = 0. The optimal control and state se-
quences, obtained by solving PN (x, i), are functions of the initial event
(x, i)
 
u0 (x, i) = u0 (i; (x, i)), u0 (i + 1; (x, i)), . . . , u0 (i + N − 1; (x, i))
 
x0 (x, i) = x 0 (i; (x, i)), x 0 (i + 1; (x, i)), . . . , x 0 (i + N; (x, i))

with x 0 (i; (x, i)) = x. In MPC, the first control action u0 (i; (x, i)) in the
optimal control sequence u0 (x, i) is applied to the plant, i.e., u(i) =
u0 (i; (x, i)). Because the system x + = f (x, u), the stage cost ℓ(·), and
the terminal cost Vf (·) are all time invariant, however, the solution of
PN (x, i), for any time i ∈ I≥0 , is identical to the solution of PN (x, 0) so
that

u0 (x, i) = u0 (x, 0)
x0 (x, i) = x0 (x, 0)

In particular, u0 (i; (x, i)) = u0 (0; (x, 0)), i.e., the control u0 (i; (x, i))
applied to the plant is equal to u0 (0; (x, 0)), the first element in the
sequence u0 (x, 0). Hence we may as well merely consider problem
96 Model Predictive Control—Regulation

PN (x, 0) which, since the initial time is irrelevant, we call PN (x). Sim-
ilarly, for simplicity in notation, we replace u0 (x, 0) and x0 (x, 0) by,
respectively, u0 (x) and x0 (x).
The optimal control problem PN (x) may then be expressed as min-
imization of
N−1
X
ℓ(x(k), u(k)) + Vf (x(N))
k=0
with respect to the decision variables (x, u) subject to the constraints
that the state and control sequences x and u satisfy the difference
equation (2.1), the initial condition x(0) = x, and the state, control
constraints (2.2). Here u denotes the control sequence u(0), u(1), . . . ,

u(N − 1) and x the state sequence (x(0), x(1), . . . , x(N)). Retaining
the state sequence in the set of decision variables is discussed in Chap-
ters 6 and 8. For the purpose of analysis, however, it is preferable to
constrain the state sequence x a priori to be a solution of x + = f (x, u)
enabling us to express the problem in the equivalent form of mini-
mizing, with respect to the decision variable u, a cost that is purely
a function of the initial state x and the control sequence u. This for-
mulation is possible since the state sequence x may be expressed, via
the difference equation x + = f (x, u), as a function of (x, u). The cost
becomes VN (x, u) defined by
N−1
X
VN (x, u) := ℓ(x(k), u(k)) + Vf (x(N)) (2.3)
k=0

in which x(k) := φ(k; x, u) for all k ∈ I0:N . Similarly the constraints


(2.2), together with an additional terminal constraint
x(N) ∈ Xf ⊆ X
impose an implicit constraint on the control sequence of the form
u ∈ UN (x) (2.4)
The control constraint set UN (x) is the set of control sequences u :=
(u(0), u(1), . . . , u(N − 1)) satisfying the state and control constraints.
It is therefore defined by
UN (x) := {u | (x, u) ∈ ZN } (2.5)
N
in which the set ZN ⊂ X × U is defined by

ZN := (x, u) | (φ(k; x, u), u(k)) ∈ Z, ∀k ∈ I0:N−1 ,
φ(N; x, u) ∈ Xf (2.6)
2.2 Model Predictive Control 97

The optimal control problem PN (x), is, therefore

PN (x) : VN0 (x) := min{VN (x, u) | u ∈ UN (x)} (2.7)


u

Problem PN (x) is a parametric optimization problem in which the


decision variable is u, and both the cost and the constraint set depend
on the parameter x. The set ZN is the set of admissible (x, u), i.e., the
set of (x, u) for which the constraints of PN (x) are satisfied. Let XN
be the set of states in X for which PN (x) has a solution

XN := {x ∈ X | UN (x) ≠ ∅} (2.8)

It follows from (2.5) and (2.8) that

XN = {x ∈ X | ∃u ∈ UN such that (x, u) ∈ ZN }

which is the orthogonal projection of ZN ⊂ X × UN onto X. The domain


of VN0 (·), i.e., the set of states in X for which PN (x) has a solution, is
XN .
Not every optimization problem has a solution. For example, the
problem min{x | x ∈ (0, 1)} does not have a solution; inf{x | x ∈ (0,
1)} = 0 but x = 0 does not lie in the constraint set (0, 1). By Weier-
strass’s theorem, however, an optimization problem does have a so-
lution if the cost is continuous (in the decision variable) and the con-
straint set compact (see Proposition A.7). This is the case for our prob-
lem as shown subsequently in Proposition 2.4. We assume, without
further comment, that the following two standing conditions are satis-
fied in the sequel.
Assumption 2.2 (Continuity of system and cost). The functions f :
Z → X, ℓ : Z → R≥0 and Vf : Xf → R≥0 are continuous, f (0, 0) = 0,
ℓ(0, 0) = 0 and Vf (0) = 0.
In by far the majority of applications the set of controls U is bounded.
Nevertheless, it is of theoretical interest to consider the case when U
is not bounded; e.g., when the optimal control problem has no con-
straints on the control. To analyze this case we employ an implicit
control constraint set ŪcN (x) defined as follows. Choose c ≥ 0 and
define
ŪcN (x) := {u ∈ UN (x) | VN (x, u) ≤ c}
c
We also define the feasible set X̄N for the optimal control problem with
no constraints on the control by
c
X̄N := {x ∈ X | ŪcN (x) ≠ ∅}
98 Model Predictive Control—Regulation

Assumption 2.3 (Properties of constraint sets). The set Z is closed and


the set Xf ⊆ X is compact. Each set contains the origin. If U is bounded
(hence compact), the set U(x) is compact for all x ∈ X. If U is un-
bounded, the function u , VN (x, u) is coercive, i.e., VN (x, u) → ∞ as
|u| → ∞ for all x ∈ X).
It is implicitly assumed that the desired equilibrium pair is (x̄, ū) =
(0, 0) because the first problem we tackle is regulation to the origin.
Proposition 2.4 (Existence of solution to optimal control problem). Sup-
pose Assumptions 2.2 and 2.3 hold. Then
(a) The function VN (·) is continuous in ZN .
c
(b) For each x ∈ XN (for each x ∈ X̄N , each c ∈ R>0 ), the control
c
constraint set UN (x) (ŪN (x)) is compact.
c
(c) For each x ∈ XN (for each X̄N , each c ∈ R>0 ) a solution to PN (x)
exists.
Proof.
(a) That (x, u) , VN (x, u) is continuous follows from continuity of
ℓ(·) and Vf (·) in Assumption 2.2, and the continuity of (x, u) ,
φ(j; x, u) for each j ∈ I0:N−1 , established in Proposition 2.1.

(b) The set UN (x) is defined by a finite set of inequalities each of which
has the form η(x, u) ≤ 0 in which η(·) is continuous. It follows that
UN (x) is closed. If U is bounded, so is UN (x), and UN (x) is therefore
compact for all x ∈ XN .
If instead U is unbounded, the set UfcN := {u | VN (x, u) ≤ c} for c ∈ R>0
is closed for all c and x because VN (·) is continuous; ŪcN (x) is the
intersection of this set with UN (x), just shown to be closed. So ŪcN (x)
is the intersection of closed sets and is closed. To prove ŪcN (x) is
bounded for all c, suppose the contrary: there exists a c such that
ŪcN (x) is unbounded. Then there exists a sequence (ui )i∈I≥0 in ŪcN (x)
such that ui → ∞ as i → ∞. Because VN (·) is coercive, VN (x, ui ) → ∞
as i → ∞, a contradiction. Hence ŪcN (x) is closed and bounded and,
hence, compact.

(c) Since VN (x, · ) is continuous and UN (x) (ŪcN (x)) is compact, it fol-
lows from Weierstrass’s theorem (Proposition A.7) a solution to PN (x)
c
exists for each x ∈ XN (X̄N ). ■

Although the function (x, u) , VN (x, u) is continuous, the function


x , VN0 (x) is not necessarily continuous; we discuss this possibility
2.2 Model Predictive Control 99

and its implications later. For each x ∈ XN , the solution of PN (x) is

u0 (x) = arg min{VN (x, u) | u ∈ UN (x)}


u

If u0 (x) = u0 (0; x), u0 (1; x), . . . , u0 (N − 1; x) is unique for each x ∈
XN , then u0 : X → UN is a function; otherwise it is a set-valued func-
tion.1 In MPC, the control applied to the plant is the first element
u0 (0; x) of the optimal control sequence. At the next sampling instant,
the procedure is repeated for the successor state. Although MPC com-
putes u0 (x) only for specific values of the state x, it could, in principle,
be used to compute u0 (x) and, hence, u0 (0; x) for every x for which
PN (x) is feasible, yielding the MPC control law κN (·) defined by

κN (x) := u0 (0; x), x ∈ XN

MPC does not require determination of the control law κN (·), a task that
is usually intractable when constraints or nonlinearities are present and
the state dimension is large; it is this fact that makes MPC so useful.
If, at a given state x, the solution of PN (x) is not unique, then
κN (·) = u0 (0; · ) is set valued and the model predictive controller se-
lects one element from the set κN (x).

Example 2.5: Linear quadratic MPC


Suppose the system is described by

x + = f (x, u) := x + u

with initial state x. The stage cost and terminal cost are

ℓ(x, u) := (1/2)(x 2 + u2 ) Vf (x) := (1/2)x 2

The control constraint is


u ∈ [−1, 1]
and there are no state or terminal constraints. Suppose the horizon is
N = 2. Under the first approach, the decision variables are u and x,
and the optimal control problem is minimization of

VN (x(0), x(1), x(2), u(0), u(1)) =


 
(1/2) x(0)2 + x(1)2 + x(2)2 + u(0)2 + u(1)2
1A set-valued function φ(·) is a function whose value φ(x) for each x in its domain
is a set.
100 Model Predictive Control—Regulation

with respect to (x(0), x(1), x(2)), and (u(0), u(1)) subject to the fol-
lowing constraints

x(0) = x x(1) = x(0) + u(0) x(2) = x(1) + u(1)


u(0) ∈ [−1, 1] u(1) ∈ [−1, 1]

The constraint u ∈ [−1, 1] is equivalent to two inequality constraints,


u ≤ 1 and −u ≤ 1. The first three constraints are equality constraints
enforcing satisfaction of the difference equation.
In the second approach, the decision variable is merely u because
the first three constraints are automatically enforced by requiring x to
be a solution of the difference equation. Hence, the optimal control
problem becomes minimization with respect to u = (u(0), u(1)) of

VN (x, u) = (1/2) x 2 + (x + u(0))2 + (x + u(0) + u(1))2 +



u(0)2 + u(1)2
h i
= (3/2)x 2 + 2x x u + (1/2)u′ Hu

in which " #
3 1
H=
1 2

subject to the constraint u ∈ UN (x) where

UN (x) = {u | |u(k)| ≤ 1 k = 0, 1}

Because there are no state or terminal constraints, the set UN (x) =


UN for this example does not depend on the parameter x; often it
does. Both optimal control problems are quadratic programs.2 The
solution for x = 10 is u0 (1; 10) = u0 (2; 10) = −1 so the optimal state
trajectory is x 0 (0; 10) = 10, x 0 (1; 10) = 9 and x 0 (2; 10) = 8. The value
VN0 (10) = 124. By solving PN (x) for every x ∈ [−10, 10], the optimal
control law κN (·) on this set can be determined, and is shown in Figure
2.1(a). The implicit MPC control law is time invariant since the system
being controlled, the cost, and the constraints are all time invariant.
For our example, the controlled system (the system with MPC) satisfies
the difference equation

x + = x + κN (x) κN (x) = −sat(3x/5)


2A quadratic program is an optimization problem in which the cost is quadratic and
the constraint set is polyhedral, i.e., defined by linear inequalities.
2.2 Model Predictive Control 101

1.5
10
1
8

0.5 6 x(k)

κN (x) 0 4

−0.5 2

−1 0 u(k)

−1.5 −2
−2 −1 0 1 2 0 2 4 6 8 10 12 14
x k

(a) Implicit MPC control law. (b) Trajectories of controlled system.

Figure 2.1: Example of MPC.

and the state and control trajectories for an initial state of x = 10 are
shown in Figure 2.1(b). It turns out that the origin is exponentially sta-
ble for this simple case; often, however, the terminal cost and terminal
constraint set have to be carefully chosen to ensure stability. □

Example 2.6: Closer inspection of linear quadratic MPC


We revisit the MPC problem discussed in Example 2.5. The objective
function is
VN (x, u) = (1/2)u′ Hu + c(x)′ u + d(x)

where c(x)′ = [2 1]x and d(x) = (3/2)x 2 . The objective function may
be written in the form

VN (x, u) = (1/2)(u − a(x))′ H(u − a(x)) + e(x)

Expanding the second form shows the two forms are equal if
" #
−1 3
a(x) = −H c(x) = K1 x K1 = −(1/5)
1

and
e(x) + (1/2)a(x)′ Ha(x) = d(x)

Since H is positive definite, a(x) is the unconstrained minimizer of the


objective function; indeed ∇u VN (x, a(x)) = 0 since

∇u VN (x, u) = Hu + c(x)
102 Model Predictive Control—Regulation

U2
x = 2.25
0 x=0

x = 5/3
x=3
u1 −1 x = 4.5

a(x)
−2

−3

−4 −3 −2 −1 0 1
u0

Figure 2.2: Feasible region U2 , elliptical cost contours and ellipse


center a(x), and constrained minimizers for different
values of x.

The locus of a(x) for x ≥ 0 is shown in Figure 2.2. Clearly the uncon-
strained minimizer a(x) = K1 x is equal to the constrained minimizer
u0 (x) for all x such that a(x) ∈ U2 where U2 is the unit square illus-
trated in Figure 2.2; since a(x) = K1 x, a(x) ∈ U2 for all x ∈ X1 = [0,
xc1 ] where xc1 = 5/3. For x > xc1 , the unconstrained minimizer lies
outside U2 as shown in Figure 2.2 for x = 2.25, x = 3 and x = 5.
For such x, the constrained minimizer u0 (x) is a point that lies on the
intersection of a level set of the objective function (which is an ellipse)
and the boundary of U2 . For x ∈ [xc1 , xc2 ), u0 (x) lies on the left face
of the box U2 and for x ≥ xc2 = 3, u0 (x) remains at (−1, −1), the
bottom left vertex of U2 .
When u0 (x) lies on the left face of U2 , the gradient ∇u VN (x, u0 (x))
of the objective function is normal to the left face of U2 , i.e., the level
set of VN0 (·) passing through u0 (x) is tangential to the left face of U2 .
The outward normal to U2 at a point on the left face is −e1 = (−1, 0)
2.2 Model Predictive Control 103

so that at u = u0 (x)

∇u V (x, u0 (x)) + λ(−e1 ) = 0

for some λ > 0; this is a standard condition of optimality. Since u =


[−1 v]′ for some v ∈ [−1, 1] and since ∇u V (x, u) = H(u − a(x)) =
Hu + c(x), the condition of optimality is
" #" # " # " # " #
3 1 −1 2 λ 0
+ x− =
1 2 v 1 0 0

or

− 3 + v + 2x − λ = 0
− 1 + 2v + x = 0

which, when solved, yields v = (1/2)−(1/2)x and λ = −(5/2)+(3/2)x.


Hence
" # " #
−1 0
u0 (x) = b2 + K2 x b2 = K2 =
(1/2) −(1/2)

for all x ∈ X2 = [xc1 , xc2 ] where xc2 = 3 since u0 (x) ∈ U2 for all x in
this range. For all x ∈ X3 = [xc2 , ∞), u0 (x) = (−1, −1)′ . Summarizing

x ∈ [0, (5/3)] =⇒ u0 (x) = K1 x


x ∈ [(5/3), 3] =⇒ u0 (x) = K2 x + b2
x ∈ [3, ∞) =⇒ u0 (x) = b3

in which
" # " # " # " #
−(3/5) 0 −1 −1
K1 = K2 = b2 = b3 =
−(1/5) −(1/2) (1/2) −1

The optimal control for x ≤ 0 may be obtained by symmetry; u0 (−x) =


−u0 (x) for all x ≥ 0 so that

x ∈ [0, −(5/3)] =⇒ u0 (x) = −K1 x


x ∈ [−(5/3), −3] =⇒ u0 (x) = −K2 x − b2
x ∈ [−3, −∞) =⇒ u0 (x) = −b3

It is easily checked that u0 (·) is continuous and satisfies the constraint


for all x ∈ R. The MPC control law κN (·) is the first component of u0 (·)
104 Model Predictive Control—Regulation

and, therefore, is defined by

κN (x) = 1 x ∈ [−(5/3), −∞)


κN (x) = −(3/5)x x ∈ [−(5/3), (5/3)]
κN (x) = −1 x ∈ [(5/3), ∞)

i.e., κN (x) = −sat(3x/5) which is the saturating control law depicted in


Figure 2.1(a). The control law is piecewise affine and the value function
piecewise quadratic. The structure of the solution to constrained linear
quadratic optimal control problems is explored more fully in Chapter 7.

As we show in Chapter 3, continuity of the value function is desir-


able. Unfortunately, this is not true in general; the major difficulty is in
establishing that the set-valued function UN (·) has certain continuity
properties. Continuity of the value function VN0 (·) and of the implicit
control law κN (·) may be established for a few important cases, how-
ever, as is shown by the next result, which assumes satisfaction of our
standing assumptions: 2.2 and 2.3 so that the cost function VN (·) is
continuous in (x, u).

Theorem 2.7 (Continuity of value function and control law). Suppose


that Assumptions 2.2 and 2.3 (U bounded) hold.
(a) Suppose that there are no state constraints so that Z = X×U in which
X = Xf = Rn . Then the value function VN0 : XN → R is continuous and
XN = Rn .

(b) Suppose f (·) is linear (x + = Ax + Bu) and that the state-control


constraint set Z is polyhedral.3 Then the value function VN0 : XN → R is
continuous.

(c) If, in addition, the solution u0 (x) of PN (x) is unique at each x ∈ XN ,


then the implicit MPC control law κN (·) is continuous.

The proof of this theorem is given in Section C.3 of Appendix C.


The following example, due to Meadows, Henson, Eaton, and Rawlings
(1995), shows that there exist nonlinear examples where the value func-
tion and implicit control law are not continuous.

3A set Z is polyhedral if it may be defined as set of linear inequalities, i.e., if it may


be expressed in the form Z = {z | Mz ≤ m}.
2.2 Model Predictive Control 105

Example 2.8: Discontinuous MPC control law


Consider the nonlinear system defined by

x1+ = x1 + u
x2+ = x2 + u3

The control horizon is N = 3 and the cost function V3 (·) is defined by


2
X
V3 (x, u) := ℓ(x(k), u(k))
k=0

and the stage cost ℓ(·) is defined by

ℓ(x, u) := |x|2 + u2

The constraint sets are X = R2 , U = R, and Xf := {0}, i.e., there are no


state and control constraints, and the terminal state must satisfy the
constraint x(3) = 0. Hence, although there are three control actions,
u(0), u(1), and u(2), two must be employed to satisfy the terminal
constraint, leaving only one degree of freedom. Choosing u(0) to be
the free decision variable automatically constrains u(1) and u(2) to be
functions of the initial state x and the first control action u(0). Solving
the equation

x1 (3) = x1 + u(0) + u(1) + u(2) =0


3 3 3
x2 (3) = x2 + u(0) + u(1) + u(2) = 0

for u(1) and u(2) yields


p
u(1) = −x1 /2 − u(0)/2 ± b
p
u(2) = −x1 /2 − u(0)/2 ∓ b

in which

3u(0)3 − 3u(0)2 x1 − 3u(0)x12 − x13 + 4x2


b=
12(u(0) + x1 )

Clearly a real solution exists only if b is positive, i.e., if both the numer-
ator and denominator in the expression for b have the same sign. The
optimal control problem P3 (x) is defined by

V30 (x) = min{V3 (x, u) | φ(3; x, u) = 0}


u
106 Model Predictive Control—Regulation

1.5

0.5

u0 0

−0.5

−1

−1.5

−2
−π −π /2 0 π /2 π
θ

Figure 2.3: First element of control constraint set U3 (x) (shaded)


and control law κ3 (x) (line) versus x = (cos(θ), sin(θ)),
θ ∈ [−π , π ] on the unit circle for a nonlinear system with
terminal constraint.

and the implicit MPC control law is κ3 (·) where κ3 (x) = u0 (0; x), the
first element in the minimizing sequence u0 (x). It can be shown, using
analysis presented later in this chapter, that the origin is asymptotically
stable for the controlled system x + = f (x, κN (x)). That this control
law is necessarily discontinuous may be shown as follows. If the control
is strictly positive, any trajectory originating in the first quadrant (x1 ,
x2 > 0) moves away from the origin. If the control is strictly negative,
any control originating in the third quadrant (x1 , x2 < 0) also moves
away from the origin. But the control cannot be zero at any nonzero
point lying in the domain of attraction. If it were, this point would be a
fixed point for the controlled system, contradicting the fact that it lies
in the domain of attraction.
In fact, both the value function V30 (·) and the MPC control law κ3 (·)
are discontinuous. Figures 2.3 and 2.4 show how U3 (x), κ3 (x), and
V30 (x) vary as x = (cos(θ), sin(θ)) ranges over the unit circle. A further
conclusion that can be drawn from this example is that it is possible
2.3 Dynamic Programming Solution 107

10

V30

1
−π −π /2 0 π /2 π
θ

Figure 2.4: Optimal cost V30 (x) versus x = (cos(θ), sin(θ)), θ ∈


[−π , π ] on the unit circle; the discontinuity in V30 is
caused by the discontinuity in U3 as θ crosses the dashed
line in Figure 2.3.

for the MPC control law to be discontinuous at points where the value
function is continuous. □

2.3 Dynamic Programming Solution


We examine next the DP solution of the optimal control problem PN (x),
not because it provides a practical procedure but because of the insight
it provides. DP can rarely be used for constrained and/or nonlinear
control problems unless the state dimension n is small. MPC is best
regarded as a practical means of implementing the DP solution; for a
given state x it provides VN0 (x) and κN (x), the value, respectively, of
the value function and control law at a point x. DP, on the other hand,
yields the value function VN0 (·) and the implicit MPC control law κN (·).
The optimal control problem PN (x) is defined, as before, by (2.7)
with the cost function VN (·) defined by  (2.3) and the constraints  by
(2.4). DP yields an optimal policy µ0 = µ00 (·), µ10 (·), . . . , µN−1
0
(·) , i.e.,
a sequence of control laws µi : Xi → U, i = 0, 1, . . . , N − 1. The domain
Xi of each control law will be defined later. The optimal controlled
108 Model Predictive Control—Regulation

system is time varying and satisfies


x + = f (x, µi0 (x)), i = 0, 1, . . . , N − 1
in contrast with the system using MPC, which is time invariant and
satisfies
x + = f (x, κN (x)), i = 0, 1, . . . , N − 1
with κN (·) = µ00 (·). The optimal control law at time i is µi0 (·), but reced-
ing horizon control (RHC) uses the time-invariant control law κN (·) =
µ0 (·) obtained by assuming that at each time t, the terminal time or
horizon is t + N so that the horizon t + N recedes as t increases. One
consequence is that the time-invariant control law κN (·) is not opti-
mal for the problem of controlling x + = f (x, u) over the fixed interval
[0, T ] in such a way as to minimize VN and satisfy the constraints.
For all j ∈ I0:N−1 , let Vj (x, u), Uj (x), Zj , Pj (x) (and Vj0 (x)) be
defined, respectively, by (2.3), (2.5), (2.6), and (2.7), with N replaced by
j. As shown in Section C.1 of Appendix C, DP solves not only PN (x)
for all x ∈ XN , the domain of VN0 (·), but also Pj (x) for all x ∈ Xj , the
domain of Vj0 (·), all j ∈ I0:N−1 . The DP equations are, for all x ∈ Xj

Vj0 (x) = min {ℓ(x, u) + Vj−1


0
(f (x, u)) | f (x, u) ∈ Xj−1 } (2.9)
u∈U(x)
0
κj (x) = arg min {ℓ(x, u) + Vj−1 (f (x, u)) | f (x, u) ∈ Xj−1 } (2.10)
u∈U(x)

with
Xj = {x ∈ X | ∃u ∈ U(x) such that f (x, u) ∈ Xj−1 } (2.11)
for j = 1, 2, . . . , N (j is time to go), with terminal conditions
V00 (x) = Vf (x) ∀x ∈ X0 X0 = Xf
For each j, Vj0 (x)
is the optimal cost for problem Pj (x) if the current
state is x, current time is N − j, and the terminal time is N; Xj is the
domain of Vj0 (x) and is also the set of states in X that can be steered
to the terminal set Xf in j steps by an admissible control sequence,
i.e., a control sequence that satisfies the control, state, and terminal
constraints. Hence, for each j
Xj = {x ∈ X | Uj (x) ≠ ∅}
DP yields much more than an optimal control sequence for a given
initial state; it yields an optimal feedback policy µ0 or sequence of con-
trol laws where
µ0 := (µ0 (·), µ1 (·), . . . , µN−1 (·)) = (κN (·), κN−1 (·), . . . , κ1 (·))
2.3 Dynamic Programming Solution 109

At event (x, i), i.e., at state x at time i, the time to go is N − i and the
optimal control is
µi0 (x) = κN−i (x)
i.e., µi0 (·) is the optimal control law at time i. Consider an initial event
(x, 0), i.e., state x at time zero. If the terminal time (horizon) is N, the
optimal control for (x, 0) is κN (x). The successor state, at time 1, is

x + = f (x, κN (x))

At event (x + , 1), the time to go to the terminal set Xf is N − 1 and the


optimal control is κN−1 (x + ) = κN−1 (f (x, κN (x))). For a given initial
event (x, 0), the optimal policy generates the optimal state and control
trajectories x0 (x) and u0 (x) that satisfy the difference equations

x(0) = x u(0) = κN (x) = µ0 (x) (2.12)


x(i + 1) = f (x(i), u(i)) u(i) = κN−i (x(i)) = µi (x(i)) (2.13)

for i = 0, 1, . . . , N − 1. These state and control trajectories are iden-


tical to those obtained, as in MPC, by solving PN (x) directly for the
particular initial event (x, 0) using a mathematical programming algo-
rithm. Dynamic programming, however, provides a solution for any
event (x, i) such that i ∈ I0:N−1 and x ∈ Xi .
Optimal control, in the classic sense of determining a control that
minimizes a cost over the interval [0, N] (so that the cost for k > N
is irrelevant), is generally time varying (at event (x, i), i ∈ I0:N , the
optimal control is µi (x) = κN−i (x)). Under fairly general conditions,
µi (·) → κ∞ (·) as N → ∞ where κ∞ (·) is the stationary infinite horizon
optimal control law. MPC and RHC, on the other hand, employ the
time-invariant control κN (x) for all i ∈ I≥0 . Thus the state and control
trajectories xmpc (x) and umpc (x) generated by MPC for an initial event
(x, 0) satisfy the difference equations

x(0) = x u(0) = κN (x)


x(i + 1) = f (x(i), u(i)) u(i) = κN (x(i))

and can be seen to differ in general from x0 (x) and u0 (x), which satisfy
(2.12) and (2.13).
Before leaving this section, we obtain some properties of the solu-
tion to each partial problem Pj (x). For this, we require a few definitions
(Blanchini and Miani, 2008).

Definition 2.9 (Positive and control invariant sets).


110 Model Predictive Control—Regulation

(a) A set X ⊆ Rn is positive invariant for x + = f (x) if x ∈ X implies


f (x) ∈ X.

(b) A set X ⊆ Rn is control invariant for x + = f (x, u), u ∈ U, if, for all
x ∈ X, there exists a u ∈ U such that f (x, u) ∈ X.

We recall from our standing assumptions 2.2 and 2.3 that f (·), ℓ(·)
and Vf (·) are continuous, that X and Xf are closed, U is compact and
that each of these sets contains the origin.

Proposition 2.10 (Existence of solutions to DP recursion). Suppose As-


sumptions 2.2 and 2.3 (U bounded) hold. Then
(a) For all j ∈ I≥0 , the cost function Vj (·) is continuous in Zj , and, for
each x ∈ Xj , the control constraint set Uj (x) is compact and a solution
u0 (x) ∈ Uj (x) to Pj (x) exists.

(b) If X0 := Xf is control invariant for x + = f (x, u), u ∈ U(x) and


0 ∈ Xf , then, for each j ∈ I≥0 , the set Xj is also control invariant,
Xj ⊇ Xj−1 , and 0 ∈ Xj . In addition, the sets Xj and Xj−1 are positive
invariant for x + = f (x, κj (x)) for j ∈ I≥1 .

(c) For all j ∈ I≥0 , the set Xj is closed.

(d) If, in addition, Xf is compact and the function f −1 (·)4 is bounded on


bounded sets (f −1 (S) is bounded for every bounded set S ∈ Rn ), then,
for all j ∈ I≥0 , Xj is compact.

Proof.
(a) This proof is almost identical to the proof of Proposition 2.4.

(b) By assumption, X0 = Xf ⊆ X is control invariant. By (2.11)

X1 = {x ∈ X | ∃u ∈ U(x) such that f (x, u) ∈ X0 }

Since X0 is control invariant for x + = f (x, u), u ∈ U, for every x ∈ X0


there exist a u ∈ U such that f (x, u) ∈ X0 so that x ∈ X1 . Hence
X1 ⊇ X0 . Since for every x ∈ X1 , there exists a u ∈ U such that f (x,
u) ∈ X0 ⊆ X1 , it follows that X1 is control invariant for x + = f (x, u),
u ∈ U(x). If for some integer j ∈ I≥0 , Xj−1 is control invariant for
x + = f (x, u), it follows by similar reasoning that Xj ⊇ Xj−1 and
that Xj is control invariant. By induction Xj is control invariant and
Xj ⊇ Xj−1 for all j > 0. Hence 0 ∈ Xj for all j ∈ I≥0 . That Xj
is positive invariant for x + = f (x, κj (x)) follows from (2.10), which
4 For any S ⊂ Rn , f −1 (S) := {z ∈ Z | f (z) ∈ S}
2.3 Dynamic Programming Solution 111

shows that κj (·) steers every x ∈ Xj into Xj−1 ⊆ Xj . Since Xj−1 ⊆ Xj ,


κj (·) also steers every x ∈ Xj−1 into Xj−1 , so Xj−1 is positive invariant
under control law κj (·) as well.

(c) By Assumption 2.3, X0 = Xf is closed. Suppose, for some j ∈ I≥1 ,


that Xj−1 is closed. Then Zj := {(x, u) ∈ Z | f (x, u) ∈ Xj−1 } is closed
since f (·) is continuous. To prove that Xj is closed, take any sequence
(xi )i∈I≥0 in Xj that converges to, say, x̄. For each i, select a ui ∈ U(xi )
such that zi = (xi , ui ) ∈ Zj ; this is possible since xi ∈ Xj implies
xi ∈ {X | Uj (x) ≠ ∅}. Since Uj (x) ⊆ U and U is bounded, by the
Bolzano-Weierstrass theorem there exists a subsequence, indexed by I,
such that ui → ū (and xi → x̄) as i → ∞, i ∈ I. The sequence (xi , ui ) ∈
Zj , i ∈ I converges, and, since Zj is closed, (x̄, ū) ∈ Zj . Therefore
f (x̄, ū) ∈ Xj−1 and x̄ ∈ Xj so that Xj is closed. By induction Xj is
closed for all j ∈ I≥0 .

(d) Since Xf and U are bounded, so is Z1 ⊂ f −1 (Xf ) := {(x, u) ∈ Z |


f (x, u) ∈ Xf } and its projection X1 onto Rn . Assume then, for some
j ∈ I≥0 that Zj−1 is bounded; its projection Xj−1 is also bounded.
Consequently, Zj ⊂ f −1 (Xj−1 ) is also bounded and so is its projection
Xj . By induction, Xj is bounded, and hence, compact, for all j ∈ I≥0 .

Part (d) of Proposition 2.10 requires that the function f −1 (·) is


bounded on bounded sets. This is a mild requirement if f (·) is the dis-
crete time version of a continuous system as is almost always the case
in process control. If the continuous time system satisfies ẋ = fc (x, u)
and if the sample time is ∆, then
Z∆
f (x, u) = x + fc (x(s; x), u)ds
0

in which x(s; x) is the solution of ẋ = fc (x, u) at time s if x(0) = x


and u is the constant input in the interval [0, ∆]. It is easily shown that
f −1 (·) is bounded on bounded sets if U is bounded and either f (x,
u) = Ax + Bu and A is nonsingular, or fc (x, u) is Lipschitz in x (see
Exercise 2.2).
The fact that XN is positive invariant for x + = f (x, κN (x)) can also
be established by observing that XN is the set of states x in X for which
there exists a u that is feasible for PN (x), i.e., for which there exists
a control u satisfying the control, state and terminal constraints. It is
shown in the next section that for every x ∈ XN , there exists a feasi-
ble control sequence u e for PN (x + ) (x + = f (x, κN (x)) is the successor
112 Model Predictive Control—Regulation

state) provided that Xf is control invariant, i.e., XN is positive invariant


for x + = f (x, κN (x)) if Xf is control invariant. An important practical
consequence is that if PN (x(0)) can be solved for the initial state x(0),
then PN (x(i)) can be solved for any subsequent state x(i) of the con-
trolled system x + = f (x, κN (x)), a property that is sometimes called
recursive feasibility. Uncertainty, in the form of additive disturbances,
model error or state estimation error, may destroy this important prop-
erty; techniques to restore this property when uncertainty is present
are discussed in Chapter 3.

2.4 Stability
2.4.1 Introduction

The classical definition of stability was employed in the first edition of


this text. This states the origin in Rn is globally asymptotically stable
(GAS) for x + = f (x) if the origin is locally stable and if the origin is
globally attractive. The origin is locally stable if, for all ε > 0, there
exists a δ > 0 such that |x| < δ implies φ(k; x) < ε for all k ∈ I≥0
(small perturbations of the initial state from the origin cause subse-
quent perturbations to be small). The origin is globally attractive for
x + = f (x) if φ(k; x) → 0 as k → ∞ for all x ∈ Rn . This defini-
tion of stability has been widely used and is equivalent to the recently
defined stronger definition given below if f (·) is continuous but has
some disadvantages; there exist examples of systems that are asymp-
totically stable (AS) in the classical sense in which small perturbations
in the initial state from its initial value, not the origin, can cause sub-
sequent perturbations to be arbitrarily large. Hence we employ in this
section, as discussed more fully in Appendix B, a stronger definition of
asymptotic stability that avoids this undesirable behavior.
To establish stability we make use of Lyapunov theorems that are
defined in terms of the function classes K, K∞ and KL. A function
belongs to class K if it is continuous, zero at zero, and strictly increas-
ing; a function belongs to class K∞ if it is in class K and unbounded;
a function β(·) belongs to class KL if it is continuous and if, for each
k ≥ 0, β(·, k) is a class K function and for each s ≥ 0, β(s, ·) is nonin-
creasing and β(s, i) converges to zero as i → ∞. We can now state the
stronger definition of stability.

Definition 2.11 (Asymptotically stable and GAS). Suppose X is positive


invariant for x + = f (x). The origin is AS for x + = f (x) in X if there
2.4 Stability 113

exists a KL function β(·) such that, for each x ∈ X

φ(i; x) ≤ β(|x| , i) ∀i ∈ I≥0

If X = Rn , the origin is GAS for x + = f (x).

The set X is called a region of attraction. Energy in a passive elec-


trical or mechanical system provides a useful analogy to Lyapunov sta-
bility theory. In a lumped mechanical system, the total stored energy,
the sum of the potential and kinetic energy, is dissipated by friction
and decays to zero at which point the dynamic system is in equilib-
rium. Lyapunov theory follows a similar path; if a real-valued function
(a Lyapunov function) can be found that is positive and decreasing if
the state is not the origin, then the state converges to the origin.

Definition 2.12 (Lyapunov function). Suppose that X is positive invari-


ant for x + = f (x). A function V : Rn → R≥0 is said to be a Lyapunov
function in X for x + = f (x) if there exist functions α1 , α2 ∈ K∞ and a
continuous, positive definite function α3 such that for any x ∈ X

V (x) ≥ α1 (|x|) (2.14)


V (x) ≤ α2 (|x|) (2.15)
V (f (x)) − V (x) ≤ −α3 (|x|) (2.16)

We now employ the following stability theorem.

Theorem 2.13 (Lyapunov stability theorem). Suppose X ⊂ Rn is positive


invariant for x + = f (x). If there exists a Lyapunov function in X for the
system x + = f (x), then the origin is asymptotically stable in X for x + =
f (x). If X = Rn , then the origin is globally asymptotically stable. If
αi (|x|) = ci |x|a , a, ci ∈ R>0 , i = 1, 2, 3, then the origin is exponentially
stable.

A standard approach to establish stability is to employ the value


function of an infinite horizon optimal control problem as a Lyapunov
function. This suggests the use of VN0 (·), the value function for the fi-
nite horizon optimal control problem whose solution yields the model
predictive controller, as a Lyapunov function. It is simple to show, un-
der mild assumptions on ℓ(·), that VN0 (·) has property (2.14) for all
x ∈ XN . The value function V∞ (·) for infinite horizon optimal con-
0
trol problems does satisfy, under mild conditions, V∞ (f (x, κ∞ (x))) =
0
V∞ (x) − ℓ(x, κ∞ (x)) thereby ensuring satisfaction of property (2.16).
Since, as is often pointed out, optimality does not imply stability, this
114 Model Predictive Control—Regulation

property does not usually hold when the horizon is finite. One of the
main tasks of this chapter is show that if the ingredients Vf (·), ℓ(·),
and Xf of the finite horizon optimal control problem are chosen ap-
propriately, then VN0 (f (x, κN (x))) ≤ VN0 (x) − ℓ(x, κN (x)) for all x in
XN enabling property (2.16) to be obtained. Property (2.15), an upper
bound on the value function, is more difficult to establish but we also
show that appropriate ingredients that ensures satisfaction of property
(2.16) also ensures satisfaction of property (2.15).
We now address a point that we have glossed over. The solution
to an optimization problem is not necessarily unique. Thus u0 (x) and
κN (x) may be set valued; any point in the set u0 (x) is a solution of
PN (x). Similarly x0 (x) is set valued. Uniqueness may be obtained by
choosing that element in the set u0 (x) that has least norm; and if the
minimum-norm solution is not unique, applying an arbitrary selection
map in the set of minimum-norm solutions. To avoid expressions such
as “let u be any element of the minimizing set u0 (x),” we shall, in
the sequel, use u0 (x) to denote any sequence in the set of minimizing
sequences and use κN (x) to denote u0 (0; x), the first element of this
sequence.

2.4.2 Stabilizing Conditions

To show that the value function VN0 (·) is a valid Lyapunov function
for the closed-loop system x + = f (x, κN (x)) we have to show that
it satisfies (2.14), (2.15), and (2.16). We show below that VN0 (·) is a
valid Lyapunov function if, in addition to Assumptions 2.2 and 2.3, the
following assumption is satisfied.
Assumption 2.14 (Basic stability assumption). Vf (·), Xf and ℓ(·) have
the following properties:
(a) For all x ∈ Xf , there exists a u (such that (x, u) ∈ Z) satisfying

f (x, u) ∈ Xf
Vf (f (x, u)) − Vf (x) ≤ −ℓ(x, u)

(b) There exist K∞ functions α1 (·) and αf (·) satisfying

ℓ(x, u) ≥ α1 (|x|) ∀x ∈ XN , ∀u such that (x, u) ∈ Z


Vf (x) ≤ αf (|x|) ∀x ∈ Xf
2.4 Stability 115

We now show that VN0 (·) is a Lyapunov function satisfying (2.14),


(2.15), and (2.16) if Assumptions 2.2, 2.3, and 2.14 hold.
Lower bound for VN0 (·). The lower-bound property (2.14) is easily
obtained. Since VN0 (x) ≥ ℓ(x, κN (x)) for all x ∈ XN , the lower bound
(2.14) follows from Assumption 2.14(b) in which it is assumed that
there exists a K∞ function α1 (·) such that ℓ(x, u) ≥ α1 (|x|) for all
x ∈ XN , for all u such that (x, u) ∈ Z. This assumption is satisfied by
the usual choice ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru) with Q and R positive
definite. Condition (2.14) is satisfied.
Upper bound for VN0 (·). If Xf contains the origin in its interior, the
upper bound property (2.15) can be established as follows. We show
below in Proposition 2.18 that, under Assumption 2.14, Vj0 (x) ≤ Vf (x)
for all x ∈ Xf , all j ∈ I≥0 . Also, under the same Assumption, there
exists a K∞ function αf (·) such that Vf (x) ≤ αf (|x|) for all x ∈ Xf .
It follows that VN0 (·) has the same upper bound αf (|x|) in Xf . We now
have to show that this bound on VN0 (·) in Xf can be extended to a similar
bound on VN0 (·) in XN . We do this through two propositions. The first
proposition proves that the value function VN0 (·) is locally bounded.

Proposition 2.15 (The value function VN0 (·) is locally bounded). Sup-
pose Assumptions 2.2 and 2.3 (U bounded) hold. Then VN0 (·) is locally
bounded on XN .

Proof. Let X be an arbitrary compact subset of XN . The function VN :


Rn ×RNm → R≥0 is continuous and therefore has an upper bound on the
compact set X × UN . Since UN (x) ⊂ UN for all x ∈ XN , VN0 : XN → R≥0
has the same upper bound on X. Since X is arbitrary, VN0 (·) is locally
bounded on XN . ■

The second proposition shows the upper bound of VN0 (·) in Xf im-
plies the existence of a similar upper bound in the larger set XN .

Proposition 2.16 (Extension of upper bound to XN ). Suppose Assump-


tions 2.2 and 2.3 (U bounded) hold and that Xf ⊆ X is control invariant
for x + = f (x, u), u ∈ U(x) and contains the origin in its interior. Sup-
pose also that there exists a K∞ function α(·) such that Vf (x) ≤ α(|x|)
for all x ∈ Xf . Then there exists a K∞ function α2 (·) such that

VN0 (x) ≤ α2 (|x|) ∀x ∈ XN


116 Model Predictive Control—Regulation

Proof. We have that 0 ≤ VN0 (x) ≤ Vf (x) ≤ α(|x|) for x ∈ Xf which


contains a neighborhood of zero (see also Proposition 2.18). Therefore
VN0 (·) is continuous at zero. The set XN is closed, and VN0 (·) is locally
bounded on XN . Therefore Proposition B.25 of Appendix B applies, and
the result is established. ■

In situations where Xf does not have an interior, such as when Xf =


{0}, we cannot establish an upper bound for VN0 (·) and resort to the
following assumption.

Assumption 2.17 (Weak controllability). There exists a K∞ function


α(·) such that
VN0 (x) ≤ α(|x|) ∀x ∈ XN

Assumption 2.17 is weaker than a controllability assumption. It


confines attention to those states that can be steered to Xf in N steps
and merely requires that the cost of doing so is not excessive.

Descent property for VN0 (·). Let x be any state in XN at time zero.
Then
VN0 (x) = VN (x, u0 (x))

in which
 
u0 (x) = u0 (0; x), u0 (1; x), . . . , u0 (N − 1; x)

is any minimizing control sequence. The resultant optimal state se-


quence is  
x0 (x) = x 0 (0; x), x 0 (1; x), . . . , x 0 (N; x)

in which x 0 (0; x) = x and x 0 (1; x) = x + . The successor state to x


at time zero is x + = f (x, κN (x)) = x 0 (1; x) at time 1 where κN (x) =
u0 (0; x), and
VN0 (x + ) = VN (x + , u0 (x + ))

in which
 
u0 (x + ) = u0 (0; x + ), u0 (1; x + ), . . . , u0 (N − 1; x + )

It is difficult to compare VN0 (x) and VN0 (x + ) directly, but

VN0 (x + ) = VN (x + , u0 (x + )) ≤ VN (x + , u
e)
2.4 Stability 117

where u e is any feasible control sequence for PN (x + ), i.e., any con-


trol sequence in UN (x). To facilitate comparison of VN (x + , u e ) with
VN0 (x) = VN (x, u0 (x)), we choose
 
e = u0 (1; x), . . . , u0 (N − 1; x), u
u

in which u ∈ U still has to be chosen. Comparing u e with u0 (x) shows


that x
e , the state sequence due to control sequence u e , is
 
e = x 0 (1; x), x 0 (2; x), . . . , x 0 (N; x), f (x 0 (N; x), u)
x

in which x 0 (1; x) = x + = f (x, κN (x)). Because x0 coincides with x e


and u(·) coincides with u e for i = 1, 2, . . . , N − 1 (but not for i = N), a
simple calculation yields
N−1
X
VN (x + , u
e) = ℓ(x 0 (j; x), u0 (j; x))+ℓ(x 0 (N; x))+Vf (f (x 0 (N; x), u))
j=1

But

VN0 (x) = VN (x, u0 (x))


N−1
X
= ℓ(x, κN (x)) + ℓ(x 0 (j; x), u0 (j; x)) + Vf (x 0 (N; x))
j=1

so that
N−1
X
ℓ(x 0 (j; x), u0 (j; x)) = VN0 (x) − ℓ(x, κN (x)) − Vf (x 0 (N; x))
j=1

Hence

VN0 (x) ≤ VN (x + , u
e ) = VN0 (x) − ℓ(x, κN (x)) − Vf (x 0 (N; x))+
ℓ(x 0 (N; x), u) + Vf (f (x 0 (N; x), u))

It follows that

VN0 (f (x, κN (x))) ≤ VN0 (x) − ℓ(x, κN (x)) (2.17)

for all x ∈ X if the function Vf (·) and the set Xf have the property
that, for all x ∈ Xf , there exists a u ∈ U such that

(x, u) ∈ Z, Vf (f (x, u)) ≤ Vf (x) − ℓ(x, u), and f (x, u) ∈ Xf (2.18)


118 Model Predictive Control—Regulation

But this condition is satisfied by the stabilizing condition, Assumption


2.14. Since ℓ(x, κN (x)) ≥ α1 (|x|) for all x ∈ X, VN0 (·) has the desired
descent property (2.16).
To complete the proof that the value function satisfies (2.14), (2.15),
and (2.16), we have to prove the assertion, made in obtaining the upper
bound for VN0 (·), that Vj0 (x) ≤ Vf (x) for all x ∈ Xf , all j ∈ I≥0 . This
assertion follows from the monotonicity property of the value function
VN0 (·). This interesting result was first obtained for the unconstrained
linear quadratic optimal control problem.
Proposition 2.18 (Monotonicity of the value function). Suppose that
Assumptions 2.2, 2.3 (U bounded), and 2.14 hold. Then
0
Vj+1 (x) ≤ Vj0 (x) ∀x ∈ Xj , ∀j ∈ I≥0

and
Vj0 (x) ≤ Vf (x) ∀x ∈ Xf , ∀j ∈ I≥0

Proof. From the DP recursion (2.9)

V10 (x) = min {ℓ(x, u) + V00 (f (x, u)) | f (x, u) ∈ X0 }


u∈U(x)

But V00 (·) := Vf (·) and X0 := Xf . Also, by Assumption 2.14

min {ℓ(x, u) + Vf (f (x, u)) | f (x, u) ∈ Xf } ≤ Vf (x) ∀x ∈ Xf


u∈U(x)

so that
V10 (x) ≤ V00 (x) ∀x ∈ X0 = Xf
Next, suppose that for some j ≥ 1

Vj0 (x) ≤ Vj−1


0
(x) ∀x ∈ Xj−1

Then, using the DP equation (2.9)


0
Vj+1 (x) − Vj0 (x) = ℓ(x, κj+1 (x)) + Vj0 (f (x, κj+1 (x)))
0
− ℓ(x, κj (x)) − Vj−1 (f (x, κj (x))) ∀x ∈ Xj ⊆ Xj+1

Since κj (x) may not be optimal for Pj+1 (x) for all x ∈ Xj ⊆ Xj+1 , we
have
0
Vj+1 (x) − Vj0 (x) ≤ ℓ(x, κj (x)) + Vj0 (f (x, κj (x)))
0
− ℓ(x, κj (x)) − Vj−1 (f (x, κj (x))) ∀x ∈ Xj
2.4 Stability 119

Also, from (2.11), x ∈ Xj implies f (x, κj (x)) ∈ Xj−1 so that, by as-


sumption, Vj0 (f (x, κj (x))) ≤ Vj−1
0
(f (x, κj (x))) for all x ∈ Xj . Hence

0
Vj+1 (x) ≤ Vj0 (x) ∀x ∈ Xj

By induction
0
Vj+1 (x) ≤ Vj0 (x) ∀x ∈ Xj , ∀j ∈ I≥0

Since the set sequence (Xj )I≥0 has the nested property Xj ⊂ Xj+1 for
all j ∈ I≥0 , it follows that Vj0 (x) ≤ Vf (x) for all x ∈ Xf , all j ∈ I≥0 . ■

The monotonicity property Proposition 2.18 also holds even if U(x)


is not compact provided that the minimizer in the DP recursion always
exists; this is the case for the linear-quadratic problem.
The monotonicity property can also be used to establish the (previ-
ously established) descent property of VN0 (·) by noting that

VN0 (x) = ℓ(x, κN (x)) + VN−1


0
(f (x, κN (x)))
= ℓ(x, κN (x)) + VN0 (f (x, κN (x)))+
0
[VN−1 (f (x, κN (x))) − VN0 (f (x, κN (x)))]

so that using the monotonicity property

VN0 (f (x, κN (x))) = VN0 (x) − ℓ(x, κN (x))+


[VN0 (f (x, κN (x))) − VN−1
0
(f (x, κN (x)))]
≤ VN0 (x) − ℓ(x, κN (x)) ∀x ∈ XN

which is the desired descent property.


Since inequalities (2.14), (2.15), and (2.16) are all satisfied we have
proved (for U bounded)

Theorem 2.19 (Asymptotic stability of the origin). Suppose Assump-


tions 2.2, 2.3, 2.14, and 2.17 are satisfied. Then
(a) There exists K∞ functions α1 (·) and α2 (·) such that for all x ∈ XN
c
(X̄N , for each c ∈ R>0 )

α1 (|x|) ≤ VN0 (x) ≤ α2 (|x|)


VN0 (f (x, κN (x))) − VN0 (x) ≤ −α1 (|x|)
c
(b) The origin is asymptotically stable in XN (X̄N , for each c ∈ R>0 ) for
+
x = f (x, κN (x)).
120 Model Predictive Control—Regulation

For the proof with U unbounded, note that the lower bound and de-
scent property remain satisfied as before. For the upper bound, if Xf
contains the origin in its interior, we have that, since Vf (·) is continu-
ous, for each c > 0 there exists 0 < τ ≤ c, such that levτ Vf contains a
c
neighborhood of the origin and is a subset of both Xf and X̄N . One can
0
then show that VN (·) ≤ Vf (·) for each N ≥ 0 on this sublevel set, and
therefore VN0 (·) is continuous at the origin so that again Proposition
c
B.25 applies, and Assumption 2.17 is satisfied on X̄N for each c ∈ R>0 .
As discussed above, Assumption 2.17 is immediate if the origin lies
in the interior of Xf . In other cases, e.g., when the stabilizing ingredi-
ent is the terminal equality constraint x(N) = 0 (Xf = {0}), Assump-
tion 2.17 is taken directly. See Proposition 2.38 for some additional
circumstances in which Assumption 2.17 is satisfied.

2.4.3 Exponential Stability

Exponential stability is defined as follows.


Definition 2.20 (Exponential stability). Suppose X ⊆ Rn is positive in-
variant for x + = f (x). The origin is exponentially stable for x + = f (x)
in X if there exist c ∈ R>0 and γ ∈ (0, 1) such that
φ(i; x) ≤ c |x| γ i
for all x ∈ X, all i ∈ I≥0 .
Theorem 2.21 (Lyapunov function and exponential stability). Suppose
X ⊂ Rn is positive invariant for x + = f (x). If there exists a Lyapunov
function in X for the system x + = f (x) with αi (·) = ci |·|a in which a,
ci ∈ R>0 i = 1, 2, 3, then the origin is exponentially stable for x + = f (x)
in X.
The proof of this result is left as an exercise.

2.4.4 Controllability and Observability

We have not yet made any assumptions on controllability (stabilizabil-


ity) or observability (detectability) of the system (2.1) being controlled,
which may be puzzling since such assumptions are commonly required
in optimal control to, for example, establish existence of a solution to
the optimal control problem. The reasons for this omission are that
such assumptions are implicitly required, at least locally, for the basic
stability Assumption 2.14, and that we restrict attention to XN , the set
of states that can be steered to Xf in N steps satisfying all constraints.
2.4 Stability 121

Stage cost ℓ(·) not positive definite. In the previous stability anal-
ysis we assume that the function (x, u) , ℓ(x, u) is positive definite;
more precisely, we assume that there exists a K∞ function α1 (·) such
that ℓ(x, u) ≥ α1 (|x|) for all (x, u). Often we assume that ℓ(·) is
quadratic, satisfying ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru) where Q and R
are positive definite. In this section we consider the case where the
stage cost is ℓ(y, u) where y = h(x) and the function h(·) is not nec-
essarily invertible. An example is the quadratic stage cost ℓ(y, u) =
(1/2)(y ′ Qy y + u′ Ru) where Qy and R are positive definite, y = Cx,
and C is not invertible; hence the stage cost is (1/2)(x ′ Qx + u′ Ru)
where Q = C ′ Qy C is merely positive semidefinite. Since now ℓ(·) does
not satisfy ℓ(x, u) ≥ α1 (|x|) for all (x, u) ∈ Z and some K∞ function
α1 (·), we have to make an additional assumption in order to estab-
lish asymptotic stability of the origin for the closed-loop system. An
appropriate assumption is input/output-to-state-stability (IOSS), which
ensures the state goes to zero as the input and output go to zero. We
recall Definition B.51, restated here.

Definition 2.22 (Input/output-to-state stable (IOSS)). The system x + =


f (x, u), y = h(x) is IOSS if there exist functions β(·) ∈ KL and γ1 (·),
γ2 (·) ∈ K such that for every initial state x ∈ Rn , every control se-
quence u, and all i ≥ 0

|x(i)| ≤ max{β(|x| , i), γ1 (∥u∥0:i−1 ), γ2 (∥y∥0:i )}

in which x(i) := φ(i; x, u) is the solution of x + = f (x, u) at time i


if the initial state is x and the input sequence is u; y(i) := h(x(i)) is
the output, and ∥d∥a:b := max a≤j≤b d(j) denotes the max norm of a
sequence.

Note that for linear systems, IOSS is equivalent to detectability of


(A, C) (see Exercise 4.5).
We assume as usual that Assumptions 2.2 and 2.3 are satisfied, but
we replace Assumption 2.14 by the following.

Assumption 2.23 (Modified basic stability assumption). Vf (·), Xf and


ℓ(·) have the following properties.
(a) For all x ∈ Xf , there exists a u (such that (x, u) ∈ Z) satisfying

Vf (f (x, u)) − Vf (x) ≤ −ℓ(h(x), u), f (x, u) ∈ Xf


122 Model Predictive Control—Regulation

(b) There exist K∞ functions α1 (·) and αf (·) satisfying

ℓ(y, u) ≥ α1 ( (y, u) ) ∀(y, u) ∈ Rp × Rm


Vf (x) ≤ αf (|x|) ∀x ∈ Xf

Note that in the modification of Assumption 2.14 we have changed


only the lower-bound inequality for stage cost ℓ(y, u). With these as-
sumptions we can then establish asymptotic stability of the origin.

Theorem 2.24 (Asymptotic stability with stage cost ℓ(y, u)). Suppose
Assumptions 2.2, 2.3, 2.17 and 2.23 are satisfied, and the system x + =
f (x, u), y = h(x) is IOSS. Then there exists a Lyapunov function in XN
c
(X̄N , for each c ∈ R>0 ) for the closed-loop system x + = f (x, κN (x)),
c
and the origin is asymptotically stable in XN (X̄N , for each c ∈ R>0 ).

Proof. For the case of bounded U, Assumptions 2.2, 2.3, and 2.23(a)
guarantee the existence of the optimal solution of the MPC problem and
the positive invariance of XN for x + = f (x, κN (x)), but the nonpositive
definite stage cost gives the following modified inequalities

ℓ(h(x), u) ≤ VN0 (x) ≤ α2 (|x|)


VN0 (f (x, κN (x))) − VN0 (x) ≤ −ℓ(h(x), u)

so VN0 (·) is no longer a Lyapunov function for the closed-loop system.


Because the system is IOSS and ℓ(y, u) ≥ α1 ( (y, u) ), however, The-
orem B.53 in Appendix B provides that for any γ(·) ∈ K∞ there exists
an IOSS-Lyapunov function Λ(·) for which the following holds for all
(x, u) ∈ Z for which f (x, u) ∈ X

γ1 (|x|) ≤ Λ(x) ≤ γ2 (|x|)


Λ(f (x, u)) − Λ(x) ≤ −ρ(|x|) + γ(ℓ(h(x), u))

with γ1 , γ2 ∈ K∞ and continuous function ρ ∈ PD. Note that these


inequalities certainly apply for u = κN (x) since (x, κN (x)) ∈ Z and
f (x, κN (x)) ∈ XN ⊆ X. Therefore we choose the linear K∞ function
γ(·) = (·), take V (·) = VN0 (·) + Λ(·) as our candidate Lyapunov func-
tion, and obtain for all x ∈ XN

α1 (|x|) ≤ V (x) ≤ α2 (|x|)


V (f (x, κN (x))) − V (x) ≤ −ρ(|x|)
2.4 Stability 123

with K∞ functions α1 (·) := γ1 (·) and α2 (·) := α2 (·) + γ2 (·). From


Definition 2.12, V (·) is a Lyapunov function in XN for the system x + =
f (x, κN (x)). Therefore the origin is asymptotically stable in XN from
Theorem 2.13. Treat unbounded U as in the proof of Theorem 2.19. ■

Note that we have here the appearance of a Lyapunov function that


is not the optimal value function of the MPC regulation problem. In
earlier MPC literature, observability rather than detectability was often
employed as the extra assumption required to establish asymptotic sta-
bility. Exercise 2.14 discusses that approach.

2.4.5 Time-Varying Systems

Most of the control problems discussed in this book are time invari-
ant. Time-varying problems do arise in practice, however, even if the
system being controlled is time invariant. One example occurs when
an observer or filter is used to estimate the state of the system being
controlled since bounds on the state estimation error are often time
varying. In the deterministic case, for example, state estimation er-
ror decays exponentially to zero. Another example occurs when the
desired equilibrium is not a state-control pair (xs , us ) but a periodic
trajectory. In this section, which may be omitted in the first reading,
we show how MPC may be employed for a class of time-varying systems.
The problem. The time-varying nonlinear system is described by

x + = f (x, u, i)

where x is the current state at time i, u the current control, and x + the
successor state at time i + 1. For each integer i, the function f (·, i) is
assumed to be continuous. The solution of this system at time k ≥ i
given that the initial state is x at time i is denoted by φ(k; x, u, i); the
solution now depends on both the time i and current time k rather than
merely on the difference k − i as in the time-invariant case. The cost
VN (x, u, i) also depends on time i and is defined by
i+N−1
X
VN (x, u, i) := ℓ(x(k), u(k), k) + Vf (x(i + N), i + N)
k=i

in which x(k) := φ(k; x, u, i), u = (u(i), u(i + 1), . . . , u(i + N − 1)),


and the stage cost ℓ(·) and terminal cost Vf (·) are time varying. The
state and control constraints are also time varying

x(i) ∈ X(i) u(i) ∈ U(i)


124 Model Predictive Control—Regulation

for all i. In addition, there is a time-varying terminal constraint

x(i + N) ∈ Xf (i + N)

in which i is the current time. The time-varying optimal control prob-


lem at event (x, i) is PN (x, i) defined by

PN (x, i) : VN0 (x, i) = min{VN (x, u, i) | u ∈ UN (x, i)}


u

in which UN (x, i) is the set of control sequences u = (u(i), u(i + 1),



. . . , u(i + N − 1) satisfying the state, control and terminal constraints,
i.e.,
UN (x, i) := {u | (x, u) ∈ ZN (i)}
in which, for each i, ZN (i) ⊂ Rn × RNm is defined by

ZN (i) := (x, u) | u(k) ∈ U(k), φ(k; x, u, i) ∈ X(k), ∀k ∈ Ii:i+N−1 ,
φ(i + N; x, u, i) ∈ Xf (i + N)

For each time i, the domain of VN0 (·, i) is XN (i) where

XN (i) := {x ∈ X(i) | UN (x, i) ≠ ∅}


= {x ∈ X(i) | ∃u such that (x, u) ∈ ZN (i)}

which is the projection of ZN (i) onto X(i). Our standing assumptions


(2.2 and 2.3) are replaced, in the time-varying case, by
Assumption 2.25 (Continuity of system and cost; time-varying case).
The functions (x, u) , f (x, u, i), (x, u) , ℓ(x, u, i) and x , Vf (x,
i) are continuous for all i ∈ I≥0 . Also, for all i ∈ I≥0 , f (0, 0, i) = 0,
ℓ(0, 0, i) = 0 and Vf (0, i) = 0.
Assumption 2.26 (Properties of constraint sets; time-varying case). For
each i ∈ I≥0 , X(i) and Xf (i) are closed, Xf (i) ⊂ X(i) and U(i) are
compact; the sets U(i), i ∈ I≥0 are uniformly bounded by the compact
set Ū. Each set contains the origin.
In making these assumptions we are implicitly assuming, as before,
that the desired setpoint has been shifted to the origin, but in this case,
it need not be constant in time. For example, letting x̄ and ū be the
original positional variables, we can consider a time-varying reference
trajectory (x̄r (i), ūr (i)) by defining x(i) := x̄(i) − x̄r (i) and u(i) :=
ū(i) − ūr (i). Depending on the application, x̄r (i) and ūr (i) could be
constant, periodic, or generally time varying. In any case, because of the
time-varying nature of the problem, we need to extend our definitions
of invariance and control invariance.
2.4 Stability 125

Definition 2.27 (Sequential positive invariance and sequential control


invariance).
(a) A sequence of sets (X(i))i≥0 is sequentially positive invariant for
the system x + = f (x, i) if for any i ≥ 0, x ∈ X(i) implies f (x, i) ∈
X(i + 1).

(b) A sequence of sets (X(i))i≥0 is sequentially control invariant for


the system x + = f (x, u, i) if for any i ≥ 0 and x ∈ X(i), there exists
a u ∈ U(i) such that x + = f (x, u, i) ∈ X(i + 1).

Let (X(i))i≥0 be sequentially positive invariant. If x ∈ X(i0 ) for


some i0 ≥ 0, then φ(i; x, i0 ) ∈ X(i) for all i ≥ i0 .
The following results, which are analogs of the results for time-
invariant systems given previously, are stated without proof.

Proposition 2.28 (Continuous system solution; time-varying case). Sup-


pose Assumptions 2.25 and 2.26 are satisfied. For each initial time i0 ≥ 0
and final time i ≥ i0 , the function (x, u) , φ(i; x, u, i0 ) is continuous.

Proposition 2.29 (Existence of solution to optimal control problem;


time-varying case). Suppose Assumptions 2.25 and 2.26 are satisfied.
Then for each time i ∈ I≥0
(a) The function (x, u) , VN (x, u, i) is continuous in ZN (i).

(b) For each x ∈ XN (i), the control constraint set UN (x, i) is compact.

(c) For each x ∈ XN (i), a solution to PN (x, i) exists.

(d) XN (i) is closed and x = 0 ∈ XN (i).


 
(e) If Xf (i) is sequentially control invariant for x + = f (x, u, i),
i∈I≥0
then (XN (i))i∈I≥0 is sequentially control invariant for x + = f (x, u, i)
and sequentially positive invariant for x + = f (x, κN (x, i), i).

Stability. Our definitions of AS (asymptotic stability) and GAS (global


asymptotic stability) also require slight modification for the time-varying
case.

Definition 2.30 (Asymptotically stable and GAS for time-varying sys-


tems). Suppose that the sequence (X(i))i≥0 is sequentially positive in-
variant for x + = f (x, i). The origin is asymptotically stable in the se-
quence (X(i))i≥0 for x + = f (x, i) if the following holds for all i ≥ i0 ≥
0, and x ∈ X(i0 )
φ(i; x, i0 ) ≤ β(|x| , i − i0 ) (2.19)
126 Model Predictive Control—Regulation

in which β ∈ KL and φ(i; x, i0 ) is the solution to x + = f (x, i) at time


i ≥ i0 with initial condition x at time i0 ≥ 0. If X(i) = Rn , the origin is
globally asymptotically stable (GAS).

This definition is somewhat restrictive because φ(i, x, i0 ) depends


on i − i0 rather than on i.

Definition 2.31 (Lyapunov function: time-varying, constrained case).


Let the sequence (X(i))i≥0 be sequentially positive invariant, and let
V (·, i) : X(i) → R≥0 satisfy for all x ∈ X(i), i ∈ I≥0

α1 (|x|A ) ≤ V (x, i) ≤ α2 (|x|A )


∆V (x, i) ≤ −α3 (|x|A )

with ∆V (x, i) := V (f (x, i), i + 1) − V (x, i), α1 , α2 , α3 ∈ K∞ . Then the


function V (·, ·) is a time-varying Lyapunov function in the sequence
(X(i))i≥0 for x + = f (x, i).

This definition requires a single, time-invariant bound for each αj (·),


j ∈ {1, 2, 3}, which is not overly restrictive.
 For example, supposing
there is a sequence of lower bounds α1i (·) , it is necessary only
I≥0
that the infimum
α1 (·) := inf α1i (·)
i∈I≥0

is class K∞ . If the system is time invariant or periodic, this property is


satisfied (as the inf becomes a min over a finite set), but it does preclude
1
bounds that become arbitrarily flat, such as α1i (s) = i+1 s 2 . A similar
argument holds for j ∈ {2, 3} (using sup instead of inf for j = 2). We
can now state a stability definition that we employ in this chapter

Theorem 2.32 (Lyapunov theorem for asymptotic stability (time-vary-


ing, constrained)). Let the sequence (X(i))≥0 be sequentially positive
invariant for the system x + = f (x, i), and V (·, ·) be a time-varying
Lyapunov function in the sequence (X(i))≥0 for x + = f (x, i). Then the
origin is asymptotically stable in X(i) at each time i ≥ 0 for x + = f (x, i).

The proof of this theorem is given in Appendix B (see Theorem B.24).


Model predictive control of time-varying systems. As before, the
receding horizon control law κN (·), which is now time varying, is not
necessarily optimal or stabilizing. By choosing the time-varying ingre-
dients Vf (·) and Xf in the optimal control problem appropriately, how-
ever, stability can be ensured, as we now show. We replace the basic
stability assumption 2.14 by its time-varying extension.
2.4 Stability 127

Assumption 2.33 (Basic stability assumption; time-varying case).


(a) For all i ∈ I≥0 , all x ∈ Xf (i), there exists a u ∈ U(i) such that

f (x, u, i) ∈ Xf (i + 1)
Vf (f (x, u, i), i + 1) − Vf (x, i) ≤ −ℓ(x, u, i)

(b) There exist K∞ functions α1 (·) and αf (·) satisfying

ℓ(x, u, i) ≥ α1 (|x|) ∀x ∈ XN (i), ∀u such that (x, u) ∈ ZN (i), ∀i ∈ I≥0


Vf (x, i) ≤ αf (|x|), ∀x ∈ Xf (i), ∀i ∈ I≥0

As in the case of the time-varying Lyapunov function, requiring time-


invariant bounds is typically not restrictive. A direct consequence of
Assumption 2.33 is the descent property given in the following propo-
sition.

Proposition 2.34 (Optimal cost decrease; time-varying case). Suppose


Assumptions 2.25, 2.26, and 2.33 hold. Then

VN0 (f (x, κN (x, i), i), i + 1) ≤ VN0 (x, i) − ℓ(x, κN (x, i), i) (2.20)

for all x ∈ XN (i), all i ∈ I≥0 .

Proposition 2.35 (MPC cost is less than terminal cost). Suppose As-
sumptions 2.25, 2.26, and 2.33 hold. Then

VN0 (x, i) ≤ Vf (x, i) ∀x ∈ Xf (i), ∀i ∈ I≥0

The proofs of Propositions 2.34 and 2.35 are left as Exercises 2.9
and 2.10.

Proposition 2.36 (Optimal value function properties; time-varying case).


Suppose Assumptions 2.25, 2.26, and 2.33 are satisfied. Then there exist
two K∞ functions α1 (·) and α2 (·) such that, for all i ∈ I≥0

VN0 (x, i) ≥ α1 (|x|) ∀x ∈ XN (i)


VN0 (x, i) ≤ α2 (|x|) ∀x ∈ Xf (i)
VN0 (f (x, κN (x, i), i + 1)) − VN0 (x, i) ≤ −α1 (|x|) ∀x ∈ XN (i)
128 Model Predictive Control—Regulation

We can deal with the obstacle posed by the fact that the upper bound
on VN0 (·) holds only in Xf (i) in much the same way as we did previ-
ously for the time-invariant case. In general, we invoke the following
assumption.

Assumption 2.37 (Uniform weak controllability). There exists a K∞


function α(·) such that

VN0 (x, i) ≤ α(|x|) ∀x ∈ XN (i), ∀i ∈ I≥0

It can be shown that Assumption 2.37 holds in a variety of other


circumstances as described in the following proposition.

Proposition 2.38 (Conditions for uniform weak controllability). Sup-


pose the functions f (·), ℓ(·), and Vf (·) are uniformly bounded for all
i ∈ I≥0 , i.e., on any compact set Z ⊂ Rn × Ū, the set

(f (x, u, i), ℓ(x, u, i), Vf (x, i)) | (x, u) ∈ Z, i ∈ I≥0

is bounded. Assumption 2.37 is satisfied if any of the following conditions


holds:
(a) There exists a neighborhood of the origin X such that X ⊆ Xf (i) for
each i ∈ I≥0

(b) For i ∈ I≥0 , the optimal value function VN0 (x, i) is uniformly contin-
uous in x at x = 0

(c) There exists a neighborhood of the origin X and a K function α(·)


such that VN0 (x, i) ≤ α(|x|) for all i ∈ I≥0 and x ∈ X ∩ XN (i)

(d) The functions f (·) and ℓ(·) are uniformly continuous at the origin
(x, u) = (0, 0) for all i ∈ I≥0 , and the system is stabilizable with small
inputs, i.e., there exists a K∞ function γ(·) such that for all i ∈ I≥0 and
x ∈ XN (i), there exists u ∈ UN (x, i) with |u| ≤ γ(|x|).

Proof.
(a) Similar to Proposition 2.16, one can show that the optimal cost

VN0 (x, i) ≤ Vf (x, i) ≤ α2 (|x|) for all x ∈ X

Thus, condition (c) is implied.

(b) From uniform continuity, we know that for each ε > 0, there exists
δ > 0 such that

|x| ≤ δ implies VN0 (x, i) ≤ ε for all i ∈ I≥0


2.4 Stability 129

recalling that VN0 (·) is nonnegative and zero at the origin. By Rawlings
and Risbeck (2015, Proposition 13), this is equivalent to the existence
of a K function γ(·) defined on [0, b] (with b > 0) such that
VN0 (x, i) ≤ γ(|x|) for all x ∈ X
with X := {x ∈ Rn | |x| ≤ b} a neighborhood of the origin. Thus,
condition (c) is also implied.

(c) First, we know that VN (·) is uniformly bounded because it is the


finite sum and composition of the uniformly bounded functions f (·),
ℓ(·), and Vf (·). Thus, VN0 (·) is also uniformly bounded, because
0 ≤ VN0 (x, i) ≤ VN (x, u, i) for all u ∈ UN (x, i)
and VN (·) is uniformly bounded. Next, because X is a neighborhood
of the origin, there exists b0 > 0 such that VN0 (x, i) ≤ α(|x|) whenever
x ∈ XN (i) and |x| ≤ b0 . Following Rawlings and Risbeck (2015, Propo-
sition 14), we choose any strictly increasing and unbounded sequence

(bk )k=0 and define
Bk (i) := {x ∈ XN (i) | |x| ≤ bk }
We then compute a new sequence (βk )∞
k=0 as

βk := k + sup VN0 (x, i)


i∈I≥0
x∈Bk (i)

We know that this sequence is well-defined because VN0 (x, i) is uni-


S
formly bounded for i ∈ I≥0 on i∈I≥0 Bk (i). We then define


 β1
 γ(b ) γ(s)
 s ∈ [0, b0 )
0
α(s) :=

 s − bi

βk+1 + (βk+2 − βk+1 ) s ∈ [bk , bk+1 ) for all k ∈ I≥0
bi+1 − bi
which is a K∞ function that satisfies
VN0 (x, i) ≤ α(|x|) for all i ∈ I≥0
as desired.

(d) See Exercise 2.22. Note that the uniform continuity of f (·) and ℓ(·)
implies the existence of K function upper bounds of the form
f (x, u, i) ≤ αf x (|x|) + αf u (|u|)
ℓ(x, u, i) ≤ αℓx (|x|) + αℓu (|u|)
for all i ∈ I≥0 . ■
130 Model Predictive Control—Regulation

Assumption Title Page


2.2 Continuity of system and cost 97
2.3 Properties of constraint sets 98
2.14 Basic stability assumption 114
2.17 Weak controllability 116

Table 2.1: Stability assumptions; time-invariant case.

Hence, if Assumptions 2.25, 2.26, 2.33, and 2.37 hold it follows from
Proposition 2.36 that, for all i ∈ I≥0 , all x ∈ XN (i)

α1 (|x|) ≤ VN0 (x, i) ≤ α2 (|x|)


VN0 (f (x, κN (x, i), i + 1)) − VN0 (x, i) ≤ −α1 (|x|) (2.21)

so that, by Definition 2.31, VN0 (·) is a time-varying Lyapunov function


in the sequence (X(i))i≥0 for x + = f (x, κN (x, i), i). It can be shown,
by a slight extension of the arguments employed in the time-invariant
case, that problem PN (·) is recursively feasible and that (XN (i))i∈I≥0
is sequentially positive invariant for the system x + = f (x, κN (x, i), i).
The sequence (XN (i))i≥0 in the time-varying case replaces the set XN
in the time-invariant case.

Theorem 2.39 (Asymptotic stability of the origin: time-varying MPC).


Suppose Assumptions 2.25, 2.26, 2.33, and 2.37 holds. Then,
(a) There exist K∞ functions α1 (·) and α2 (·) such that, for all i ∈ I≥0
and all x ∈ XN (i), inequalities (2.21) are satisfied.

(b) The origin is asymptotically stable in XN (i) at each time i ≥ 0 for


the time-varying system x + = f (x, κN (x, i), i).

Proof.
(a) It follows from Assumptions 2.25, 2.26, 2.33, and 2.37 and Propo-
sition 2.36 that VN0 (·) satisfies the inequalities (2.21).

(b) It follows from (a) and definition 2.31 that VN0 (·) is a time-varying
Lyapunov function. It follows from Theorem 2.32 that the origin is
asymptotically stable in XN (i) at each time i ≥ 0 for the time-varying
system x + = f (x, κN (x, i), i).

2.5 Examples of MPC 131

Assumption Title Page


2.25 Continuity of system and cost 124
2.26 Properties of constraint sets 124
2.33 Basic stability assumption 127
2.37 Uniform weak controllability 128

Table 2.2: Stability assumptions; time-varying case.

2.5 Examples of MPC


We already have discussed the general principles underlying the design
of stabilizing model predictive controllers. The stabilizing conditions
on Xf , ℓ(·), and Vf (·) that guarantee stability can be implemented
in a variety of ways so that MPC can take many different forms. We
present the most useful forms of MPC for applications. These examples
also display the roles of the three main assumptions used to guarantee
closed-loop asymptotic stability. These assumptions are summarized
in Table 2.1 for the time-invariant case, and Table 2.2 for the time-
varying case. Referring back to these tables may prove helpful while
reading this section and comparing the various forms of MPC.
One question that is often asked is whether or not the terminal con-
straint is necessary. Since the conditions given previously are suffi-
cient, necessity cannot be claimed. We discuss this further later. It is
evident that the constraint arises because one often has a local, rather
than a global, control Lyapunov function (CLF) for the system being
controlled. In a few situations, a global CLF is available, in which case
a terminal constraint is not necessary.
All model predictive controllers determine the control action to be
applied to the system being controlled by solving, at each state, an op-
timal control problem that is usually constrained. If the constraints in
the optimal control problem include hard state constraints, then the
feasible region XN is a subset of Rn . The analysis given previously
shows that if the initial state x(0) lies in XN , so do all subsequent
states, a property known as recursive feasibility. It is always possible,
however, for unanticipated events to cause the state to become infea-
sible. In this case, the optimal control problem, as stated, cannot be
solved, and the controller fails. It is therefore desirable, if this does
not conflict with design aims, to employ soft state constraints in place
of hard constraints. Otherwise, any implementation of the algorithms
132 Model Predictive Control—Regulation

described subsequently should be modified to include a feature that


enables recovery from faults that cause infeasibility. One remedy is to
replace the hard constraints by soft constraints when the current state
is infeasible, thereby restoring feasibility, and to revert back to the hard
constraints as soon as they can be satisfied at the current state.
In establishing stability of the examples of MPC presented below,
we make use of Theorem 2.19 (or Theorem 2.24) for time-invariant sys-
tems and Theorem 2.39 for time-varying systems. We must therefore
establish that Assumptions 2.2, 2.3, and 2.14 are satisfied in the time-
invariant case and that Assumptions 2.25, 2.26, and 2.33 are satisfied
in the time-varying case. We normally assume that 2.2, 2.3, and 2.14(b)
or 2.25, 2.26, and 2.33(b) are satisfied, so our main task below in each
example is establishing satisfaction of the basic stability assumption
(cost decrease) 2.14(a) or 2.33(a).

2.5.1 The Unconstrained Linear Quadratic Regulator

Consider the linear, time-invariant model x + = Ax + Bu, y = Cx with


quadratic penalties on output and state ℓ(y, u) = (1/2)(y ′ Qy y +
u′ Ru) for both the finite and infinite horizon cases. We first consider
what the assumptions of Theorem 2.24 imply in this case, and compare
these assumptions to the standard LQR assumptions (listed in Exercise
1.20(b)).
Assumptions 2.2 is satisfied for f (x, u) = Ax + Bu and ℓ(x, u) =
(1/2)(x ′ C ′ Qy Cx + u′ Ru) for all A, B, C, Qy , R. Assumption 2.3 is sat-
isfied with Z = Rn × Rm and R > 0. Assumption 2.23 implies that
Qy > 0 as well. The system being IOSS implies that (A, C) is detectable
(see Exercise 4.5). We can choose Xf to be the stabilizable subspace
of (A, B) and Assumption 2.23(a) is satisfied. The set XN contains the
system controllability information. The set XN is the stabilizable sub-
space of (A, B), and we can satisfy Assumption 2.23(b) by choosing
Vf (x) = (1/2)x ′ Πx in which Π is the solution to the steady-state Ric-
cati equation for the stabilizable modes of (A, B).
In particular, if (A, B) is stabilizable, then Vf (·) can be chosen to
be Vf (x) = (1/2)x ′ Πx in which Π is the solution to the steady-state
Riccati equation (1.18), which is positive definite. The terminal set can
be taken as any (arbitrarily large) sublevel set of the terminal penalty,
Xf = leva Vf , a > 0, so that any point in Rn is in Xf for large enough a.
We then have XN = Rn for all N ∈ I0:∞ . The horizon N can be finite or
infinite with this choice of Vf (·) and the control law is invariant with
2.5 Examples of MPC 133

respect to the horizon length, κN (x) = Kx in which K is the steady-


state linear quadratic regulator gain given in (1.18). Theorem 2.24 then
gives that the origin of the closed-loop system x + = f (x, κN (x)) =
(A + BK)x is globally, asymptotically stable. This can be strengthened
to globally, exponentially stable because of the choice of quadratic stage
cost and form of the resulting Lyapunov function in Theorem 2.24.
The standard assumptions for the LQR with stage cost l(y, u) =
(1/2)(y ′ Qy y + u′ Ru) are

Qy > 0 R>0 (A, C) detectable (A, B) stabilizable

and we see that LQ theory establishes that the standard steady-state


LQR is covered by Theorem 2.24. Summarizing we have

Given the standard LQR problem, Assumptions 2.2, 2.3, and


2.23 are satisfied and XN = Xf = Rn . It follows from The-
orems 2.24 and 2.21 that the origin is globally, exponen-
tially stable for the controlled system x + = Ax + BκN (x) =
(A + BK)x.

2.5.2 Unconstrained Linear Periodic Systems

In the special case where the system is time varying but periodic, a
global CLF can be determined as in the LQR case. Suppose the objective
function is
1 
ℓ(x, u, i) := x ′ Q(i)x + u′ R(i)u
2
with each Q(i) and R(i) positive definite. To start, choose a sequence
of linear control laws
κf (x, i) := K(i)x
and let

AK (i) := A(i) + B(i)K(i)


QK (i) := Q(i) + K(i)′ R(i)K(i)

For integers m and n satisfying m ≥ n ≥ 0, let

A(m, n) := AK (m − 1)AK (m − 2) · · · AK (n + 1)AK (n)

Given these matrices, the closed-loop evolution of the system under


the terminal control law is

x(m) = A(m, n)x(n)


134 Model Predictive Control—Regulation

for f (x, u, i) = f (x, κf (u, i), i) = AK (i)x.


Suppose the periodic system (A(i), B(i)) is stabilizable. It follows
that the control laws K(i) can be chosen so that each A(i + T , i) is
stable. Such control laws can be found, e.g., by iterating the periodic
discrete algebraic Riccati equation or by solving the Riccati equation
for a larger, time-invariant system (see Exercise 2.23).
For a terminal cost, we require matrices P (i) that satisfy

AK (i)′ P (i + 1)AK (i) + QK (i) = P (i)

Summing this relationship for i ∈ I0:T −1 gives

TX
−1
A(T , 0)′ P (T )A(T , 0) + A(k, 0)′ Q(k)A(k, 0) = P (0)
i=0

and by periodicity, P (T ) = P (0). Noting that A(T , 0) is stable and the


summation is positive definite (recall that the first term |A(0, 0)|2Q(0) =
Q(0) is positive definite), there exists a unique solution to this equation,
and the remaining P (i) are determined by the recurrence relationship.
Thus, taking
1
Vf (x, i) = x ′ P (i)x
2
we have, for u = κf (x, i) = K(i)x

1 ′
Vf (f (x, u, i), i + 1) + ℓ(x, u, i) = x AK (i)′ P (i + 1)AK (i)x+
2
1 ′ 1
x QK (i)x = x ′ P (i)x ≤ Vf (x, i)
2 2

as required. The terminal region can then be taken as Xf (i) = Rn .


Summarizing we have

If the periodic system is stabilizable, there exists a periodic


sequence of controller gains and terminal penalties such
that XN (i) = Xf (i) = Rn for all i ≥ 0. The origin is glob-
ally asymptotically stable by Theorem 2.39, which can be
strengthened to globally exponentially stable due to the quad-
ratic stage cost. The function Vf (·, i) is a global, time-varying
CLF.
2.5 Examples of MPC 135

2.5.3 Stable Linear Systems with Control Constraints

Usually, when constraints and/or nonlinearities are present, it is im-


possible to obtain a global CLF to serve as the terminal cost function
Vf (·). There are, however, a few special cases where this is possible,
such as the stable linear system.
The system to be controlled is x + = Ax + Bu where A is stable (its
eigenvalues lie strictly inside the unit circle) and the control u is subject
to the constraint u ∈ U where U is compact and contains the origin in its
interior. The stage cost is ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru) where Q and
R are positive definite. To establish stability of the systems under MPC,
we wish to obtain a global CLF to serve as the terminal cost function
Vf (·). This is usually difficult because any linear control law u = Kx,
say, transgresses the control constraint for x sufficiently large. In other
words, it is usually impossible to find a Vf (·) such that there exists a
u ∈ U satisfying Vf (Ax + Bu) ≤ Vf (x) − ℓ(x, u) for all x in Rn . Since
A is stable, however, it is possible to obtain a Lyapunov function for
the autonomous system x + = Ax that is a suitable candidate for Vf (·);
in fact, for all Q > 0, there exists a P > 0 such that

A′ P A + Q = P

Let Vf (·) be defined by

Vf (x) = (1/2)x ′ P x

With f (·), ℓ(·), and Vf (·) defined thus, PN (x) is a parametric quadratic
problem if the constraint set U is polyhedral and global solutions may
be computed online. The terminal cost function Vf (·) satisfies

Vf (Ax) + (1/2)x ′ Qx − Vf (x) = (1/2)x ′ (A′ P A + Q − P )x = 0

for all x ∈ Xf := Rn . We see that for all x ∈ Xf , there exists a u, namely


u = 0, such that Vf (Ax + Bu) ≤ Vf (x) − ℓ(x, u); ℓ(x, u) = (1/2)x ′ Qx
when u = 0. Since there are no state or terminal constraints, XN = Rn .
It follows that there exist positive constants c1 and c2 such that

c1 |x|2 ≤ VN0 (x) ≤ c2 |x|2


VN0 (f (x, κN (x))) − VN0 (x) ≤ −c1 |x|2

for all x ∈ XN = Rn . Summarizing, we have

Assumptions 2.2, 2.3, and 2.14 are satisfied and XN = Xf =


Rn . It follows from Theorems 2.19 and 2.21 that the origin
136 Model Predictive Control—Regulation

is globally, exponentially stable for the controlled system


x + = Ax + BκN (x).

An extension of this approach for unstable A is used in Chapter 6.

2.5.4 Linear Systems with Control and State Constraints

We turn now to the consideration of systems with control and state


constraints. In this situation determination of a global CLF is usually
difficult if not impossible. Hence we show how local CLFs may be de-
termined together with an invariant region in which they are valid.
The system to be controlled is x + = Ax + Bu where A is not nec-
essarily stable, the control u is subject to the constraint u ∈ U where
U is compact and contains the origin in its interior, and the state x
is subject to the constraint x ∈ X where X is closed and contains the
origin in its interior. The stage cost is ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru)
where Q and R are positive definite. Because of the constraints, it is
difficult to obtain a global CLF. Hence we restrict ourselves to the more
modest goal of obtaining a local CLF and proceed as follows. If (A, B)
is stabilizable, the solution to the infinite horizon unconstrained op-
timal control problem Puc ∞ (x) is known. The value function for this
uc
problem is V∞ (x) = (1/2)x ′ P x where P is the unique (in the class of
positive semidefinite matrices) solution to the discrete algebraic Riccati
equation
P = A′K P AK + QK
in which AK := A + BK, QK := Q + K ′ RK, and u = Kx, in which K is
defined by
K := −(B ′ P B + R)−1 B ′ P A
uc
is the optimal controller. The value function V∞ (·) for the infinite
uc
horizon unconstrained optimal control problem P∞ (x) satisfies
uc uc uc
V∞ (x) = min{ℓ(x, u) + V∞ (Ax + Bu)} = ℓ(x, Kx) + V∞ (AK x)
u

It is known that P is positive definite. We define the terminal cost Vf (·)


by
uc
Vf (x) := V∞ (x) = (1/2)x ′ P x
If X and U are polyhedral, problem PN (x) is a parametric quadratic pro-
gram that may be solved online using standard software. The terminal
cost function Vf (·) satisfies

Vf (AK x) + (1/2)x ′ QK x − Vf (x) ≤ 0 ∀x ∈ Rn


2.5 Examples of MPC 137

The controller u = Kx does not necessarily satisfy the control and state
constraints, however. The terminal constraint set Xf must be chosen
with this requirement in mind. We may choose Xf to be the maximal
invariant constraint admissible set for x + = AK x; this is the largest set
W with respect to inclusion5 satisfying: (a) W ⊆ {x ∈ X | Kx ∈ U}, and
(b) x ∈ W implies x(i) = AiK x ∈ W for all i ≥ 0. Thus Xf , defined this
way, is control invariant for x + = Ax + Bu, u ∈ U. If the initial state
x of the system is in Xf , the controller u = Kx maintains the state
in Xf and satisfies the state and control constraints for all future time
(x(i) = AiK x ∈ Xf ⊂ X and u(i) = Kx(i) ∈ U for all i ≥ 0). Hence,
with Vf (·), Xf , and ℓ(·) as defined previously, Assumptions 2.2, 2.3,
and 2.14 are satisfied. Summarizing, we have

Assumptions 2.2, 2.3, and 2.14 are satisfied, and Xf contains


the origin in its interior. Hence, by Theorems 2.19 and 2.21,
the origin is exponentially stable in XN .

It is, of course, not necessary to choose K and Vf (·) as above. Any


K such that AK = A + BK is stable may be chosen, and P may be
obtained by solving the Lyapunov equation A′K P AK + QK = P . With
Vf (x) := (1/2)x ′ P x and Xf the maximal constraint admissible set for
x + = AK x, the origin may be shown, as above, to be asymptotically
stable with a region of attraction XN for x + = Ax + BκN (x), and ex-
ponentially stable with a region of attraction any sublevel set of VN0 (·).
The optimal control problem is, again, a quadratic program. The ter-
minal set Xf may be chosen, as above, to be the maximal invariant
constraint admissible set for x + = AK x, or it may be chosen to be a
suitably small sublevel set of Vf (·); by suitably small, we mean small
enough to ensure Xf ⊆ X and KXf ⊆ U. The set Xf , if chosen this way,
is ellipsoidal, a subset of the maximal constraint admissible set, and
is positive invariant for x + = AK x. The disadvantage of this choice
is that PN (x) is no longer a quadratic program, though it remains a
convex program for which software exists.
uc
The choice Vf (·) = V∞ (·) results in an interesting property of the
+
closed-loop system x = Ax + BκN (x). Generally, the terminal con-
straint set Xf is not positive invariant for the controlled system x + =
Ax + BκN (x). Thus, in solving PN (x) for an initial state x ∈ Xf , the

“predicted” state sequence x0 (x) = x 0 (0; x), x 0 (1; x), . . . , x 0 (N; x)
starts and ends in Xf but does not necessarily remain in Xf . Thus

5W ∈ W is the largest set in W with respect to inclusion if W ′ ⊆ W for any W ′ ∈ W .


138 Model Predictive Control—Regulation

x 0 (0; x) = x ∈ Xf and x 0 (N; x) ∈ Xf , because of the terminal con-


straint in the optimal control problem, but, for any i ∈ I1:N−1 , x 0 (i; x)
may lie outside of Xf . In particular, x + = Ax + BκN (x) = x 0 (1; x)
may lie outside of Xf ; Xf is not necessarily positive invariant for the
controlled system x + = Ax + BκN (x).
Consider now the problem PucN (x) defined in the same way as PN (x)
except that all constraints are omitted so that UN (x) = RNm

Puc
N (x) : VNuc (x) = min VN (x, u)
u

in which VN (·) is defined as previously by


N−1
X
VN (x, u) := ℓ(x(i), u(i)) + Vf (x(N))
i=0

with Vf (·) the value function for the infinite horizon unconstrained
uc
optimal control problem, i.e., Vf (x) := V∞ (x) = (1/2)x ′ P x. With
these definitions, it follows that

VNuc (x) = V∞
uc
(x) = Vf (x) = (1/2)x ′ P x
uc
κN (x) = Kx, K = −(B ′ P B + R)−1 B ′ P A

for all x ∈ Rn ; u = Kx is the optimal controller for the unconstrained


infinite horizon problem. But Xf is positive invariant for x + = AK x.
uc
We now claim that with Vf (·) chosen to equal to V∞ (·), the terminal
+
constraint set Xf is positive invariant for x = Ax+BκN (x). We do this
by showing that VN0 (x) = VNuc (x) = V∞ uc
(x) for all x ∈ Xf , so that the
associated control laws are the same, i.e., κN (x) = Kx. First, because
Puc
N (x) is identical with PN (x) except for the absence of all constraints,
we have
VNuc (x) = Vf (x) ≤ VN0 (x) ∀x ∈ XN ⊇ Xf
Second, from Lemma 2.18

VN0 (x) ≤ Vf (x) ∀x ∈ Xf

Hence VN0 (x + ) = VNuc (x) = Vf (x) for all x ∈ Xf . That κN (x) = Kx for
all x ∈ Xf follows from the uniqueness of the solutions to the problems
PN (x) and PucN (x). Summarizing, we have

If Vf (·) is chosen to be the value function for the uncon-


strained infinite horizon optimal control problem, if u =
Kx is the associated controller, and if Xf is invariant for
2.5 Examples of MPC 139

x + = AK x, then Xf is also positive invariant for the con-


trolled system x + = Ax + BκN (x). Also κN (x) = Kx for all
x ∈ Xf .

2.5.5 Constrained Nonlinear Systems

The system to be controlled is

x + = f (x, u)

in which f (·) is assumed to be twice continuously differentiable. The


system is subject to state and control constraints

x∈X u∈U

in which X is closed and U is compact; each set contains the origin in


its interior. The cost function is defined by
N−1
X
VN (x, u) = ℓ(x(i), u(i)) + Vf (x(N))
i=0

in which, for each i, x(i) := φ(i; x, u), the solution of x + = f (x, u) at


time i if the initial state is x at time zero and the control is u. The stage
cost ℓ(·) is defined by

ℓ(x, u) := (1/2)(|x|2Q + |u|2R )

in which Q and R are positive definite. The optimal control problem


PN (x) is defined by

PN (x) : VN0 (x) = min{VN (x, u) | u ∈ UN (x)}


u

in which UN (x) is defined by (2.5) and includes the terminal constraint


x(N) = φ(N; x, u) ∈ Xf (in addition to the state and control con-
straints).
Our first task is to choose the ingredients Vf (·) and Xf of the op-
timal control problem to ensure asymptotic stability of the origin for
the controlled system. We obtain a terminal cost function Vf (·) and
a terminal constraint set Xf by linearization of the nonlinear system
x + = f (x, u) at the origin. Hence we assume f (·) and ℓ(·) are twice
continuously differentiable so that Assumption 2.2 is satisfied . Sup-
pose then that the linearized system is

x + = Ax + Bu
140 Model Predictive Control—Regulation

where A := fx (0, 0) and B := fu (0, 0). We assume that (A, B) is sta-


bilizable and we choose any controller u = Kx such that the origin is
globally exponentially stable for the system x + = AK x, AK := A + BK,
i.e., such that AK is stable. Suppose also that the stage cost ℓ(·) is
defined by ℓ(x, u) := (1/2)(|x|2Q + |u|2R ) where Q and R are positive
definite; hence ℓ(x, Kx) = (1/2)x ′ QK x where QK := (Q + K ′ RK). Let
P be defined by the Lyapunov equation

A′K P AK + µQK = P

for some µ > 1 The reason for the factor µ will become apparent soon.
Since QK is positive definite and AK is stable, P is positive definite. Let
the terminal cost function Vf (·) be defined by

Vf (x) := (1/2)x ′ P x

Clearly Vf (·) is a global CLF for the linear system x + = Ax+Bu. Indeed,
it follows from its definition that Vf (·) satisfies

Vf (AK x) + (µ/2)x ′ QK x − Vf (x) = 0 ∀x ∈ Rn (2.22)

Consider now the nonlinear system x + = f (x, u) with linear control


u = Kx. The controlled system satisfies

x + = f (x, Kx)

We wish to show that Vf (·) is a local CLF for x + = f (x, u) in some


neighborhood of the origin; specifically, we wish to show there exists
an a ∈ (0, ∞) such that

Vf (f (x, Kx)) + (1/2)x ′ QK x − Vf (x) ≤ 0 ∀x ∈ leva Vf (2.23)

in which, for all a > 0, leva Vf := {x | Vf (x) ≤ a} is a sublevel set of


Vf . Since P is positive definite, leva Vf is an ellipsoid with the origin as
its center. Comparing inequality (2.23) with (2.22), we see that (2.23) is
satisfied if

Vf (f (x, Kx)) − Vf (AK x) ≤ ((µ − 1)/2)x ′ Qk x ∀x ∈ leva Vf (2.24)

Let e(·) be defined as follows

e(x) := f (x, Kx) − AK x


2.5 Examples of MPC 141

so that

Vf (f (x, Kx)) − Vf (AK x) = (AK x)′ P e(x) + (1/2)e(x)′ P e(x) (2.25)

By definition, e(0) = f (0, 0) − AK 0 = 0 and ex (x) = fx (x, Kx) + fu (x,


Kx)K − AK . It follows that ex (0) = 0. Since f (·) is twice continuously
differentiable, for any δ > 0, there exists a cδ > 0 such that |exx (x)| ≤
cδ for all x in δB. From Proposition A.11 in Appendix A
Z1
|e(x)| = e(0) + ex (0)x + (1 − s)x ′ exx (sx)xds
0
Z1
≤ (1 − s)cδ |x|2 ds ≤ (1/2)cδ |x|2
0

for all x in δB. From (2.25), we see that there exists an ε ∈ (0, δ] such
that (2.24), and, hence, (2.23), is satisfied for all x ∈ εB. Because of
our choice of ℓ(·), there exists a c1 > 0 such that Vf (x) ≥ ℓ(x, Kx) ≥
p
c1 |x|2 for all x ∈ Rn . It follows that x ∈ leva Vf implies |x| ≤ a/c1 .
p
We can choose a to satisfy a/c1 = ε. With this choice, x ∈ leva Vf
implies |x| ≤ ε ≤ δ, which, in turn, implies (2.23) is satisfied.
We conclude that there exists an a > 0 such that Vf (·) and Xf :=
leva Vf satisfy Assumptions 2.2 and 2.3. For each x ∈ Xf there exists a
u = κf (x) := Kx such that Vf (x, u) ≤ Vf (x)−ℓ(x, u) since ℓ(x, Kx) =
(1/2)x ′ QK x so that our assumption that ℓ(x, u) = (1/2)(x ′ Qx +
u′ Ru) where Q and R are positive definite, and our definition of Vf (·)
ensure the existence of positive constants c1 , c2 and c3 such that VN0 (x) ≥
c1 |x|2 for all Rn , Vf (x) ≤ c2 |x|2 and VN0 (f (x, κf (x))) ≤ VN0 (x) −
c3 |x|2 for all x ∈ Xf thereby satisfying Assumption 2.14. Finally, by
definition, the set Xf contains the origin in its interior. Summarizing,
we have
Assumptions 2.2, 2.3, and 2.14 are satisfied, and Xf con-
tains the origin in its interior. In addition α1 (·), α2 (·), and
α3 (·) satisfy the hypotheses of Theorem 2.21. Hence, by
Theorems 2.19 and 2.21, the origin is exponentially stable
for x + = f (x, κN (x)) in XN .
Asymptotic stability of the origin in XN also may be established when
Xf := {0} by assuming a K∞ bound on VN0 (·) as in Assumption 2.17.

2.5.6 Constrained Nonlinear Time-Varying Systems

Although Assumption 2.33 (the basic stability assumption) for the time-
varying case suffices to ensure that VN0 (·) has sufficient cost decrease,
142 Model Predictive Control—Regulation

it can be asked if there exist a Vf (·) and Xf satisfying the hypotheses of


this assumption, as well as Assumption 2.37. We give a few examples
below.
Terminal equality constraint. Consider a linear time-varying system
described by x + = f (x(i), u(i), i) = A(i)x(i) + B(i)u(i) with ℓ(x, u,
i) = (1/2)(x ′ Q(i)x(i) + u′ R(i)u). Clearly (x̄, ū) = (0, 0) is an equilib-
rium pair since f (0, 0, i) = 0 for all i ∈ I≥0 . The terminal constraint set
is Xf (i) = {0} for all i ∈ I≥0 , and the cost can be taken as Vf (x, i) ≡ 0.
Assumption 2.33(a) is clearly satisfied. If, in addition, the matrices A(i),
B(i), Q(i), and R(i) can be uniformly bounded from above, and the
system is stabilizable (with U containing a neighborhood of the origin),
then the weak controllability hypothesis implies that Assumption 2.37
is satisfied as well.
If f (·) is nonlinear, assumption 2.33(a) is satisfied if f (0, 0, i) = 0
for all i ∈ I≥0 . Verifying Assumption 2.37 requires more work in the
nonlinear case, but weak controllability is often the easiest way. In
summary we have

Given the terminal equality constraint and Assumption 2.37,


Theorem 2.39 applies and the origin is asymptotically stable
in XN (i) at each time i ≥ 0 for the time-varying system x + =
f (x, κN (x, i), i).

Periodic target tracking. If the target is a periodic reference signal


and the system is periodic with period T as in Limon, Alamo, de la
Peña, Zeilinger, Jones, and Pereira (2012), Falugi and Mayne (2013b),
and Rawlings and Risbeck (2017), it is possible, under certain condi-
tions, to obtain terminal ingredients that satisfy Assumptions 2.33(a)
and 2.37.
In the general case, terminal region synthesis is challenging. But
given sufficient smoothness in the system model, we can proceed as
follows. First we subtract the periodic state and input references and
work in deviation variables so that the origin is again the target. As-
suming f (·) is twice continuously differentiable in x and u at (0, 0, i),
we can linearize the system to determine

∂f ∂f
A(i) := (0, 0, i) B(i) := (0, 0, i)
∂x ∂u
Assuming the origin is in the interior of each X(i) (but not necessarily
each U(i)), we determine a subspace of unsaturated inputs u e such that
(i) u(i) = F (i)u(i), (ii) there exists ϵ > 0 such that F (i)u(i) ∈ U(i)
e e
2.5 Examples of MPC 143

for all ue ≤ ϵ, and (iii) the reduced linear system (A(i), B(i)F (i)) is
stabilizable. These conditions ensure that the reduced linear system is
locally unconstrained. Taking a positive definite stage cost

1 
ℓ(x, u, i) := x ′ Q(i)x + u′ R(i)u
2
we chose µ > 1 and proceed as in the linear unconstrained case (Sec-
tion 2.5.2) using the reduced model (A(i), B(i)F (i)) and adjusted cost
matrices µQ(i) and µR(i). We thus have the relationship

Vf (A(i)x + B(i)u, i + 1) ≤ Vf (x, i) − µℓ(x, u, i)

with u = κf (x, i) := K(i)x and Vf (x, i) := (1/2)x ′ P (i)x. Two issues


remain: first, it is unlikely that K(i)x ∈ U(i) for all x and i; and second,
the cost decrease holds only for the (approximate) linearized system.
To address the first issue, we start by defining the set

X(i) := {x ∈ X(i) | κf (x, i) ∈ U(i) and f (x, κf (u, i), i) ∈ X(i + 1)}

on which κf (·) is valid. We require Xf (i) ⊆ X(i) for all i ∈ I≥0 . By


assumption, X(i) contains a neighborhood of the origin, and so we can
determine constants a(i) > 0 sufficiently small such that

leva(i) Vf (·, i) ⊆ X(i) i≥0

For the second issue, we can appeal to Taylor’s theorem as in Sec-


tion 2.5.5 to find constants b(i) ∈ (0, a(i)] such that

Vf (f (x, u, i), i + 1) − Vf (A(i)x + B(i)u, i + 1) ≤ (µ − 1)ℓ(x, u, i)

for all x ∈ levb(i) Vf (·, i) and i ∈ I≥0 . That is, the approximation error
of the linear system is sufficiently small. Thus, adding this inequality
to the approximate cost decrease condition, we recover

Vf (f (x, u, i), i + 1) − Vf (x, i) ≤ −ℓ(x, u, i)

on terminal regions Xf (i) = levb(i) Vf (·, i). That these terminal regions
are positive invariant follows from the cost decrease condition. Note
also that these sets Xf (i) contain the origin in their interiors, and thus
Assumption 2.37 is satisfied. Summarizing we have

Given sufficient smoothness in f (x, u, i), terminal region


synthesis can be accomplished for tracking a periodic refer-
ence. Then the assumptions of Theorem 2.39 are satisfied,
144 Model Predictive Control—Regulation

and the origin (in deviation variables; hence, the periodic


reference in the original variables) is asymptotically stable
in XN (i) at each time i ≥ 0 for the time-varying system
x+ = f (x, κN (x, i), i).

2.6 Is a Terminal Constraint Set Xf Necessary?


While addition of a terminal cost Vf (·) does not materially affect the
optimal control problem, addition of a terminal constraint x(N) ∈ Xf ,
which is a state constraint, may have a significant effect. In particular,
problems with only control constraints are usually easier to solve. So
if state constraints are not present or if they are handled by penalty
functions (soft constraints), it is highly desirable to avoid the addition
of a terminal constraint. Moreover, it is possible to establish continuity
of the value function for a range of optimal control problems if there
are no state constraints; continuity of the value function ensures a de-
gree of robustness (see Chapter 3). It is therefore natural to ask if the
terminal constraint can be omitted without affecting stability.
A possible procedure is merely to omit the terminal constraint and
to require that the initial state lies in a subset of XN that is sufficiently
small. We examine this alternative here and assume that Vf (·), Xf and
ℓ(·) satisfy Assumptions 2.2, 2.3, and 2.14, and that Xf := {x | Vf (x) ≤
a} for some a > 0.
We assume, as in the examples of MPC discussed in Section 2.5, that
the terminal cost function Vf (·), the constraint set Xf , and the stage
cost ℓ(·) for the optimal control problem PN (x) are chosen to satisfy
Assumptions 2.2, 2.3, and 2.14 so that there exists a local control law
κf : Xf → U such that Xf ⊆ {x ∈ X | κf (x) ∈ U} is positive invariant
for x + = f (x, κf (x)) and Vf (f (x, κf (x)))+ℓ(x, κf (x)) ≤ Vf (x) for all
x ∈ Xf . We assume that the function Vf (·) is defined on X even though
it possesses the property Vf (f (x, κf (x))) + ℓ(x, κf (x)) ≤ Vf (x) only
in Xf . In many cases, even if the system being controlled is nonlinear,
Vf (·) is quadratic and positive definite, and κf (·) is linear. The set Xf
may be chosen to be a sublevel set of Vf (·) so that Xf = W (a) := {x |
Vf (x) ≤ a} for some a > 0. We discuss in the sequel a modified form
of the optimal control problem PN (x) in which the terminal cost Vf (·)
is replaced by βVf (·) and the terminal constraint Xf is omitted, and
show that if β is sufficiently large the solution of the modified optimal
control problem is such that the optimal terminal state nevertheless
lies in Xf so that terminal constraint is implicitly satisfied.
2.6 Is a Terminal Constraint Set Xf Necessary? 145

β
For all β ≥ 1, let PN (x) denote the modified optimal control problem
defined by
β β
V̂N (x) = min{VN (x, u) | u ∈ ÛN (x)}
u

in which the cost function to be minimized is now


N−1
X
β
VN (x, u) := ℓ(x(i), u(i)) + βVf (x(N))
i=0

in which, for all i, x(i) = φ(i; x, u), the solution at time i of x + = f (x,
u) when the initial state is x and the control sequence is u. The control
constraint set ÛN (x) ensures satisfaction of the state and control con-
straints, but not the terminal constraint, and is defined by

ÛN (x) := {u | (x(i), u(i)) ∈ Z, i ∈ I0:N−1 , x(N) ∈ X}


β
The cost function VN (·) with β = 1 is identical to the cost function
VN (·) employed in the standard problem PN considered previously.
β
Let X̂N := {x ∈ X | ÛN (x) ≠ ∅} denote the domain of V̂N (·); let uβ (x)
β
denote the solution of PN (x); and let xβ (x) denote the associated op-
timal state trajectory. Thus
 
uβ (x) = uβ (0; x), uβ (1; x), . . . , uβ (N − 1; x)
 
xβ (x) = x β (0; x), x β (1; x), . . . , x β (N; x)

where x β (i; x) := φ(i; x, uβ (x)) for all i. The implicit MPC control law
β β
is κN (·) where κN (x) := uβ (0; x). Neither ÛN (x) nor X̂N depend on
the parameter β. It can be shown (Exercise 2.11) that the pair (βVf (·),
Xf ) satisfies Assumptions 2.2–2.14 if β ≥ 1, since these assumptions
are satisfied by the pair (Vf (·), Xf ). The absence of the terminal con-
β
straint x(N) ∈ Xf in problem PN (x), which is otherwise the same as
the normal optimal control problem PN (x) when β = 1, ensures that
V̂N1 (x) ≤ VN0 (x) for all x ∈ XN and that XN ⊆ X̂N where VN0 (·) is the
value function for PN (x) and XN is the domain of VN0 (·).
β β
Problem PN (x) and the associated MPC control law κN (·) are de-
fined below. Suppose uβ (x) is optimal for the terminally unconstrained
β
problem PN (x), β ≥ 1, and that xβ (x) is the associated optimal state
trajectory.
β
That the origin is asymptotically stable for x + = f (x, κN (x)) and
each β ≥ 1, with a region of attraction that depends on the parameter
β is established by Limon, Alamo, Salas, and Camacho (2006) via the
following results.
146 Model Predictive Control—Regulation

Lemma 2.40 (Entering the terminal region). Suppose uβ (x) is optimal


β
for the terminally unconstrained problem PN (x), with β ≥ 1, and that
xβ (x) is the associated optimal state trajectory. If x β (N; x) ∉ Xf , then
x β (i; x) ∉ Xf for all i ∈ I0:N−1 .

Proof. Since, as shown in Exercise 2.11, βVf (x) ≥ βVf (f (x, κf (x))) +
ℓ(x, κf (x)) and f (x, κf (x)) ∈ Xf for all x ∈ Xf , all β ≥ 1, it follows
that for all x ∈ Xf and all i ∈ I0:N−1

N−1
X β
βVf (x) ≥ ℓ(x f (j; x, i), uf (j; x, i)) + βVf (x f (N; x, i)) ≥ V̂N−i (x)
j=i

in which x f (j; x, i) is the solution of x + = f (x, κf (x)) at time j if the


initial state is x at time i, uf (j; x, i) = κf (x f (j; x, i)), and κf (·) is the
local control law that satisfies the stability assumptions. The second 
inequality follows from the fact that the control sequence uf (j; x, i) ,
β
j ∈ Ii:N−1 is feasible for PN (x) if x ∈ Xf . Suppose contrary to what is to
be proved, that there exists a i ∈ I0:N−1 such that x β (i; x) ∈ Xf . By the
principle of optimality, the control sequence uβ (i; x), uβ (i + 1; x), . . . ,
 β
uβ (N − 1; x) is optimal for PN−i (x β (i; x)). Hence
β
βVf (x β (i; x)) ≥ V̂N−i (x β (i; x)) ≥ βVf (x β (N; x)) > βa

since x β (N; x) ∉ Xf contradicting the fact that x β (i; x) ∈ Xf . This


proves the lemma. ■
β
For all β ≥ 1, let the set ΓN be defined by
β β
ΓN := {x | V̂N (x) ≤ Nd + βa}

We assume in the sequel that there exists a d > 0 such ℓ(x, u) ≥ d for
all x ∈ X \ Xf and all u ∈ U. The following result is due to Limon et al.
(2006).

Theorem 2.41 (MPC stability; no terminal constraint). The origin is


asymptotically or exponentially stable for the closed-loop system x + =
β β β
f (x, κN (x)) with a region of attraction ΓN . The set ΓN is positive invari-
β
ant for x + = f (x, κN (x)).

Proof. From the Lemma, x β (N; x) ∉ Xf implies x β (i; x) ∉ Xf for all


i ∈ I0:N . This, in turn, implies
β
V̂N (x) > Nd + βa
2.7 Suboptimal MPC 147

β β
so that x ∉ ΓN . Hence x ∈ ΓN implies x β (N; x) ∈ Xf . It then follows,
since βVf (·) and Xf satisfy Assumptions 2.2 and 2.3, that the origin
β
is asymptotically or exponentially stable for x + = f (x, κN (x)) with a
β β
region of attraction ΓN . It also follows that x ∈ ΓN (x) implies

β β β β
V̂N (x β (1; x)) ≤ V̂N (x) − ℓ(x, κN (x)) ≤ V̂N (x) ≤ Nd + βa

β β β
so that x β (1; x) = f (x, κN (x)) ∈ ΓN . Hence ΓN is positive invariant for
β
x + = f (x, κN (x)). ■
β
Limon et al. (2006) then proceed to show that ΓN increases with β
β β
or, more precisely, that β1 ≤ β2 implies that ΓN 1
⊆ ΓN 2 .
They also show
that for any x steerable to the interior of Xf by a feasible control, there
β
exists a β such that x ∈ ΓN . We refer to requiring the initial state x to
β
lie in ΓN as an implicit terminal constraint.
If it is desired that the feasible sets for Pi (x) be nested (Xi ⊂ Xi+1 ,
i = 1, 2, . . . N −1) (thereby ensuring recursive feasibility), it is necessary,
as shown in Mayne (2013), that PN (x) includes a terminal constraint
that is control invariant.

2.7 Suboptimal MPC


Overview. There is a significant practical problem that we have not
yet addressed, namely that if the optimal control problem PN (x) solved
online is not convex, which is usually the case when the system is non-
linear, the global minimum of VN (x, u) in UN (x) cannot usually be
determined. Since we assume, in the stability theory given previously,
that the global minimum is achieved, we have to consider the impact of
this unpalatable fact. It is possible, as shown in Scokaert, Mayne, and
Rawlings (1999); Pannocchia, Rawlings, and Wright (2011) to achieve
stability without requiring globally optimal solutions of PN (x). The
basic idea behind the suboptimal model predictive controller is sim-
ple. Suppose the current state is x and that u = u(0), u(1), . . . ,

u(N − 1) ∈ UN (x) is a feasible control sequence for PN (x). The
first element u(0) of u is applied to the system x + = f (x, u); let κN (x,
u) denote this control. In the absence of uncertainty, the next state is
equal to the predicted state x + = f (x, u(0)).
Consider the control sequence u e defined by
 
e = u(1), u(2), . . . , u(N − 1), κf (x(N))
u (2.26)
148 Model Predictive Control—Regulation

in which x(N) = φ(N; x, u) and κf (·) is a local control law with the
property that u = κf (x) satisfies Assumption 2.2 for all x ∈ Xf . The
existence of such a κf (·), which is often of the form κf (x) = Kx, is
implied by Assumption 2.2. Then, since x(N) ∈ Xf and since the sta-
e ∈ UN (x)
bilizing conditions 2.14 are satisfied, the control sequence u
satisfies
VN (x + , u
e ) ≤ VN (x, u) − ℓ(x, u(0)) ≤ VN (x, u) − α1 (|x|) (2.27)
with x + := f (x, u(0)).
No optimization is required to get the cost reduction ℓ(x, u(0))
given by (2.27); in practice the control sequence u e can be improved
by several iterations of an optimization algorithm. Inequality (2.27) is
reminiscent of the inequality VN0 (x + ) ≤ VN0 (x) − α1 (|x|) that provides
the basis for establishing asymptotic stability of the origin for the con-
trolled systems previously analyzed. This suggests that the simple al-
gorithm described previously, which places very low demands on the
online optimization algorithm, may also ensure asymptotic stability of
the origin.
This is almost true. The obstacle to applying standard Lyapunov
theory is that there is no obvious Lyapunov function V : Rn → R≥0
because, at each state x + , there exist many control sequences u+ satis-
fying VN (x + , u+ ) ≤ VN (x, u) − α1 (|x|). The function (x, u) , VN (x, u)
is not a function of x only and may have many different values for each
x; therefore it cannot play the role of the function VN0 (x) used previ-
ously. Moreover, the controller can generate, for a given initial state,
many different trajectories, all of which have to be considered. We
address these issues next following the recent development in Allan,
Bates, Risbeck, and Rawlings (2017).
A key step is to consider suboptimal MPC as an evolution of an
extended state consisting of the state and warm-start pair. Given a
feasible warm start, optimization algorithms can produce an improved
feasible sequence or, failing even that, simply return the warm start.
The first input is injected and a new warm start can generated from
the returned control sequence and terminal control law.
Warm start. An admissible warm start u e , must steer the current state
x to the terminal region subject to the input constraints, i.e., u e ∈
UN (x). It also must satisfy VN (x, u e ) ≤ Vf (x) if x ∈ Xf , which en-
sures that |x| → 0 implies |u| → 0. These two conditions define the set
of admissible warm starts

Ue N (x) := ue ∈ UN (x) | VN (x, u
e ) ≤ Vf (x) if x ∈ Xf (2.28)
2.7 Suboptimal MPC 149

When x ∈ Xf and u e ∈ UN (x) but VN (x, u


e ) > Vf (x), an admissible
e f (x) can be recovered using the terminal control law.
warm start u

Proposition 2.42 (Admissible warm start in Xf ). For any x ∈ Xf , the


following warm start is feasible

e N (x)
e f (x) := κf (x), κf (f (x, κf (x))), . . . ∈ U
u

The proof of this proposition is discussed in Exercise 2.24.


We define the set of admissible control sequences ǓN (x, u e ) as those
feasible control sequences u that result in a lower cost than the warm
start; the suboptimal control law is the set of first elements of admis-
sible control sequences

ǓN (x, u
e) = u | u ∈ Ue N (x), VN (x, u) ≤ VN (x, u
e)

κN (x, u
e ) = u(0) | u ∈ ǓN (x, u
e)

From its definition, the suboptimal control law is a function of both the
state x and the warm start u e∈U e N (x).
To complete the algorithm we require a successor warm start for
the successor state x + = f (x, u(0)). First defining
 
e w (x, u) := u(1), u(2), . . . , u(N − 1), κf (φ(N; x, u))
u

+ e N (x + ) as follows
we choose the successor warm start u e ∈U

 e (x + )
u if x + ∈ Xf and
 f

+
e :=
u VN (x + , u
e f (x + )) ≤ VN (x + , u
e w (x, u)) (2.29)


e
uw (x, u) else
+
e = ζ(x, u), and Proposition 2.42
This mapping in (2.29) is denoted u
ensures that the warm start generated by ζ(x, u) is admissible for x + .
We have the following algorithm for suboptimal MPC.

Algorithm 2.43 (Suboptimal MPC). First, choose Xf and Vf (·) satisfy-


ing Assumption 2.14 and obtain the initial state x ∈ XN and any initial
warm start u
e∈U e N (x). Then repeat
1. Obtain current measurement of state x.

2. Compute any input u ∈ ǓN (x, u


e ).

3. Inject the first element of the input sequence u.


+
e = ζ(x, u).
4. Compute the next warm start u
150 Model Predictive Control—Regulation

Because the control law κN (x, u


e ) is a function of the warm start u
e
as well as the state x, we extend the meaning of state to include the
warm start.

2.7.1 Extended State

In Algorithm 2.43 we begin with a state and warm-start pair and pro-
ceed from this pair to the next at the start of each time step. We denote
this extended state as z := (x, u
e ) for x ∈ XN and u e∈U e N (x). The ex-
tended state evolves according to
 +
z+ ∈ H(z) := (x + , u
e ) | x + = f (x, u(0)),
+
e = ζ(x, u), u ∈ ǓN (z)
u (2.30)

in which u(0) is the first element of u. We denote by ψ(k; z) any so-


lution of (2.30) with initial extended state z and denote by φ(k; z) the
accompanying x trajectory. We restrict ZN to the set of z for which
u
e∈U e N (x).

e N := (x, u
Z e ) | x ∈ XN and u e N (x)
e∈U

To directly link the asymptotic behavior of z with that of x, the follow-


ing proposition is necessary.

Proposition 2.44 (Linking warm start and state). There exists a function
αr (·) ∈ K∞ such that u e ≤ αr (|x|) for any (x, u eN.
e) ∈ Z

A proof is given in (Allan et al., 2017, Proposition 10).

2.7.2 Asymptotic Stability of Difference Inclusions

Because the extended state evolves as the difference inclusion (2.30),


we present the following definitions of asymptotic stability and the
associated Lyapunov functions. Consider the difference inclusion z+ ∈
H(z), such that H(0) = {0}.

Definition 2.45 (Asymptotic stability (difference inclusion)). We say the


origin of the difference inclusion z+ ∈ H(z) is asymptotically stable in
a positive invariant set Z if there exists a function β(·) ∈ KL such that
for any z ∈ Z and for all k ∈ I≥0 , all solutions ψ(k; z) satisfy

|ψ(k; z)| ≤ β(|z| , k)


2.7 Suboptimal MPC 151

Definition 2.46 (Lyapunov function (difference inclusion)). V (·) is a


Lyapunov function in the positive invariant set Z for the difference
inclusion z+ ∈ H(z) if there exist functions α1 (·), α2 (·), α3 (·) ∈ K∞
such that for all z ∈ Z

α1 (|z|) ≤ V (z) ≤ α2 (|z|) (2.31)


+
sup V (z ) ≤ V (z) − α3 (|z|) (2.32)
z+ ∈H(z)

Although V (·) is not required to be continuous everywhere, (2.31)


implies that it is continuous at the origin.
Proposition 2.47 (Asymptotic stability (difference inclusion)). If the set
Z contains the origin, is positive invariant for the difference inclusion
z+ ∈ H(z), H(0) = {0}, and it admits a Lyapunov function V (·) in Z,
then the origin is asymptotically stable in Z.
A proof of this proposition is given in (Allan et al., 2017, Proposition
13); it is similar to the proof of Theorem B.15 in Appendix B.
Theorem 2.48 (Asymptotic stability of suboptimal MPC). Suppose As-
sumptions 2.2, 2.3, and 2.14 are satisfied, and that ℓ(x, u) ≥ αℓ (|(x, u)|)
for all (x, u) ∈ Z, and Xf = levb Vf = {x ∈ Rn | Vf (x) ≤ b}, for some
b > 0. Then the function VN (z) is a Lyapunov function in the set Z eN
for the closed-loop system (2.30) under Algorithm 2.43. Therefore the
eN.
origin is asymptotically stable in Z
Proof. First we show that VN (z) is a Lyapunov function for (2.30) on
e N . Because u ∈ ǓN (z) and, by construction,
the positive invariant set Z
+ e e N , so that Z
e N is positive invariant.
e ∈ UN (x ), we have that z+ ∈ Z
u +

From the definition of the control law and the warm start, we have that
eN
for all z ∈ Z
N−1
X N−1
X
VN (z) ≥ VN (x, u) ≥ ℓ(x(i), u(i)) ≥ αℓ (|(x(i), u(i))|)
i=0 i=0

Next we use (B.1) from Appendix B and the triangle inequality to obtain

N−1
X  N−1 
1 X 
αℓ (|(x(i), u(i))|) ≥ αℓ |(x(i), u(i))| ≥ αℓ |(x, u)| /N
i=0
N i=0

Finally using the ℓp -norm property that for all vectors a, b, |(a, b)| ≥
|b|, and noting that x(0) = x, so we have that
  
αℓ |(x, u)| /N ≥ αℓ |(x, u)| /N := α1 |(x, u)| = α1 (z)
152 Model Predictive Control—Regulation

with α1 ∈ K∞ . So we have established the lower bound VN (z) ≥ α1 (z)


eN.
for all z ∈ Z
Because of Assumptions 2.2 and 2.3, the set ZN is closed as shown
in Proposition 2.10(c). The cost function VN (z) is continuous on ZN ,
which includes z = 0, so from Proposition B.25 we conclude that there
exists α2 (·) ∈ K∞ such that VN (z) ≤ α2 (|z|) for all z ∈ Ze N ⊂ ZN , and
the upper-bound condition of Definition 6.2 is satisfied.
As in standard MPC analysis, we have for all z ∈ Z e N that

VN (z+ ) ≤ VN (x, u) − ℓ(x, u(0)) ≤ VN (x, u) − αℓ (|x, u(0)|)

Because u e N (x), from Proposition 2.44 we have that


e∈U

(x, u
e ) ≤ |x| + u
e ≤ |x| + αr (|x|) := αr ′ (|x|) ≤ αr ′ (|(x, u(0))|)

Therefore, αℓ ◦ αr−1
′ ( (x, u
e ) ) ≤ αℓ (|(x, u(0))|). Defining α3 (·) := αℓ ◦
−1
αr ′ (·) and because VN (x, u) ≤ VN (x, ue ), we have that

VN (z+ ) ≤ VN (x, u
e ) − α3 (|z|) = VN (z) − α3 (|z|)

e N and z+ ∈ H(z). We conclude that VN (z) is a Lyapunov


for all z ∈ Z
e N . Asymptotic stability follows directly from
function for (2.30) in Z
Proposition 2.47. ■

From this result, a bound on just x(k) rather than z(k) = (x(k),
eN
e (k)) can also be derived. First we have that for all k ≥ 0 and z ∈ Z
u

|z(k; z)| ≤ β(|z| , k) = β( (x, u


e ) , k) ≤ β(|x| + u
e , k)

From Proposition 2.44 we then have that

β(|x| + u e (|x| , k)
e , k) ≤ β(|x| + αr (|x|), k) := β

e (·) ∈ KL. Combining these we have that


with β

|z(k; z)| = (x(k; z), u e (|x| , k)


e (k; z) ≤ β
e (k; z)) ≤ |x(k; z)| + u

which implies |x(k; z)| ≤ β e (|x| , k). So we have a bound on the evolu-
tion of x(k) depending on only the x initial condition. Note that the
evolution of x(k) depends on the initial condition of z = (x, u e ), so it
depends on initial warm start u e as well as initial x. We cannot ignore
this dependence, which is why we had to analyze the extended state in
the first place. For the same reason we also cannot define the invariant
set in which the x(k) evolution takes place without referring to Z eN.
2.8 Economic Model Predictive Control 153

2.8 Economic Model Predictive Control


Many applications of control are naturally posed as tracking problems.
Vehicle guidance, robotic motion guidance, and low-level objectives
such as maintaining pressures, temperatures, levels, and flows in in-
dustrial processes are typical examples. MPC can certainly provide
feedback control designs with excellent tracking performance for chal-
lenging multivariable, constrained, and nonlinear systems as we have
explored thus far in the text. But feedback control derived from re-
peated online optimization of a process model enables other, higher-
level goals to be addressed as well. In this section we explore using
MPC for optimizing economic performance of a process rather than a
simple tracking objective. As before, we assume the system dynamics
are described by the model

x + = f (x, u)

But here the stage cost is some general function ℓ(x, u) that measures
economic performance of the process. The stage cost is not positive
definite with respect to some target equilibrium point of the model as
in a tracking problem. We set up the usual MPC objective function as a
sum of stage costs over some future prediction horizon
N−1
X
VN (x, u) = ℓ(x(k), u(k)) + Vf (x(N))
k=0

subject to the system model with x(0) = x, the initial condition. As


before, we consider constraints on the states and inputs, (x, u) ∈ Z.
So the only significant change in the MPC problem has been the redefi-
nition of the stage cost ℓ(x, u) to reflect the economics of the process.
The terminal penalty Vf (x) may be changed for the same reason. Typ-
ical stage-cost functions would be composed of a sum of prices of the
raw materials and utilities, and the values of the products being man-
ufactured.
We can also define the best steady-state solution of the system from
the economic perspective. This optimal steady-state pair (xs , us ) is
defined as the solution to the optimization problem Ps

(xs , us ) := arg min {ℓ(x, u) | x = f (x, u)}


(x,u)∈Z

The standard industrial approach to addressing economic performance


is to calculate this best economic steady state (often on a slower time
154 Model Predictive Control—Regulation

scale than the process sample time), and then design an MPC controller
with a different, tracking stage cost to reject disturbances and track this
steady state. In this approach, a typical tracking stage cost would be the
types considered thus far, e.g., ℓt (x, u) = (1/2)(|x − xs |2Q +|u − us |2R ).
In economic MPC, we instead use the same economic stage cost di-
rectly in the dynamic MPC problem. Some relevant questions to be
addressed with this change in design philosophy are: (i) how much
economic performance improvement is possible, and (ii) how differ-
ent is the closed-loop dynamic behavior. For example, we are not even
guaranteed for a nonlinear system that operating at the steady state is
the best possible dynamic behavior of the closed-loop system.
As an introduction to the topic, we next set up the simplest version
of an economic MPC problem, in which we use a terminal constraint. In
the Notes section, we comment on what generalizations are available in
the literature. We now modify the basic assumptions given previously.

Assumption 2.49 (Continuity of system and cost). The functions f :


Z → Rn and ℓ : Z → R≥0 are continuous. Vf (·) = 0. There exists at
least one point (xs , us ) ∈ Z satisfying xs = f (xs , us ).

Assumption 2.50 (Properties of constraint sets). The set Z is closed. If


there are control constraints, the set U(x) is compact and is uniformly
bounded in X.

Assumption 2.51 (Cost lower bound).


(a) The terminal set is a single point, Xf = {xs }.

(b) The stage cost ℓ(x, u) is lower bounded for (x, u) ∈ Z.

Note that since we are using a terminal equality constraint, we do


not require the terminal penalty Vf (·), so it is set to zero. For clarity
in this discussion, we do not assume that (xs , us ) has been shifted to
the origin. The biggest change is that we do not assume here that the
stage cost ℓ(x, u) is positive definite with respect to the optimal steady
state, only that it is lower bounded.
Note that the set of steady states, Zs := {(x, u) ∈ Z | x = f (x, u)},
is nonempty due to Assumption 2.49. It is closed because Z is closed
(Assumption 2.50) and f (·) is continuous. But it may not be bounded
so we are not guaranteed that the solution to Ps exists. So we consider
(xs , us ) to be any element of Zs . We may want to choose (xs , us ) to be
an element of the solution to Ps , when it exists, but this is not necessary
to the subsequent development.
2.8 Economic Model Predictive Control 155

The economic optimal control problem PN (x), is the same as in (2.7)

PN (x) : VN0 (x) := min{VN (x, u) | u ∈ UN (x)}


u

Due to Assumptions 2.49 and 2.50, Proposition 2.4 holds, and the so-
lution to the optimal control problem exists. The control law, κN (·) is
therefore well defined; if it is not unique, we consider as before a fixed
selection map, and the closed-loop system is again given by

x + = f (x, κN (x)) (2.33)

2.8.1 Asymptotic Average Performance

We already have enough structure in this simple problem to establish


that the average cost of economic MPC is better, i.e., not worse, than
any steady-state performance ℓ(xs , us ).

Proposition 2.52 (Asymptotic average performance). Let Assumptions


2.49, 2.50, and 2.51 hold. Then for every x ∈ XN , the following holds
t−1
X ℓ(x(k), u(k))
lim sup ≤ ℓ(xs , us )
t→∞ k=0
t

in which x(k) is the closed-loop solution to (2.33) with initial condition


x, and u(k) = κN (x(k)).

Proof. Because of the terminal constraint, we have that



VN0 f (x, κN (x)) ≤ VN0 (x) − ℓ(x, κN (x)) + ℓ(xs , us ) (2.34)

Performing a sum on this inequality gives


t=1
X ℓ(x(k), u(k))
≤ ℓ(xs , us ) + (1/t)(VN0 (x(0)) − VN0 (x(t)))
k=0
t

The left-hand side may not have a limit, so we take lim sup of both
sides. Note that from Assumption 2.51(b), ℓ(x, u) is lower bounded
for (x, u) ∈ Z, hence so is VN (x, u) for (x, u) ∈ Z, and VN0 (x) for
x ∈ XN . Denote this bound by M. Then limt→∞ −(1/t)VN0 (x(t)) ≤
limt→∞ −M/t = 0 and we have that
t=1
X ℓ(x(k), u(k))
lim sup ≤ ℓ(xs , us ) ■
t→∞ k=0
t
156 Model Predictive Control—Regulation

This result does not imply that the economic MPC controller stabi-
lizes the steady state (xs , us ), only that the average closed-loop per-
formance is better than the best steady-state performance. There are
many examples of nonlinear systems for which the time-average of an
oscillation is better than the steady state. For such systems, we would
expect an optimizing controller to destabilize even a stable steady state
to obtain the performance improvement offered by cycling the system.
Note also that the appearance in (2.34) of the term −ℓ(x, κN (x)) +
ℓ(xs , us ), which is sign indeterminate, destroys the cost decrease prop-
erty of VN0 (·) so it no longer can serve as a Lyapunov function in a
closed-loop stability argument. We next examine the stability question.

2.8.2 Dissipativity and Asymptotic Stability

The idea of dissipativity proves insightful in understanding when eco-


nomic MPC is stabilizing (Angeli, Amrit, and Rawlings, 2012). The basic
idea is motivated by considering a thermodynamic system, mechanical
energy, and work. Imagine we supply mechanical energy to a system
by performing work on the system at some rate. We denote the me-
chanical energy as a storage function, i.e., as the way in which the work
performed on the system is stored by the system. If the system has
no dissipation, then the rate of change in storage function (mechanical
energy) is equal to the supply rate (work). However, if the system also
dissipates mechanical energy into heat, through friction for example,
then the change in the storage function is strictly less than the work
supplied. We make this physical idea precise in the following definition.

Definition 2.53 (Dissipativity). The system x + = f (x, u) is dissipative


with respect to supply rate s : Z → R if there exists a storage function
λ : X → R such that for all (x, u) ∈ Z

λ(f (x, u)) − λ(x) ≤ s(x, u) (2.35)

The system is strictly dissipative with respect to supply rate s and


steady-state xs if there exists α(·) ∈ K∞ such that for all (x, u) ∈ Z

λ(f (x, u)) − λ(x) ≤ s(x, u) − α(|x − xs |) (2.36)

Note that we do not assume that λ(·) is continuous, and we define


strict dissipativity with α(·) a K∞ function. In other literature, α(·) is
sometimes assumed to be a continuous, positive definite function.
We require one technical assumption; its usefulness will be apparent
shortly.
2.8 Economic Model Predictive Control 157

Assumption 2.54 (Continuity at the steady state). The function VN0 (·)+
λ(·) : XN → R is continuous at xs .

The following assumption is then sufficient to guarantee that eco-


nomic MPC is stabilizing.

Assumption 2.55 (Strict dissipativity). The system x + = f (x, u) is


strictly dissipative with supply rate

s(x, u) = ℓ(x, u) − ℓ(xs , us )

Theorem 2.56 (Asymptotic stability of economic MPC). Let Assump-


tions 2.49, 2.50, 2.51, 2.54, and 2.55 hold. Then xs is asymptotically
stable in XN for the closed-loop system x + = f (x, κN (x)).

Proof. We know that VN0 (·) is not a Lyapunov function for the given
stage cost ℓ(·), so our task is to construct one. We first introduce a
rotated stage cost as follows (Diehl, Amrit, and Rawlings, 2011)

e (x, u) = ℓ(x, u) − ℓ(x , u ) + λ(x) − λ(f (x, u))


ℓ s s

Note from (2.36) and Assumption 2.55 that this stage cost then satisfies
for all (x, u) ∈ Z

e (x, u) ≥ α(|x − x |)
ℓ e (x , u ) = 0
ℓ (2.37)
s s s

and we have the kind of stage cost required for a Lyapunov function.
Next define an N-stage sum of this new stage cost as V e N (x, u) :=
PN−1 e
k=0ℓ(x(k), u(k)) and perform the sum to obtain

 N−1
X 
e N (x, u) =
V ℓ(x(k), u(k)) − Nℓ(xs , us ) + λ(x) − λ(xs )
k=0

= VN (x, u) − Nℓ(xs , us ) + λ(x) − λ(xs ) (2.38)

e N (·) and VN (·) differ only by constant terms involving the


Notice that V
steady state, (xs , us ), and the initial condition, x. Therefore because
the optimization of VN (x, u) over u has a solution, so does the opti-
mization of Ve N (x, u), and they are the same solution, giving the same
control law κN (x).
Because of the terminal constraint, we know that XN is positive
0
invariant for the closed-loop system. Next we verify that V e (x) is a
N
158 Model Predictive Control—Regulation

Lyapunov function for the closed-loop system. Since ℓ e (x, u) is non-


e
negative, we have from (2.37) and the definition of V N as a sum of
stage costs, that
0
e (x) ≥ α(|x − xs |)
V N

for all x ∈ XN , and we have established the required lower bound. The
cost difference can be calculated to establish the required cost decrease
0 0 0
e (f (x, κN (x))) ≤ V
V e (x, κ (x)) ≤ V
e (x) − ℓ e (x) − α(|x − xs |)
N N N N

for all x ∈ XN . The remaining step is to verify the upper-bounding in-


0
equality. From Assumption 2.54 and (2.38), we know that V e (·) is also
N
continuous at xs . Therefore, from Proposition 2.38, we have existence
of α2 (·) ∈ K∞ such that for all x ∈ XN
0
e (x) ≤ α2 (|x − xs |)
V N
0
e (·) is therefore a Lya-
We have established the three inequalities and V N
punov function in XN for the system x + = f (x, κN (x)) and xs . Theo-
rem 2.13 then establishes that xs is asymptotically stable in XN for the
closed-loop system. ■

These stability results can also be extended to time-varying and pe-


riodic systems.

Example 2.57: Economic MPC versus tracking MPC


Consider the linear system
" # " #
1/2 1 0
f (x, u) = Ax + Bu A= B=
0 3/4 1

with economic cost function


#"
′ ′ −2
ℓecon (x, u) = q x + r u q= r = −10
2

and sets X = [−10, 10]2 , U = [−1, 1]. The economically optimal steady
state is xs = (8, 4), us = 1. We compare economic MPC to tracking MPC
with
ℓtrack (x, u) = |x − xs |210I + |u − us |2I
Figure 2.5 shows a phase plot of the closed-loop evolution starting from
x = (−8, 8). Both controllers use the terminal constraint Xf = {xs }.
2.8 Economic Model Predictive Control 159

tracking economic
10

6
x2

0
−10 −5 0 5 10 15
x1

Figure 2.5: Closed-loop economic MPC versus tracking MPC starting


at x = (−8, 8) with optimal steady state (8, 4). Both con-
trollers asymptotically stabilize the steady state. Dashed
contours show cost functions for each controller.

While tracking MPC travels directly to the setpoint, economic MPC takes
a detour to achieve lower economic costs.
To prove that the economic MPC controller is stabilizing, we find a
storage function. As a candidate storage function, we take

λ(x) = µ ′ (x − xs ) + (x − xs )′ M(x − xs )

which gives the rotated cost function

e (x, u) = ℓ
ℓ econ (x, u) + λ(x) − λ(f (x, u))

To start, we take µ = (4, 8) from the Lagrange multiplier of the


steady-state problem. With M = 0, ℓe (·) is nonnegative but not posi-
tive definite, indicating that the system is dissipative but not strictly
dissipative. To achieve strict dissipativity, we choose M such that
M − A′ MA = 0.01I. Although the resulting ℓ e (·) function is nonconvex,
160 Model Predictive Control—Regulation

10 1

u
5 0
x1
0
101
−5
100

10−1

5 e0
V 10−2

x2 10−3
0
10−4
−5
10−5

0 5 10 15 20 0 5 10 15 20
time time

Figure 2.6: Closed-loop evolution under economic MPC. The rotated


0
e is a Lyapunov function for the system.
cost function V

it is nevertheless positive definite on Z, indicating strict dissipativity.


To illustrate, we simulate a variety of initial conditions in Figure 2.6.
0
Plotting the rotated cost function Ve (·), we see that it is indeed a Lya-
punov function for the system. □

2.9 Discrete Actuators


Discrete-valued actuators appear in nearly all large-scale industrial pro-
cesses. These obviously include the on/off equipment switches. But,
as discussed in Chapter 1, processes are often designed with multiple
similar units such as furnaces, heaters, chillers, compressors, etc., op-
erating in parallel. In these designs, an important aspect of the control
problem is to choose how many and which of these several possible
units to employ while the total feed flowrate to the process varies.
In industrial practice, these discrete decisions are usually removed
from the MPC control layer and instead made at a different layer of
2.9 Discrete Actuators 161

the automation system using heuristics or other logical rules. If dis-


crete inputs are chosen optimally, however, process performance can
be greatly improved, and thus we would like to treat discrete decisions
directly in MPC theory.
There are two basic issues brought about by including the discrete
actuators in the control decision u. The first is theoretical: how much
does the established MPC theory have to change to accommodate this
class of decision variables? The second is computational: is it practical
to solve the modified MPC optimal control problem in the available
sample time? We address the theory question here, and find that the
required changes to the existing theory are surprisingly minimal. The
computational question is being addressed by the rapid development
of mixed-integer solvers. It is difficult to predict what limits might
emerge to slow this progress, but current mixed-integer solvers are
already capable of addressing a not uninteresting class of industrial
applications.
Figure 1.2 provides a representative picture of the main issue. From
this perspective, if we embed the discrete decisions in the field of reals,
we are merely changing the feasible region U, from a simply connected
set with an interior when describing only continuous actuators, to a dis-
connected set that may not have an interior when describing mixed con-
tinuous/discrete actuators. So one theoretical approach to the problem
is to adjust the MPC theory to accommodate these types of U regions.
A careful reading of the assumptions made for the results presented
thus far reveals that we have little work to do. We have not assumed
that the equilibrium of interest lies in the interior of U, or even that U
has an interior. The main assumption about U are Assumption 2.3 for
the time-invariant case, Assumption 2.26 for the time-varying case, and
Assumption 2.50 for the economic MPC problem. The main restrictions
are that U is closed, and sometimes compact, so that the optimization
of VN (x, u) over u has a solution. All of these assumptions admit U
regions corresponding to discrete variables. The first conclusion is that
the results governing nominal closed-loop stability for various forms
of MPC all pass through. These include Theorem 2.19 (time-invariant
case), Theorem 2.39 (time-varying case), Theorem 2.24 (ℓ(y, u) stage
cost), and Theorem 2.56 (economic MPC).
That does not mean that nothing has changed. The admissible re-
gion XN in which the system is stabilized may change markedly, for
example. Proposition 2.10 also passes through in the discrete-actuator
case, so we know that the admissible sets are still nested, Xj ⊆ Xj+1
162 Model Predictive Control—Regulation

for all j ≥ 0. But it is not unusual for systems with even linear dy-
namics to have disconnected admissible regions, which is not possible
for linear systems with only continuous actuators and convex U. When
tracking a constant setpoint, the design of terminal regions and penal-
ties must account for the fact that the discrete actuators usually remain
at fixed values in a small neighborhood of the steady state of interest,
and can be used only for rejecting larger disturbances and enhancing
transient performance back to the steady state. Fine control about the
steady state must be accomplished by the continuous actuators that
are unconstrained in a neighborhood of the steady state. But this is
the same issue that is faced when some subset of the continuous ac-
tuators are saturated at the steady state of interest (Rao and Rawlings,
1999), which is a routine situation in process control problems. We
conclude the chapter with an example illustrating these issues.

Example 2.58: MPC with mixed continuous/discrete actuators


Consider a constant-volume tank that needs to be cooled. The system is
diagrammed in Figure 2.7. The two cooling units operate such that they
can be either on or off, and if on, the heat duty must be between Q̇min
and Q̇max . After nondimensionalizing, the system evolves according to
dT1
= −α(T1 − T0 ) − ρ1 (T1 − T2 )
dt
dT2
= −ρ2 (T2 − T1 ) − βQ̇
dt
with α = 2 and β = ρ1 = ρ2 = 1. The system states are (T1 , T2 ), and
the inputs are (Q̇, nq ) with
n o
U = (Q̇, nq ) ∈ R × {0, 1, 2} | nq Q̇min ≤ Q̇ ≤ nq Q̇max

in which Q̇ is the total cooling duty and nq chooses the number of


cooling units that are on at the given time. For T0 = 40 and Q̇max = 10,
we wish to control the system to the steady state xs = (35, 25), us =
(10, 1), using costs Q = I and R = 10−3 I. The system is discretized
with ∆ = 0.25.
To start, we choose a terminal region and control law. Assuming
Q̇min > 0, both components of u are at constraints at the steady state,
and thus we cannot use them in a linear terminal control law. The
system is stable for κf (x) = us , however, and a valid terminal cost
is Vf (x) = (x − xs )′ P (x − xs ) with P satisfying A′ P A − P = −Q. As
a terminal set we take Xf = {x | Vf (x) ≤ 1}, although any level set
2.10 Concluding Comments 163

T0
T3

cooler Q̇
tank

T2
T1

Figure 2.7: Diagram of tank/cooler system. Each cooling unit can be


either on or off, and if on, it must be between its (possibly
nonzero) minimum and maximum capacities.

would suffice. With this terminal region, Figure 2.8 shows the feasible
sets for Q̇min = 0 and Q̇min = 9. Note that for Q̇min > 0, the projection
of U onto the total heat duty Q̇ is a disconnected set of possible heat
duties, leading to disconnected sets XN for N ≤ 5. (The sets XN for
N ≥ 6 are connected.)
To control the system, we solve the standard MPC problem with
horizon N = 8. Figure 2.9 shows a phase portrait of closed-loop evolu-
tion for various initial conditions with Q̇min = 9. Each evaluation of the
control law requires solving a mixed-integer, quadratically constrained
QP (with the quadratic constraint due to the terminal region). In gen-
eral, the controller chooses u2 = 1 near the setpoint and u2 ∈ {0, 2}
far from it, although this behavior is not global. Despite the discon-
nected nature of U, all initial conditions are driven asymptotically to
the setpoint. □

2.10 Concluding Comments


MPC is an implementation, for practical reasons, of receding horizon
control (RHC), in which offline determination of the RHC law κN (·) is
replaced by online determination of its value κN (x), the control action,
at each state x encountered during its operation. Because the optimal
164 Model Predictive Control—Regulation

Q̇min = 0, Q̇max = 10 Q̇min = 9, Q̇max = 10


40 40

35 35

30 30

T2
25 25

20 20

15 15

10 10
10 20 30 40 50 60 10 20 30 40 50 60
T1 T1
Xf X1 X2 X3 X4 X5 X6 X7

Figure 2.8: Feasible sets XN for two values of Q̇min . Note that for
Q̇min = 9 (right-hand side), XN for N ≤ 4 are discon-
nected sets.

control problem that defines the control is a finite horizon problem, nei-
ther stability nor optimality of the cost function is necessarily achieved
by a receding horizon or model predictive controller.
This chapter shows how stability may be achieved by adding a ter-
minal cost function and a terminal constraint to the optimal control
problem. Adding a terminal cost function adds little or no complexity
to the optimal control problem that has to be solved online, and usu-
ally improves performance. Indeed, the infinite horizon value function
0
V∞ (·) for the constrained problem would be an ideal choice for the ter-
minal penalty because the value function VN0 (·) for the online optimal
0
control problem would then be equal to V∞ (·), and the controller would
inherit the performance advantages of the infinite horizon controller.
In addition, the actual trajectories of the controlled system would be
precisely equal, in the absence of uncertainty, to those predicted by
0
the online optimizer. Of course, if we knew V∞ (·), the optimal infinite
horizon controller κ∞ (·) could be determined and there would be no
reason to employ MPC.
0
The infinite horizon cost V∞ (·) is known globally only for special
2.10 Concluding Comments 165

u2 = 0 u2 = 1 u2 = 2

40

35

30

T2
25

20

15

10
10 20 30 40 50 60
T1

Figure 2.9: Phase portrait for closed-loop evolution of cooler system


with Q̇min = 9. Line colors show value of discrete actuator
u2 .

cases, however, such as the linear quadratic (LQ) unconstrained prob-


lem. For more general problems in which constraints and/or nonlin-
earity are present, its value—or approximate value—in a neighborhood
of the setpoint can usually be obtained and the use of this local con-
trol Lyapunov function (CLF) should, in general, enhance performance.
Adding a terminal cost appears to be generally advantageous.
The reason for the terminal constraint is precisely the fact that the
terminal penalty is usually merely a local CLF defined in the set Xf ; to
benefit from the terminal cost, the terminal state must be constrained
to lie in Xf . Unlike the addition of a terminal penalty, however, addition
of a terminal constraint may increase complexity of the optimal con-
trol problem considerably. Because efficient programs exist for solving
quadratic programs (QPs), in which the cost function to be minimized is
quadratic and the constraints polyhedral, there is an argument for us-
ing polyhedral constraints. Indeed, a potential terminal constraint set
for the constrained LQ optimal control problem is the maximal con-
166 Model Predictive Control—Regulation

straint admissible set, which is polyhedral. This set is complex, how-


ever, i.e., defined by many linear inequalities, and would appear to be
unsuitable for the complex control problems routinely encountered in
industry.
A terminal constraint set that is considerably simpler is a suitable
sublevel set of the terminal penalty, which is often a simple positive
definite quadratic function resulting in a convex terminal constraint
set. A disadvantage is that the terminal constraint set is now ellipsoidal
rather than polytopic, and conventional QPs cannot be employed for the
LQ constrained optimal control problem. This does not appear to be
a serious disadvantage, however, because the optimal control problem
remains convex, so interior point methods may be readily employed.
In the nonlinear case, adding an ellipsoidal terminal constraint set
does not appreciably affect the complexity of the optimal control prob-
lem. A more serious problem, when the system is nonlinear, is that the
optimal control problem is then usually nonconvex so that global solu-
tions, on which many theoretical results are predicated, are usually too
difficult to obtain. A method for dealing with this difficulty, which also
has the advantage of reducing online complexity, is suboptimal MPC,
described in this chapter and also in Chapter 6.
The current chapter also presents some results that contribute to
an understanding of the subject but do not provide practical tools. For
example, it is useful to know that the domain of attraction for many
of the controllers described here is XN , the set of initial states con-
trollable to the terminal constraint set, but this set cannot usually be
computed. The set is, in principle, computable using the dynamic pro-
gramming (DP) equations presented in this chapter, and may be com-
puted if the system is linear and the constraints, including the terminal
constraint, are polyhedral, provided that the state dimension and the
horizon length are suitably small—considerably smaller than in prob-
lems routinely encountered in industry. In the nonlinear case, this set
cannot usually be computed. Computation difficulties are not resolved
if XN is replaced by a suitable sublevel set of the value function VN0 (·).
Hence, in practice, both for linear and nonlinear MPC, this set has to be
estimated by simulation.

2.11 Notes
MPC has an unusually rich history, making it impossible to summarize
here the many contributions that have been made. Here we restrict
2.11 Notes 167

attention to a subset of this literature that is closely related to the ap-


proach adopted in this book. A fuller picture is presented in the review
paper (Mayne, Rawlings, Rao, and Scokaert, 2000).
The success of conventional MPC derives from the fact that for de-
terministic problems (no uncertainty), feedback is not required so the
solution to the open-loop optimal control problem solved online for a
particular initial state is the same as that obtained by solving the feed-
back problem using DP, for example. Lee and Markus (1967) pointed
out the possibility of MPC in their book on optimal control.

One technique for obtaining a feedback controller synthe-


sis is to measure the current control process state and then
compute very rapidly the open-loop control function. The
first portion of this function is then used during a short time
interval after which a new measurement of the process state
is made and a new open-loop control function is computed
for this new measurement. The procedure is then repeated.

Even earlier, Propoi (1963) proposed a form of MPC utilizing linear


programming, for the control of linear systems with hard constraints
on the control. A big surge in interest in MPC occurred when Richalet,
Rault, Testud, and Papon (1978b) advocated its use for process con-
trol. A whole series of papers, such as (Richalet, Rault, Testud, and
Papon, 1978a), (Cutler and Ramaker, 1980), (Prett and Gillette, 1980),
(Garcı́a and Morshedi, 1986), and (Marquis and Broustail, 1988) helped
cement its popularity in the process control industries, and MPC soon
became the most useful method in modern control technology for con-
trol problems with hard constraints—with thousands of applications
to its credit.
The basic question of stability, an important issue since optimizing
a finite horizon cost does not necessarily yield a stabilizing control,
was not resolved in this early literature. Early academic research in
MPC, reviewed in Garcı́a, Prett, and Morari (1989), did not employ Lya-
punov theory and therefore restricted attention to control of uncon-
strained linear systems, studying the effect of control and cost hori-
zons on stability. Similar studies appeared in the literature on gener-
alized predictive control (GPC) (Ydstie, 1984; Peterka, 1984; De Keyser
and Van Cauwenberghe, 1985; Clarke, Mohtadi, and Tuffs, 1987) that
arose to address deficiencies in minimum variance control. Inter-
estingly enough, earlier research on RHC (Kleinman, 1970; Thomas,
1975; Kwon and Pearson, 1977) had shown indirectly that the impo-
168 Model Predictive Control—Regulation

sition of a terminal equality constraint in the finite horizon optimal


control problem ensured closed-loop stability for linear unconstrained
systems. That a terminal equality constraint had an equally benefi-
cial effect for constrained nonlinear discrete time systems was shown
by Keerthi and Gilbert (1988) and for constrained nonlinear continu-
ous time systems by Chen and Shaw (1982) and Mayne and Michalska
(1990). In each of these papers, Lyapunov stability theory was em-
ployed in contrast to the then current literature on MPC and GPC.
The next advance showed that incorporation of a suitable termi-
nal cost and terminal constraint in the finite horizon optimal control
problem ensured closed-loop stability; the terminal constraint set is
required to be control invariant, and the terminal cost function is re-
quired to be a local CLF. Perhaps the earliest proposal in this direction
is the brief paper by Sznaier and Damborg (1987) for linear systems
with polytopic constraints; in this prescient paper the terminal cost is
chosen to be the value function for the unconstrained infinite horizon
optimal control problem, and the terminal constraint set is the maxi-
mal constraint admissible set (Gilbert and Tan, 1991) for the optimal
controlled system.6 A suitable terminal cost and terminal constraint
set for constrained nonlinear continuous time systems was proposed
in Michalska and Mayne (1993) in the context of dual-mode MPC. In
a paper that has had considerable impact, Chen and Allgöwer (1998)
showed that similar ingredients may be employed to stabilize con-
strained nonlinear continuous time systems when conventional MPC
is employed. Related results were obtained by Parisini and Zoppoli
(1995), and De Nicolao, Magni, and Scattolini (1996).
Stability proofs for the form of MPC proposed, but not analyzed,
in Sznaier and Damborg (1987) were finally provided by Chmielewski
and Manousiouthakis (1996) and Scokaert and Rawlings (1998). These
papers also showed that optimal control for the infinite horizon con-
strained optimal control problem for a specified initial state is achieved
if the horizon is chosen sufficiently long. A terminal constraint is not
required if a global, rather than a local, CLF is available for use as a
terminal cost function. Thus, for the case when the system being con-
trolled is linear and stable, and subject to a convex control constraint,
Rawlings and Muske (1993) showed, in a paper that raised consider-
able interest, that closed-loop stability may be obtained if the terminal

6 If the optimal infinite horizon controlled system is described by x + = A x and if


K
the constraints are u ∈ U and x ∈ X, then the maximal constraint admissible set is
{x | AiK x ∈ X, KAiK x ∈ U ∀i ∈ I≥0 }.
2.11 Notes 169

constraint is omitted and the infinite horizon cost using zero control is
employed as the terminal cost. The resultant terminal cost is a global
CLF.
The basic principles ensuring closed-loop stability in these and many
other papers including (De Nicolao, Magni, and Scattolini, 1998), and
(Mayne, 2000) were distilled and formulated as “stability axioms” in
the review paper (Mayne et al., 2000); they appear as Assumptions 2.2,
2.3, and 2.14 in this chapter. These assumptions provide sufficient
conditions for closed-loop stability for a given horizon. There is an al-
ternative literature that shows that closed-loop stability may often be
achieved if the horizon is chosen to be sufficiently long. Contributions
in this direction include (Primbs and Nevistić, 2000), (Jadbabaie, Yu,
and Hauser, 2001), as well as (Parisini and Zoppoli, 1995; Chmielewski
and Manousiouthakis, 1996; Scokaert and Rawlings, 1998) already men-
tioned. An advantage of this approach is that it avoids addition of an
explicit terminal constraint, although this may be avoided by alterna-
tive means as shown in Section 2.6. A significant development of this
approach (Grüne and Pannek, 2017) gives a comprehensive investiga-
tion and extension of the conditions that ensure recursive feasibility
and stability of MPC that does not have a terminal constraint. On the
other hand, it has been shown (Mayne, 2013) that an explicit or implicit
terminal constraint is necessary if positive invariance and the nested
property Xj+1 ⊃ Xj , j ∈ I≥0 of the feasible sets are required; the nested
property ensures recursive feasibility.
Recently several researchers (Limon, Alvarado, Alamo, and Cama-
cho, 2008, 2010; Fagiano and Teel, 2012; Falugi and Mayne, 2013a;
Müller and Allgöwer, 2014; Mayne and Falugi, 2016) have shown how
to extend the region of attraction XN , and how to solve the related
problem of tracking a randomly varying reference—thereby alleviating
the disadvantage caused by the reduction in the region of attraction
due to the imposition of a terminal constraint. Attention has also been
given to the problem of tracking a periodic reference using model pre-
dictive control (Limon et al., 2012; Falugi and Mayne, 2013b; Rawlings
and Risbeck, 2017).
Regarding the analysis of nonpositive stage costs in Section 2.4.4,
Grimm, Messina, Tuna, and Teel (2005) use a storage function like Λ(·)
to compensate for a semidefinite stage cost. Cai and Teel (2008) give a
discrete time converse theorem for IOSS for all Rn . Allan and Rawlings
(2018) give a converse theorem for IOSS on closed positive invariant
sets and provide a lemma for changing the supply rate function.
170 Model Predictive Control—Regulation

Suboptimal MPC based on a warm start was proposed and analyzed


by Scokaert et al. (1999). Pannocchia et al. (2011) establish that this
form of suboptimal MPC is robustly stable for systems without state
constraints if the terminal constraint is replaced with an enlarged ter-
minal penalty. As noted by Yu, Reble, Chen, and Allgöwer (2014), how-
ever, the assumptions used for these results are strong enough to im-
ply that the optimal value function is continuous. Allan et al. (2017)
establish robustness for systems with discontinuous feedback and dis-
continuous optimal value function.
Lazar and Heemels (2009) analyze robustness of suboptimal MPC
with respect to state disturbances under the condition that the sub-
optimal controller is able to find a solution within a specific degree
of suboptimality from the global solution. Roset, Heemels, Lazar, and
Nijmeijer (2008), show how to extend the analysis to treat measure-
ment disturbances as well as state disturbances. Because this type of
suboptimal MPC is defined in terms of the globally optimal cost, its
implementation requires, in principle, global solvers.
Economic MPC was introduced in Rawlings and Amrit (2009), but
designing process controllers other than MPC to optimize process eco-
nomics has been a part of industrial practice for a long time. When
using an economic (as opposed to tracking) stage cost, cost inequali-
ties and conditions for asymptotic stability have been established for
time-invariant systems with a steady state (Diehl et al., 2011; Amrit,
Rawlings, and Angeli, 2011; Angeli et al., 2012; Ellis, Durand, and Chris-
tofides, 2014). Such results have been extended in Zanon, Gros, and
Diehl (2013) to the time-varying periodic case under the assumptions
of a linear storage function and Lipschitz continuity of the model and
stage cost; Rawlings and Risbeck (2017) require only continuity of the
model and stage cost, and allow a more general form for the storage
function.
For the case of a time-invariant system with optimal periodic opera-
tion, convergence to the optimal periodic solution can be shown using
similar notions of dissipativity (Müller and Grüne, 2015); but this case is
different than the case treated by Rawlings and Risbeck (2017) because
clock time does not appear. In Müller, Angeli, and Allgöwer (2015),
the authors establish the interesting result that a certain dissipativ-
ity condition is also necessary for asymptotic stability. For periodic
processes, stability has been investigated by converting to deviation
variables (Huang, Harinath, and Biegler, 2011; Rawlings and Risbeck,
2017).
2.11 Notes 171

Various results on stability of MPC with discrete actuators have ap-


peared in the literature. In Bemporad and Morari (1999), convergence
to the origin is shown for mixed-logical-dynamical systems based on
certain positive definite restrictions on the stage cost, although Lya-
punov stability is not explicitly shown. For piecewise affine systems,
Baotic, Christophersen, and Morari (2006) establish asymptotic sta-
bility for an infinite horizon control law via Lyapunov function argu-
ments. In Di Cairano, Heemels, Lazar, and Bemporad (2014), a hybrid
Lyapunov function is directly embedded within the optimal control
problem, enforcing cost decrease as a hard constraint and ensuring
closed-loop asymptotic stability. Alternatively, practical stability (i.e.,
boundedness) can often be shown by treating discretization of inputs
as a disturbance and deriving error bounds with respect to the relaxed
continuous-actuator system (Quevedo, Goodwin, and De Doná, 2004;
Aguilera and Quevedo, 2013; Kobayshi, Shein, and Hiraishi, 2014). Fi-
nally, Picasso, Pancanti, Bemporad, and Bicchi (2003) shows asymptotic
stability for open-loop stable linear systems with only practical stabil-
ity for open-loop unstable systems. All of these results are concerned
with stability of a steady state.
The approach presented in this chapter, which shows that current
MPC asymptotic stability theorems based on Lyapunov functions also
cover general nonlinear systems with mixed continuous/discrete actu-
ators, was developed by Rawlings and Risbeck (2017).
172 Model Predictive Control—Regulation

2.12 Exercises

Exercise 2.1: Discontinuous MPC


In Example 2.8, compute U3 (x), V30 (x), and κ3 (x) at a few points on the unit circle.

Exercise 2.2: Boundedness of discrete time model


Consider the continuous time differential equation ẋ = fc (x, u), and its discrete time
counterpart x + = f (x, u). Suppose that fc (·) is continuous, and there exists a positive
constant c such that
fc (x ′ , u) − fc (x, u) ≤ c x ′ − x ∀x, x ′ ∈ Rn , u ∈ U

Show that f (·) is bounded on bounded sets. Moreover, if U is bounded, show that
f −1 (·) is bounded on bounded sets.

Exercise 2.3: Destabilization with state constraints


Consider a state feedback regulation problem with the origin as the setpoint (Muske
and Rawlings, 1993). Let the system be
" # " #
4/3 −2/3 1
A= B= C = [−2/3 1]
1 0 0

and the controller objective function tuning matrices be


Q=I R=I N=5
(a) Plot the unconstrained regulator performance starting from initial condition
h i′
x(0) = 3 3 .

(b) Add the output constraint y(k) ≤ 0.5. Plot the response of the constrained
regulator (both input and output). Is this regulator stabilizing? Can you modify
the tuning parameters Q, R to affect stability as in Section 1.3.4?

(c) Change the output constraint to y(k) ≤ 1 + ϵ, ϵ > 0. Plot the closed-loop re-
sponse for a variety of ϵ. Are any of these regulators destabilizing?

(d) Set the output constraint back to y(k) ≤ 0.5 and add the terminal constraint
x(N) = 0. What is the solution to the regulator problem in this case? Increase
the horizon N. Does this problem eventually go away?

Exercise 2.4: Computing the projection of Z onto XN


Given a polytope
Z := {(x, u) ∈ Rn × Rm | Gx + Hu ≤ ψ}
write an Octave or MATLAB program to determine X, the projection of Z onto Rn
X = {x ∈ Rn | ∃u ∈ Rm such that (x, u) ∈ Z}
Use algorithms 3.1 and 3.2 in Keerthi and Gilbert (1987).
To check your program, consider a system
" # " #
1 1 0
x+ = x+ u
0 1 1
2.12 Exercises 173

subject to the constraints X = {x | x1 ≤ 2} and U = {u | −1 ≤ u ≤ 1}. Consider the


MPC problem with N = 2, u = (u(0), u(1)), and the set Z given by
Z = {(x, u) | x, φ(1; x, u), φ(2; x, u) ∈ X and u(0), u(1) ∈ U}
Verify that the set
X2 := {x ∈ R2 | ∃u ∈ R2 such that (x, u) ∈ Z}
is given by    
1 0 2
   
X2 = {x ∈ R2 | P x ≤ p} P = 1 1 p = 2
1 2 3

Exercise 2.5: Computing the maximal output admissible set


Write an Octave or MATLAB program to determine the maximal constraint admissible
set for the system x + = F x, y = Hx subject to the hard constraint y ∈ Y in which
Y = {y | Ey ≤ e}. Use algorithm 3.2 in Gilbert and Tan (1991).
To check your program, verify for the system
" #
0.9 1 h i
F= H= 1 1
0 0.09
subject to the constraint Y = {y | −1 ≤ y ≤ 1}, and that the maximal output admissi-
ble set is given by
   
1 1 1
 −1 −1   1
  
 0.9 1.09  1
   
O∞ = {x ∈ R2 | Ax ≤ b} A=  b= 
 −0.9 −1.09  1
   
 0.81 0.9981  1
−0.81 −0.9981 1
Show that t ∗ , the smallest integer t such that Ot = O∞ satisfies t ∗ = 2.
What happens to t ∗ as F22 increases and approaches one? What do you conclude
for the case F22 ≥ 1?

Exercise 2.6: Terminal constraint and region of attraction


Consider the system
x + = Ax + Bu
subject to the constraints
x∈X u∈U
in which
" # " #
2 1 1 0
A= B=
0 2 0 1
X = {x ∈ R2 | x1 ≤ 5} U = {u ∈ R2 | −1 ≤ u ≤ 1}
and 1 ∈ R2 is a vector of ones. The MPC cost function is
N−1
X
VN (x, u) = ℓ(x(i), u(i)) + Vf (x(N))
i=0
in which " #
 α 0
ℓ(x, u) = (1/2) |x|2Q + |u|2 Q=
0 α
and Vf (·) is the terminal penalty on the final state.
174 Model Predictive Control—Regulation

x2 0

−1

−2

−3
−3 −2 −1 0 1 2 3
x1

Figure 2.10: Region of attraction (shaded region) for constrained


MPC controller of Exercise 2.6.

(a) Implement unconstrained MPC with no terminal cost (Vf (·) = 0) for a few values
of α. Choose a value of α for which the resultant closed loop is unstable. Try
N = 3.

(b) Implement constrained MPC with no terminal cost or terminal constraint for the
value of α obtained in the previous part. Is the resultant closed loop stable or
unstable?

(c) Implement constrained MPC with terminal equality constraint x(N) = 0 for the
same value of α. Find the region of attraction for the constrained MPC controller
using the projection algorithm from Exercise 2.4. The result should resemble
Figure 2.10.

Exercise 2.7: Infinite horizon cost to go as terminal penalty


Consider the system
x + = Ax + Bu
subject to the constraints
x∈X u∈U
in which " # " #
2 1 1 0
A= B=
0 2 0 1
and
X = {x ∈ R2 | −5 ≤ x1 ≤ 5} U = {u ∈ R2 | − 1 ≤ u ≤ 1}
2.12 Exercises 175

x2 0

−1

−2

−3
−3 −2 −1 0 1 2 3
x1

Figure 2.11: The region Xf , in which the unconstrained LQR control


law is feasible for Exercise 2.7.

The cost is
N−1
X
VN (x, u) := ℓ(x(i), u(i)) + Vf (x(N))
i=0
in which " #
α 0
ℓ(x, u) = (1/2)(|x|2Q 2
+ |u| ) Q=
0 α
and Vf (·) is the terminal penalty on the final state and 1 ∈ R2 is a vector of all ones.
Use α = 10−5 and N = 3 and terminal cost Vf (x) = (1/2)x ′ Πx where Π is the solution
to the steady-state Riccati equation.
(a) Compute the infinite horizon optimal cost and control law for the unconstrained
system.

(b) Find the region Xf , the maximal constraint admissible set using the algorithm in
Exercise 2.5 for the system x + = (A + BK)x with constraints x ∈ X and Kx ∈ U.
You should obtain the region shown in Figure 2.11.

(c) Add a terminal constraint x(N) ∈ Xf and implement constrained MPC. Find XN ,
the region of attraction for the MPC problem with Vf (·) as the terminal cost and
x(N) ∈ Xf as the terminal constraint. Contrast it with the region of attraction
for the MPC problem in Exercise 2.6 with a terminal constraint x(N) = 0.

(d) Estimate X̄N , the set of initial states for which the MPC control sequence for
horizon N is equal to the MPC control sequence for an infinite horizon.
Hint: x ∈ X̄N if x 0 (N; x) ∈ int(Xf ). Why?
176 Model Predictive Control—Regulation

Exercise 2.8: Terminal penalty with and without terminal constraint


Consider the system
x + = Ax + Bu
subject to the constraints
x∈X u∈U
in which " # " #
2 1 1 0
A= B=
0 2 0 1
and
X = {x ∈ R2 | −15 ≤ x1 ≤ 15} U = {u ∈ R2 | − 5 · 1 ≤ u ≤ 5 · 1}
The cost is
N−1
X
VN (x, u) = ℓ(x(i), u(i)) + Vf (x(N))
i=0
in which " #
α 0
ℓ(x, u) = (1/2)(|x|2Q + |u|)2 Q=
0 α
Vf (·) is the terminal penalty on the final state, and 1 ∈ R2 is a vector of ones.
Use α = 10−5 and N = 3 and terminal cost Vf (x) = (1/2)x ′ Πx where Vf (·) is the
infinite horizon optimal cost for the unconstrained problem.
(a) Add a terminal constraint x(N) ∈ Xf , in which Xf is the maximal constraint
admissible set for the system x + = (A + BK)x and K is the optimal controller
gain for the unconstrained problem. Using the code developed in Exercise 2.7,
estimate XN , the region of attraction for the MPC problem with this terminal
constraint and terminal cost. Also estimate X̄N , the region for which the MPC
control sequence for horizon N is equal to the MPC control sequence for infinite
horizon. Your results should resemble Figure 2.12.

(b) Remove the terminal constraint and estimate the domain of attraction X̂N (by
simulation). Compare this X̂N with XN and X̄N obtained previously.

(c) Change the terminal cost to Vf (x) = (3/2)x ′ Πx and repeat the previous part.

Exercise 2.9: Decreasing property for the time-varying case


Consider the time-varying optimal control problem specified in 2.4.5. Suppose that
Vf (·) and Xf satisfy the basic stability assumption, Assumption 2.33 Prove that the
0
value function VN (·) has the decreasing property
0 0
VN ((x, i)+ ) ≤ VN (x, i) − ℓ(x, i, κN (x, i))

for all (x, i) ∈ XI.

Exercise 2.10: Terminal cost bound for the time-varying case


0
Refer to Section 2.4.5. Prove that the value function VN (·) satisfies
0
VN (x, i) ≤ Vf (x, i + N) ∀(x, i) ∈ Xf (i + N) × I≥0
2.12 Exercises 177

5
XN
4 Xf
X̄N
3
2
1
x2 0
−1
−2
−3
−4
−5
−10 −8 −6 −4 −2 0 2 4 6 8 10
x1

Figure 2.12: The region of attraction for terminal constraint x(N) ∈


Xf and terminal penalty Vf (x) = (1/2)x ′ Πx and the
estimate of X̄N for Exercise 2.8.

Exercise 2.11: Modification of terminal cost


Refer to Section 2.6. Show that the pair (βVf (·), Xf ) satisfies Assumption 2.14 if
(Vf (·), Xf ) satisfies this assumptions, β ≥ 1, and ℓ(·) satisfies Assumption 2.2.

Exercise 2.12: A Lyapunov theorem for asymptotic stability


Prove the asymptotic stability result for Lyapunov functions.

Theorem 2.59 (Lyapunov theorem for asymptotic stability). Given the dynamic system
x + = f (x) 0 = f (0)
The origin is asymptotically stable if there exist K functions α, β, γ, and r > 0 such that
Lyapunov function V satisfies for x ∈ r B
α(|x|) ≤ V (x) ≤ β(|x|)
V (f (x)) − V (x) ≤ −γ(|x|)

Exercise 2.13: An MPC stability result


Given the following nonlinear model and objective function
x + = f (x, u), 0 = f (0, 0)
x(0) = x
N−1
X
VN (x, u) = ℓ(x(k), u(k))
k=0
178 Model Predictive Control—Regulation

Consider the terminal constraint MPC regulator


min VN (x, u)
u
subject to
x + = f (x, u) x(0) = x x(N) = 0
and denote the first move in the optimal control sequence as u0 (x). Given the closed-
loop system
x + = f (x, u0 (x))
(a) Prove that the origin is asymptotically stable for the closed-loop system. State
the cost function assumption and controllability assumption required so that
the control problem is feasible for some set of defined initial conditions.

(b) What assumptions about the cost function ℓ(x, u) are required to strengthen the
controller so that the origin is exponentially stable for the closed-loop system?
How does the controllability assumption change for this case?

Exercise 2.14: Stability using observability instead of IOSS


Assume that the system x + = f (x, u), y = h(x) is ℓ-observable, i.e., there exists a
α ∈ K and an integer No ≥ 1 such that
NX
o −1
ℓ(y(i), u(i)) ≥ α(|x|)
j=0

for all x and all u; here x(i) := φ(i; x, u) and y(i) := h(x(i)). Prove the result given in
Section 2.4.4 that the origin is asymptotically stable for the closed-loop system x + =
f (x, κN (x)) using the assumption that x + = f (x, u), y = h(x) is ℓ-observable rather
than IOSS. Assume that N ≥ No .

Exercise 2.15: Input/output-to-state stability (IOSS) and convergence


Proposition 2.60 (Convergence of state under IOSS). Assume that the system x + = f (x,
u), y = h(x) is IOSS and that u(i) → 0 and y(i) → 0 as i → ∞. Then x(i) = φ(i; x,
u) → 0 as i → ∞ for any initial state x.

Prove Proposition 2.60. Hint: consider the solution at time k + l using the state at
time k as the initial state.

Exercise 2.16: Equality for quadratic functions


Prove the following result which is useful for analyzing the unreachable setpoint prob-
lem.

Lemma 2.61 (An equality for quadratic functions). Let X be a nonempty compact sub-
set of Rn , and let ℓ(·) be a strictly convex quadratic function on X defined by ℓ(x) :=
(1/2)x ′ Qx + q′ x + c, Q > 0. Consider a sequence (x(i))i∈I1:P with mean x̄P :=
P
(1/P ) Pi=1 x(i). Then the following holds
P
X P
X
ℓ(x(i)) = (1/2) |x(i) − x̄P |2Q + P ℓ(x̄P )
i=1 i=1
P
It follows from this lemma that ℓ(x̄P ) ≤ (1/P ) Pi=1 ℓ(x(i)), which is Jensen’s
inequality for the special case of a quadratic function.
2.12 Exercises 179

Exercise 2.17: Unreachable setpoint MPC and evolution in a compact set


Prove the following lemma, which is useful for analyzing the stability of MPC with an
unreachable setpoint.

Lemma 2.62 (Evolution in a compact set). Suppose x(0) = x lies in the set XN . Then
the state trajectory (x(i)) where, for each i, x(i) = φf (i; x) of the controlled system
x + = f (x) evolves in a compact set.

Exercise 2.18: MPC and multivariable, constrained systems


Consider a two-input, two-output process with the following transfer function
 
2 2
 
 s+1 
G(s) =  10s + 1 
 1 4 

s+1 s+1

(a) Consider a unit setpoint change in the first output. Choose a reasonable sample
time, ∆. Simulate the behavior of an offset-free discrete time MPC controller
with Q = I, S = I and large N.

(b) Add the constraint −1 ≤ u(k) ≤ 1 and simulate the response.

(c) Add the constraint −0.1 ≤ ∆u/∆ ≤ 0.1 and simulate the response.

(d) Add significant noise to both output measurements (make the standard devia-
tion in each output about 0.1). Retune the MPC controller to obtain good perfor-
mance. Describe which controller parameters you changed and why.

Exercise 2.19: LQR versus LAR


We are now all experts on the linear quadratic regulator (LQR), which employs a linear
model and quadratic performance measure. Let’s consider the case of a linear model
but absolute value performance measure, which we call the linear absolute regulator
(LAR)7
N−1
X 
min q |x(k)| + r |u(k)| + q(N) |x(N)|
u
k=0
For simplicity consider the following one-step controller, in which u and x are scalars

min V (x(0), u(0)) = |x(1)| + |u(0)|


u(0)

subject to
x(1) = Ax(0) + Bu(0)
Draw a sketch of x(1) versus u(0) (recall x(0) is a known parameter) and show
the x-axis and y-axis intercepts on your plot. Now draw a sketch of V (x(0), u(0))
versus u(0) in order to see what kind of optimization problem you are solving. You
may want to plot both terms in the objective function individually and then add them
together to make your V plot. Label on your plot the places where the cost function V

7 Laplace would love us for making this choice, but Gauss would not be happy.
180 Model Predictive Control—Regulation

suffers discontinuities in slope. Where is the solution in your sketch? Does it exist for
all A, B, x(0)? Is it unique for all A, B, x(0)?
The motivation for this problem is to change the quadratic program (QP) of the
LQR to a linear program (LP) in the LAR, because the computational burden for LPs is
often smaller than QPs. The absolute value terms can be converted into linear terms
with the introduction of slack variables.

Exercise 2.20: Unreachable setpoints in constrained versus unconstrained


linear systems
Consider the linear system with input constraint

x + = Ax + Bu u∈U

We examine here both unconstrained systems in which U = Rm and constrained sys-


tems in which U ⊂ Rm is a convex polyhedron. Consider the stage cost defined in terms
of setpoints for state and input xsp , usp
2 2 
ℓ(x, u) = (1/2) x − xsp + u − usp
Q R

in which we assume for simplicity that Q, R > 0. For the setpoint to be unreachable in
an unconstrained problem, the setpoint must be inconsistent, i.e., not a steady state of
the system, or
xsp ≠ Axsp + Busp
Consider also using the stage cost centered at the optimal steady state (xs , us )

ℓs (x, u) = (1/2) |x − xs |2Q + |u − us |2R

The optimal steady state satisfies

(xs , us ) = arg min ℓ(x, u)


x,u

subject to " #
h i x
I−A −B =0 u∈U
u
Figure 2.13 depicts an inconsistent setpoint, and the optimal steady state for uncon-
strained and constrained systems.
(a) For unconstrained systems, show that optimizing the cost function with terminal
constraint
N−1
X
V (x, u) := ℓ(x(k), u(k))
k=0
subject to
x + = Ax + Bu x(0) = x x(N) = xs
gives the same solution as optimizing the cost function
N−1
X
Vs (x, u) := ℓs (x(k), u(k))
k=0

subject to the same model constraint, initial condition, and terminal constraint.
Therefore, there is no reason to consider the unreachable setpoint problem fur-
ther for an unconstrained linear system. Shifting the stage cost from ℓ(x, u) to
ℓs (x, u) provides identical control behavior and is simpler to analyze.
2.12 Exercises 181

ℓ(x, u) = const

x (xsp , usp )
unreachable
steady-state line

) ℓs (x, u) = const
, us
ℓs (x, u) = const (x s unconstrained
)
, us
(x s
constrained

U u

Figure 2.13: Inconsistent setpoint (xsp , usp ), unreachable stage cost


ℓ(x, u), and optimal steady states (xs , us ), and stage
costs ℓs (x, u) for constrained and unconstrained sys-
tems.

Hint. First define a third stage cost l(x, u) = ℓ(x, u) − λ′ ((I − A)x −
Bu), and show, for any λ, optimizing with l(x, u) as stage cost is
the same as optimizing using ℓ(x, u) as stage cost. Then set λ =
λs , the optimal Lagrange multiplier of the steady-state optimization
problem.

(b) For constrained systems, provide a simple example that shows optimizing the
cost function V (x, u) subject to

x + = Ax + Bu x(0) = x x(N) = xs u(k) ∈ U for all k ∈ I0:N−1

does not give the same solution as optimizing the cost function Vs (x, u) sub-
ject to the same constraints. For constrained linear systems, these problems
are different and optimizing the unreachable stage cost provides a new design
opportunity.

Exercise 2.21: Filing for patent


An excited graduate student shows up at your office. He begins, “Look, I have discov-
ered a great money-making scheme using MPC.” You ask him to tell you about it. “Well,”
he says, “you told us in class that the optimal steady state is asymptotically stable even
if you use the stage cost measuring distance from the unreachable setpoint, right?” You
reply, “Yes, that’s what I said.” He continues, “OK, well look at this little sketch I drew,”
and he shows you a picture like Figure 2.14. “So imagine I use the infinite horizon cost
function so the open-loop and closed-loop trajectories are identical. If the best steady
state is asymptotically stable, then the stage cost asymptotically approaches ℓ(xs , us ),
right?” You reply, “I guess that looks right.” He then says, “OK, well if I look at the
optimal cost using state x at time k and state x + at time k + 1, by the principle of
optimality I get the usual cost decrease”

V 0 (x + ) ≤ V 0 (x) − ℓ(x, u0 (x)) (2.39)


182 Model Predictive Control—Regulation

ℓ V 0 (x)
V 0 (x + )

ℓ(x, u0 (x))

ℓ(xs , us )
0
k k+1 k+2 k+3 k+4 k+5 k+6 k+7

Figure 2.14: Stage cost versus time for the case of unreachable set-
point. The cost V 0 (x(k)) is the area under the curve to
the right of time k.

You interrupt, “Wait, these V 0 (·) costs are not bounded in this case!” Unfazed, the
student replies, “Yeah, I realize that, but this sketch is basically correct regardless.
Say we just make the horizon really long; then the costs are all finite and this equation
becomes closer and closer to being true as we make the horizon longer and longer.” You
start to feel a little queasy at this point. The student continues, “OK, so if this inequality
basically holds, V 0 (x(k)) is decreasing with k along the closed-loop trajectory, it is
bounded below for all k, it converges, and, therefore, ℓ(x(k), u0 (x(k))) goes to zero
as k goes to ∞.” You definitely don’t like where this is heading, and the student finishes
with, “But ℓ(x, u) = 0 implies x = xsp and u = usp , and the setpoint is supposed to
be unreachable. But I have proven that infinite horizon MPC can reach an unreachable
setpoint. We should patent this!”
How do you respond to this student? Here are some issues to consider.
1. Does the principle of optimality break down in the unreachable setpoint case?
2. Are the open-loop and closed-loop trajectories identical in the limit of an infinite
horizon controller with an unreachable setpoint?
3. Does inequality (2.39) hold as N → ∞? If so, how can you put it on solid footing?
If not, why not, and with what do you replace it?
4. Do you file for patent?

Exercise 2.22: Stabilizable with small controls


Consider a time-varying system x(i + 1) = f (x, u, i) with stage cost ℓ(x, u, i) and
terminal cost Vf (x, i) satisfying Assumptions 2.25, 2.26, and 2.33. Suppose further
that functions f (·) and ℓ(·) are uniformly bounded by K∞ functions αf and αℓ , i.e.,

f (x, u, i) ≤ αf x (|x|) + αf u (|u|)


ℓ(x, u, i) ≤ αℓx (|x|) + αℓu (|u|)

for all i ∈ I≥0 . Prove that if there exists a K∞ function γ(·) such that for each x ∈
XN (i), there exists u ∈ UN (x, i) with |u| ≤ γ(|x|), then there exists a K∞ function
α(·) such that
V 0 (x, i) ≤ α(|x|)
2.12 Exercises 183

for all i ∈ I≥0 and x ∈ XN (i).


Hint: the following inequality may prove useful: for any K∞ function α (see (B.1))

α(s1 + s2 + · · · + sN ) ≤ α(Ns1 ) + α(Ns2 ) + · · · + α(NsN )

Exercise 2.23: Power lifting


Consider a stabilizable T -periodic linear system

x(i + 1) = A(i)x(i) + B(i)u(i)

with positive definite stage cost


1 
ℓ(x, u, i) := x ′ Q(i)x + u′ R(i)u
2
Suppose there exist periodic control laws K(i) and cost matrices P (i) satisfying the
periodic Riccati equation

P (i) = Q(i) + A(i)′ P (i + 1) (A(i) + B(i)K(i))


−1
K(i) = − B(i)′ P (i + 1)B(i) + R(i) B(i)′ P (i + 1)A(i)

Show that the control law K := diag(K(0), . . . , K(T − 1)) and cost P := diag(P (0), . . . ,
P (T − 1)) satisfy the Riccati equation for the time-invariant lifted system
 
0 0 ··· 0 A(T − 1)
A(0) 0 ··· 0 0 
 
 
 0 A(1) · · · 0 0 
A :=  
 . . . . . 
 . . .. . . 
 . . . . 
0 0 · · · A(T − 2) 0
 
0 0 ··· 0 B(T − 1)
B(0) 0 ··· 0 0 
 
 
 0 B(1) · · · 0 0 
B :=  
 . . .. . . 
 . . . . 
 . . . . . 
0 0 · · · B(T − 2) 0
Q := diag(Q(0), . . . , Q(T − 1))
R := diag(R(0), . . . , R(T − 1))

By uniqueness of solutions to the Riccati equation, this system can be used to synthe-
size control laws for periodic systems.

Exercise 2.24: Feasible warm start in Xf


Establish Proposition 2.42, which states that for any x ∈ Xf , the following warm start
is feasible
 e N (x)
e f (x) := κf (x), κf (f (x, κf (x))), . . . ∈ U
u
e is a member of U
Recall that a warm start u e N (x) if all elements of the sequence of
controls are members of U, the state trajectory φ(k; x, ue ) terminates in Xf , and VN (x,
e ) is less than Vf (x).
u
184 Model Predictive Control—Regulation

Exercise 2.25: The geometry of cost rotation


Let’s examine the rotated cost function in the simplest possible setting to understand
what “rotation” means in this context. Consider the discrete time dynamic model and
strictly convex quadratic cost function
2 2
x + = f (x, u) ℓ(x, u) = (1/2) x − xsp Q + u − usp R

with x ∈ Rn , u ∈ Rm , ℓ : Rn × Rm → R≥0 , Q ∈ Rn×n , R ∈ Rm×m with Q, R > 0.


We define the feasible control region as u ∈ U for some nonempty set U. We wish to
illustrate the ideas with the following simple linear system

f (x, u) = Ax + Bu A = 1/2 B = 1/2

subject to polyhedral constraint

U = {u | −1 ≤ u ≤ 1}

We choose an unreachable setpoint that is not a steady state, and cost matrices as
follows

(usp , xsp ) = (2, 3) Q=R=2

The optimal steady state (us , xs ) is given by the solution to the following optimization

(us , xs ) = arg min{ℓ(x, u) | u ∈ U, x = f (x, u)} (2.40)


u,x

(a) Solve this quadratic program and show that the solution is (xs , us ) = (1, 1).
What is the Lagrange multiplier for the equality constraint?

(b) Next we define the rotated cost function following Diehl et al. (2011)
e (x, u) = ℓ(x, u) − λ′ (x − f (x, u)) − ℓ(x , u )
ℓ s s

e (x, u) = 0 for three λ values, λ = 0, −8,


Plot the contour of zero rotated cost ℓ
−12. Compare your contours to those shown in Figure 2.15.
Notice that as you decrease λ, you rotate (and enlarge) the zero cost contour of
ℓ(x, u) about the point (xs , us ), hence the name rotated stage cost.

(c) Notice that the original cost function, which corresponds to λ = 0, has negative
cost values (interior of the circle) that are in the feasible region. The zero contour
for λ = −8 has become tangent to the feasible region, so the cost is nonnegative
in the feasible region. But for λ = −12, the zero contour has been over rotated
so that it again has negative values in the feasible region.
How does the value λ = −8 compare to the Lagrange multiplier of the optimal
steady-state problem?

(d) Explain why MPC based on the rotated stage cost is a Lyapunov function for the
closed-loop system.

Exercise 2.26: Strong duality implies dissipativity


Consider again the steady-state economic problem Ps for the optimal steady state (xs ,
us )
ℓ(xs , us ) := min {ℓ(x, u) | x = f (x, u)}
(x,u)∈Z
2.12 Exercises 185

6
λ=0
(xsp , usp )

(xs , us )
x 0 λ = −8

−2

−4 U λ = −12

−6
−2 0 2 4 6 8 10

e (x, u) = 0 (circles) for


Figure 2.15: Rotated cost-function contour ℓ
λ = 0, −8, −12. Shaded region shows feasible region
e (x, u) < 0.
where ℓ

Form the Lagrangian and show that the solution is given also by

ℓ(xs , us ) = min max ℓ(x, u) − λ′ (x − f (x, u))


(x,u)∈Z λ

Switching the order of min and max gives

min max ℓ(x, u) − λ′ (x − f (x, u)) ≥ max min ℓ(x, u) − λ′ (x − f (x, u))
(x,u)∈Z λ λ (x,u)∈Z

due to weak duality. The strong duality assumption states that equality is achieved in
this inequality above, so that

ℓ(xs , us ) = max min ℓ(x, u) − λ′ (x − f (x, u))


λ (x,u)∈Z

Let λs denote the optimal Lagrange multiplier in this problem. (For a brief review of
these concepts, see also Exercises C.4, C.5, and C.6 in Appendix C.)
Show that the strong duality assumption implies that the system x + = f (x, u) is
dissipative with respect to the supply rate s(x, u) = ℓ(x, u) − ℓ(xs , us ).
Bibliography

R. P. Aguilera and D. E. Quevedo. Stability analysis of quadratic MPC with a


discrete input alphabet. IEEE Trans. Auto. Cont., 58(12):3190–3196, 2013.

D. A. Allan and J. B. Rawlings. An input/output-to-state stability converse the-


orem for closed positive invariant sets. Technical Report 2018–01, TWCCC
Technical Report, December 2018.

D. A. Allan, C. N. Bates, M. J. Risbeck, and J. B. Rawlings. On the inherent


robustness of optimal and suboptimal nonlinear MPC. Sys. Cont. Let., 106:
68–78, August 2017.

R. Amrit, J. B. Rawlings, and D. Angeli. Economic optimization using model


predictive control with a terminal cost. Annual Rev. Control, 35:178–186,
2011.

D. Angeli, R. Amrit, and J. B. Rawlings. On average performance and stability


of economic model predictive control. IEEE Trans. Auto. Cont., 57(7):1615–
1626, 2012.

M. Baotic, F. J. Christophersen, and M. Morari. Constrained optimal control of


hybrid systems with a linear performance index. IEEE Trans. Auto. Cont., 51
(12):1903–1919, 2006.

A. Bemporad and M. Morari. Control of systems integrating logic, dynamics,


and constraints. Automatica, 35:407–427, 1999.

F. Blanchini and S. Miani. Set-Theoretic methods in Control. Systems & Control:


Foundations and Applications. Birkhäuser, 2008.

C. Cai and A. R. Teel. Input–output-to-state stability for discrete-time systems.


Automatica, 44(2):326 – 336, 2008.

C. C. Chen and L. Shaw. On receding horizon feedback control. Automatica,


16(3):349–352, 1982.

H. Chen and F. Allgöwer. A quasi-infinite horizon nonlinear model predictive


control scheme with guaranteed stability. Automatica, 34(10):1205–1217,
1998.

D. Chmielewski and V. Manousiouthakis. On constrained infinite-time linear


quadratic optimal control. Sys. Cont. Let., 29:121–129, 1996.

186
Bibliography 187

D. W. Clarke, C. Mohtadi, and P. S. Tuffs. Generalized predictive control—Part


I. The basic algorithm. Automatica, 23(2):137–148, 1987.

C. R. Cutler and B. L. Ramaker. Dynamic matrix control—a computer control


algorithm. In Proceedings of the Joint Automatic Control Conference, 1980.

R. M. C. De Keyser and A. R. Van Cauwenberghe. Extended prediction self-


adaptive control. In H. A. Barker and P. C. Young, editors, Proceedings of
the 7th IFAC Symposium on Identification and System Parameter Estimation,
pages 1255–1260. Pergamon Press, Oxford, 1985.

G. De Nicolao, L. Magni, and R. Scattolini. Stabilizing nonlinear receding hori-


zon control via a nonquadratic penalty. In Proceedings IMACS Multiconfer-
ence CESA, volume 1, pages 185–187, Lille, France, 1996.

G. De Nicolao, L. Magni, and R. Scattolini. Stabilizing receding-horizon con-


trol of nonlinear time-varying systems. IEEE Trans. Auto. Cont., 43(7):1030–
1036, 1998.

S. Di Cairano, W. P. M. H. Heemels, M. Lazar, and A. Bemporad. Stabilizing dy-


namic controllers for hybrid systems: A hybrid control Lyapunov function
approach. IEEE Trans. Auto. Cont., 59(10):2629–2643, 2014.

M. Diehl, R. Amrit, and J. B. Rawlings. A Lyapunov function for economic


optimizing model predictive control. IEEE Trans. Auto. Cont., 56(3):703–
707, 2011.

M. Ellis, H. Durand, and P. D. Christofides. A tutorial review of economic model


predictive control methods. J. Proc. Cont., 24(8):1156–1178, 2014.

L. Fagiano and A. R. Teel. On generalised terminal state constraints for model


predictive control. In Proceedings of 4th IFAC Nonlinear Model Predictive
Control Conference, pages 299–304, 2012.

P. Falugi and D. Q. Mayne. Model predictive control for tracking random refer-
ences. In Proceedings of European Control Conference (ECC), pages 518–523,
2013a.

P. Falugi and D. Q. Mayne. Tracking a periodic reference using nonlinear model


predictive control. In Proceedings of 52nd IEEE Conference on Decision and
Control, pages 5096–5100, Florence, Italy, December 2013b.

C. E. Garcı́a and A. M. Morshedi. Quadratic programming solution of dynamic


matrix control (QDMC). Chem. Eng. Commun., 46:73–87, 1986.

C. E. Garcı́a, D. M. Prett, and M. Morari. Model predictive control: Theory and


practice—a survey. Automatica, 25(3):335–348, 1989.
188 Bibliography

E. G. Gilbert and K. T. Tan. Linear systems with state and control constraints:
The theory and application of maximal output admissible sets. IEEE Trans.
Auto. Cont., 36(9):1008–1020, September 1991.

G. Grimm, M. J. Messina, S. E. Tuna, and A. R. Teel. Model predictive control:


For want of a local control Lyapunov function, all is not lost. IEEE Trans.
Auto. Cont., 50(5):546–558, 2005.

L. Grüne and J. Pannek. Nonlinear Model Predictive Control: Theory and Algo-
rithms. Communications and Control Engineering. Springer-Verlag, London,
second edition, 2017.

R. Huang, E. Harinath, and L. T. Biegler. Lyapunov stability of economically


oriented NMPC for cyclic processes. J. Proc. Cont., 21:501–509, 2011.

A. Jadbabaie, J. Yu, and J. Hauser. Unconstrained receding horizon control of


nonlinear systems. IEEE Trans. Auto. Cont., 46(5):776–783, 2001.

S. S. Keerthi and E. G. Gilbert. Computation of minimum-time feedback control


laws for systems with state-control constraints. IEEE Trans. Auto. Cont., 32:
432–435, 1987.

S. S. Keerthi and E. G. Gilbert. Optimal infinite-horizon feedback laws for a


general class of constrained discrete-time systems: Stability and moving-
horizon approximations. J. Optim. Theory Appl., 57(2):265–293, May 1988.

D. L. Kleinman. An easy way to stabilize a linear constant system. IEEE Trans.


Auto. Cont., 15(12):692, December 1970.

K. Kobayshi, W. W. Shein, and K. Hiraishi. Large-scale MPC with


continuous/discrete-valued inputs: Compensation of quantization errors,
stabilization, and its application. SICE J. Cont., Meas., and Sys. Integr., 7(3):
152–158, 2014.

W. H. Kwon and A. E. Pearson. A modified quadratic cost problem and feed-


back stabilization of a linear system. IEEE Trans. Auto. Cont., 22(5):838–842,
October 1977.

M. Lazar and W. P. M. H. Heemels. Predictive control of hybrid systems: Input-


to-state stability results for sub-optimal solutions. Automatica, 45(1):180–
185, 2009.

E. B. Lee and L. Markus. Foundations of Optimal Control Theory. John Wiley


and Sons, New York, 1967.

D. Limon, T. Alamo, F. Salas, and E. F. Camacho. On the stability of MPC without


terminal constraint. IEEE Trans. Auto. Cont., 51(5):832–836, May 2006.
Bibliography 189

D. Limon, I. Alvarado, T. Alamo, and E. F. Camacho. MPC for tracking piece-


wise constant references for constrained linear systems. Automatica, pages
2382–2387, 2008.

D. Limon, I. Alvarado, T. Alamo, and E. F. Camacho. Robust tube-based MPC for


tracking of constrained linear systems with additive disturbances. J. Proc.
Cont., 20:248–260, 2010.

D. Limon, T. Alamo, D. M. de la Peña, M. N. Zeilinger, C. N. Jones, and M. Pereira.


MPC for tracking periodic reference signals. In 4th IFAC Nonlinear Model
Predictive Control Conference, pages 490–495, 2012.

P. Marquis and J. P. Broustail. SMOC, a bridge between state space and model
predictive controllers: Application to the automation of a hydrotreating
unit. In T. J. McAvoy, Y. Arkun, and E. Zafiriou, editors, Proceedings of the
1988 IFAC Workshop on Model Based Process Control, pages 37–43. Perga-
mon Press, Oxford, 1988.

D. Q. Mayne. Nonlinear model predictive control: challenges and opportuni-


ties. In F. Allgöwer and A. Zheng, editors, Nonlinear Model Predictive Control,
pages 23–44. Birkhäuser Verlag, Basel, 2000.

D. Q. Mayne. An apologia for stabilising conditions in model predictive control.


Int. J. Control, 86(11):2090–2095, 2013.

D. Q. Mayne and P. Falugi. Generalized stabilizing conditions for model pre-


dictive control. J. Optim. Theory Appl., 169:719–734, 2016.

D. Q. Mayne and H. Michalska. Receding horizon control of non-linear systems.


35(5):814–824, 1990.

D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert. Constrained


model predictive control: Stability and optimality. Automatica, 36(6):789–
814, 2000.

E. S. Meadows, M. A. Henson, J. W. Eaton, and J. B. Rawlings. Receding horizon


control and discontinuous state feedback stabilization. Int. J. Control, 62
(5):1217–1229, 1995.

H. Michalska and D. Q. Mayne. Robust receding horizon control of constrained


nonlinear systems. IEEE Trans. Auto. Cont., 38(11):1623–1633, 1993.

M. A. Müller and F. Allgöwer. Distributed economic MPC: a framework for


cooperative control problems. In Proceedings of the 19th World Congress of
the International Federation of Automatic Control, pages 1029–1034, Cape
Town, South Africa, 2014.
190 Bibliography

M. A. Müller and L. Grüne. Economic model predictive control without terminal


constraints: Optimal periodic operation. In 2015 54th IEEE Conference on
Decision and Control (CDC), pages 4946–4951, 2015.

M. A. Müller, D. Angeli, and F. Allgöwer. On necessity and robustness of dis-


sipativity in economic model predictive control. IEEE Trans. Auto. Cont., 60
(6):1671–1676, June 2015.

K. R. Muske and J. B. Rawlings. Model predictive control with linear models.


AIChE J., 39(2):262–287, 1993.

G. Pannocchia, J. B. Rawlings, and S. J. Wright. Conditions under which subop-


timal nonlinear MPC is inherently robust. Sys. Cont. Let., 60:747–755, 2011.

G. Pannocchia, J. B. Rawlings, D. Q. Mayne, and G. Mancuso. Whither discrete


time model predictive control? IEEE Trans. Auto. Cont., 60(1):246–252, Jan-
uary 2015.

T. Parisini and R. Zoppoli. A receding-horizon regulator for nonlinear systems


and a neural approximation. Automatica, 31(10):1443–1451, 1995.

V. Peterka. Predictor-based self-tuning control. Automatica, 20(1):39–50, 1984.

B. Picasso, S. Pancanti, A. Bemporad, and A. Bicchi. Receding–horizon control


of LTI systems with quantized inputs. In Analysis and Design of Hybrid
Systems 2003 (ADHS 03): A Proceedings Volume from the IFAC Conference,
St. Malo, Brittany, France, 16-18 June 2003, volume 259, 2003.

D. M. Prett and R. D. Gillette. Optimization and constrained multivariable


control of a catalytic cracking unit. In Proceedings of the Joint Automatic
Control Conference, pages WP5–C, San Francisco, CA, 1980.

J. A. Primbs and V. Nevistić. Feasibility and stability of constrained finite re-


ceding horizon control. Automatica, 36:965–971, 2000.

A. I. Propoi. Use of linear programming methods for synthesizing sampled-


data automatic systems. Autom. Rem. Control, 24(7):837–844, July 1963.

D. E. Quevedo, G. C. Goodwin, and J. A. De Doná. Finite constraint set receding


horizon quadratic control. Int. J. Robust and Nonlinear Control, 14(4):355–
377, 2004.

C. V. Rao and J. B. Rawlings. Steady states and constraints in model predictive


control. AIChE J., 45(6):1266–1278, 1999.

J. B. Rawlings and R. Amrit. Optimizing process economic performance using


model predictive control. In L. Magni, D. M. Raimondo, and F. Allgöwer,
editors, Nonlinear Model Predictive Control, volume 384 of Lecture Notes in
Control and Information Sciences, pages 119–138. Springer, Berlin, 2009.
Bibliography 191

J. B. Rawlings and K. R. Muske. Stability of constrained receding horizon con-


trol. IEEE Trans. Auto. Cont., 38(10):1512–1516, October 1993.

J. B. Rawlings and M. J. Risbeck. On the equivalence between statements with


epsilon-delta and K-functions. Technical Report 2015–01, TWCCC Technical
Report, December 2015.

J. B. Rawlings and M. J. Risbeck. Model predictive control with discrete actua-


tors: Theory and application. Automatica, 78:258–265, 2017.

J. Richalet, A. Rault, J. L. Testud, and J. Papon. Model predictive heuristic con-


trol: Applications to industrial processes. Automatica, 14:413–428, 1978a.

J. Richalet, A. Rault, J. L. Testud, and J. Papon. Algorithmic control of industrial


processes. In Proceedings of the 4th IFAC Symposium on Identification and
System Parameter Estimation, pages 1119–1167. North-Holland Publishing
Company, 1978b.

B. J. P. Roset, W. P. M. H. Heemels, M. Lazar, and H. Nijmeijer. On robustness of


constrained discrete-time systems to state measurement errors. Automat-
ica, 44(4):1161 – 1165, 2008.

P. O. M. Scokaert and J. B. Rawlings. Constrained linear quadratic regulation.


IEEE Trans. Auto. Cont., 43(8):1163–1169, August 1998.

P. O. M. Scokaert, D. Q. Mayne, and J. B. Rawlings. Suboptimal model predictive


control (feasibility implies stability). IEEE Trans. Auto. Cont., 44(3):648–654,
March 1999.

M. Sznaier and M. J. Damborg. Suboptimal control of linear systems with state


and control inequality constraints. In Proceedings of the 26th Conference on
Decision and Control, pages 761–762, Los Angeles, CA, 1987.

Y. A. Thomas. Linear quadratic optimal estimation and control with receding


horizon. Electron. Lett., 11:19–21, January 1975.

B. E. Ydstie. Extended horizon adaptive control. In J. Gertler and L. Keviczky,


editors, Proceedings of the 9th IFAC World Congress, pages 911–915. Perga-
mon Press, Oxford, 1984.

S. Yu, M. Reble, H. Chen, and F. Allgöwer. Inherent robustness properties of


quasi-infinite horizon nonlinear model predictive control. Automatica, 50
(9):2269 – 2280, 2014.

M. Zanon, S. Gros, and M. Diehl. A Lyapunov function for periodic economic


optimizing model predictive control. In Decision and Control (CDC), 2013
IEEE 52nd Annual Conference on, pages 5107–5112, 2013.
3
Robust and Stochastic Model Predictive
Control

3.1 Introduction
3.1.1 Types of Uncertainty

Robust and stochastic control concern control of systems that are un-
certain in some sense so that predicted behavior based on a nominal
model is not identical to actual behavior. Uncertainty may arise in dif-
ferent ways. The system may have an additive disturbance that is un-
known, the state of the system may not be perfectly known, or the
model of the system that is used to determine control may be inaccu-
rate.
A system with additive disturbance satisfies the following difference
equation
x + = f (x, u, w)

If the disturbance w in constrained optimal control problems is bounded


it is often possible to design a model predictive controller that ensures
the state and control constraints are satisfied for all possible distur-
bance sequences (robust MPC). If the disturbance w is unbounded, it
is impossible to ensure that the usual state and control constraints are
satisfied for all disturbance sequences. The model predictive controller
is then designed to ensure that the constraints are satisfied on average
or, more usually, with a prespecified probability (stochastic MPC).
The situation in which the state is not perfectly measured may be
treated in several ways. For example, inherent robustness is often stud-
ied using the model x + = f (x + e, u, w) where e denotes the error in
measuring the state. In the stochastic optimal control literature, where

193
194 Robust and Stochastic Model Predictive Control

the measured output is y = Cx + v and the disturbance w and mea-


surement noise v are usually assumed to be Gaussian white noise pro-
cesses, the state or hyperstate of the optimal control problem is the
conditional density of the state x at time k given prior measurements

y(0), y(1), . . . , y(k − 1) . Because this density usually is difficult to
compute and use, except in the linear case when it is provided by the
Kalman filter, a suboptimal procedure often is adopted. In this subop-
timal approach, the state x is replaced by its estimate x̂ in a control law
determined under the assumption that the state is accessible. This pro-
cedure is usually referred to as certainty equivalence, a term that was
originally employed for the linear quadratic Gaussian (LQG) or similar
cases when this procedure did not result in loss of optimality. When
f (·) is linear, the evolution of the state estimate x̂ may be expressed
by a difference equation

x̂ + = g(x̂, u) + ξ

in which ξ is the innovation process. In controlling x̂, we should ensure


that the actual state x, which lies in a bounded, possibly time-varying
set if the innovation process is bounded, satisfies the constraints of
the optimal control problem certainly (robust MPC). If the innovation
process is not bounded, the constraints should be satisfied with a pre-
specified probability (stochastic MPC).
A system that has parametric uncertainty may be modeled as

x + = f (x, u, θ)

in which θ represents parameters of the system that are known only


to the extent that they belong to a compact set Θ. A much-studied
example is
x + = Ax + Bu
in which θ := (A, B) may take any value in Θ := co{(Ai , Bi ) | i ∈ I}
where I = {1, 2, . . . , I}, say, is an index set.
Finally the system description may not include all the dynamics. For
example, fast dynamics may be ignored to simplify the system descrip-
tion, or a system described by a partial differential equation may be
modeled by an ordinary differential equation (ODE).
It is possible, of course, for all these types of uncertainty to occur in
a single application. In this chapter we focus on the effect of additive
disturbance. Output MPC—in which the controller employs an estimate
of the state, rather than the state itself—is discussed in Chapter 5.
3.1 Introduction 195

3.1.2 Feedback Versus Open-Loop Control

It is well known that feedback is required only when uncertainty is


present; in the absence of uncertainty, feedback control and open-loop
control are equivalent. Indeed, when uncertainty is not present, as for
the systems studied in Chapter 2, the optimal control for a given initial
state may be computed using either dynamic programming (DP) that
provides an optimal control policy or sequence of feedback control
laws, or an open-loop optimal control that merely provides a sequence
of control actions. A simple example illustrates this fact. Consider the
deterministic linear dynamic system defined by

x+ = x + u

The optimal control problem, with horizon N = 3, is

P3 (x) : V30 (x) = min V3 (x, u)


u3

in which u = (u(0), u(1), u(2))

2
X
V3 (x, u) := (1/2) [(x(i)2 + u(i)2 )] + (1/2)x(3)2
i=0

in which, for each i, x(i) = φ(i; x, u) = x + u(0) + u(1) + . . . + u(i −


1), the solution of the difference equation x + = x + u at time i if
the initial state is x(0) = x and the control (input) sequence is u =
(u(0), u(1), u(2)); in matrix operations u is taken to be the column
vector [u(0), u(1), u(2)]′ . Thus

V3 (x, u) = (1/2) x 2 + (x + u(0))2 + (x + u(0) + u(1))2 +

(x + u(0) + u(1) + u(2))2 + u(0)2 + u(1)2 + u(2)2
h i
= (3/2)x 2 + x 3 2 1 u + (1/2)u′ P3 u

in which  
4 2 1
 
P3 =  2 3 1 
1 1 2
Therefore, the vector form of the optimal open-loop control sequence
for an initial state of x is
h i′ h i′
u0 (x) = −P3−1 3 2 1 x = − 0.615 0.231 0.077 x
196 Robust and Stochastic Model Predictive Control

and the optimal control and state sequences are

u0 (x) = (−0.615x, −0.231x, −0.077x)


x0 (x) = (x, 0.385x, 0.154x, 0.077x)

To compute the optimal feedback control, we use the DP recursions

Vi0 (x) = min{x 2 /2 + u2 /2 + Vi−1


0
(x + u)}
u∈R

κi0 (x) = arg min{x 2 /2 + u2 /2 + Vi−1


0
(x + u)}
u∈R

with boundary condition

V00 (x) = (1/2)x 2

This procedure gives the value function Vi0 (·) and the optimal control
law κi0 (·) at each i where the subscript i denotes time to go. Solving
the DP recursion, for all x ∈ R, all i ∈ {1, 2, 3}, yields

V10 (x) = (3/4)x 2 κ10 (x) = −(1/2)x


V20 (x) = (4/5)x 2 κ20 (x) = −(3/5)x
V30 (x) = (21/26)x 2 κ30 (x) = −(8/13)x

Starting at state x at time zero, and applying the optimal control laws
iteratively to the deterministic system x + = x +u (recalling that at time
0
i the optimal control law is κ3−i (·) since, at time i, 3 − i is the time to
go) yields

x 0 (0) = x u0 (0) = −(8/13)x


x 0 (1) = (5/13)x u0 (1) = −(3/13)x
x 0 (2) = (2/13)x u0 (2) = −(1/13)x
x 0 (3; x) = (1/13)x

so that the optimal control and state sequences are, respectively,

u0 (x) = (−(8/13)x, −(3/13)x, −(1/13)x)


x0 (x) = (x, (5/13)x, (2/13)x, (1/13)x)

which are identical with the optimal open-loop values computed above.
Consider next an uncertain version of the dynamic system in which
uncertainty takes the simple form of an additive disturbance w; the
system is defined by
x+ = x + u + w
3.1 Introduction 197

in which the only knowledge of w is that it lies in the compact set


W := [−1, 1]. Let φ(i; x, u, w) denote the solution of this system at
time i if the initial state is x at time zero, and the input and distur-
bance sequences are, respectively, u and w := (w(0), w(1), w(2)). The
cost now depends on the disturbance sequence—but it also depends, in
contrast to the deterministic problem discussed above, on whether the
control is open-loop or feedback. To discuss the latter case, we define
a feedback policy µ to be a sequence of control laws

µ := (µ0 (·), µ1 (·), µ2 (·))

in which µi : R → R, i = 0, 1, 2; under policy µ, if the state at time i is x,


the control is µi (x). Let M denote the class of admissible policies, for
example, those policies for which each control law µi (·) is continuous.
Then, φ(i; x, µ, w) denotes the solution at time i ∈ {0, 1, 2, 3} of the
following difference equation

x(i + 1) = x(i) + µi (x(i)) + w(i) x(0) = x

An open-loop control sequence u = (u(0), u(1), u(2)) is then merely a


degenerate policy µ = (µ0 (·), µ1 (·), µ2 (·)) where each control law µi (·)
satisfies
µi (x) = u(i)
for all x ∈ R and all i ∈ {0, 1, 2}. The cost V3 (·) may now be defined
2
X
V3 (x, µ, w) := (1/2) [(x(i)2 + u(i)2 )] + (1/2)x(3)2
i=0

where, now, x(i) = φ(i; x, µ, w) and u(i) = µi (x(i)). Since the distur-
bance is unpredictable, the value of w is not known at time zero, so the
optimal control problem must “eliminate” it in some meaningful way
so that the solution µ0 (x) does not depend on w. To eliminate w, the
optimal control problem P∗ 3 (x) is defined by

P∗
3 (x) : V30 (x) := inf J3 (x, µ)
µ∈M

in which the cost J3 (·) is defined in such a way that it does not depend
on w; inf is used rather than min in this definition since the minimum
may not exist. The most popular choice for J3 (·) in the MPC literature
is
J3 (x, µ) := max V3 (x, µ, w)
w∈W
198 Robust and Stochastic Model Predictive Control

x1
2 2 x1
X3
X2
X1 X1 X2 X3
x0 x0

0 1 2 k 3 0 1 2 k 3
x2
x2

−2 −2

(a) Open-loop trajectories. (b) Feedback trajectories.

Figure 3.1: Open-loop and feedback trajectories.

in which the disturbance w is assumed to lie in W a bounded class


of admissible disturbance sequences. Alternatively, if the disturbance
sequence is random, the cost J3 (·) may be chosen to be

J3 (x, µ) := EV3 (x, µ, w)

in which E denotes “expectation” or average, over random disturbance


sequences. For our purpose here, we adopt the simple cost

J3 (x, µ) := V3 (x, µ, 0)

in which 0 := (0, 0, 0) is the zero disturbance sequence. In this case,


J3 (x, µ) is the nominal cost, i.e., the cost associated with the nominal
system x + = x +u in which the disturbance is neglected. With this cost
function, the solution to P∗3 (x) is the DP solution, obtained previously,
to the deterministic nominal optimal control problem.
We now compare two solutions to P3 (x): the open-loop solution
in which M is restricted to be the set of control sequences, and the
feedback solution in which M is the class of admissible policies. The
solution to the first problem is the solution to the deterministic prob-
lem discussed previously; the optimal control sequence is

u0 (x) = (−(8/13)x, −(3/13)x, −(1/13)x)

in which x is the initial state at time zero. The solution to the second
problem is the sequence of control laws determined previously, also for
3.1 Introduction 199

the deterministic  problem, using dynamic programming; the optimal


policy is µ = µ00 (·), µ10 (·), µ2 (·) where the control laws (functions)
0

µi (·), i = 0, 1, 2, are defined by

µ00 (x) := κ30 (x) = −(8/13)x ∀x ∈ R


µ10 (x) := κ20 (x) = −(3/5)x ∀x ∈ R
µ20 (x) := κ10 (x) = −(1/2)x ∀x ∈ R

The two solutions, u0 (·) and µ0 , when applied to the uncertain system
x + = x + u + w, do not yield the same trajectories for all disturbance
sequences. This is illustrated in Figure 3.1 for the three disturbance
sequences, w0 := (0, 0, 0), w1 := (1, 1, 1), and w2 := (−1, −1, −1); and
initial state x = 1 for which the corresponding state trajectories, de-
noted x0 , x1 , and x2 , are

Open-loop solution.

x0 = (1, (5/13), (2/13), (1/13))


x1 = (1, (18/13), (28/13), (40/13))
x2 = (1, −(8/13), −(24/13), −(38/13))

Feedback solution.

x0 = (1, (5/13), (2/13), (1/13))


x1 = (1, (18/13), (101/65), (231/130))
x2 = (1, −(8/13), −(81/65), −(211/130))

Even for the short horizon of 3, the superiority of the feedback so-
lution can be seen although the feedback was designed for the de-
terministic (nominal) system and therefore did not take the distur-
bance into account. For the open-loop solution x 2 (3) − x 1 (3) =
6, whereas for the feedback case x 2 (3) − x 1 (3) = 3.4; the open-
loop solution does not restrain the spread of the trajectories result-
ing from the disturbance w. If the horizon length is N, for the open-
loop solution, x 2 (N) − x 1 (N) = 2N, whereas for the feedback case
x 2 (N) − x 1 (N) → 3.24 as N → ∞. The obvious and well-known con-
clusion is that feedback control is superior to open-loop control when
uncertainty is present. Feedback control requires determination of a
control policy, however, which is a difficult task if nonlinearity and/or
constraints are features of the optimal control problem.
200 Robust and Stochastic Model Predictive Control

3.1.3 Robust and Stochastic MPC

An important feature of conventional, or deterministic, MPC discussed


in Chapter 2 is that the solution of the open-loop optimal control prob-
lem solved online is identical to that obtained by DP for the given ini-
tial state. When uncertainty is present and the state is known or ob-
servations of the state are available, feedback control is superior to
open-loop control. The optimal control problem solved online must,
therefore, permit feedback in order for its solution to coincide with
the DP solution. In robust and stochastic MPC, the decision variable
is µ, a sequence of control laws, rather than u, a sequence of control
actions. MPC in which the decision variable is a policy has been termed
feedback MPC to distinguish it from conventional MPC. Both forms of
MPC naturally provide feedback control since the control that is imple-
mented depends on the current state x in both cases. But the control
that is applied depends on whether the optimal control problem solved
is open loop, in which case the decision variable is a control sequence,
or feedback, in which case the decision variable is a feedback policy.

 solution to the optimal controlproblem PN (x)
In feedback MPC the
is the policy µ0 (x) = µ00 (·; x), µ10 (·; x), . . . , µN−1
0
(·; x) . The constitu-
ent control laws are restrictions of those determined by DP and there-
fore depend on the initial state x as implied by the notation. Thus, only
the value u0 (x) = µ0 (x; x) of the control law µ0 (·; x) at the initial state
x need be determined, while successive laws need only be determined
over a limited subset of the state space. In the example illustrated in
Figure 3.1, µ0 (·; x) need be determined only at the point x = 1, µ1 (·; x)
need only be determined in the subset [−8/13, 18/13], and µ2 (·; x) in
the subset [−81/65, 101/65], whereas in the DP solution these control
laws are defined over the infinite interval (−∞, ∞).
While feedback MPC is superior in the presence of uncertainty, the
associated optimal control problem is vastly more complex than the
optimal control problem employed in deterministic MPC. The decision
variable µ, being a sequence of control laws, is infinite dimensional;
each law or function requires, in general, an infinite dimensional grid
to specify it. The complexity is comparable to solving the DP equation,
so that MPC, which in the deterministic case replaces DP with a solvable
open-loop optimization problem, is not easily solved when uncertainty
is present. Hence much research effort has been devoted to forms of
feedback MPC that sacrifice optimality for simplicity. As in the early
days of adaptive control, many different proposals have been made.
3.1 Introduction 201

These proposals for robust MPC are all simpler to implement than the
optimal solution provided by DP.
At the current stage of research it is perhaps premature to select
a particular approach; we have, nevertheless, selected one approach,
tube-based MPC that we describe here and in Chapter 5. There is a good
reason for our choice. It is well known that standard mathematical op-
timization algorithms may be used to obtain an optimal open-loop con-
trol sequence for an optimal control problem. What is perhaps less well
known is that there exist algorithms, the second variation algorithms,
that provide not only an optimal control sequence but also a local time-
varying feedback law of the form u(k) = ū(k) + K(k)(x(k) − x̄(k)) in
which (ū(k)) is the optimal open-loop control sequence and (x̄(k)) the
corresponding optimal open-loop state sequence. This policy provides
feedback control for states x(k) close to the nominal states x̄(k).
The second variation algorithms are perhaps too complex for rou-
tine use in MPC because they require computation of the second deriva-
tives with respect to (x, u) of f (·) and ℓ(·). When the system is linear,
the cost quadratic, and the disturbance additive, however, the opti-
mal control law for the unconstrained infinite horizon case is u = Kx.
This result may be expressed as a time-varying control law u(k) =
ū(k) + K(x(k) − x̄(k)) in which the state and control sequences (x̄(k))
and (ū(k)) satisfy the nominal difference equations x̄ + = Ax̄ + B ū,
ū = Kz, i.e., the sequences (x̄(k)) and (ū(k)) are optimal open-loop
solutions for zero disturbance and some initial state. The time-varying
control law u(k) = ū(k) + K(x(k) − x̄(k)) is clearly optimal in the
unconstrained case; it remains optimal for the constrained case in the
neighborhood of the nominal trajectory (x̄(k)) if (x̄(k)) and (ū(k)) lie
in the interior of their respective constraint sets.
These comments suggest that a time-varying policy of the form u(x,
k) = ū(k) + K(x − x̄(k)) might be adequate, at least when f (·) is
linear. The nominal control and state sequences, (ū(k)) and (x̄(k)),
respectively, can be determined by solving a standard open-loop op-
timal control problem of the form usually employed in MPC, and the
feedback matrix K can be determined offline. We show that this form
of robust MPC has the same order of online complexity as that conven-
tionally used for deterministic systems. It requires a modified form of
the online optimal control problem in which the constraints are simply
tightened to allow for disturbances, thereby constraining the trajecto-
ries of the uncertain system to lie in a tube centered on the nominal
trajectories. Offline computations are required to determine the mod-
202 Robust and Stochastic Model Predictive Control

ified constraints and the feedback matrix K. We also present, in the


last section of this chapter, a modification of this tube-based proce-
dure for nonlinear systems for which a nonlinear local feedback policy
is required.
A word of caution is necessary. Just as nominal model predictive
controllers presented in Chapter 2 may fail in the presence of uncer-
tainty, the controllers presented in this chapter may fail if the actual
uncertainty does not satisfy our assumptions. In robust MPC this may
occur when the disturbance that we assume to be bounded exceeds the
assumed bounds; the controlled systems are robust only to the speci-
fied uncertainties. As always, online fault diagnosis and safe recovery
procedures may be required to protect the system from unanticipated
events.

3.1.4 Tubes

The approach that we adopt is motivated by the following observation.


Both open-loop and feedback control generate, in the presence of un-
certainty, a bundle or tube of trajectories, each trajectory in the bundle
or tube corresponding to a particular realization of the uncertainty.
In Figure 3.1(a), the tube corresponding to u = u0 (x) and initial state
x = 1, is (X0 , X1 , X2 , X3 ) where X0 = {1}; for each i, Xi = {φ(i; x,
u, w) | w ∈ W }, the set of states at time i generated by all possible
realizations of the disturbance sequence. In robust MPC the state con-
straints must be satisfied by every trajectory in the tube. In stochastic
MPC the tube has the property that state sequences lie within this tube
with a prespecified probability.
Control of uncertain systems is best viewed as control of tubes
rather than trajectories; the designer chooses, for each initial state,
a tube in which all realizations of the state trajectory are controlled to
lie (robust MPC), or in which the realizations lie with a given probabil-
ity (stochastic MPC). By suitable choice of the tube, satisfaction of state
and control constraints may be guaranteed for every realization of the
disturbance sequence, or guaranteed with a given probability.
Determination of a suitable tube (X0 , X1 , . . .) corresponding to a
given initial state x and policy µ is difficult even for linear systems,
however, and even more difficult for nonlinear systems. Hence, in the
sequel, we show for robust MPC how simple tubes that bound all real-
izations of the state trajectory may be constructed. For example, for
linear systems with convex constraints, a tube (X0 , X1 , . . . , ) may be
designed to bound all realizations of the state trajectory; for each i,
3.1 Introduction 203

Xi = {x̄(i)} ⊕ S, x̄(i) is the state at time i of a deterministic system, Xi


is a polytope, and S is a positive invariant set. This construction per-
mits robust model predictive controllers to be designed with not much
more computation online than that required for deterministic systems.
The stochastic MPC controllers are designed to satisfy constraints with
a given probability.

3.1.5 Difference Inclusion Description of Uncertain Systems

Here we introduce some notation that will be useful in the sequel. A


deterministic discrete time system is usually described by a difference
equation
x + = f (x, u) (3.1)
We use φ(k; x, i, u) to denote the solution of (3.1) at time k when the ini-
tial state at time i is x and the control sequence is u = (u(0), u(1), . . .);
if the initial time i = 0, we write φ(k; x, u) in place of φ(k; (x, 0), u).
Similarly, an uncertain system may be described by the difference equa-
tion

x + = f (x, u, w) (3.2)

in which the variable w that represents the uncertainty takes values


in a specified set W. We use φ(k; x, i, u, w) to denote the solution
of (3.2) when the initial state at time i is x and the control and dis-
turbance sequences are, respectively, u = (u(0), u(1), . . .) and w =
(w(0), w(1), . . .). The uncertain system may alternatively be described
by a difference inclusion of the form

x + ∈ F (x, u)

in which F (·) is a set-valued map. We use the notation F : Rn × Rm Ž


n
Rn or1 F : Rn × Rm → 2R to denote a function that maps points in
Rn × Rm into subsets of Rn . If the uncertain system is described by
(3.2), then

F (x, u) = f (x, u, W) := {f (x, u, w) | w ∈ W}

If x is the current state, and u the current control, the successor state
x + lies anywhere in the set F (x, u). When the control policy µ :=
(µ0 (·), µ1 (·), . . .) is employed, the state evolves according to

x + ∈ F (x, µk (x)), k+ = k + 1 (3.3)


1 For any set X, 2X denotes the set of all subsets of X.
204 Robust and Stochastic Model Predictive Control

in which x is the current state, k the current time, and x + the successor
state at time k+ = k + 1. The system described by (3.3) does not have a
single solution for a given initial state; it has a solution for each possible
realization w of the disturbance sequence. We use S(x, i) to denote the
set of solutions of (3.3) if the initial state is x at time i. If φ∗ (·) ∈ S(x,
i) then
φ∗ (t) = φ(t; x, i, µ, w)

for some admissible disturbance sequence w in which φ(t; x, i, µ, w)


denotes the solution at time t of

x + = f (x, µk (x), w)

when the initial state is x at time i and the disturbance sequence is


w. The policy µ is defined, as before, to be the sequence of control
laws (µ0 (·), µ1 (·), . . . , µN−1 (·)). The tube X = (X0 , X1 , . . .), discussed in
Section 3.5, generated when policy µ is employed, satisfies

Xk+1 = F(Xk , µk (·)) := f (Xk , µk (x), W)

3.2 Nominal (Inherent) Robustness


3.2.1 Introduction

Because feedback MPC is complex, it is natural to inquire if nominal


MPC, i.e., MPC based on the nominal system ignoring uncertainty, is
sufficiently robust to uncertainty. Before proceeding with a detailed
analysis, a few comments may be helpful.
MPC uses, as a Lyapunov function, the value function of a paramet-
ric optimal control problem. Often the value function is continuous,
but this is not necessarily the case, especially if state and/or terminal
constraints are present. It is also possible for the value function to be
continuous but the associated control law to be discontinuous; this can
happen, for example, if the minimizing control is not unique.
It is important to realize that a control law may be stabilizing but
not robustly stabilizing; arbitrary perturbations, no matter how small,
can destabilize the system. This point is illustrated in Teel (2004) with
the following discontinuous autonomous system (n = 2, x = (x1 , x2 ))

(0, |x|) x1 ≠ 0
x + = f (x) f (x) =
(0, 0) otherwise
3.2 Nominal (Inherent) Robustness 205


If the initial state is x = (1, 1), then φ(1; x) = (0, 2) and φ(2; x) = (0,
0), with similar behavior for other initial states. In fact, all solutions
satisfy
φ(k; x) ≤ β(|x| , k)
in which β(·), defined by

β(|x| , k) := 2(1/2)k |x|

is a KL function, so that the origin is globally asymptotically stable


(GAS). Consider now a perturbed system satisfying
" #
+ δ
x =
|x| + δ

in which δ > 0 is a constant perturbation that causes x1 to remain


√ initial state is x = ε(1, 1), then x1 (k) = δ for
strictly positive. If the
k ≥ 1, and x2 (k) > ε 2 + kδ → ∞ as k → ∞, no matter how small δ
and ε are. Hence the origin is unstable in the presence of an arbitrarily
small perturbation; global asymptotic stability is not a robust property
of this system.
This example may appear contrived but, as Teel (2004) points out,
a similar phenomenon can arise in receding horizon optimal control of
a continuous system. Consider the following system
" #
x1 (1 − u)
x+ =
|x| u

in which the control u is constrained to lie in the set U = [−1, 1]. Sup-
pose we choose a horizon length N = 2 and choose Xf to be the origin.
If x1 ≠ 0, the only feasible control sequence steering x to 0 in two
steps is u = {1, 0}; the resulting state sequence is (x, (0, |x|), (0, 0)).
Since there is only one feasible control sequence, it is also optimal, and
κ2 (x) = 1 for all x such that x1 ≠ 0. If x1 = 0, then the only optimal
control sequence is u = (0, 0) and κ2 (x) = 0. The resultant closed-loop
system satisfies
" #
x1 (1 − κ2 (x))
x + = f (x) :=
|x| κ2 (x)

in which κ2 (x) = 1 if x1 ≠ 0, and κ2 (x) = 0 otherwise. Thus



(0, |x|) x1 ≠ 0
f (x) = (3.4)
(0, 0) otherwise
206 Robust and Stochastic Model Predictive Control

The system x + = f (x) is the discontinuous system analyzed previ-


ously. Thus, receding horizon optimal control of a continuous system
has resulted in a discontinuous system that is globally asymptotically
stable (GAS) but has no robustness.

3.2.2 Difference Inclusion Description of Discontinuous Systems

Consider a system
x + = f (x)

in which f (·) is not continuous. An example of such a system occurred


in the previous subsection where f (·) satisfies (3.4). Solutions of this
system are very sensitive to the value of x1 . An infinitesimal change
in x1 at time zero, say, from 0 can cause a substantial change in the
subsequent trajectory resulting, in this example, in a loss of robustness.
To design a robust system, one must take into account, in the design
process, the system’s extreme sensitivity to variations in state. This
can be done by regularizing the system (Teel, 2004). If f (·) is locally
bounded,2 the regularization x + = f (x) is defined to be
\
x + ∈ F (x) := f ({x} ⊕ δB)
δ>0

in which B is the closed unit ball so that {x} ⊕ δB̄ = {z | |z − x| ≤ δ}


and A denotes the closure of set A. At points where f (·) is continuous,
F (x) = {f (x)}, i.e., F (x) is the single point f (x). If f (·) is piecewise
continuous, e.g., if f (x) = x if x < 1 and f (x) = 2x if x ≥ 1, then
F (x) = {limxi →x f (xi )}, the set of all limits of f (xi ) as xi → x. For
our example immediately above, F (x) = {x} if x < 1 and F (x) = {2x}
if x > 1. When x = 1, the limit of f (xi ) as xi → 1 from below is 1 and
the limit of f (xi ) as x → 1 from above is 2, so that F (1) = {1, 2}. The
regularization of x + = f (x) where f (·) is defined in (3.4) is x + ∈ F (x)
where F (·) is defined by
(" #)
0
F (x) = x1 ≠ 0 (3.5)
|x|
(" # " #)
0 0
F (x) = , x1 = 0 (3.6)
|x| 0

2A function f : Rp → Rn is locally bounded if, for every x ∈ Rp , there exists a


neighborhood N of x and a c > 0 such that f (z) ≤ c for all z ∈ N .
3.2 Nominal (Inherent) Robustness 207

If the initial state is x = (1, 1), as before, then the difference inclusion
generates the following tube
(" #) (" #) (" # " #)
1 √ 0 √0 , 0
X0 = , X1 = , X2 = , ...
1 2 2 0

with Xk = X2 for all k ≥ 2. The set Xk of possible states clearly does


not converge to the origin even though the trajectory generated by the
original system does.

3.2.3 When Is Nominal MPC Robust?

The discussion in Section 2.4.1 shows that nominal MPC is not nec-
essarily robust. It is therefore natural to ask under what conditions
nominal MPC is robust. To answer this, we have to define robustness
precisely. In Appendix B, we define robust stability, and robust asymp-
totic stability, of a set. We employ this concept later in this chapter in
the design of robust model predictive controllers that for a given ini-
tial state in the region of attraction, steer every realization of the state
trajectory to this set. Here, however, we address a slightly different
question: when is nominal MPC that steers every trajectory in the re-
gion of attraction to the origin robust? Obviously, the disturbance will
preclude the controller from steering the state of the perturbed system
to the origin; the best that can be hoped for is that the controller will
steer the state to some small neighborhood of the origin. Let the nom-
inal (controlled) system be described by x + = f (x) in which f (·) is
not necessarily continuous, and let the perturbed system be described
by x + = f (x + e) + w. Also let Sδ (x) denote the set of solutions for
the perturbed system with initial state x and perturbation sequences
e := (e(0), e(1), e(2), . . .) and w := (w(0), w(1), w(2), . . .) satisfying
max{∥e∥ , ∥w∥} ≤ δ where, for any sequence ν, ∥ν∥ denotes the sup
norm, supk≥0 |ν(k)|. The definition of robustness that we employ is
(Teel, 2004)
Definition 3.1 (Robust global asymptotic stability). Let A be compact,
and let d(x, A) := mina {|a − x| | a ∈ A}, and |x|A := d(x, A). The
set A is robustly globally asymptotically stable (RGAS) for x + = f (x) if
there exists a class KL function β(·) such that for each ε > 0 and each
compact set C, there exists a δ > 0 such that for each x ∈ C and each
φ ∈ Sδ (x), there holds φ(k; x) A ≤ β(|x|A , k) + ε for all k ∈ I≥0 .
Taking the set A to be the origin (A = {0}) so that |x|A = |x|, we
see that if the origin is robustly asymptotically stable for x + = f (x),
208 Robust and Stochastic Model Predictive Control

then, for each ε > 0, there exists a δ > 0 such that every trajectory of the
perturbed system x + = f (x+e)+w with max{∥e∥ , ∥w∥} ≤ δ converges
to εB (B is the closed unit ball); this is the attractivity property. Also,
if the initial state x satisfies |x| ≤ β−1 (ε, 0), then φ(k; x) ≤ β(β−1 (ε,
0), 0) + ε = 2ε for all k ∈ I≥0 and for all φ ∈ Sδ , which is the Lyapunov
stability property. Here the function β−1 (·, 0) is the inverse of the func-
tion α , β(α, 0).
We return to the question: under what conditions is asymptotic
stability robust? We first define a slight extension to the definition of
a Lyapunov function given in Chapter 2: A function V : Rn → R≥0 is
defined to be a Lyapunov function for x + = f (x) in X and set A if there
exist functions αi ∈ K∞ , i = 1, 2 and a continuous, positive definite
function α3 (·) such that, for any x ∈ X

α1 (|x|A ) ≤ V (x) ≤ α2 (|x|A )


V (f (x)) ≤ V (x) − α3 (|x|A )

in which |x|A is defined to be distance d(x, A) of x from the set A. The


following important result (Teel, 2004; Kellett and Teel, 2004) answers
the important question, “When is asymptotic stability robust?”
Theorem 3.2 (Lyapunov function and RGAS). Suppose A is compact and
that f (·) is locally bounded.3 The set A is RGAS for the system x + =
f (x) if and only if the system admits a continuous global Lyapunov
function for A.
This result proves the existence of a δ > 0 that specifies the per-
mitted magnitude of the perturbations, but does not give a value for
δ. Robustness against perturbations of a specified magnitude may be
required in practice; in the following section we show how to achieve
this aim if it is possible.
In MPC, the value function of the finite horizon optimal control prob-
lem that is solved online is used as a Lyapunov function. In certain
cases, such as linear systems with polyhedral constraints, the value
function is known to be continuous; see Proposition 7.13. Theorem 3.2,
suitably modified because the region of attraction is not global, then
shows that asymptotic stability is robust, i.e., that asymptotic stability
is not destroyed by small perturbations.
Theorem 3.2 characterizes robust stability of the set A for the sys-
tem x + = f (x) in the sense that it shows robust stability is equivalent
3A function f : X → Y is locally bounded if, for every x ∈ X, there exists a neigh-
borhood N of x such that the set f (N ) in Y is bounded.
3.2 Nominal (Inherent) Robustness 209

to the existence of a continuous global Lyapunov function for the sys-


tem. It also is possible to characterize robustness of x + = f (x) by
global asymptotic stability of its regularization x + ∈ F (x). It is shown
in Appendix B that for the system x + ∈ F (x), the set A is GAS if there
exists a KL function β(·) such that for each x ∈ Rn and each solution
φ(·) ∈ S(x) of x + ∈ F (x) with initial state x, φ(k) ≤ β(|x|A , k) for all
k ∈ I≥0 . The following alternative characterization of robust stability
of A for the system x + = f (x) appears in (Teel, 2004).

Theorem 3.3 (Robust global asymptotic stability and regularization).


Suppose A is compact and that f (·) is locally bounded. The set A is
RGAS for the system x + = f (x) if and only if the set A is GAS for
x + ∈ F (x), the regularization of x + = f (x).

We saw previously that for f (·) and F (·) defined respectively in


(3.4) and (3.6), the origin is not globally asymptotically stable for the
regularization x + ∈ F (x) of x + = f (x) since not every solution of
x + ∈ F (x) converges to the origin. Hence the origin is not RGAS for
this system.

3.2.4 Robustness of Nominal MPC

If the origin is asymptotically stable for the nominal version of an un-


certain system, it is sometimes possible to establish that there exists a
set A that is asymptotically stable for the uncertain system. We con-
sider the uncertain system described by

x + = f (x, u, w) (3.7)

in which w is a bounded additive disturbance and f (·) is continuous.


The system is subject to the control constraints

u(i) ∈ U ∀i ∈ I≥0

The set U is compact and contains the origin in its interior. The dis-
turbance w may take any value in the set W. As before, u denotes
the control sequence (u(0), u(1), . . .) and w the disturbance sequence
(w(0), w(1), . . .); φ(i; x, u, w) denotes the solution of (3.7) at time i if
the initial state is x, and the control and disturbance sequences are,
respectively, u and w. The nominal system is described by

x + = f¯(x, u) := f (x, u, 0) (3.8)


210 Robust and Stochastic Model Predictive Control

and φ̄(i; x, u) denotes the solution of the nominal system (3.8) at time
i if the initial state is x and the control sequence is u. The nominal
control problem, defined subsequently, includes, for reasons discussed
in Chapter 2, a terminal constraint

x(N) ∈ Xf

The nominal optimal control problem is

PN (x) : VN0 (x) = min{VN (x, u) | u ∈ UN (x)}


u

u0 (x) = arg min{VN (x, u) | u ∈ UN (x)}


u
 
in which u0 = u00 (x), u01 (x), . . . , u0N−1 (x) and the nominal cost VN (·)
is defined by

N−1
X
VN (x, u) := ℓ(x(i), u(i)) + Vf (x(N)) (3.9)
i=0

In (3.9) and (3.10), x(i) := φ̄(i; x, u), the state of the nominal system
at time i, for all i ∈ I0:N−1 = {0, 1, 2, . . . , N − 1}. The set of admissible
control sequences UN (x) is defined by

UN (x) := {u | u(i) ∈ U ∀i ∈ I0:N−1 , x(N) ∈ Xf } (3.10)

which is the set of control sequences such that the nominal system
satisfies the nominal control and terminal constraints when the initial
state at time zero is x. Thus, UN (x) is the set of feasible controls for
the nominal optimal control problem PN (x). The set XN ⊂ Rn , defined
by
XN := {x ∈ Rn | UN (x) ≠ ∅}
is the domain of the value function VN0 (·), i.e., the set of x for which
PN (x) has a solution; XN is also the domain of the minimizer u0 (x).
The value of the nominal control at state x is u0 (0; x), the first control
in the sequence u0 (x). Hence the implicit nominal MPC control law is
κN : XN → U defined by

κN (x) = u0 (0; x)

We assume, as before, that ℓ(·) and Vf (·) are defined by

ℓ(x, u) := (1/2)(x ′ Qx + u′ Ru) Vf (x) := (1/2)x ′ Pf x


3.2 Nominal (Inherent) Robustness 211

in which Q, R, and Pf are all positive definite. We also assume that


Vf (·) and Xf := {x | Vf (x) ≤ cf } for some cf > 0 satisfy the standard
stability assumption that, for all x ∈ Xf , there exists a u = κf (x) ∈ U
such that Vf (f¯(x, u)) ≤ Vf (x) − ℓ(x, u) and f¯(x, u) ∈ Xf . Because
f f
Vf (·) is quadratic, there exist positive constants c1 and c2 such that
f f f
c |x|2 ≤ Vf (x) ≤ c |x|2 and Vf (f¯(x, κf (x))) ≤ Vf (x) − c |x|2 .
1 2 1
Under these assumptions, as shown in Chapter 2, there exist posi-
tive constants c1 and c2 , c2 > c1 , satisfying

c1 |x|2 ≤ VN0 (x) ≤ c2 |x|2 (3.11)


VN0 (f¯(x, κN (x))) ≤ VN0 (x) − c1 |x|2 (3.12)

for all x ∈ XN . It then follows that

VN0 (x + ) ≤ γVN0 (x)

for all x ∈ X̄N with x + := f¯(x, κN (x)) and γ = (1 − c1 /c2 ) ∈ (0,


1). Hence, V̄N0 (x(i)) decays exponentially to zero as i → ∞; moreover,
VN0 (x(i)) ≤ γ i VN0 (x(0)) for all i ∈ I≥0 . From (3.11), the origin is expo-
nentially stable, with a region of attraction X̄N for the nominal system
under MPC.
We now examine the consequences of applying the nominal model
predictive controller κN (·) to the uncertain system (3.7). The controlled
uncertain system satisfies the difference equation

x + = f (x, κN (x), w) (3.13)

in which w can take any value in W. It is obvious that the state x(i)
of the controlled system (3.13) cannot tend to the origin as i → ∞;
the best that can be hoped for is that x(i) tends to and remains in
some neighborhood Rb of the origin. We shall establish this, if the
disturbance w is sufficiently small, using the value function VN0 (·) of
the nominal optimal control problem as a Lyapunov function for the
controlled uncertain system (3.13).
To analyze the effect of the disturbance w we employ the follow-
ing useful technical result (Allan, Bates, Risbeck, and Rawlings, 2017,
Proposition 20).

Proposition 3.4 (Bound for continuous functions). Let C ⊆ D ⊆ Rn with


C compact and D closed. If f (·) is continuous, there exists an α(·) ∈ K∞
such that, for all x ∈ C and y ∈ D, we have that |f (x) − f (y)| ≤
α(|x − y|).
212 Robust and Stochastic Model Predictive Control

Since XN is not necessarily robustly positive invariant (see Defini-


tion 3.6) for the uncertain system x + = f (x, κN (x), w), we replace it
by a subset, Rc := levc VN0 ={x | VN0 (x) ≤ c} for c > 0, a sublevel set of
VN0 (·) contained in XN . Let Rb denote levb VN0 ={x | VN0 (x) ≤ b}, the
smallest sublevel set containing Xf . Because VN0 (·) is lower semicontin-
uous (Rockafellar and Wets, 1998, Exercise 1.19) and VN0 (x) ≥ c1 |x|2 ,
both Rb and Rc are compact. We show below, if W is sufficiently small,
then Rb and Rc are robustly positive invariant for the uncertain system
x + = f (x, κN (x), w), w ∈ W and every trajectory of x + = f (x, κN (x),
w), commencing at a state x ∈ Rc , converges to Rb and thereafter
remains in this set.
Satisfaction of the terminal constraint. Our first task is to show that
the terminal constraint x(N) ∈ Xf is satisfied
 by the uncertain system
if W is sufficiently small. Let u∗ (x) := u01 (x), u02 (x), . . . , u0N−1 (x)
 
e (x) := u∗ (x), κf (x 0 (N; x)) . Since Vf∗ (·) defined by
and let u

Vf∗ (x, u) := Vf (φ̄(N; x, u))

is continuous, it follows from Proposition 3.4 that there exists a K∞


function αa (·) such that

Vf∗ (x + , u) − Vf∗ (x̄ + , u) ≤ αa ( x + − x̄ + )

for all (x̄ + , u) ∈ Rc × UN and all (x + , u) ∈ f (Rc , U, W) × UN . This result


holds, in particular, for x̄ + := f (x, κN (x), 0), x + := f (x, κN (x), w) and
u=u e (x) with x ∈ Rc . As shown in Chapter 2, x 0 (N; x̄ + ) ∈ Xf ; we wish
to show x 0 (N; x + ) ∈ Xf .
Since Vf∗ (x̄ + , u
e (x)) = Vf (f (x 0 (N; x), κf (x 0 (N; x)))) ≤ γf cf and
since Vf (x , u(x)) ≤ Vf∗ (x̄ + , u
∗ + e e (x)) + αa (|x + − x̄ + |) it follows that
Vf (x 0 (N; x + )) ≤ Vf (x 0 (N; x̄ + ))+αa (|x + − x̄ + |) ≤ γf cf +αa (|x + − x̄ + |).
Hence, x 0 (N; x) ∈ Xf implies x 0 (N; x + ) ∈ Xf if αa (|x + − x̄ + |) ≤
(1 − γf )cf .
Robust positive invariance of Rc for the controlled uncertain sys-
tem. Suppose x ∈ Rc . Since VN (·) is continuous, it follows from
Proposition 3.4 that there exists a K∞ function αb (·) such that

VN (x + , u) − VN (x̄ + , u) ≤ αb ( x + − x̄ + )

for all (x + , u) ∈ f (Rc , U, W) × UN , all (x̄ + , u) ∈ Rc × UN . This result


holds in particular for x + = f (x, κN (x), w), x̄ + = f (x, κN (x), 0) and
3.2 Nominal (Inherent) Robustness 213

e (x) with x ∈ Rc . Hence, if x ∈ Rc


u=u

VN (x + , u
e (x)) ≤ VN (x̄ + , u
e (x)) + αb ( x + − x̄ + )

Since VN (x + , ue (x)) ≤ VN0 (x) − c1 |x|2 and, since the control u


e (x), x ∈
Rc satisfies both the control and terminal constraints if αa (|x + − x̄ + |) ≤
(1 − γf )cf , it follows that

e (x)) ≤ VN0 (x) − c1 |x|2 + αb ( x + − x̄ + )


VN0 (x + ) ≤ VN (x + , u

so that
VN0 (x + ) ≤ γVN0 (x) + αb ( x + − x̄ + )

Hence x ∈ Rc implies x + = f (x, κN (x), w) ∈ Rc for all w ∈ W if


αa (|x + − x̄ + |) ≤ (1 − γf )cf and αb (|x + − x̄ + |) ≤ (1 − γ)c.
Robust positive invariance of Rb for the controlled uncertain sys-
tem. Similarly, x ∈ Rb implies x + = f (x, κN (x), w) ∈ Rb for all
w ∈ W if αa (|x + − x̄ + |) ≤ (1 − γf )cf and αb (|x + − x̄ + |) ≤ (1 − γ)b.
Descent property of VN0 (·) in Rc \ Rb . Suppose that x ∈ Rc \ Rb and
that αa (|x + − x̄ + |) ≤ (1 − γf )cf . Then because u e (x) ∈ UN (x + ), we
+ 0 +
have that PN (x ) is feasible and thus VN (x ) is well defined. As above,
we have that VN0 (x + ) ≤ γVN0 (x) + αb (|x + − x̄ + |). Let γ ∗ ∈ (γ, 1). If
αb (|x + − x̄ + |) ≤ (γ ∗ − γ)b, we have that

VN0 (x + ) ≤ γVN0 (x) + (γ ∗ − γ)b


< γVN0 (x) + (γ ∗ − γ)VN0 (x)
= γ ∗ VN0 (x)

because VN0 (x) > b.


Summary. These conditions can be simplified if we assume that f (·)
is uniformly Lipschitz continuous in w with Lipschitz constant L so
that f (x + κN (x), w) − f (x, κN (x), 0) ≤ L |w| for all (x, u) ∈ Rc × U.
The function f (·) has this property with L = 1 if f (x, u, w) = f ′ (x,
u) + w. Under this assumption, the four conditions become

1. αa (|Lw|) ≤ (1 − γf )cf
2. αa (|Lw|) ≤ (1 − γ)c
3. αb (|Lw| ≤ (1 − γ)c
4. αb (|Lw|) ≤ (γ ∗ − γ)b
214 Robust and Stochastic Model Predictive Control

XN

Rc x

Rb

Figure 3.2: The sets XN , Rb , and Rc .

Let δ∗ > 0 denote the largest δ such that all four conditions are satisfied
if w ∈ W with |W| ≤ δ.4 Condition 3 can be satisfied if b ≥ δ∗ /(1 − γ).

Proposition 3.5 (Robustness of nominal MPC). Suppose all assumptions


in Section 3.2.4 are satisfied and that |W| ≤ δ∗ and c > b. Then, any ini-
tial state x ∈ Rc of the controlled system x + = f (x, κN (x), w) is steered
to the set Rb in finite time for all admissible disturbance sequences w
satisfying w(i) ∈ W for all i ∈ I≥0 . Thereafter, the state remains in Rb
for all admissible disturbance sequences.

Figure 3.2 illustrates this result.

3.3 Min-Max Optimal Control: Dynamic Programming So-


lution
3.3.1 Introduction

In this section we show how robust control of an uncertain system may


be achieved using dynamic programming (DP). Our purpose here is to
use DP to gain insight. The results we obtain here are not of practical
use for complex systems, but reveal the nature of the problem and
show what the ideal optimal control problem solved online should be.
4 |W| := max w {|w| | w ∈ W}
3.3 Min-Max Optimal Control: Dynamic Programming Solution 215

In Section 3.2 we examined the inherent robustness of an asymp-


totically stable system. If uncertainty is present, and it always is, it is
preferable to design the controller to be robust, i.e., able to cope with
some uncertainty. In this section we discuss the design of a robust
controller for the system

x + = f (x, u, w) (3.14)

in which a bounded disturbance input w models the uncertainty. The


disturbance is assumed to satisfy w ∈ W where W is compact convex,
and contains the origin in its interior. The controlled system is required
to satisfy the same state and control constraints as above, namely (x,
u) ∈ Z as well as a terminal constraint x(N) ∈ Xf . The constraint (x,
u) ∈ Z may be expressed equivalently as x ∈ X and u ∈ U(x) in which
X = {x | ∃ u such that (x, u) ∈ Z} and U(x) = {u | ∃ x such that (x,
u) ∈ Z}. Because of the disturbance, superior control may be achieved
by employing feedback, in the form of a control policy, i.e., a sequence
of control laws rather than employing open-loop control in the form of
a sequence of control actions. Each control law is a function that maps
states into control actions; if the control law at time i is µi (·), then
the system at time i satisfies x(i + 1) = f (x(i), µi (x(i))). Because of
uncertainty, feedback and open-loop control for a given initial state are
not equivalent.
The solution at time k of (3.14) with control and disturbance se-
quences u = (u(0), . . . , u(N − 1)) and w = (w(0), . . . , w(N − 1)) if the
initial state is x at time 0 is φ(k; x, u, w). Similarly, the solution at
time k due to feedback policy µ = (µ0 (·), . . . , µN−1 (·)) and disturbance
sequence w is denoted by φ(k; x, µ, w). As discussed previously, the
cost may be taken to be that of the nominal trajectory, or the average,
or maximum taken over all possible realizations of the disturbance se-
quence. Here we employ, as is common in the literature, the maximum
over all realizations of the disturbance sequence w, and define the cost
due to policy µ with initial state x to be

VN (x, µ) := max {JN (x, µ, w) | w ∈ W } (3.15)


w

in which W = WN is the set of admissible disturbance sequences, and


JN (x, µ, w) is the cost due to an individual realization w of the distur-
bance process and is defined by
N−1
X
JN (x, µ, w) := ℓ(x(i), u(i), w(i)) + Vf (x(N)) (3.16)
i=0
216 Robust and Stochastic Model Predictive Control

in which µ = (µ0 (·), µ1 (·), . . . , µN−1 (·)), x(i) = φ(i; x, µ, w), and u(i) =
µi (x(i)). Let M(x) denote the set of feedback policies µ that for a
given initial state x satisfy: the state and control constraints, and the
terminal constraint for every admissible disturbance sequence w ∈ W .
The first control law µ0 (·) in µ may be replaced by a control action
u0 = µ0 (x) to simplify optimization, since the initial state x is known
whereas future states are uncertain. The set of admissible control poli-
cies M(x) is defined by

M(x) := µ |µ0 (x) ∈ U(x), φ(i; x, µ, w) ∈ X, µi (φ(i; x, µ, w)) ∈ U(x)
∀i ∈ I0:N−1 , φ(N; x, µ, w) ∈ Xf ∀w ∈ W

The robust optimal control problem is

PN (x) : inf {VN (x, µ) | µ ∈ M(x)} (3.17)


µ

The solution to PN (x), if it exists, is the policy µ0 (x)


 
µ0 (x) = µ00 (·; x), µ10 (·; x), . . . , µN−1
0
(·, x)

and the value function is VN0 (x) = VN (x, µ0 (x)).


Dynamic programming solves problem PN (x) with horizon N for
all x such that the problem is feasible, yielding the optimal control
policy µ0 (·) = (µ0 (·), . . . , µN−1 (·)) for the optimal control problem with
horizon N. In doing so, it also solves, for each i ∈ I1:N , problem Pi (x)
yielding the optimal control policy for the problem with horizon i.

3.3.2 Properties of the Dynamic Programming Solution

As for deterministic optimal control, the value function and implicit


control law may, in principle, be obtained by DP. But DP is, in most
cases, impossible to use because of its large computational demands.
There are, of course, important exceptions such as H2 and H∞ opti-
mal control for unconstrained linear systems with quadratic cost func-
tions. DP also can be used for low dimensional constrained optimal
control problems when the system is linear, the constraints are affine,
and the cost is affine or quadratic. Even when DP is computationally
prohibitive, however, it remains a useful tool because of the insight
it provides. Because of the cost definition, min-max DP is required.
For each i ∈ {0, 1, . . . , N}, let Vi0 (·) and κi (·) denote, respectively, the
partial value function and the optimal solution to the optimal control
3.3 Min-Max Optimal Control: Dynamic Programming Solution 217

problem Pi defined by (3.17) with i replacing N. The DP recursion


equations for computing these functions are

Vi0 (x) = min max {ℓ(x, u, w) + Vi−1


0
(f (x, u, w)) | f (x, u, W) ⊆ Xi−1 }
u∈U(x) w∈W
0
κi (x) = arg min max {ℓ(x, u, w) + Vi−1 (f (x, u, W)) | f (x, u, W) ⊆ Xi−1 }
u∈U(x) w∈W

Xi = {x ∈ X | ∃ u ∈ U(x) such that f (x, u, W) ⊆ Xi−1 }

with boundary conditions

V00 (x) = Vf (x) X0 = Xf

In these equations, the subscript i denotes time to go so that κi (·) :=


µN−i (·) (equivalently µi (·) := κN−i (·)). In particular, κN (·) = µ0 (·). For
each i, Xi is the domain of Vi0 (·) (and κi (·)) and is therefore the set of
states x for which a solution to problem Pi (x) exists. Thus Xi is the
set of states that can be robustly steered by state feedback, i.e., by a
policy µ ∈ M(x), to Xf in i steps or less satisfying all constraints for
all disturbance sequences. It follows from these definitions that

Vi0 (x) = max {ℓ(x, κi (x), w) + Vi−1


0
(f (x, κi (x), w))} (3.18)
w∈W

as discussed in Exercise 3.1.


As in the deterministic case studied in Chapter 2, we are interested
in obtaining conditions that ensure that the optimal finite horizon con-
trol law κ00 (·) is stabilizing. To do this we replace the stabilizing As-
sumption 2.14 in Section 2.4.2 of Chapter 2 by conditions appropriate
to the robust control problem. The presence of a disturbance requires
us to generalize some earlier definitions; we therefore define the terms
robustly control invariant and robustly positive invariant that general-
ize our previous definitions of control invariant and positive invariant
respectively.

Definition 3.6 (Robust control invariance). A set X ⊆ Rn is robustly


control invariant for x + = f (x, u, w), w ∈ W if, for every x ∈ X, there
exists a u ∈ U(x) such that f (x, u, W) ⊆ X.

Definition 3.7 (Robust positive invariance). A set X is robustly positive


invariant for x + = f (x, w), w ∈ W if, for every x ∈ X, f (x, W) ⊆ X.

As in Chapter 2, stabilizing conditions are imposed on the ingredi-


ents ℓ(·), Vf (·), and Xf of the optimal control problem to ensure that
218 Robust and Stochastic Model Predictive Control

the resultant controlled system has desirable stability properties; the


solution to a finite horizon optimal control problem does not necessar-
ily ensure stability. Our new assumption is a robust generalization of
the stabilizing Assumption 2.2 employed in Chapter 2.
Assumption 3.8 (Basic stability assumption; robust case).
(a) For all x ∈ Xf there exists a u = κf (x) ∈ U(x) such that

Vf (f (x, u, 0)) ≤ Vf (x) − ℓ(x, u, 0) and f (x, u, w) ∈ Xf ∀w ∈ W

(b) Xf ⊆ X

(c) There exist K∞ functions α1 (·) and αf (·) satisfying

ℓ(x, u, w) ≥ α1 (|x|) ∀(x, w) ∈ Rn × W ∀u such that (x, u) ∈ Z


Vf (x) ≤ αf (|x|), ∀x ∈ Xf

Assumption 3.8(a) replaces the unrealistic assumption in the first


edition that, for each x ∈ Xf , there exists a u ∈ U such that, for all
w ∈ W, Vf (f (x, u, w)) ≤ Vf (x) − ℓ(x, u, w) and f (x, u, w) ∈ Xf . Let
δ ∈ R≥0 be defined by

δ := max {Vf (f (x, κf (x), w)) − Vf (x) + ℓ(x, κf (x), w)}


(x,w)∈Xf ×W

so that, if Assumption 3.8 holds

Vf (f (x, κf (x), w)) ≤ Vf (x) − ℓ(x, u, w) + δ ∀(x, w) ∈ Xf × W (3.19)

If δ = 0, the controller κf (·) can steer any x ∈ Xf to the origin despite


the disturbance.
Theorem 3.9 (Recursive feasibility of control policies). Suppose As-
sumption 3.8 holds. Then
(a) XN ⊇ XN−1 ⊇ . . . ⊇ X1 ⊇ X0 = Xf

(b) Xi is robustly control invariant for x + = f (x, u, w) ∀i ∈ I0:N

(c) Xi is robustly positive invariant for x + = f (x, κi (x), w), ∀i ∈ I0:N


0
(d) [Vi+1 − Vi0 ](x) ≤ max w∈W {[Vi0 − Vi−1
0
](f (x, κi (x), w))} ∀x ∈ Xi ,
0 0
∀i ∈ I1:N−1 . Also Vi (x) − Vi−1 (x) ≤ δ ∀x ∈ Xi−1 , ∀i ∈ {1, . . . , N}
and Vi0 (x) ≤ Vf (x) + iδ ∀ x ∈ Xf , ∀i ∈ I1:N
 
(e) For any x ∈ XN , κN (x), κN−1 (·), . . . , κ1 (·), κf (·) is a feasible pol-
icy for PN+1 (x), and, for any x ∈ XN−1 , κN−1 (x), κN−2 (·), . . . , κ1 (·),

κf (·) is a feasible policy for PN (x).
3.3 Min-Max Optimal Control: Dynamic Programming Solution 219

Proof.
(a)–(c) Suppose, for some i, Xi is robust control invariant so that any
point x ∈ Xi can be robustly steered into Xi . By construction, Xi+1
is the set of all points x that can be robustly steered into Xi . Also
Xi+1 ⊇ Xi so that Xi+1 is robust control invariant. But X0 = Xf is
robust control invariant. Both (a) and (b) follow by induction. Part (c)
follows from (b).

(d) From (3.18) we have

0
[Vi+1 − Vi0 ](x) = max {ℓ(x, κi+1 (x), w) + Vi0 (f (x, κi+1 (x), w))}
w∈W
0
− max {ℓ(x, κi (x), w) + Vi−1 (f (x, κi (x), w))}
w∈W

≤ max {ℓ(x, κi (x), w) + Vi0 (f (x, κi (x), w))}


w∈W
0
− max {ℓ(x, κi (x), w) + Vi−1 (f (x, κi (x), w))}
w∈W

for all x ∈ Xi since κi (·) may not be optimal for problem Pi+1 (x). We
now use the fact that max w {a(w)} − max w {b(w)} ≤ max w {a(w) −
b(w)}, which is discussed in Exercise 3.2, to obtain

0
[Vi+1 − Vi0 ](x) ≤ max {[Vi0 − Vi−1
0
](f (x, κi (x), w))}
w∈W

for all x ∈ Xi . Also, for all x ∈ X0 = Xf

[V10 − V00 ](x) = max {ℓ(x, κ1 (x), w) + Vf (f (x, κ1 (x), w)) − Vf (x)} ≤ δ
w∈W

in which the last inequality follows from Assumption 3.8. By induction,


Vi0 (x) − Vi−1
0
(x) ≤ δ ∀x ∈ Xi−1 , ∀i ∈ {1, . . . , N}. It follows that
0
Vi (x) ≤ Vf (x) + iδ for all x ∈ Xf , all i ∈ {1, . . . , N}.

(e) Suppose x ∈ XN . Then κ0 (x) = (κN (x), κN−1 (·), . . . , κ1 (·)) is a


feasible and optimal policy for problem PN (x), and steers every tra-
jectory emanating from x into X0 = Xf in N time steps. Because Xf is
robustly
 positive invariant for x + = f (x, κf (x), w), w ∈ W, the pol-
icy κN (x), κN−1 (·), . . . , κ1 (·), κf (·) is feasible for problem PN+1 (x).
Similarly, the policy (κN−1 (x), κN−2 (·), . . . , κ1 (·)) is feasible and opti-
mal for problem PN−1 (x), and steers every trajectory emanating from
x ∈ XN−1 into X0 = Xf in N − 1 time  steps. Therefore the pol-
icy κN−1 (x), κN−2 (·), . . . , κ1 (·), κf (·) is feasible for PN (x) for any
x ∈ XN−1 . ■
220 Robust and Stochastic Model Predictive Control

3.4 Robust Min-Max MPC


Because use of dynamic programming (DP) is usually prohibitive, ob-
taining an alternative, robust min-max model predictive control, is de-
sirable. We present here an analysis that uses the improved stability
condition Assumption 3.8. The system to be controlled is defined in
(3.14) and the cost function VN (·) in (3.15) and (3.16). The decision
variable, which, in DP, is a sequence µ = (µ0 (·), µ1 (·), . . . , µN−1 (·))
of control laws, each of which is an arbitrary function of the state
x, is too complex for online optimization; so, we replace µ by the
simpler object µ(v) := (µ(·, v0 ), µ(·, v1 ), . . . µ(·, vN−1 )) in which v =
(v0 , v1 , . . . , vN−1 ) is a sequence of parameters with µ(·) parameterized
by vi , i ∈ I0:N−1 .
A simple parameterization is µ(v) = v = (v0 , v1 , . . . , vN−1 ), a se-
quence of control actions rather than control laws. The decision vari-
able v in this case is similar to the control sequence u used in deter-
ministic MPC, and is simple enough for implementation; the disadvan-
tage is that feedback is not allowed in the optimal control problem
PN (x). Hence the predicted trajectories may diverge considerably. An
equally simple parameterization that has proved to be useful when the
system being controlled is linear and time invariant is µ(v) = (µ( · ,
v0 ) . . . , µ( · , vN−1 )) in which, for each i, µ(x, vi ) := vi + Kx; if f (x,
u, w) = Ax + Bu + w, K is chosen so that AK := A + BK is Hurwitz.
P j
More generally, µ(x, vi ) := j∈J vi θj (x) = ⟨vi , θ(x)⟩, θ(x) := (θ1 (x),
θ2 (x), . . . , θJ (x)). Hence the policy sequence µ(v) is parameterized by
the vector sequence v = (v0 , v1 , . . . , vN−1 ). Choosing appropriate basis
functions θj (·), j ∈ J, is not simple. The decision variable is the vector
sequence v.
With this parameterization, the optimal control problem PN (x) be-
comes

PN (x) : VN0 (x) = min{VN (x, µ(v)) | v ∈ VN (x)}


v

in which

VN (x, µ(v)) := max {JN (x, µ(v), w) | w ∈ WN }


w

N−1
X
JN (x, µ(v), w) := ℓ(x(i), u(i), w(i)) + Vf (x(N))
i=0

VN (x) := {v | (x(i), u(i)) ∈ Z, ∀i ∈ I0:N−1 , x(N) ∈ Xf , ∀w ∈ WN }


3.4 Robust Min-Max MPC 221

with x(i) := φ(i; x, µ(v), w) and u(i) = µ(x(i), vi ); φ(i; x, µ(v), w)


denotes the solution at time i of x(i + 1) = f (x(i), u(i), w(i)) with
x(0) = x, u(i) = µ(x(i), vi ) for all i ∈ I0:N−1 , and disturbance se-
quence w. Let v0 (x) denote the minimizing value of the decision vari-
able v, µ0 (x) := µ(v0 (x)) the corresponding optimal control policy,
and let VN0 (x) := VN (x, µ0 (x)) denote the value function. We implic-
itly assume that a solution to PN (x) exists for all x ∈ XN (x) := {x |
VN (x) ≠ ∅} and that XN is not empty. The MPC action at state x
is µ00 (x) = µ(x, v00 (x)), with v00 (x) the first element of the optimal
decision variable sequence v0 (x). The implicit MPC law is µ00 (·). To
complete the problem definition, we assume that Vf (·) and ℓ(·) sat-
isfy Assumption 3.8.
It follows from Assumption 3.8 that there exists a K∞ function α1 (·)
such that VN0 (x) ≥ α1 (|x|) for all x ∈ XN , the domain of VN0 (·). Deter-
mination of an upper bound for VN0 (·) is difficult, so we assume that
there exists a K∞ function α2 (·) such that VN0 (x) ≤ α2 (|x|) for all
x ∈ XN . We now consider the descent condition, i.e., we determine an
upper bound for VN0 (x + ) − VN0 (x) as well as a warm start for obtaining,
via optimization, the optimal decision sequence v0 (x + ) given v0 (x).
Suppose that, at state x, the value function VN0 (x) and the optimal
decision sequence v0 (x) have been determined, as well as the control
action µ00 (x). The subsequent state is x + = f (x, µ00 (x), w0 ), with w0
the value of the additive disturbance (w(t) if the current time is t). Let
 
µ∗ (x) := µ01:N−1 (x) = µ( · , v10 (x)), µ( · , v20 (x)), . . . , µ( · , vN−1
0
(x))

denote µ0 (x) with its first element µ( · , v00 (x)) removed; µ∗ (x) is a
sequence of N − 1 control laws. In addition let u e (x) be defined by
e (x) := (µ∗ (x), κf (·))
u
e (x) is a sequence of N control laws.
u
For any sequence z let za:b denote the subsequence z(a), z(a + 1),

. . . , z(b) ; as above, z := z0:N−1 . Because x ∈ XN is feasible for the opti-
mal control problem PN (x), every random trajectory with disturbance
sequence w = w0:N−1 ∈ WN emanating from x ∈ XN under the control
policy µ0 (x) reaches the terminal state xN = φ(N; x, µ0 (x), w) ∈ Xf
in N steps. Since w(0) is the first element of w, w = (w(0), w1:N−1 ).
Hence the random trajectory with control sequence µ01:N−1 (x) and dis-
turbance sequence w1:N−1 emanating from x + = f (x, µ00 (x), w(0))
reaches xN ∈ Xf in N − 1 steps. Clearly
JN−1 (x + , µ01:N−1 (x), w1:N−1 ) = JN (x, µ0 (x), w) − ℓ(x, µ00 (x), w0 )
222 Robust and Stochastic Model Predictive Control

By Assumption 3.8, ℓ(x, µ00 (x), w(0)) = ℓ(x, κN (x), w(0)) ≥ α1 (|x|)
and

JN−1 (x + , µ01:N−1 (x), w1:N−1 ) ≤ JN (x, µ0 (x), w) − α1 (|x|)

The policy sequence µ e (x), which appends κf (·) to µ01:N−1 (x), steers
+
x to xN in N − 1 steps and then steers xN ∈ Xf to x(N + 1) = f (xN ,
κf (xN ), wN ) that lies in the interior of Xf . Using Assumption 3.8, we
obtain

e (x), w1:N ) ≤ JN (x, µ0 (x), w) − α1 (|x|) + δ


JN (x + , µ

Using this inequality with w0:N = (w(0), w0 (x + ))5 so that w1:N =
w0 (x + ) and w = w0:N−1 = w(0), w00:N−2 (x + ) yields

VN0 (x + ) = JN (x + , µ0 (x + ), w0 (x + )) ≤ JN (x + , µ
e (x), w0 (x + ))
≤ JN (x, µ0 (x), (w(0), w00:N−2 (x + )) − α1 (|x|) + δ
≤ VN0 (x) − α1 (|x|) + δ

The
 last inequality follows from the fact that the disturbance sequence
w(0), w00:N−2 (x + ) does not necessarily maximize w , JN (x, µ0 (x),
w).
Assume now that ℓ(·) is quadratic and positive definite so that
α1 (|x|) ≥ c1 |x|2 . Assume also that VN0 (x) ≤ c2 |x|2 so that for all
x ∈ XN
VN0 (x + ) ≤ γVN0 (x) + δ

with γ = 1 − c1 /c2 ∈ (0, 1). Let ε > 0. It follows that for all x ∈ XN
such that VN0 (x) ≥ c := (δ + ε)/(1 − γ)

VN0 (x + ) ≤ γVN0 (x) + δ ≤ VN0 (x) − (1 − γ)c + δ ≤ VN0 (x) − ε

since VN0 (x) ≥ c and, by definition, (1 − γ)c = δ + ε. Secondly, if x lies


in levc VN0 , then
VN0 (x + ) ≤ γc + δ ≤ c − ε

since VN0 (x) ≤ c and, by definition, c = γc + δ + ε. Hence x ∈ levc VN0


implies x + ∈ f (x, µ00 (x), W) ⊂ levc VN0 .

5 w0 (x + ) := arg max w∈WN JN (x + , µ0 (x + ), w).


3.5 Tube-Based Robust MPC 223

Summary. If δ < (1 − γ)c (c > δ/(1 − γ)) and levc VN0 ⊂ XN , every
initial state x ∈ XN of the closed-loop system x + = f (x, µ00 (x), w)
is steered to the sublevel set levc VN0 in finite time for all disturbance
sequences w satisfying w(i) ∈ W, all i ≥ 0, and thereafter remains
in this set; the set levc VN0 is positive invariant for x + = f (x, µ00 (x),
w), w ∈ W. The policy sequence u e (x), easily obtained from µ0 (x), is
feasible for PN (x ) and is a suitable warm start for computing µ0 (x + ).
+

3.5 Tube-Based Robust MPC


3.5.1 Introduction

It was shown in Section 3.4 that it is possible to control an uncertain


system robustly using a version of MPC that requires solving online an
optimal control problem of minimizing a cost subject to satisfaction
of state and control constraints for all possible disturbance sequences.
For MPC with horizon N and qx state constraints, the number of state
constraints in the optimal control problem is Nqx . Since the state con-
straints should be satisfied for all disturbance sequences, the number
of state constraints for the uncertain case is MNqx , with M equal to the
number of disturbance sequences. For linear MPC, M can be as small as
V N with V equal to the number of vertices of W with W polytopic. For
nonlinear MPC, Monte Carlo optimization must be employed, in which
case M can easily be several thousand to achieve constraint satisfac-
tion with high probability. The number of constraints MNqx can thus
exceed 106 in process control applications.
It is therefore desirable to find approaches for which the online com-
putational requirement is more modest. We describe, in this section,
a tube-based approach. We show that all trajectories of the uncertain
system lie in a bounded neighborhood of a nominal trajectory. This
bounded neighborhood is called a tube. Determination of the tube en-
ables satisfaction of the constraints by the uncertain system for all
disturbance sequences to be obtained by ensuring that the nominal
trajectory satisfies suitably tightened constraints. If the nominal tra-
jectory satisfies the tightened constraints, every random trajectory in
the associated tube satisfies the original constraints. Computation of
the tightened constraints may be computationally expensive but can
be done offline; the online computational requirements are similar to
those for nominal MPC.
To describe tube-based MPC, we use some concepts in set algebra.
Given two subsets A and B of Rn , we define set addition, set subtrac-
224 Robust and Stochastic Model Predictive Control

tion (sometimes called Minkowski or Pontryagin set subtraction), set


multiplication, and Hausdorff distance between two sets as follows.

Definition 3.10 (Set algebra and Hausdorff distance).


(a) Set addition: A ⊕ B := {a + b | a ∈ A, b ∈ B}

(b) Set subtraction: A ⊖ B := {x ∈ Rn | {x} ⊕ B ⊆ A}

(c) Set multiplication: Let K ∈ Rm×n ; then KA := {Ka | a ∈ A}

(d) The Hausdorff distance dH (·) between two subsets A and B of Rn


is defined by

dH (A, B) := max{sup d(a, B), sup d(b, A)}


a∈A b∈B

in which d(x, S) denotes the distance of a point x ∈ Rn from a set


S ⊂ Rn and is defined by

d(x, S) := inf {d(x, y) | y ∈ S} d(x, y) := x − y


y

In these definitions, {x} denotes the set consisting of a single point


x, and {x} ⊕ B therefore denotes the set {x + b | b ∈ B}; the set A ⊖ B
is the largest set C such that B ⊕ C ⊆ A. A sequence (x(i)) is said to
converge to a set S if d(x(i), S) → 0 as i → ∞. If dH (A, B) ≤ ε, then
the distance of every point a ∈ A from B is less than or equal to ε, and
that the distance of every point b ∈ B from A is less than or equal to
ε. We say that the sequence of sets (A(i)) converges, in the Hausdorff
metric, to the set B if dH (A(i), B) → 0 as i → ∞.
Our first task is to generate an outer-bounding tube. An excellent
background for the following discussion is provided in Kolmanovsky
and Gilbert (1998).

3.5.2 Outer-Bounding Tube for a Linear System with Additive Dis-


turbance

Consider the following linear system

x + = Ax + Bu + w

in which w ∈ W, a compact convex subset of Rn containing the origin.


We assume that W contains the origin in its interior. Let φ(i; x, u, w)
denote the solution of x + = Ax + Bu + w at time i if the initial state at
3.5 Tube-Based Robust MPC 225

time zero is x, and the control and disturbance sequences are, respec-
tively, u and w.
Let the nominal system be described by

x̄ + = Ax̄ + B ū

and let φ̄(i; x̄, u) denote the solution of x̄ + = Ax̄ + B ū at time i if the
initial state at time zero is x̄. Then e := x− x̄, the deviation of the actual
state x from the nominal state x̄, satisfies the difference equation

e+ = Ae + w

so that
i−1
X
e(i) = Ai e(0) + Aj w(j)
j=0

in which e(0) = x(0) − x̄(0). If e(0) = 0, then e(i) ∈ S(i) where the set
S(i) is defined by

i−1
X
S(i) := Aj W = W ⊕ AW ⊕ · · · ⊕ Ai−1 W
j=0

P
in which and ⊕ denote set addition. It follows from our assumptions
on W that S(i) contains the origin in its interior for all i ≥ n.
We first consider the tube X(x, u) generated by the open-loop con-
trol sequence u when x(0) = x̄(0) = x, and e(0) = 0. It is easily seen
that X(x, u) = (X(0; x), X(1; x, u), . . . , X(N; x, u)) with

X(i; x) := {x̄(i)} ⊕ S(i)

and x̄(i) = φ̄(i; x, u), the state at time i of the nominal system, is
the center of the tube. So it is relatively easy to obtain the exact tube
generated by an open-loop control if the system is linear and has a
bounded additive disturbance, provided that one can compute the sets
S(i).
If A is stable, then, as shown in Kolmanovsky and Gilbert (1998),
P∞
S(∞) := j=0 Aj W exists and is positive invariant for x + = Ax + w, i.e.,
x ∈ S(∞) implies that Ax + w ∈ S(∞) for all w ∈ W; also S(i) → S(∞)
in the Hausdorff metric as i → ∞. The set S(∞) is known to be the
minimal robust positive invariant set6 for x + = Ax + w, w ∈ W. Also
6 Every other robust positive invariant set X satisfies X ⊇ S∞ .
226 Robust and Stochastic Model Predictive Control

S(i) ⊆ S(i + 1) ⊆ S(∞) for all i ∈ I≥0 so that the tube X̂(x, u) defined
by  
X̂(x, u) := X̂(0; x), X̂(1; x, u), . . . , X̂(N; x, u)
in which

X̂(0; x) = {x} ⊕ S(∞) X̂(i; x, u) = {x̄(i)} ⊕ S(∞)

is an outer-bounding tube with constant “cross section” S(∞) for the


exact tube X(x, u) (X(i; x, u) ⊆ X̂(i; x, u) for all i ∈ I≥0 ). It is sometimes
more convenient to use the constant cross-section outer-bounding tube
X̂(x, u) in place of the exact tube X(x, u). If we restrict attention to
the interval [0, N] as we do in computing the MPC action, then replac-
ing S(∞) by S(N) yields a less conservative, constrained cross-section,
outer-bounding tube for the interval [0, N].
Use of the exact tube X(x, u) and the outer-bounding tube X̂(x, u)
may be limited for reasons discussed earlier—the sets S(i) may be un-
necessarily large simply because an open-loop control sequence rather
than a feedback policy was employed to generate the tube. For example,
if W = [−1, 1] and x + = x + u + w, then S(i) = (i + 1)W increases with-
out bound as time i increases. We must introduce feedback to contain
the size of S(i), but wish to do so in a simple way because optimizing
over arbitrary policies is prohibitive. The feedback policy we propose
is
u = ū + K(x − x̄)
in which x is the current state of the system x + = Ax +Bu+w, x̄ is the
current state of a nominal system defined below, and ū is the current
input to the nominal system. With this feedback policy, the state x
satisfies the difference equation

x + = Ax + B ū + BKe + w

in which e := x − x̄ is the deviation of the actual state from the nominal


state. The nominal system corresponding to the uncertain system x + =
Ax + B ū + BKe + w is
x̄ + = Ax̄ + B ū
The deviation e = x − x̄ now satisfies the difference equation

e + = AK e + w AK := A + BK

which is the same equation used previously except that A, which is


possibly unstable, is replaced by AK , which is stable by design. If K is
3.5 Tube-Based Robust MPC 227

chosen so that AK is stable, then the corresponding uncertainty sets


SK (i) defined by
i−1
X j
SK (i) := AK W
j=0

can be expected to be smaller than the original uncertainty sets S(i),


i ∈ I≥0 , considerably smaller if A is unstable and i is large. Our as-
sumptions on W imply that SK (i), like S(i), contains the origin in its
P∞ j
interior for each i. Since AK is stable, the set SK (∞) := j=0 AK W ex-
ists and is positive invariant for e+ = AK e + w. Also, SK (i) → SK (∞)
in the Hausdorff metric as i → ∞. Since K is fixed, the feedback policy
u = ū + K(x − x̄) is simply parameterized by the open-loop control
sequence ū. If x(0) = x̄(0) = x, the tube generated by the feedback
policy u = ū+K(x − x̄) is X(x, ū) = (X(0; x), X(1; x, ū), . . . , X(N; x, ū))
in which
X(0; x) = {x} X(i; x, ū) := {x̄(i)} ⊕ SK (i)

and x̄(i) is the solution of the nominal system x̄ + = Ax̄ + B ū at


time i if the initial state x̄(0) = x, and the control sequence is ū.
For given initial state x and control sequence ū, the solution of x + =
Ax + B(ū + Ke) + w lies in the tube X(x, ū) for every admissible distur-
bance sequence w. As before, SK (i) may be replaced by SK (∞) to get
an outer-bounding tube. If attention is confined to the interval [0, N],
SK (i) may be replaced by SK (N) to obtain a less conservative outer-
bounding tube. If we consider again our previous example, W = [−1,
1] and x + = x + u + w, and choose K = −(1/2), then AK = 1/2,
SK (i) = (1 + 0.5 + . . . + 0.5i−1 )W ⊂ 2W, and SK (∞) = 2W = [−2, 2]. In
contrast, S(i) → [−∞, ∞] as i → ∞.
In the preceding discussion, we required x(0) = x̄(0) so that e(0) =
0 in order to ensure e(i) ∈ S(i) or e(i) ∈ SK (i). When AK is stable,
however, it is possible to relax this restriction. This follows from the
previous statement that SK (∞) exists and is robustly positive invariant
for e+ = AK e + w, i.e., e ∈ SK (∞) implies e+ ∈ SK (∞) for all e+ ∈
{AK e} ⊕ W. Hence, if e(0) ∈ SK (∞), then e(i) ∈ SK (∞) for all i ∈ I≥0 ,
all w ∈ Wi .
In tube-based MPC, we ensure that x̄(i) → 0 as i → ∞, so that x(i),
which lies in the sequence of sets ({x̄(i)} ⊕ SK (i))0:∞ , converges to the
set SK (∞) as i → ∞. Figure 3.3 illustrates this result (S := SK (∞)).
Even though SK (∞) is difficult to compute, this is a useful theoretical
property of the controlled system.
228 Robust and Stochastic Model Predictive Control

x2
X1

x trajectory

X0
X2

x̄ trajectory

x1

Figure 3.3: Outer-bounding tube X(x̄, ū); Xi = {x̄(i)} ⊕ SK (∞).

The controller is required to ensure that state-control constraint


(x, u) ∈ Z is not transgressed. Let Z̄ be defined by

Z̄ := Z ⊖ (SK (∞) × KSK (∞))

since it follows from the definition of the set operation ⊖ that Z̄ ⊕


(SK (∞) × KSK (∞)) ⊆ Z. In the simple case when Z = X × U

Z̄ = X̄ × Ū X̄ = X ⊖ SK (∞) Ū = U ⊖ KSK (∞)

Computation of the set SK (∞)—which is known to be difficult—is not


required, as we show later. It follows from the preceding discussion
that if the nominal state and control trajectories x̄ and ū satisfy the
tightened constraint (x̄(i), ū(i)) ∈ Ẑ ⊂ Z̄ for all i ∈ I0:N−1 , the state
and control trajectories x and u of the uncertain system then satisfy
the original constraints (x(i), u(i)) ∈ Z for all i ∈ I0:N−1 . This is the
basis for tube-based robust MPC discussed next.

3.5.3 Tube-Based MPC of Linear Systems with Additive Disturbances

The tube-based controller has two components: (i) a nominal state-


control trajectory (x̄(i), ū(i))i∈I≥0 that commences at the initial state
x and that satisfies the tightened constraint, and (ii) a feedback con-
troller u = ū + K(x − x̄) that attempts to steer the uncertain state-
control trajectory to the nominal trajectory. The nominal state-control
3.5 Tube-Based Robust MPC 229

trajectory may be generated at the initial time or generated sequen-


tially using standard MPC for deterministic systems. The latter gives
more flexibility to cope with changing conditions, such as changing
setpoint. Assume, then, that a controller ū = κ̄N (x̄) for the nominal
system x̄ + = Ax̄ + B ū has been determined using results in Chapter 2
by solving the standard optimal control problem of the form

P̄N (x̄) : V̄N0 (x̄) = min{V̄N (x̄, ū) | ū ∈ ŪN (x̄)}



N−1
X
V̄N (x̄, ū) = ℓ(x̄(i), ū(i)) + Vf (x̄(N))
i=0

ŪN (x̄) = {ū | (x̄(i), ū(i)) ∈ Z̄, i ∈ I0:N−1 , x̄(N) ∈ Xf }

in which x̄(i) = φ̄(i; x, ū). Under usual conditions, the origin is asymp-
totically stable for the controlled nominal system described by

x̄ + = Ax̄ + B κ̄N (x̄)

and the controlled system satisfies the constraint (x̄(i), ū(i)) ∈ Z̄ for
all i ∈ I≥0 . Let X̄N denote the set {x̄ | ŪN (x̄) ≠ ∅}. Of course, deter-
mination of the control κ̄N (x̄) requires solving online the constrained
optimal control problem PN (x̄).
The feedback controller, given the state x of the system being con-
trolled, and the state x̄ of the nominal system, generates the control
u = κ̄N (x̄) + K(x − x̄). The composite system with state (x, x̄) satisfies

x + = Ax + B κ̄N (x̄) + BK(x − x̄) + w


x̄ + = Ax̄ + B κ̄N (x̄)

The system with state (e, x̄), e := x − x̄, satisfies a simpler difference
equation

e + = AK e + w
x̄ + = Ax̄ + B κ̄N (x̄)

The two states (x, x̄) and (e, x̄) are related by
" # " # " #
e x I −I
=T T :=
x̄ x̄ 0 I

Since T is invertible, the two systems with states (x, x̄) and (e, x̄) are
equivalent. Hence, to establish robust stability it suffices to consider
the simpler system with state (e, x̄). First, we define robustly asymp-
totically stable (RAS).
230 Robust and Stochastic Model Predictive Control

Definition 3.11 (Robust asymptotic stability of a set). Suppose the sets


S1 and S2 , S2 ⊂ S1 , are robustly positive invariant for the system z+ =
f (z, w), w ∈ W. The set S2 is RAS for z+ = f (z, w) in S1 if there exists
a KL function β(·) such that every solution φ( · ; z, w) of z+ = f (z, w)
with initial state z ∈ S1 and any disturbance sequence w ∈ W∞ satisfies

φ(i; z, w) S2 ≤ β(|z|S2 , i) ∀i ∈ I≥0

In this definition, |z|S := d(z, S), the distance of z from set S.


We now assume that κ̄N (·) and Z̄ have been determined to ensure
the origin is asymptotically stable in a positive invariant set X̄ for the
controlled nominal system x̄ + = Ax̄ + B κ̄N (x̄). Under this assumption
we have

Proposition 3.12 (Robust asymptotic stability of tube-based MPC for


linear systems). The set SK (∞) × {0} is RAS for the composite system
(e+ = AK e + w, x̄ + = Ax̄ + B κ̄N (x̄)) in the positive invariant set SK (∞) ×
X̄N .

Proof. Because the origin is asymptotically stable for x̄ + = Ax̄+B κ̄N (x̄),
there exists a KL function β(·) such that every solution φ̄( · ; x̄) of the
controlled nominal system with initial state x̄ ∈ X̄N satisfies

φ̄(i; x̄) ≤ β(|x̄| , i) ∀i ∈ I≥0

Since e(0) ∈ SK (∞) implies e(i) ∈ SK (∞) for all i ∈ I≥0 , it follows that

(e(i), φ̄(i; x̄)) ≤ |e(i)|SK (∞) + φ̄(i; x̄) ≤ β(|x̄| , i)


SK (∞)×{0}

Hence the set SK (∞) × {0} is RAS in SK (∞) × X̄N for the composite
system (e+ = AK e + w, x̄ + = Ax̄ + B κ̄N (x̄)). ■

It might be of interest to note that (see Exercise 3.4)

dH ({φ̄(i; x̄)} ⊕ SK (∞), SK (∞)) ≤ φ̄(i; x̄) ≤ β(|x̄| , i)

for every solution φ̄(·) of the nominal system with initial state x̄ ∈ XN .
Finally we show how suitable tightened constraints may be deter-
mined. It was shown above that the nominal system should satisfy
the tightened constraint (x̄, ū) ∈ Z̄ = Z ⊖ (SK (∞), KSK (∞)). Since
SK (∞) is difficult to compute and use, impossible for many process
control applications, we present an alternative. Suppose Z is polytopic
and is described by a set of scalar inequalities of the form c ′ z ≤ d
3.5 Tube-Based Robust MPC 231

(cx′ x + cu

u ≤ d). We show next how each constraint of this form may
be tightened so that satisfaction of the tightened constraint by the nom-
inal system ensures satisfaction of original constraint by the uncertain
system. For all j ∈ I≥0 , let
j−1
X

θj := max {c (e, Ke) | e ∈ SK (j)} = max { c ′ (I, K)AiK wi | w ∈ W0:j−1 }
e w
i=0

in which c ′ (e, Ke) = cx′ e+cu



Ke and c ′ (I, K)AiK wi = cx′ AiK wi +cu

KAiK wi .
Satisfaction of the constraint c ′ z̄ ≤ d − θ∞ by the nominal system en-
sures satisfaction of c ′ z ≤ d, z = z̄ + (e, Ke), by the uncertain system;
however, computation of θ∞ is impractical so we adopt the approach
in (Raković, Kerrigan, Kouramas, and Mayne, 2005a). Because AK is
Hurwitz, for all α ∈ (0, 1) there exists a finite integer N such that
AN N
K W ⊂ αW and KAK W ⊂ αKW. It follows that

θ∞ ≤ θN + αθ∞

so that
θ∞ ≤ (1 − α)−1 θN
Hence, satisfaction of the tightened constraint c ′ z̄ ≤ d − (1 − α)−1 θN
by the nominal system ensures that the uncertain system satisfies the
original constraint c ′ z ≤ d. The tightened constraint set Z̄ is defined
by these modified constraints.

Example 3.13: Calculation of tightened constraints


Consider the system
" # " #
1 1 0
x+ = + +w
0 1 1

i u) | |x|∞ ≤ 1, |u| ≤ 1}, and


with W := {w | |w|∞ ≤ h0.1}, Z := {(x,
nominal control law K := −0.4 −1.2 . For increasing values of N, we
calculate α such that AN N
K W ⊂ αW and KAK W ⊂ αKW.
Because W is a box, it is sufficient to check only its vertices, i.e., the
four elements w ∈ W := {−0.1, 0.1}2 . Thus, we have
 
max w∈W AN K w ∞ max w∈W KAK w ∞
N
α = max  , 
max w∈W |w|∞ max w∈W |Kw|∞

These values are shown in Figure 3.4. From here, we see that N ≥ 3
is necessary for the approximation to hold. With the values of α, the
232 Robust and Stochastic Model Predictive Control

100

10−1

10−2

10−3
α
10−4

10−5

10−6

10−7

10−8
0 5 10 15 20 25
N

Figure 3.4: Minimum feasible α for varying N. Note that we require


α ∈ [0, 1).

tightened constraint sets Z̄ can then be computed as above. Once again,


because of the structure of W, we need only check the vertices. Due to
the symmetry of the system, each set is of the form

Z̄ = {(x, u) | |x1 | ≤ χ1 , |x2 | ≤ χ2 , |u| ≤ µ}

The bounds χ1 , χ2 , and µ are shown in Figure 3.5. Note that while
N = 3 gives a feasible value of α, we require at least N = 4 for Z̄ to be
nonempty. □

Time-varying constraint set Z̄(i). The tube-based model predictive


controller is conservative in that the feasible set for P̄N (x̄) is unneces-
sarily small due to use of a constant constraint set Z̄ = Z ⊖ (SK (∞) ×
KSK (∞)). This reduces the region of attraction X̄N , the set of states
for which P̄N (x̄) is feasible. Tube-based model predictive control can
be made less conservative by using time-varying constraint set Z̄(i) =
Z ⊖ (SK (i) × KSk (i)), i ∈ I0:N−1 for the initial optimal control prob-
lem that generates the control sequence u0 (x̄). The control applied
3.5 Tube-Based Robust MPC 233

0.75

0.50

0.25
component bound

0.00

−0.25

−0.50
χ1
−0.75 χ2
µ
−1.00
0 5 10 15 20 25
N

Figure 3.5: Bounds on tightened constraint set Z̄ for varying N.


Bounds are |x1 | ≤ χ1 , |x2 | ≤ χ2 , and |u| ≤ µ.

to the uncertain system is ū(i) + Ke(k); the infinite sequence ū is


constructed as follows. The sequence (ū(0), ū(1), . . . , ū(N − 1)) is set
equal to ū0 (x̄), the solution of the nominal optimal control problem at
the initial state x̄, with time-varying constraint sets Z̄(i) and terminal
constraint set X̄(f ). The associated state sequence is (x̄(0), x̄(1), . . . ,
x̄(N)) with x̄(N) ∈ X̄f . For i ∈ I≥N , ū(i) and x̄(i) are obtained as the
solution at time i of

x̄ + = Ax̄ + Bκf (x̄), u = κf (x̄)

with initial state x̄(N) at time N. We now assume that X̄f satisfies
X̄f ⊕SK (∞) ⊂ X. Since x̄(N) ∈ X̄f it follows that x̄(i) ∈ X̄f and x(i) ∈ X
for all i ∈ I≥N . Also, for all i ∈ I0:N−1 , x̄(i) ∈ X̄(i) = X ⊖ SK (i) and
e(i) ∈ Sk (i) so that x(i) = x̄(i) + e(i) ∈ X. Hence x(i) ∈ X for all
i ∈ I≥0 . Since x̄(i) → 0, the state x(i) of the uncertain system tends to
SK (∞) as i → ∞. Since Z̄(i) ⊃ Z̄, the region of attraction is larger than
that for tube-based MPC using a constant constraint set.
234 Robust and Stochastic Model Predictive Control

3.5.4 Improved Tube-Based MPC of Linear Systems with Additive


Disturbances

In this section we describe a version of the tube-based model predictive


controller that has pleasing theoretical properties. We omitted, in the
previous section, to make use of an additional degree of freedom avail-
able to the controller, namely the ability to change the state x̄ of the
nominal system. In Chisci, Rossiter, and Zappa (2001), x̄ is set equal to
x, the current state of the uncertain system, but there is no guarantee
that an initial state x is superior to x̄ in the sense of enhancing conver-
gence to the origin of the nominal trajectory. To achieve more rapid
convergence, we propose that an improved tube center x̄ ∗ is chosen
by minimizing the value function V̄N0 (·) of the nominal optimal control
problem. It is necessary that the current state x remains in the tube
with new center x̄ ∗ . To achieve this, at state (x, x̄), a new optimal con-
trol problem P̄∗ N (x), is solved online, to determine an improved center
x̄ ∗ and, simultaneously the subsequent center x̄ + . We assume, for sim-
plicity, that Z = X × U and Z̄ = X̄ × Ū. The new optimal control problem
P∗N (x) that replaces P̄N (x̄) is defined by

0
P∗ ∗
N (x) : V̄N (x) = min{V̄N (z) | x ∈ {z} ⊕ SK (∞), z ∈ X̄}
z

= min{V̄N (z, ū) | ū ∈ ŪN (z), x ∈ {z} ⊕ SK (∞), z ∈ X̄}


z,ū

The solution to problem P∗ ∗ ∗


N (x) is (x̄ (x), ū (x)). The constraint x ∈
{z}⊕SK (∞) ensures that the current state x lies in {x̄ ∗ (x)}⊕SK (∞), the
first element of the “new tube.” The argument of P∗ N (x) is x because of
the constraint x ∈ {z} ⊕ SK (∞); the solution to the problem generates
both the improved current nominal state x̄ ∗ (x) as well as its successor
x̄ + . If (x, x̄) satisfies x̄ ∈ X̄N and x ∈ {x̄} ⊕ SK (∞), then (x̄, u e (x̄)) is

a warm start for PN (x); here u(x̄) is a warm start for P̄N (x̄). The
e
successor nominal state is

x̄ + = (x̄ ∗ (x))+ = Ax̄ ∗ (x) + B κ̄N (x̄ ∗ (x))

in which, as usual, κ̄N (x̄ ∗ (x)) is the first element in the control se-
quence ū∗ (x). It follows that

V̄N∗ (x) = V̄N0 (x̄ ∗ (x)) ≤ V̄N0 (x̄), ū∗ (x) = ū0 (x̄ ∗ (x))

The control applied to the uncertain system at state x is



κ̄N (x) := κ̄N (x̄ ∗ (x)) + K(x − x̄ ∗ (x))
3.5 Tube-Based Robust MPC 235

so the closed-loop uncertain system satisfies

x + = Ax + B κ̄N (x̄ ∗ (x)) + K(x − x̄ ∗ (x)) + w

and e = x − x̄ ∗ (x) satisfies

e+ = x + − (x̄ ∗ (x))+ = Ae + BKe + w = AK e + w

as before so that if e ∈ SK (∞), then e+ ∈ SK (∞); hence x ∈ {x̄ ∗ (x)} ⊕


SK (∞) implies x + ∈ {(x̄ ∗ (x))+ } ⊕ SK (∞).
Suppose then that x̄ ∈ X̄N ⊆ X̄ and x ∈ {x̄} ⊕ SK (∞) so that x ∈ X.
If the usual assumptions for the nominal optimal control problem P̄N
are satisfied and ℓ(·) is quadratic and positive definite it follows that
2
V̄N∗ (x) = V̄N0 (x̄ ∗ (x)) ≥ c1 x̄ ∗ (x)
2
V̄N∗ (x) = V̄N0 (x̄ ∗ (x)) ≤ c2 x̄ ∗ (x)
2
V̄N∗ (x + ) = V̄N0 (x̄ ∗ (x + )) ≤ V̄N0 ((x̄ ∗ (x))+ ) ≤ V̄N0 (x̄ ∗ (x)) − c1 x̄ ∗ (x)

The last inequality follows from the fact that x̄ + = (x̄ ∗ (x))+ = Ax̄ ∗ (x)+
+
B κ̄N (x̄ ∗ (x)) and the descent property of the solution to P̄0N (x̄ ∗ (x)).

Proposition 3.14 (Recursive feasibility of tube-based MPC). Suppose


that at time zero, (x, x̄) ∈ ({x̄} ⊕ SK (∞)) × X̄N . Then, Problem P̄∗ N is
recursively feasible: (x, x̄) ∈ ({x̄}⊕SK (∞))× X̄N implies (x, x̄)+ = (x + ,
x̄ + ) ∈ ({x̄ + } ⊕ SK (∞)) × X̄N .

Proof. Suppose that (x, x̄) satisfies x ∈ {x̄}⊕SK (∞) and x̄ ∈ X̄N . From
the definition of P̄∗N , any solution satisfies the tightened constraints so
that x̄ ∗ (x) ∈ X̄N . The terminal conditions ensure, by the usual argu-
ment, that the successor state x̄ ∗ (x)+ also lies in X̄N . The condition
x ∈ {z} ⊕ SK (∞) in P∗ ∗
N (x) then implies that x ∈ {x̄ (x)} ⊕ SK (∞) so
+ + +
that x ∈ {x̄ } ⊕ SK (∞) (e ∈ SK (∞)). ■

Proposition 3.15 (Robust exponential stability of improved tube-based


MPC). The set SK (∞) is robustly exponentially stable in X̄N ⊕ SK (∞) for

the system x + = Ax + B κ̄N (x̄ ∗ (x)) + K(x − x̄ ∗ (x)) + w.

Proof. It follows from the upper and lower bounds on V̄N0 (x ∗ (x)), and
the descent property listed above that

V̄N0 (x ∗ (x + )) ≤ γ V̄N0 (x ∗ (x))

with γ = (1 − c1 /c2 ) ∈ (0, 1). Hence, if x(i) denotes the solution at


time i of x + = Ax + B(κ̄N (x̄ ∗ (x)) + K(x − x ∗ (x))) + w, V̄N0 (x̄ ∗ (x(i)))
236 Robust and Stochastic Model Predictive Control

decays exponentially fast to zero. It then follows from the upper bound
on V̄N0 (x̄ ∗ (x)) that x ∗ (x(i)) also decays exponentially to zero. Because
x(i) ∈ {x̄ ∗ (x(i))} for all i ∈ I≥0 , it follows, similarly to the proof of
Proposition 3.12, that the set SK (∞) is robustly exponentially stable in
X̄N ⊕ SK (∞) for the system x + = Ax + B κ̄N (x̄ ∗ (x)) + K(x − x̄ ∗ (x)) +
w. ■

3.6 Tube-Based MPC of Nonlinear Systems


Satisfactory control in the presence of uncertainty requires feedback.
As shown in Section 3.5, MPC of uncertain systems ideally requires
optimization over control policies rather than control sequences, re-
sulting in an optimal control problem that is often impossibly com-
plex. Practicality demands simplification. Hence, in tube-based MPC
of constrained linear systems we replace the general control policy
µ = (µ0 (·), µ1 (·), . . . , µN−1 (·)), in which each element µi (·) is an arbi-
trary function, by the simpler policy µ in which each element has the
simple form µi (x) = ū(i) + K(x − x̄(i)); ū(i) and x̄(i), the control and
state of the nominal system at time i, are determined using conven-
tional MPC.
The feedback gain K, which defines the local control law, is de-
termined offline; it can be chosen so that all possible trajectories of
the uncertain system lie in a tube centered on the nominal trajectory
(x̄(0), x̄(1), . . .). The “cross section” of the tube is a constant set SK (∞)
so that every possible state of the uncertain system at time i lies in the
set {x̄(i)} ⊕ SK (∞). This enables the nominal trajectory to be deter-
mined using MPC, to ensure that all possible trajectories of the un-
certain system satisfy the state and control constraints, and that all
trajectories converge to an invariant set centered on the origin.
It would be desirable to extend this methodology to the control
of constrained nonlinear systems, but we face some formidable chal-
lenges. It is possible to define a nominal system and, as shown in Chap-
ter 2, to determine, using MPC with “tightened” constraints, a nominal
trajectory that can serve as the center of a tube. But it seems to be
prohibitively difficult to determine a local control law that steers all
trajectories of the uncertain system toward the nominal trajectory, and
of a set centered on the nominal trajectory in which these trajectories
can be guaranteed to lie.
We can overcome these difficulties by first generating a nominal
trajectory—either by MPC as in the linear case or by a single solution
3.6 Tube-Based MPC of Nonlinear Systems 237

of an optimal control problem—and then using MPC to steer the state


of the uncertain system toward the nominal trajectory x̄(·). The lat-
ter MPC controller replaces the linear controller u = ū + K(x − x̄)
employed in the linear case, and thereby avoids the difficulty of de-
termining a local nonlinear version of this linear controller. The value
function (x, i) , VN0 (x, i) of the optimal control problem that is used to
determine the MPC controller is time varying and has the property that
VN0 (x̄(i), i) = 0 for all i. The tube is now a sequence of sublevel sets

levc VN0 (·, i) and therefore, unlike the linear case, has a varying
i∈I≥0
cross section. We show that if the initial state x(0) lies in levc VN0 (·, 0),
then subsequent states x(i) of the controlled system lie in levc VN0 (·, i)
for all i ∈ I≥0 .
The system to be controlled is described by a nonlinear difference
equation
x + = f (x, u, w) (3.20)
in which the disturbance w is assumed to lie in the compact set W
that contains the origin. The state x and the control u are required to
satisfy the constraints
x∈X u∈U
Both X and U are assumed to be compact and to contain the origin in
their interiors. The solution of (3.20) at time i, if the initial state at time
zero is x0 and the control is generated by policy µ, is φ(i; x0 , µ, w), in
which w denotes, as usual, the disturbance sequence (w(0), w(1), . . .).
Similarly, φ(i; x0 , κ, w) denotes the solution of (3.20) at time i, if the
initial state at time zero is x0 and the control is generated by a time-
invariant control law κ(·).
The nominal system is obtained by neglecting the disturbance w
and is therefore described by

x̄ + = f¯(x̄, ū) := f (x̄, ū, 0)

Its solution at time i, if its initial state is x̄0 , is denoted by φ̄(i; x̄0 ,
ū), in which ū := (ū(0), ū(1), . . .) is the nominal control sequence. The
deviation between the actual and nominal state is e := x−x̄ and satisfies

e+ = f (x, u, w) − f (x̄, ū, 0) = f (x, u, w) − f¯(x̄, ū)

Because f (·) is nonlinear, this difference equation cannot be simplified


as in the linear case when e+ is independent of x and x̄, and depends
only on their difference e and w.
238 Robust and Stochastic Model Predictive Control

3.6.1 The Nominal Trajectory

The nominal trajectory is a feasible trajectory for the nominal system


that is sufficiently far from the boundaries of the original constraints
to enable the model predictive controller for the uncertain system to
satisfy these constraints. It is generated by the solution to a nominal
optimal control problem P̄N (x̄) in which x̄ is the state of the nomi-
nal system. The cost function V̄N (·) for the nominal optimal control
problem is defined by

N−1
X
V̄N (x̄, ū) := ℓ(x̄(i), ū(i)) (3.21)
i=0

in which x̄(i) = φ̄(i; x̄, ū) and x̄ is the initial state. The function ℓ(·)
is defined by

ℓ(x̄, ū) := (1/2) |x̄|2Q + |ū|2R

in which Q and R are positive definite, |x̄|2Q := x̄ T Qx̄, and |ū|2R :=


ūT R ū. We impose the following state and control constraints on the
nominal system
x̄ ∈ X̄ ū ∈ Ū

in which X̄ ⊂ X and Ū ⊂ U. The choice of X̄ and Ū is more difficult


than in the linear case because it is difficult to bound the deviation
e = x − x̄ of the state x of the uncertain system from the state x̄
of the nominal system; this is discussed below. The optimal nominal
trajectories ū0 and x̄0 are determined by minimizing V̄N (x̄0 , ū) subject
to: x̄0 = x0 , the state and control constraints specified above, and
the terminal constraint x̄(N) = 0 (we omit the initial state x̄0 = x0 in
ū0 and x̄0 to simplify notation). The state and control of the nominal
system satisfy x̄(i) = 0 and ū(i) = 0 for all i ≥ N. This simplifies both
analysis and implementation in that the control reverts to conventional
MPC for all i ≥ N.

3.6.2 Model Predictive Controller

The purpose of the model predictive controller is to maintain the state


of the uncertain system x + = f (x, u, w) close to the trajectory of the
nominal system. This controller replaces the controller u = v + K(x −
x̄) employed in the linear case. Given the current state/time (x, t) of
the uncertain system, we determine a control sequence that minimizes
with respect to the control sequence u, the cost over a horizon N of
3.6 Tube-Based MPC of Nonlinear Systems 239

the deviation between the state and control of the nominal system,
with initial state x and control sequence u, and the state and control
of the nominal system, with initial state x̄ 0 (t) and control sequence

ūt0 := ū0 (t), ū0 (t + 1), . . . , ū0 (t + N − 1) . The cost VN (x, t, u) that
measures the distance between these two trajectories is defined by
N−1
X 
VN (x, t, u) := ℓ (x(i) − x̄ 0 (t + i)), (u(i) − ū0 (t + i)) + Vf (x(N))
i=0
(3.22)
in which x(i) = φ̄(i; x, u). The optimal control problem solved online
is defined by

PN (x, t) : VN0 (x, t) = min{VN (x, t, u) | u ∈ UN }


u

The only constraint in PN (x, t) is the control constraint. The con-


trol applied to the uncertain system is κN (x, t), the first element of

u0 (x, t) = u0 (0; x, t), u0 (1; x, t), . . . , u0 (N − 1; x, t) , the optimizing
control sequence. The associated optimal state sequence is x0 (x, t) =

x 0 (0; x, t), . . . , x 0 (1; x, t), . . . , x 0 (N − 1; x, t) . The terminal penalty
Vf (·), and the functions f¯(·) and ℓ(·) are assumed to satisfy the usual
assumptions 2.2, 2.3, and 2.14 for the nominal system x̄ + = f¯(x̄, ū).
In addition, f : x , f (x, t, u) is assumed to be Lipschitz continuous
for all x ∈ Rn , uniformly in (t, u) ∈ I0:N × U, and ℓ(·) is assumed to be
quadratic and positive definite. Also, the linearization of f¯(·) at (0, 0)
is assumed to be stabilizable.
We first address the problem that PN (·) has no terminal constraint.
The function Vf′ (·) and associated controller κf (·) is chosen, as in Sec-
tion 2.5.5, to be a local Lyapunov function for the nominal system
x̄ + = f¯(x̄, ū). The terminal cost Vf (·) is set equal to βVf′ (·) with β
chosen as shown in the following proposition. The associated terminal
constraint Xf := {x | Vf′ (x) ≤ α} for some α > 0 is not employed in the
optimal control problem, but is needed for analysis. For any state se-
quence x̄ let Xc (x̄) denote the tube (sequence of sets) (X0c (x̄), X1c (x̄), . . .)
in which the ith element of the sequence is Xic (x̄) := {x | VN0 (x, i) ≤ c}.
The tube Xd (x̄) is similarly defined.

Proposition 3.16 (Implicit satisfaction of terminal constraint). For all


c > 0 there exists a βc := c/α such that, for any i ∈ I≥0 and any
x ∈ Xic (x̄0 ), the terminal state x 0 (N; x0 , i) lies in Xf if β ≥ βc .

Proof. Since x ∈ Xic (x̄0 ) implies VN0 (x, i) ≤ c, we know Vf (x 0 (N; x,


i)) = βVf′ (x 0 (N; x, i)) ≤ c so that x 0 (N; x0 , i) ∈ Xf if β ≥ βc . ■
240 Robust and Stochastic Model Predictive Control

Proposition 3.16 shows that the constraint that the terminal state
lies in Xf is implicitly satisfied if β ≥ βc and the initial state lies in
Xic (x̄0 ) for any i ∈ I≥0 . The next Proposition establishes important
properties of the value function VN0 (·).

Proposition 3.17 (Properties of the value function). Suppose β ≥ βc .


There exist constants c1 > 0 and c2 > 0 such that
2
(a) VN0 (x, t) ≥ c1 x − x̄ 0 (t) ∀(x, t) ∈ Rn × I≥0
2
(b) VN0 (x, t) ≤ c2 x − x̄ 0 (t) ∀(x, t) ∈ Rn × I≥0
2
(c) VN0 ((x, t)+ ) ≤ VN0 (x, t) − c1 x − x̄ 0 (t) ∀(x, t) ∈ Xic (x̄ 0 ) × I≥0
in which (x, t)+ = (x + , t + ) = (f¯(x, κN (x, t)), t + 1).

It should be recalled that x̄ 0 (t) = 0 and ū0 (t) = 0 for all t ≥ N; the
controller reverts to conventional MPC for t ≥ N.

Proof.
(a) This follows from the fact that VN0 (x, t) ≥ ℓ(x − x̄ 0 (t), u − ū0 (t))
2
so that, by the assumptions on ℓ(·), VN0 (x, t) ≥ c1 x − x̄ 0 (t) for all
(x, t) ∈ Rn × I≥0 .

(b) We have that VN0 (x, t) = VN (x, u0 (x, t)) ≤ VN (x, ūt0 ) with

N−1
X
VN (x, ūt0 ) = ℓ(x 0 (i; x, t) − x̄ 0 (t + i), 0)+
i=0

Vf (x 0 (N; x, t) − x̄ 0 (t + N))

and ūt0 := ū0 (t), ū0 (t + 1), ū0 (t + 2), . . . . Lipschitz continuity of f (·)
in x gives φ̄(i; x, ūt0 ) − x̄ 0 (i + t) ≤ Li x − x̄ 0 (t) . Since ℓ(·) and
2
Vf (·) are quadratic, it follows that VN0 (x, t) ≤ c2 x − x̄ 0 (t) for all
(x, t) ∈ Rn × I≥0 , for some c2 > 0.

(c) It follows from Proposition 3.16 that the terminal state x 0 (N; x,
t) ∈ Xf so that the usual stabilizing condition is satisfied and

VN0 ((x, t)+ ) ≤ VN0 (x, t) − ℓ(x, κN (x, t))

The desired result follows from the lower bound on ℓ(·). ■

It follows that the origin is asymptotically stable in the tube Xc (x̄0 )


for the time-varying nominal system (x, i)+ = f¯(x, κN (x, i)). How-
ever, our main interest is the behavior of the uncertain system with the
3.6 Tube-Based MPC of Nonlinear Systems 241

controller κN (·). Before proceeding, we note that the tube Xc (x̄0 ) is a


“large” neighborhood of x̄0 in the sense that any state/time (x, i) in
this set can be controlled to Xf in N − i steps by a control subject only
to the control constraint. We wish to determine, if possible, a “small”
neighborhood Xd (x̄) of x̄0 , d < c, in which the trajectories of the un-
certain system are contained by the controller κN (·). The size of these
neighborhoods, however, are dictated by the size of the disturbance set
W as we show next.

Proposition 3.18 (Neighborhoods of the uncertain system). Suppose


β ≥ βc .
(a) VN0 ((x, t)+ ) ≤ γVN0 (x, t) for all (x, t) ∈ Xtd (x̄0 ) × I≥0 , with (x, t)+ =
(x + , t + ) = (f (x, κN (x, t), 0), t + 1) and γ := 1 − c1 /c2 ∈ (0, 1).

(b) x , VN0 ( · ; t) is Lipschitz continuous with Lipschitz constant c3 > 0


in the compact set Xtc (x̄0 ) = {x | VN0 (x, t) ≤ c} for all t ∈ I0:N .

(c) VN0 (f (x, κN (x, t), w), t + 1) ≤ γVN0 (x, t) + c3 |w| for all (x, t) ∈
(Xic (x̄0 ) ⊕ W) × I≥0 .

Proof.
(a) This inequality follows directly from Proposition 3.17.

(b) This follows, as shown in Theorem C.29 in Appendix C, from the


fact that x , VN0 (x, t) is Lipschitz continuous on bounded sets for
each t ∈ I0:N , since VN (·) is Lipschitz continuous on bounded sets
and u lies in the compact set UN .

(c) The final inequality follows from (a), (b), and Proposition 3.17. ■

Proposition 3.19 (Robust positive invariance of tube-based MPC for


nonlinear systems).
(a) Suppose β ≥ βc and VN0 (x, t) ≤ d < c (x ∈ Xtd (x̄0 )), then VN0 (x,
d
t)+ ≤ d (x ∈ Xt+1 (x̄0 )) with (x, t)+ = (x + , t + ) = (f (x, κN (x, t), w),
t + 1) if d ≥ (c3 /(1 − γ)) |W|, |W| := max w {|w| | w ∈ W}.

(b) Suppose ε > 0. Then VN0 ((x, t)+ ) ≤ VN0 (x, t) − ε if VN0 (x) ≥ dε :=
(c3 /(1 − γ))W + ε.

Proof.
(a) It follows from Proposition 3.17 that

VN0 (f (x, κN (x, t), w), t + 1) ≤ γd + c3 |w|


242 Robust and Stochastic Model Predictive Control

If d ≥ (c3 /(1 − γ)) |W|, then

VN0 (f (x, κN (x, t), w), t+1) ≤ [(γc3 )/(1−γ)+c3 ] |W| ≤ [c3 /(1−γ)] |W|

(b) VN0 (f (x, κN (x, t), w) ≤ VN0 (x) − ε if γVN0 (x) + c3 W ≤ VN0 (x) − ε, i.e.,
if VN0 (x) ≥ [c3 /(1 − γ)]W + ε. ■

These results show that—provided the inequalities c ≥ (c3 /(1 −


γ)) |W| and d ≥ (c3 /(1 − γ)) |W| are satisfied—the tubes Xc (x̄0 ) and
Xd (x̄0 ) are robustly positive invariant for (x, t)+ = (f (x, κN (x, t)),
t + 1), w ∈ W in the sense that if x ∈ Xtc (x̄0 ) (x ∈ Xtd (x̄0 )), then
c d
x + ∈ Xt+1 (x̄0 ) (x + ∈ Xt+1 (x̄0 )). The tubes Xc (x̄0 ) and Xd (x̄0 ) may
be regarded as analogs of the sublevel sets levc VN0 (·) and levd VN0 (·)
for time-invariant systems controlled by conventional MPC. If d = dε
and c is large (which implies β = c/α is large), are such that tube
Xd (x̄0 ) ⊂ Xc (x̄0 ), then any trajectory commencing at x ∈ Xtc (x̄0 ) con-
verges to the tube Xd (x̄0 ) in finite time and thereafter remains in the
tube Xd (x̄0 ). It follows that dH (Xid (x̄0 ), XN
c
(x̄0 ) becomes zero when i
exceeds some finite time not less than N.

3.6.3 Choosing the Nominal Constraint Sets Ū and X̄

The first task is to choose d as small as possible given the constraint


d ≥ dε , and to choose c large. If the initial state x0 lies in X0c (x̄0 )
(this can be ensured by setting x̄0 = x0 ), then all state trajectories of
the uncertain system lie in the tube Xc (x̄0 ) and converge to the tube
Xd (x̄0 ). As d → 0, the tube Xd (x̄0 ) shrinks to the nominal trajectory
x̄0 . If d is sufficiently small, and if X̄ is a sufficiently small subset of X,
all state trajectories of the uncertain system lie in the state constraint
set X. This is, of course, a consequence of the fact that the nominal
trajectory x̄0 lies in the tightened constraint set X̄.
The set Ū is chosen next. Since U is often a box constraint, a simple
choice would be Ū = θU with θ ∈ (0, 1). This choice determines how
much control is devoted to controlling x̄0 to 0, and how much to reduce
the effect of the disturbance w. It is possible to change this choice
online.
The main task is to choose X̄. This can be done as follows. Assume
that X is defined by a set of inequalities of the form gi (x) ≤ hi , i ∈ I1:J .
Then X̄ may be defined by the set of “tightened” inequalities gi (x) ≤
αi hi , i ∈ I1:J , in which each αi ∈ (0, 1). Let α := (α1 , α2 , . . . , αJ ). Then
the “design parameter” α is chosen to satisfy the constraint that the
state trajectory of the controlled uncertain system lies in X for all x0 ∈
3.6 Tube-Based MPC of Nonlinear Systems 243

X0 (the set of potential initial states), and all disturbance sequences


w ∈ WN . This is a complex semi-infinite optimization problem, but
can be solved offline using recent results in Monte Carlo optimization
that show the constraints can be satisfied with “practical certainty,” i.e.,
with probability exceeding 1 − β, β ≪ 1, using a manageable number
of random samples of w.

Example 3.20: Robust control of an exothermic reaction


Consider the control of a continuous-stirred-tank reactor. We use a
model derived in Hicks and Ray (1971) and modified by Kameswaran
and Biegler (2006). The reactor is described by the second-order differ-
ential equation

ẋ1 = (1/θ)(1 − x1 ) − kx1 exp(−M/x2 )


ẋ2 = (1/θ)(xf − x2 ) + kx1 exp(−M/x2 ) − αu(x2 − xc ) + w

in which x1 is the product concentration, x2 is the temperature, and


u is the coolant flowrate. The model parameters are θ = 20, k = 300,
M = 5, xf = 0.3947, xc = 0.3816, and α = 0.117. The state, control,
and disturbance constraint sets are

X = {x ∈ R2 | x1 ∈ [0, 2], x2 ∈ [0, 2]}


U = {u ∈ R | u ∈ [0, 2]}
W = {w ∈ R | w ∈ [−0.001, 0.001]}

The controller is required to steer the system from a locally stable


steady state x(0) = (0.9831, 0.3918) at time zero, to a locally unsta-
ble steady state ze = (0.2632, 0.6519). Because the desired terminal
state is ze rather than the origin, the stage cost ℓ(z, v) is replaced by
ℓ(z − ze , v − ve ) where ℓ(z, v) := (1/2)(|z|2 + v 2 ) and (ze , ve ) is an
equilibrium pair satisfying ze = f (ze , ve ); the terminal constraint set
Zf is chosen to be {ze }. The constraint sets for the nominal control
problem are Z = X and V = [0.02, 2]. Since the state constraints are
not activated, there is no need to tighten X. The disturbance is cho-
sen to be w(t) = A sin(ωt) where A and ω are independent uniformly
distributed random variables, taking values in the sets [0, 0.001] and
[0, 1], respectively. The horizon length is N = 40 and the sample time
is ∆ = 3 giving a horizon time of 120. The model predictive controller
uses ℓa (x, u) = (1/2)(|x|2 + u2 ), and the same horizon and sample
time.
244 Robust and Stochastic Model Predictive Control

Standard MPC Tube-based MPC


1.0
concentration, x1

0.5

0.0
0.8
temperature, x2

0.6

0.4

1.5
coolant flowrate, u

1.0

0.5

0.0
0 100 200 300 0 100 200 300
time time

Figure 3.6: Comparison of 100 realizations of standard and tube-


based MPC for the chemical reactor example.

For comparison, the performance of a standard MPC controller, us-


ing the same stage cost and the same terminal constraint set as that
employed in the central-path controller, is simulated. Figure 3.6 (left)
illustrates the performance of standard MPC, and Figure 3.6 (right) the
performance of tube-based MPC for 100 realizations of the disturbance
sequence. Tube-based MPC, as expected, has a smaller spread of state
trajectories than is the case for standard MPC. Because each controller
has the same stage cost and terminal constraint, the spread of trajec-
tories in the steady-state phase is the same for the two controllers. Be-
cause the control constraint set for the central-path controller is tighter
than that for the standard controller, the tube-based controller is some-
what slower than the standard controller.
The model predictive controller may be tuned to reduce more effec-
tively the spread of trajectories due to the external disturbance. The
main purpose of the central-path controller is to steer the system from
one equilibrium state to another, while the purpose of the ancillary
model predictive controller is to reduce the effect of the disturbance.
These different objectives may require different stage costs.
3.6 Tube-Based MPC of Nonlinear Systems 245

Standard MPC Tube-based MPC


1.0
concentration, x1

0.5

0.0
0.8
temperature, x2

0.6

0.4

1.5
coolant flowrate, u

1.0

0.5

0.0
0 100 200 300 0 100 200 300
time time

Figure 3.7: Comparison of standard and tube-based MPC with an ag-


gressive model predictive controller.

The next simulation compares the performance of the standard and


tube-based MPC when a more “aggressive” stage cost is employed for
the model predictive controller. Figure 3.7 shows the performance of
these two controllers when the central-path and standard MPC con-
troller employ ℓ(z − ze , v − ve ) with ℓ(z, v) := (1/2) |z|2 + 5v 2 , and
the ancillary model predictive controller employs ℓa (x, u) = 50 |x|2 +
(1/20)u2 . The tube-based MPC controller reduces the spread of the
trajectories during both the transient and the steady-state phases.
It is also possible to tune the sample time of the ancillary model
predictive controller. This feature may be useful when the disturbance
frequency lies outside the pass band of the central-path controller. Fig-
ure 3.8 shows how concentration varies with time when the disturbance
is w(t) = 0.002 sin(0.4t), the sample time of the central-path con-
troller is 12, whereas the sample time of the ancillary model predictive
controller is 12 (left figure) and 8 (right figure). The central-path con-
troller employs ℓ(z − ze , v − ve ) where ℓ(z, v) := (1/2)(|z|2 + v 2 ),
and the model predictive controller employs the same stage cost ℓa (x,
246 Robust and Stochastic Model Predictive Control

∆ = 12 ∆=8
concentration, x1 1.0

0.5

0.0
temperature, x2

1.00

0.75

0.50
coolant flowrate, u

1.0

0.5

0.0
0 100 200 300 400 5000 100 200 300 400 500
time time

Figure 3.8: Concentration versus time for the ancillary model predic-
tive controller with sample time ∆ = 12 (left) and ∆ = 8
(right).

u) = ℓ(x, u). The model predictive controller with the smaller sample
time is more effective in rejecting the disturbance. □

3.7 Stochastic MPC


3.7.1 Introduction

In stochastic MPC, as in robust MPC, the system to be controlled is


usually described by x + = f (x, u, w), in which the disturbance w
is a random variable that is assumed to take values in W. The con-
straint set W is not necessarily assumed to be bounded as it is in
robust MPC, although, to date, implementable versions appear to re-
quire boundedness of W. The decision variable µ is usually assumed,
as in robust MPC, to be a policy µ = (µ0 (·), µ1 (·), . . . , µN−1 (·)) (a se-
quence of control laws) in order to contain the spread of trajectories
that may result in a high cost and constraint violation. The functions
3.7 Stochastic MPC 247

µi (·) are usually parameterized to simplify optimization. A parameter-


ization that is widely used when the system being controlled is linear
is µi (x) = Kx + vi , in which case the decision variable is simply the
sequence v = (v0 , v1 , . . . , vN−1 ). Let φ(i; x, µ, w) denote the solution
of x + = f (x, u, w) at time i if the initial state at time zero is x, the
control at (x, i) is µi (x), and the disturbance sequence is w.
The cost that is minimized online is usually defined to be
VN (x, µ) = E|x (JN (x, µ, w))
N−1
X
JN (x, u, w) = ℓ(x(i), µi (x(i)) + Vf (φ(N; x, µ, w))
i=0

in which E|x (·) = E(· | x(0) = x)), E(·) is the expectation under the
probability measure of the underlying probability space, and x(i) =
φ(i; x, µ, w). For simplicity, the nominal cost VN (x, µ) = E|x (JN (x,
µ, 0)) is sometimes employed; here 0 is defined to be the sequence
(0, 0, . . . , 0).
We consider briefly below three versions of MPC associated with
three versions of the optimal control problem PN (x) solved online. In
the first version there are no constraints, permitting the disturbance to
be unbounded. In the second version the hard constraints x ∈ X, u ∈ U
and the terminal constraint x(N) ∈ Xf are required to be satisfied.
While satisfaction of the constraint x ∈ X almost surely is desirable,
this constraint is often regarded as too conservative. The third ver-
sion, therefore, replaces the hard constraint x ∈ X by the probabilistic
(chance) constraint of the form
Pr|x (x(i) ∈ X) ≥ 1 − ε
for some suitably small ε ∈ [0, 1]. Some papers propose treating the
hard control constraint u ∈ U similarly. This approach is not appro-
priate for process control since hard actuator constraints have to be
satisfied; a valve cannot be more than fully open or less than fully
closed. In a similar vein, softening of the terminal constraint may re-
sult in instability. Hence, the constraints in the third version on the
system being controlled take the form
Pr|x (x(i) ∈ X) ≥ 1 − ε
u(i) ∈ U
for all i ∈ I0:N . Pr(·) denotes the probability measure of the underlying
probability space and Pr|x (·) the probability measure conditional on
x(0) = x. Also x(i) := φ(i; x, µ, w) and u(i) = µi (x(i)).
248 Robust and Stochastic Model Predictive Control

Let ΠN (x) denote the set of parameterized policies that satisfy the
constraints appropriate to the version being considered and the initial
state is x. The optimal control problem PN (x) that is solved online can
now be defined by

PN (x) : VN0 (x) = min VN (x, µ)


µ∈ΠN (x)

subject to the constraints defined above as well as the hard terminal


stability constraint x(N) ∈ Xf . The solution to this problem, if it ex-
ists, is µ0 (x) = (µ00 (x), µ10 (x), . . . , µN−1
0
(x)). The control applied to the
uncertain system at state x is κN (x) := µ00 (x).

3.7.2 Stability of Stochastic MPC

Because the optimal control problem solved online has a finite hori-
zon the resultant control law is not necessarily stabilizing. Stabiliz-
ing conditions involving the addition of a terminal cost and a terminal
constraint set have been developed for deterministic and robust MPC
but, as pointed out in Chatterjee and Lygeros (2015), no approaches to
stochastic MPC prior to 2015 dealt “directly with stability under reced-
ing horizon control as a standalone and fundamental problem.”
Version 1. A major contribution to stability and performance of stoch-
astic MPC in the absence of hard constraints is given in the paper by
Chatterjee and Lygeros that is the first paper proposing “standalone”
stability conditions for unconstrained stochastic MPC. The problem
considered in this paper is as stated above except that there are no
constraints (X = Xf = Rn , U = Rm ) and the random disturbance w is
merely assumed to take values in a measurable set W that is not neces-
sarily bounded. The stabilizing assumption in Chatterjee and Lygeros
(2015) is

Assumption 3.21 (Stabilizing conditions, stochastic MPC: Version 1).


There exists a measurable control law κf (·) : Rn → Rm a number b
and a bounded measurable set K such that

E(Vf (f (x, κf (x), w)) ≤ Vf (x) − ℓ(x, κf (x)) ∀x ∉ K

sup{E(Vf (f (x, κf (x), w))) − (Vf (x) − ℓ(x, κf (x)))} ≤ b


x∈K

Under the basic assumptions that (i) the cost VN (x, µ) is finite for
all x ∈ Rn and all µ ∈ ΠN (x); (ii) for all x ∈ Rn , there exists a solution
3.7 Stochastic MPC 249

µ0 (x) that solves PN (x); and (iii) the stage cost ℓ(·) satisfies some
modest conditions, it is shown in (Chatterjee and Lygeros, 2015) that,
if Assumption 3.21 holds, then, for all x ∈ Rn

E|x (VN0 (x + )) ≤ VN0 (x) − ℓ(x, κN (x)) + b

Chatterjee and Lygeros then show that VN0 (·) satisfies the geometric
drift condition E|x (VN0 (x + )) ≤ VN0 (x) − ℓ(x, κN (x)) outside of some
compact subset of Rn , and that the sequence (E|x (VN0 (x(t))))t∈I≥0 is
bounded.
Version 2. While the results in Chatterjee and Lygeros (2015) hold for
situations in which the disturbance is not restricted to lie in a compact
set, they do require the absence of hard state constraints. In addi-
tion, determination of a function satisfying Assumption 3.21 is diffi-
cult. Stabilizing conditions suitable for version 2 of stochastic MPC (all
constraints are hard and W is compact) are given in Mayne and Falugi
(2019).

Assumption 3.22 (Stabilizing conditions, stochastic MPC: Version 2).


Vf (·), Xf and ℓ(·) have the following properties.
(a) For all x ∈ Xf there exists a u = κf (x) ∈ U such that Vf (f (x,
κf (x), 0)) ≤ Vf (x) − ℓ(x, κf (x)) and f (x, κf (x), w) ∈ Xf , ∀w ∈ W

(b) There exists a δ ∈ (0, ∞) such that for all x ∈ Xf

E|x (Vf (f (x, κf (x), w))) ≤ Vf (x) − ℓ(x, κf (x)) + δ

(c) Xf ⊆ X, W is compact.

(d) There exist constants c2 > c1 > 0 and a > 0 such that

ℓ(x, u) ≥ c1 |x|a , ∀x ∈ X, ∀u ∈ U
VN0 (x) ≤ c2 |x|a , ∀x such that ΠN (x) ≠ ∅

If this assumption is satisfied, it follows (Mayne and Falugi, 2019)


that, for x such that ΠN (x) is not empty, the optimal control problem
PN (x) is recursively feasible and

E|x (VN0 (x + )) ≤ VN0 (x) − ℓ(x, κN (x)) + δ (3.23)

which is a modified descent property. Consider now an infinite random


sequence (x(i))i∈I≥0 generated by the control algorithm and stochastic
system.
250 Robust and Stochastic Model Predictive Control

Proposition 3.23 (Expected cost bound). If Assumption 3.22 holds, then


there exists λ ∈ (0, 1) such that the closed-loop trajectory x(k) satisfies

E|x (VN0 (x(k))) ≤ λk VN0 (x) + δ/(1 − λ)

for all k ∈ I≥0 and x ∈ X such that ΠN (x) ≠ ∅ and x(0) = x.

Proof. We proceed as in the deterministic case to obtain

E|x (VN0 (x + )) ≤ VN (x) − c1 |x|a + δ


≤ VN (x) − (c1 /c2 )VN (x) + δ
≤ λVN (x) + δ

with λ = 1 − c1 /c2 . Then E|x(0) (VN0 (x(1))) ≤ λVN0 (x(0)) + δ and, by law
of iterated expectation, E|x(0) (VN0 (x(k))) = E|x(0) (E|x(k−1) (VN0 (x(k)))).
By iterating we obtain our stability condition

E|x(0) (VN0 (x(k))) ≤ λk VN (x) + δ/(1 − λ)

and the proof is complete. ■

Remark. Version 1 is applicable to model predictive control of sys-


tems that have unbounded disturbances but do not have hard state
and control constraints. Moreover it requires determination of a global
Lyapunov-like function Vf (·) defined in Assumption 3.21. Version 2
requires compactness of W since it is applicable to model predictive
control of systems that requires solution of a complex optimal control
problem in which hard constraints have to be satisfied for all permitted
disturbance sequences.

We turn next to an implementable version of stochastic MPC.

3.7.3 Tube-based stochastic MPC

To date, it appears that all implementable versions of stochastic MPC


assume boundedness of the disturbance since, otherwise, it is difficult,
if not impossible, to satisfy hard constraints. Even if the disturbance is
bounded, however, satisfaction of hard constraints for all disturbance
sequences is not simple. The tube-based approach, introduced in Chisci
et al. (2001); Mayne and Langson (2001) appears to be the most prac-
tical method for handling hard constraints. The reason for this is the
state and control of the uncertain system are forced to satisfy hard con-
straints merely by requiring the state and control of the nominal, deter-
ministic, system to satisfy tighter versions of the same constraints—a
3.7 Stochastic MPC 251

much simpler problem than forcing satisfaction of the constraints for


all disturbance sequences using optimization. We present below a sim-
ilar simple approach to stochastic MPC, the major difference from ro-
bust MPC being a minor modification required to tighten the constraints
to permit a small probability of nonsatisfaction.
Control strategy. We present a control strategy that ensures stability
in the sense that under reasonable assumptions the state converges to
the optimal solution of the unconstrained linear system. The uncertain
system to be controlled is described by

x + = Ax + Bu + w

and is subject to the constraints x ∈ X and u ∈ U; X is closed, U is com-


pact, and each set contains the origin in its interior. The disturbance
w is a stationary random process and is assumed to lie in the compact
set W that contains the origin in its interior. The nominal system and
error are described by

x̄ + = Ax̄ + B ū e := x − x̄

Given state x at time t, we denote the solution of the nominal system


by x̄(i) for given controls ū(i), i ∈ I0:N−1 where the initial state of the
nominal system is x̄(0) = x. As in robust tube-based MPC, we employ
the control policy µ = (µ0 (·), µ1 (·), . . . , µN−1 (·)) in which for each i,
µi (·) is defined for all x by

µi (x) := ū(i) + K(x − x̄(i))

so that u(i) = µi (x(i)) = ū(i) + K(x(i) − x̄(i)). The (x, e) pair then
evolve as

x + = Ax + B ū + BKe + w e + = AK e + w AK := A + BK

The feedback matrix K is chosen so that AK is Hurwitz. For practical


reasons we assume that w and, hence, e are bounded. If w is an infinite
sequence of independent, identically distributed, zero-mean random
variables, then an optimal K may be obtained from the solution to the
unconstrained problem
N−1
X
min lim E|x (1/N) ℓ(x(i), u(i))
N→∞
i=0

in which ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru) with both Q and R positive


definite. Then K = −(R + B ′ P B)−1 B ′ P A, with P the solution of the
252 Robust and Stochastic Model Predictive Control

matrix Riccati equation P = Q + A′ P (R + B ′ P B)−1 P A; (1/2)x ′ P x is the


minimum cost for the deterministic problem in which the disturbance
w is identically zero.
Stochastic MPC differs from robust MPC in the definition of the con-
trol objective and in softening of the constraints that now take the form

Pr|x (x(i) ∈ X) ≥ 1 − ε, i ∈ I≥0N


u(i) ∈ U, i ∈ I≥0

in which x(i) = φ(i; x, µ, w). The control u(i) applied to the system
at time i is
u(i) = ū(i) + K(x − x̄(i)) ∀x
With this control policy, e(i) := x(i) − x̄(i) is the solution at time i of
the difference equation

e+ = AK e + w, e(0) = 0
Pi−1 j
As shown earlier e(i) ∈ SK (i) := j=0 AK W for all i. Because AK is Hur-
witz, e(t) converges to a stationary process e∞ as t → ∞. To achieve
robustness of stochastic MPC we adopt a policy similar to that em-
ployed in robust MPC. For each i, the control constraints are tightened
by determining, for each i, a set Ū(i) that ensures ū + Ke(i) ∈ U for all
ū ∈ Ū(i). The state constraints are tightened by determining, for each
i, a set X̄(i) that ensures Pr(bx + e(i) ∈ X) ≥ 1 − ε for all x̄ ∈ X̄(i).
A model predictive controller is employed to steer the state and
control of the nominal system, subject to the tightened constraints, to
the origin. Since x(t) = x̄(t) + e(t) and ū(t) = ū(t) + Ke(t) it follows
that x(t) converges to e∞ and u(t) converges to Ke∞ as t → ∞.
To implement this control it seems, at first sight, that we have to
determine the tightened constraints for all i ∈ I≥0 . We propose two
practical alternatives. The first, which is similar to that employed for
robust MPC, is determination of constant constraint sets Ū∞ and X̄∞
satisfying, respectively, Ū∞ ⊕ KSK (∞) ⊂ U and P{x̄ + e∞ ∈ X} ≥ 1 − ε
for all x̄ ∈ X̄∞ . At each time i, when the composite state is (x(i), x̄(i)),
a standard nominal optimal control problem P̄N (x̄(i)) with constraints
x̄ ∈ X̄∞ , ū ∈ Ū∞ and the usual terminal constraint is solved. If standard
stability conditions are satisfied, x̄(i) and ū(i) converge to zero while
satisfying the tightened constraints ū ∈ Ū∞ and x̄ ∈ X̄∞ as i → ∞. The
control applied to the system at time i is u(i) = ū(i) + K(x(i) − x̄(i)).
This procedure is conservative in that the constraints are tighter than
necessary.
3.7 Stochastic MPC 253

A better, albeit more complex, alternative is to solve, at time zero,


a standard nominal optimal control problem P̄N (x̄) with a sequence
of tightened control constraints (Ū(0), Ū(1), . . . , Ū(N − 1)) and state
constraints (X̄(1), X̄(2), . . . , X̄(N − 1), Xf ) specified below; the solution
to this problem yields the nominal control and state sequences (ū0 (0),
ū0 (1), ū0 (N−1)) and (x̄ 0 (1), x̄ 0 (2), . . . , x̄ 0 (N)) satisfying ū0 (i) ∈ Ū(i),
x̄ 0 (i) ∈ X̄(i), and x̄ 0 (N) ∈ Xf . At time i = N and thereafter the control
ū(i) is set equal to κf (x̄(i)) so that x̄(i + 1) = Ax̄(i) + Bκf (x̄(i)).
If the usual stability conditions are satisfied, the nominal state x̄(i)
remains in Xf for all i ≥ N. The procedure therefore yields a control
sequence consisting of the sequence (ū0 (0), ū0 (1), . . . , ū0N−1 ) followed
by the infinite control sequence (κf (x̄(N), κf (x̄(N + 1), . . .). Moreover,
the control law κf (·) ensures that ū(i) → 0 and x̄(i) → 0 as i → ∞. To
implement this procedure we require an additional assumption.

Assumption 3.24 (Robust terminal set condition). The terminal set sat-
isfies Xf ⊕ SK (∞) ⊂ X.

Both procedures ensure that x(t) converges to the zero mean sta-
tionary process e∞ to which e(t) converges, and that u(t) converges to
Ke∞ as t → ∞.

Determination of tightened constraints. The tightened control con-


straints must satisfy Ū(i) ⊕ KSK (i) ⊂ U or, equivalently, Ū(i) ⊂ U ⊖
KSK (i) for all i ∈ I0:N−1 . Provided we are able to tractably calculate
SK (i) for any i ∈ I0:N−1 , we may simply define Ū(i) := U ⊖ KSK (i). Al-
ternatively, we may use any conservative estimate S e K (i) ⊃ SK (i) and
e
define Ū(i) := U ⊖ KS K (i). For example, we may choose S e K (i) = SK (∞)
and thereby define Ū(i) = Ū∞ for all i ∈ I≥0 .
We now consider determination, for any i ∈ I0:N−1 , of the state
constraint set X̄(i) that satisfies Pr(x̄+e(i) ∈ X) ≥ 1−ε for all x̄ ∈ X̄(i).
This is a stochastic optimization problem, a field in which, fortunately,
there has been considerable recent progress. Tempo, Calafiore, and
Dabbene (2013) give an excellent exposition of this subject.
Suppose X is defined by a single constraint of the form {x | c ′ x ≤
d}. For each i ∈ I0:N−1 , we wish to determine a tighter constraint c ′ x̄ ≤
d̄ := d − f , f ∈ [0, d], such that c ′ x̄(i) ≤ d̄ implies c ′ x(i) = c ′ x̄(i) +
c ′ e(i) ≤ d with probability not less than 1−ε. To achieve this objective,
we solve the stochastic problem P defined by

min {f | Pr(c ′ e(i) ≤ f ) ≥ 1 − ε}


f ∈[0,d]
254 Robust and Stochastic Model Predictive Control

with ε chosen to be suitably small; c ′ e ≤ f and c ′ x̄ ≤ d − f imply c ′ x ≤


d. In Calafiore and Campi (2006), the complex probability constrained
problem P is replaced by a scenario convex optimization problem Ps
defined by
min {f | c ′ e(i; wj ) ≤ f , ∀j ∈ I1:M } (3.24)
f ∈[0,d]

Here w denotes the j th sample of the finite sequence {w(0), w(1),


j

w(2), . . . , w(i − 1)} and e(i) is replaced by e(i; wj ) to denote its depen-
dence on the random sequence wj .
It is shown in Calafiore and Campi (2006) and Tempo et al. (2013)
that given (ε, β), there exists a relatively modest number of samples
M ∗ (ε, β) such that if M ≥ M ∗ , one of the following two conditions
hold. For each i ∈ I0:N−1 , either problem Ps is infeasible, in which case
the robust control problem is infeasible; or its solution f 0 (i) satisfies

Pr(c ′ e(i) ≤ f 0 (i)) ≥ 1 − ε

with probability 1 − β (i.e., with practical certainty if β is chosen suffi-


ciently small). The tightened state constraint set is X̄(i) = {x | c ′ x ≤
d − f 0 (i)} ⊂ X. Note that X̄∞ ⊂ X̄(i), i.e., using X̄(i) is less conservative
than X̄∞ . Tempo et al. (2013) give the value
! !
∗ 2 1
M (ε, β) = log + nθ (3.25)
ε β

If X := {x | Cx ≤ d} in which d ∈ Rp , we apply the procedure above


to each row ck′ x ≤ dk of the constraint yielding fK0 (i) satisfying

Pr(ck′ e(i) ≤ fk0 (i)) ≥ 1 − εk

for each k = 1, . . . , p if the associated scenario problem is feasible. The


Pp
probability that x(i) ∈ X(i) is not less than 1 − ε with ε = j=1 εk .

Example 3.25: Constraint tightening via sampling


Consider the scalar system x + = x +u+w, with X = U = [−1, 1] and w
uniformly distributed in W = [−1/2, 1/2]. Using costs Q = 1/2, R = 1,
the LQR gain is K = 1/2, which gives AK = 1/2, and thus
i−1
X j
SK (i) := AK W = [−(1 − 2−i ), 1 − 2−i ]
j=0

for all i ∈ I≥0 . Tightening the set U, we have

Ū(i) := U ⊖ KSK (i) = (1/2)[−(1 + 2−i ), 1 + 2−i ]


3.7 Stochastic MPC 255

105
104

M 103
102
101
100

10−1
99th percentile
95th percentile
εtest 10−2 75th percentile
50th percentile
25th percentile
10−3

10−4

10−3 10−2 10−1 100


ε

Figure 3.9: Observed probability εtest of constraint violation for i =


10. Distribution is based on 500 trials for each value of
ε. Dashed line shows the outcome predicted by formula
(3.25), i.e., εtest = ε.

for all i ∈ I≥0 . Note that the first control constraint does not need to be
tightened at all, Ū(0) = U, and all subsequent control constraints are
less conservative than Ū∞ = [−1/2, 1/2].
To compute the tightened sets X̄(i), we apply the sampling pro-
cedure for each i ∈ I0:N−1 . For various values of ε, we compute the
number of samples M = M ∗ (ε, β) using (3.25) with β = 0.01. Then,
we choose M samples of w and solve (3.24) for f 0 (i). To evaluate the
actual probability of constraint violation, we then test the constraint vi-
olation using Mtest = 106 different samples wtest . That is, we compute
j 
εtest := Pr c ′ e(i; wtest ) > f 0 (i), ∀j ∈ I1:Mtest

for each i ∈ I0:N−1 . Note that since εtest is now a random variable de-
pending on the particular M samples chosen, we repeat the process 500
times for each value of M. The distribution of εtest for i = 10 is shown
in Figure 3.9. Notice that the formula (3.25) is slightly conservative,
i.e., the observed probability εtest is half of the chosen probability ε for
99% of samples (with probability 1 − β). This gap holds throughout the
256 Robust and Stochastic Model Predictive Control

entire range of the test. □

Evaluation. How does tube-based MPC compare with other methods?


We discuss here two aspects: constraint handling and performance.
Traditional MPC handles constraints by solving online a finite hori-
zon optimal control problem that involves minimization of a cost sub-
ject to control and state constraints for every realization of the distur-
bance process, a computationally expensive requirement. Tube-based
MPC for linear systems appears to be the only method for avoiding this
online expense but can only be employed for linear systems.
Tube-based MPC takes two different forms. One, introduced in
Chisci et al. (2001), solves an online optimal control problem PN (x)
at the current state x and uses a tube to determine the set of all pos-
sible trajectories emanating from the current state x; the motive is to
minimize the cost at each current state. The second form, introduced
in Mayne and Langson (2001), solves a nominal optimal control prob-
lem PN (x(0)) at the initial state x(0) and uses a single tube emanating
from x(0). The first form is closer to traditional MPC and attempts
to minimize the cost of the particular trajectory generated by the con-
troller as in deterministic MPC. The second form, on the other hand,
minimizes the average cost over all trajectories emanating from the
initial state x(0) as in classical stochastic control in which controllers
are developed offline.
While the first approach is closer to traditional MPC, its implemen-
tation is more difficult for the following reason: at each state x the suc-
cessor state x + does not lie on the optimal trajectory emanating from
x due to the disturbance w; recursive feasibility is therefore lost. Al-
gorithmic modifications that are fairly complex have to be introduced;
see, for example, Kouvaritakis and Cannon (2016); Lorenzen, Dabbene,
Tempo, and Allgöwer (2016). These modifications increase computa-
tional expense and their effect on performance has not yet been stud-
ied.
The big advantage of the second approach is its simplicity; it is no
more difficult to implement than traditional MPC, requiring only the
determination of a nominal trajectory that converges to the origin. It
is also possible to get an indication of its performance. If there are no
constraints, if the horizon of the optimal control problem is infinite,
and if (w(i))i∈I≥0 is a sequence of independent, identically distributed
random variables, then the optimal controller gain is u = K(x) for both
the stochastic and nominal systems. Thus, at composite state (x, x̄),
3.8 Notes 257

the optimal control for the stochastic system at state x is u0 = Kx and


the optimal control for the nominal system at state x̄ is ū0 = K x̄. The
control u determined by the control algorithm is u = ū0 +K(x 0 − x̄ 0 ) =
K x̄ 0 +Kx 0 −K x̄ 0 = Kx 0 and is therefore optimal. Next, if we accept that
the control u is paramaterized by u = ū+K(x − x̄) so that the decision
variable is ū, then, since x = x̄ + e and u = ū + Ke, the performance
index for the parameterized stochastic system is
 
N−1
X
VN (x(0), ū) = E|x(0)  ℓ(x(i), u(i)) + Vf (x(N))
i=0
N−1
X
= ℓ(x̄(i), ū(i)) + Vf (x̄(N)) + c
i=0

= V̄N (x̄(0), ū) + c

in which V̄N is the performance index for the nominal system


 
N−1
X
c =E ℓ(e(i), Ke(i)) + Vf (e(N))
i=0

and ℓ(·) and Vf (·) are quadratic functions. If, in addition, the system
being controlled satisfies its control and probabilistic constraints if and
only if the nominal system satisfies its tightened constraints, then the
solution ū0 (x(0)) of the nominal optimal control problem P̄N (x(0))
is also the solution of the parameterized stochastic optimal control
problem PN (x(0)).

3.8 Notes
Robust MPC. There is now a considerable volume of research on ro-
bust MPC; for a review of the literature up to 2000 see Mayne, Rawlings,
Rao, and Scokaert (2000). Early literature examines robustness of nomi-
nal MPC under perturbations in Scokaert, Rawlings, and Meadows (1997);
and robustness under model uncertainty in De Nicolao, Magni, and Scat-
tolini (1996) and Magni and Sepulchre (1997). Sufficient conditions
for robust stability of nominal MPC with modeling error are provided
in Santos and Biegler (1999). Teel (2004) provides an excellent dis-
cussion of the interplay between nominal robustness and continuity
of the Lyapunov function, and also presents some illuminating exam-
ples of nonrobust MPC. Robustness of the MPC controller described in
258 Robust and Stochastic Model Predictive Control

Chen and Allgöwer (1998), when employed to control a system with-


out state constraints, is established in Yu, Reble, Chen, and Allgöwer
(2011). The theory of inherent robustness is usefully extended in Pan-
nocchia, Rawlings, and Wright (2011); Allan et al. (2017); and applied
to optimal and suboptimal MPC.
Many papers propose solving online an optimal control problem
in which the decision variable is a sequence of control actions that
takes into account future disturbances. Thus, it is shown in Limon,
Álamo, and Camacho (2002) that it is possible to determine a sequence
of constraints sets that become tighter with time, and that ensure the
state constraint is not transgressed if the control sequence satisfies
these tightened constraints. This procedure was extended in Grimm,
Messina, Tuna, and Teel (2007), who do not require the value function
to be continuous and do not require the terminal cost to be a control
Lyapunov function.
Predicted trajectories when the decision variable is a control se-
quence can diverge considerably with time, making satisfaction of state
and terminal constraints difficult or even impossible. This has moti-
vated the introduction of “feedback” MPC, in which the decision vari-
able is a policy (sequence of control laws) rather than a sequence of
control actions (Mayne, 1995; Kothare, Balakrishnan, and Morari, 1996;
Mayne, 1997; Lee and Yu, 1997; Scokaert and Mayne, 1998). If arbitrary
control laws are admissible, the implicit MPC control law is identical
to that obtained by dynamic programming; see Section 3.1.3 and pa-
pers such as Magni, De Nicolao, Scattolini, and Allgöwer (2003), where
a H∞ MPC control law is obtained. But such results are conceptual be-
cause the decision variable is infinite dimensional. Hence practical con-
trollers employ suboptimal policies that are finitely parameterized—an
extreme example being nominal MPC. A widely used parameterization
is u = v + Kx, particularly when the system being controlled is lin-
ear; this parameterization was first proposed in Rossiter, Kouvaritakis,
and Rice (1998). The matrix K is chosen to stabilize the unconstrained
linear system, and the decision variable is the sequence (v(i))0:N−1 .
The robust suboptimal controllers discussed in this chapter employ
the concept of tubes introduced in the pioneering papers by Bertsekas
and Rhodes (1971a,b), and developed for continuous time systems by
Aubin (1991) and Khurzhanski and Valyi (1997). In robust MPC, local
feedback is employed to confine all trajectories resulting from the ran-
dom disturbance to lie in a tube that surrounds a nominal trajectory
chosen to ensure the whole tube satisfies the state and control con-
3.8 Notes 259

straints. Robustly positive invariant sets are employed to construct


the tubes as shown in (Chisci et al., 2001) and (Mayne and Langson,
2001). Useful references are the surveys by Blanchini (1999), and Kol-
manovsky and Gilbert (1995), as well as the recent book by Blanchini
and Miani (2008). Kolmanovsky and Gilbert (1995) provide extensive
coverage of the theory and computation of minimal and maximal ro-
bust (disturbance) invariant sets.
The computation of approximations to robust invariant sets that are
themselves invariant is discussed in a series of papers by Raković and
colleagues (Raković, Kerrigan, Kouramas, and Mayne, 2003; Raković
et al., 2005a; Raković, Mayne, Kerrigan, and Kouramas, 2005b; Koura-
mas, Raković, Kerrigan, Allwright, and Mayne, 2005). The tube-based
controllers described above are based on the papers (Langson, Chrys-
sochoos, Raković, and Mayne, 2004; Mayne, Serón, and Raković, 2005).
Construction of robust invariant sets is restricted to systems of rela-
tively low dimension, and is avoided in Section 3.6.3 by employing op-
timization directly to determine tightened constraints. A tube-based
controller for nonlinear systems is presented in Mayne, Kerrigan, van
Wyk, and Falugi (2011).
Because robust MPC is still an active area of research, other meth-
ods for achieving robustness have been proposed. Diehl, Bock, and
Kostina (2006) simplify the robust nonlinear MPC problem by using
linearization, also employed in (Nagy and Braatz, 2004), and present
some efficient numerical procedures to determine an approximately
optimal control sequence. Goulart, Kerrigan, and Maciejowski (2006)
propose a control that is an affine function of current and past states;
the decision variables are the associated parameters. This method sub-
sumes the tube-based controllers described in this chapter, and has the
advantage that a separate nominal trajectory is not required. A disad-
vantage is the increased complexity of the decision variable, although
an efficient computational procedure that reduces computational time
per iteration from O(N 6 ) to O(N 3 ) has been developed in Goulart, Ker-
rigan, and Ralph (2008). Interesting extensions to tube-based MPC are
presented in Raković (2012), and Raković, Kouvaritakis, Cannon, Panos,
and Findeisen (2012). The introduction of a novel parameterization by
Raković (2012) enables him to establish that the solution obtained is
equivalent to dynamic programming in at least three cases.
Considerable attention has recently been given to input-to-state sta-
bility of uncertain systems. Thus Limon, Alamo, Raimondo, de la Peña,
Bravo, and Camacho (2008) present the theory of input-to-state sta-
260 Robust and Stochastic Model Predictive Control

bility as a unifying framework for robust MPC, generalizes the tube-


based MPC described in (Langson et al., 2004), and extends existing
results on min-max MPC. Another example of research in this vein is
the paper by Lazar, de la Peña, Heemels, and Alamo (2008) that utilizes
input-to-state practical stability to establish robust stability of feedback
min-max MPC. A different approach is described by Angeli, Casavola,
Franz̀e, and Mosca (2008) where it is shown how to construct, for each
time i, an ellipsoidal inner approximation Ei to the set Ti of states that
can be robustly steered in i steps to a robust control invariant set T .
All that is required from the online controller is the determination of
the minimum i such that the current state x lies in Ei and a control
that steers x ∈ Ei into the set Ei−1 ⊂ Ei .
Stochastic MPC. Interest in stochastic MPC has increased consider-
ably. An excellent theoretical foundation is provided in Chatterjee and
Lygeros (2015). Most papers address the stochastic constrained linear
problem and propose that the online optimal control problem PN (x)
(x is the current state) minimizes a suitable objective function sub-
ject to satisfaction of state constraints with a specified probability as
discussed above. If time-invariant probabilistic state constraints are
employed, a major difficulty with this approach, as pointed out in Kou-
varitakis, Cannon, Raković, and Cheng (2010) in the context of stoch-
astic MPC for constrained linear systems, is that recursive feasibility is
lost unless further measures are taken. It is assumed in this paper, as
well as in a later paper Lorenzen et al. (2016), that the disturbance is
bounded, enabling a combination of stochastic and hard constraints to
be employed.
In contrast to these papers, which employ the control policy pa-
rameterization u = Kx + v, Chatterjee, Hokayem, and Lygeros (2011)
employ the parameterization, first proposed in Goulart et al. (2006), in
which the control law is an affine function of finite number of past dis-
turbances. This parameterization, although not parsimonious, results
in a convex optimal control problem, which is advantageous. Recur-
sive feasibility is easily achieved in the tube-based controller proposed
above, since it requires online solution of PN (x̄) rather than PN (x).
Tube-based MPC is well suited to handle hard constraints via con-
straint tightening Michalska and Mayne (1993); Chisci et al. (2001);
Mayne and Langson (2001) and many subsequent papers. It has more
recently been used for stochastic MPC in Lorenzen et al. (2016); Mayne
(2016).
Another difficulty that arises in stochastic MPC, as pointed out above,
3.8 Notes 261

is determination of suitable terminal conditions. It is impossible, for


example, to obtain a terminal cost Vf (·) and local controller κf (·) such
that Vf (x + ) < Vf (x) for all x ∈ Xf , x ̸= 0, and all x + = f (x, κf (x), w).
For this reason, Chatterjee and Lygeros (2015) propose that it should
be possible to decrease Vf (x) outside of the terminal constraint set Xf ,
but that Vf (x) should be permitted to increase by a bounded amount
inside Xf . The terminal ingredients, Vf (·) and Xf , that we propose for
robust MPC in Assumption 3.8 have this property a difference being
that Chatterjee and Lygeros (2015) require Vf (·) to be a global (stoch-
astic) Lyapunov function.
In most proposals, PN (x) is a stochastic optimization problem, an
area of study in which there have been recent significant advances dis-
cussed briefly above. Despite this, the computational requirements for
solving stochastic optimization problems online seems excessive for
process control applications. It is therefore desirable that as much
computation as possible is done offline as proposed in Kouvaritakis
et al. (2010); Lorenzen et al. (2016); Mayne (2016); and above. In these
papers, offline optimization is employed to choose tightened constraints
that, if satisfied by the nominal system, ensure that the original con-
straints are satisfied by the uncertain system. It also is desirable, in
process control applications, to avoid computation of polytopic sets,
as in Section 3.6.3, since they cannot be reliably computed for complex
systems.
Robustness against unstructured uncertainty has been considered
in Løvaas, Serón, and Goodwin (2008); Falugi and Mayne (2011).
262 Robust and Stochastic Model Predictive Control

3.9 Exercises

Exercise 3.1: Removing the outer min in a min-max problem


Show that Vi0 : Xi → R and κi : Xi → U defined by

Vi0 (x) = min max {ℓ(x, u, w) + Vi−1


0
(f (x, u, w)) | f (x, u, W) ⊂ Xi−1 }
u∈U w∈W
0
κi (x) = arg min max {ℓ(x, u, w) + Vi−1 (f (x, u, w)) | f (x, u, W) ⊂ Xi−1 }
u∈U w∈W
Xi = {x ∈ X | ∃u ∈ U such that f (x, u, W) ⊂ Xi−1 }

satisfy
Vi0 (x) = max {ℓ(x, κi (x), w) + Vi−1
0
(f (x, κi (x), w))}
w∈W

Exercise 3.2: Maximizing a difference


Prove the claim used in the proof of Theorem 3.9 that

max {a(w)} − max {b(w)} ≤ max {a(w) − b(w)}


w w w

Also show the following minimization version

min{a(w)} − min{b(w)} ≥ min{a(w) − b(w)}


w w w

Exercise 3.3: Equivalent constraints


Assuming that S is a polytope and, therefore, defined by linear inequalities, show that
the constraint x ∈ {z} ⊕ S (on z for given x) may be expressed as Bz ≤ b + Bx, i.e., z
must lie in a polytope. If S is symmetric (x ∈ S implies −x ∈ S), show that x ∈ {z} ⊕ S
is equivalent to z ∈ {x} ⊕ S.

Exercise 3.4: Hausdorff distance between translated sets


Prove that the Hausdorff distance between two sets {x} ⊕ S and {y} ⊕ S, where S is a
compact subset of Rn and x and y are points in Rn , is x − y .

Exercise 3.5: Exponential convergence of X(i)


Complement the proof of Proposition 3.12 by proving the sequence of sets (X(i))0:∞ ,
X(i) := {x̄(i)} ⊕ SK (∞), converges exponentially fast to the set SK (∞) as i → ∞ if x̄(i)
converges exponentially fast to 0 as i → ∞.

Exercise 3.6: Simulating a robust MPC controller


This exercise explores robust MPC for linear systems with an additive bounded distur-
bance
x + = Ax + Bu + w
The first task, using the tube-based controller described in Section 3.5.3 is to determine
state and control constraint sets Z and V such that if the nominal system z+ = Az + Bv
satisfies z ∈ Z and v ∈ V, then the actual system x + = Ax + Bu + w with u =
v + K(x − z) where K is such that A + BK is strictly stable, satisfies the constraints
x ∈ X and u ∈ U.
3.9 Exercises 263

1.5

1.0

0.5

x2
0.0

−0.5

−1.0

−1.5
−3 −2 −1 0 1 2 3
x1

Figure 3.10: Closed-loop robust MPC state evolution with uniformly


distributed |w| ≤ 0.1 from four different x0 .

(a) To get started, consider the scalar system

x+ = x + u + w

with constraint sets X = {x | x ≤ 2}, U = {u | |u| ≤ 1}, and W = {w | |w| ≤


0.1}. Choose K = −(1/2) so that AK = 1/2. Determine Z and V so that if the
nominal system z+ = z + v satisfies z ∈ Z and v ∈ V, the uncertain system
x + = Ax + Bu + w, u = v + K(x − z) satisfies x ∈ X, u ∈ U.

(b) Repeat part (a) for the following uncertain system


" # " #
+ 1 1 0
x = x+ u+w
0 1 1

with the constraint sets X = h{x ∈ R2 | xi1 ≤ 2}, U = {u ∈ R | |u| ≤ 1} and


W = [−0.1, 0.1]. Choose K = −0.4 −1.2 .

(c) Determine a model predictive controller for the nominal system and constraint
sets Z and V used in (b).

(d) Implement robust MPC for the uncertain system and simulate the closed-loop
system for a few initial states and a few disturbance sequences for each initial
state. The phase plot for initial states [−1, −1], [1, 1], [1, 0], and [0, 1] should
resemble Figure 3.10.
Bibliography

D. A. Allan, C. N. Bates, M. J. Risbeck, and J. B. Rawlings. On the inherent


robustness of optimal and suboptimal nonlinear MPC. Sys. Cont. Let., 106:
68–78, August 2017.

D. Angeli, A. Casavola, G. Franz̀e, and E. Mosca. An ellipsoidal off-line MPC


scheme for uncertain polytopic discrete-time systems. Automatica, 44:
3113–3119, 2008.

J. P. Aubin. Viability Theory. Systems & Control: Foundations & Applications.


Birkhauser, Boston, Basel, Berlin, 1991.

D. P. Bertsekas and I. B. Rhodes. Recursive state estimation for a set-


membership description of uncertainty. IEEE Trans. Auto. Cont., 16:117–
128, 1971a.

D. P. Bertsekas and I. B. Rhodes. On the minimax reachability of target sets


and target tubes. Automatica, 7(2):233–247, 1971b.

F. Blanchini. Set invariance in control. Automatica, 35:1747–1767, 1999.

F. Blanchini and S. Miani. Set-Theoretic methods in Control. Systems & Control:


Foundations and Applications. Birkhäuser, 2008.

G. C. Calafiore and M. C. Campi. The scenario approach to robust control


design. IEEE Trans. Auto. Cont., 51(5):742–753, May 2006.

D. Chatterjee and J. Lygeros. On stability and performance of stochastic pre-


dictive control techniques. IEEE Trans. Auto. Cont., 60(2):509–514, 2015.

D. Chatterjee, P. Hokayem, and J. Lygeros. Stochastic receding horizon control


with bounded control inputs: a vector space approach. IEEE Trans. Auto.
Cont., 56(11):2704–2710, November 2011.

H. Chen and F. Allgöwer. A quasi-infinite horizon nonlinear model predictive


control scheme with guaranteed stability. Automatica, 34(10):1205–1217,
1998.

L. Chisci, J. A. Rossiter, and G. Zappa. Systems with persistent disturbances:


predictive control with restricted constraints. Automatica, 37(7):1019–1028,
2001.

G. De Nicolao, L. Magni, and R. Scattolini. Robust predictive control of systems


with uncertain impulse response. Automatica, 32(10):1475–1479, 1996.

264
Bibliography 265

M. Diehl, H. G. Bock, and E. Kostina. An approximation technique for robust


nonlinear optimization. Math. Prog., 107:213–230, 2006.

P. Falugi and D. Q. Mayne. Tube-based model predictive control for nonlinear


systems with unstructured uncertainty. In Proceedings of 50th IEEE Con-
ference on Decision and Control, pages 2656–2661, Orlando, Florida, USA,
December 2011.

P. J. Goulart, E. C. Kerrigan, and J. M. Maciejowski. Optimization over state


feedback policies for robust control with constraints. Automatica, 42:523–
533, 2006.

P. J. Goulart, E. C. Kerrigan, and D. Ralph. Efficient robust optimization for


robust control with constraints. Math. Prog., 114(1):115–147, July 2008.

G. Grimm, M. J. Messina, S. E. Tuna, and A. R. Teel. Nominally robust model


predictive control with state constraints. IEEE Trans. Auto. Cont., 52(10):
1856–1870, October 2007.

G. A. Hicks and W. H. Ray. Approximation methods for optimal control syn-


thesis. Can. J. Chem. Eng., 49:522–528, August 1971.

S. Kameswaran and L. T. Biegler. Simultaneous dynamic optimization strate-


gies: Recent advances and challenges. Comput. Chem. Eng., 30:1560–1575,
September 2006.

C. M. Kellett and A. R. Teel. Discrete-time asymptotic controllability implies


smooth control-Lyapunov function. Sys. Cont. Let., 52:349–359, 2004.

A. B. Khurzhanski and I. Valyi. Ellipsoidal-valued dynamics for estimation and


control. Systems & Control: Foundations & Applications. Birkhauser, Boston,
Basel, Berlin, 1997.

I. Kolmanovsky and E. G. Gilbert. Maximal output admissible sets for discrete-


time systems with disturbance inputs. In Proceedings of the American Con-
trol Conference, Seattle, June 1995.

I. Kolmanovsky and E. G. Gilbert. Theory and computation of disturbance


invariant sets for discrete-time linear systems. Math. Probl. Eng., 4(4):317–
367, 1998.

M. V. Kothare, V. Balakrishnan, and M. Morari. Robust constrained model pre-


dictive control using linear matrix inequalities. Automatica, 32(10):1361–
1379, 1996.

K. I. Kouramas, S. V. Raković, E. C. Kerrigan, J. C. Allwright, and D. Q. Mayne. On


the minimal robust positively invariant set for linear difference inclusions.
266 Bibliography

In Proceedings of the 44th IEEE Conference on Decision and Control and Eu-
ropean Control Conference ECC 2005, pages 2296–2301, Sevilla, Spain, De-
cember 2005.

B. Kouvaritakis and M. Cannon. Model predictive control. Springer International


Publishing, Switzerland, 2016.

B. Kouvaritakis, M. Cannon, S. V. Raković, and Q. Cheng. Explicit use of proba-


bilistic distributions in linear predictive control. Automatica, 46:1719–1724,
2010.

W. Langson, I. Chryssochoos, S. V. Raković, and D. Q. Mayne. Robust model


predictive control using tubes. Automatica, 40:125–133, January 2004.

M. Lazar, D. M. de la Peña, W. P. M. H. Heemels, and T. Alamo. On input-to-state


stability of min-max nonlinear model predictive control. Sys. Cont. Let., 57:
39–48, 2008.

J. H. Lee and Z. Yu. Worst-case formulations of model predictive control for


systems with bounded parameters. Automatica, 33(5):763–781, 1997.

D. Limon, T. Álamo, and E. F. Camacho. Stability analysis of systems with


bounded additive uncertainties based on invariant sets: stability and fea-
sibility of MPC. In Proceedings of the American Control Conference, pages
364–369, Anchorage, Alaska, May 2002.

D. Limon, T. Alamo, D. M. Raimondo, D. M. de la Peña, J. M. Bravo, and E. F.


Camacho. Input-to-state stability: an unifying framework for robust model
predictive control. In L. Magni, D. M. Raimondo, and F. Allgöwer, editors,
International Workshop on Assessment and Future Directions of Nonlinear
Model Predictive Control, Pavia, Italy, September 2008.

M. Lorenzen, F. Dabbene, R. Tempo, and F. Allgöwer. Constraint-tightening


and stability in stochastic model predictive control. IEEE Trans. Auto. Cont.,
62(7):3165–3177, 2016.

C. Løvaas, M. M. Serón, and G. C. Goodwin. Robust output feedback model


predictive control for systems with unstructured uncertainty. Automatica,
44(8):1933–1943, August 2008.

L. Magni and R. Sepulchre. Stability margins of nonlinear receding-horizon


control via inverse optimality. Sys. Cont. Let., 32:241–245, 1997.

L. Magni, G. De Nicolao, R. Scattolini, and F. Allgöwer. Robust model predictive


control for nonlinear discrete-time systems. Int. J. Robust and Nonlinear
Control, 13:229–246, 2003.
Bibliography 267

D. Q. Mayne. Optimization in model based control. In Proceedings of the


IFAC Symposium Dynamics and Control of Chemical Reactors, Distillation
Columns and Batch Processes, pages 229–242, Helsingor, Denmark, June
1995.

D. Q. Mayne. Nonlinear model predictive control: An assessment. In J. C. Kan-


tor, C. E. Garcı́a, and B. Carnahan, editors, Proceedings of Chemical Process
Control – V, pages 217–231. CACHE, AIChE, 1997.

D. Q. Mayne. Robust and stochastic model predictive control: Are we going in


the right direction? Annual Rev. Control, 2016.

D. Q. Mayne and P. Falugi. Stabilizing conditions for model predictive control.


Int. J. Robust and Nonlinear Control, 29(4):894–903, 2019.

D. Q. Mayne and W. Langson. Robustifying model predictive control of con-


strained linear systems. Electron. Lett., 37(23):1422–1423, 2001.

D. Q. Mayne, J. B. Rawlings, C. V. Rao, and P. O. M. Scokaert. Constrained


model predictive control: Stability and optimality. Automatica, 36(6):789–
814, 2000.

D. Q. Mayne, M. M. Serón, and S. V. Raković. Robust model predictive control


of constrained linear systems with bounded disturbances. Automatica, 41
(2):219–224, February 2005.

D. Q. Mayne, E. C. Kerrigan, E. J. van Wyk, and P. Falugi. Tube based robust


nonlinear model predictive control. Int. J. Robust and Nonlinear Control, 21
(11):1341–1353, 2011.

H. Michalska and D. Q. Mayne. Robust receding horizon control of constrained


nonlinear systems. IEEE Trans. Auto. Cont., 38(11):1623–1633, 1993.

Z. Nagy and R. Braatz. Open-loop and closed-loop robust optimal control of


batch processes using distributional and worst-case analysis. J. Proc. Cont.,
pages 411–422, 2004.

G. Pannocchia, J. B. Rawlings, and S. J. Wright. Conditions under which subop-


timal nonlinear MPC is inherently robust. Sys. Cont. Let., 60:747–755, 2011.

S. V. Raković. Invention of prediction structures and categorization of robust


MPC syntheses. In Proceedings of 4th IFAC Nonlinear Model Predictive Con-
trol Conference, pages 245–273, Noordwijkerhout, NL, August 2012.

S. V. Raković, E. C. Kerrigan, K. I. Kouramas, and D. Q. Mayne. Approximation of


the minimal robustly positively invariant set for discrete-time LTI systems
with persistent state disturbances. In Proceedings 42nd IEEE Conference
on Decision and Control, volume 4, pages 3917–3918, Maui, Hawaii, USA,
December 2003.
268 Bibliography

S. V. Raković, E. C. Kerrigan, K. I. Kouramas, and D. Q. Mayne. Invariant ap-


proximations of the minimal robustly positively invariant sets. IEEE Trans.
Auto. Cont., 50(3):406–410, 2005a.

S. V. Raković, D. Q. Mayne, E. C. Kerrigan, and K. I. Kouramas. Optimized


robust control invariant sets for constrained linear discrete-time systems.
In Proceedings of 16th IFAC World Congress on Automatic Control, Prague,
Czechoslavakia, 2005b.

S. V. Raković, B. Kouvaritakis, M. Cannon, C. Panos, and R. Findeisen. Pa-


rameterized tube model predictive control. IEEE Trans. Auto. Cont., 57(11):
2746–2761, 2012.

R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer-Verlag, 1998.

J. A. Rossiter, B. Kouvaritakis, and M. J. Rice. A numerically robust state-space


approach to stable-predictive control strategies. Automatica, 34(1):65–73,
1998.

L. O. Santos and L. T. Biegler. A tool to analyze robust stability for model


predictive control. J. Proc. Cont., 9:233–245, 1999.

P. O. M. Scokaert and D. Q. Mayne. Min–max feedback model predictive control


for constrained linear systems. IEEE Trans. Auto. Cont., 43(8):1136–1142,
August 1998.

P. O. M. Scokaert, J. B. Rawlings, and E. S. Meadows. Discrete-time stability with


perturbations: Application to model predictive control. Automatica, 33(3):
463–470, 1997.

A. R. Teel. Discrete time receding horizon control: is the stability robust. In


Marcia S. de Queiroz, Michael Malisoff, and Peter Wolenski, editors, Optimal
control, stabilization and nonsmooth analysis, volume 301 of Lecture notes
in control and information sciences, pages 3–28. Springer, 2004.

R. Tempo, G. C. Calafiore, and F. Dabbene. Randomized algorithms for anal-


ysis and control of uncertain systems: With applications. Springer, second
edition, 2013.

S. Yu, M. Reble, H. Chen, and F. Allgöwer. Inherent robustness properties of


quasi-infinite horizon MPC. In Proceedings of the 18th IFAC World Congress,
Milano, Italy, August, September 2011.
4
State Estimation

4.1 Introduction
We now turn to the general problem of estimating the state of a noisy
dynamic system given noisy measurements. We assume that the sys-
tem generating the measurements is given by

x + = f (x, w)
y = h(x) + v (4.1)

with the state x ∈ X ⊆ Rn , measurement y ∈ Y ⊆ Rp , process dis-


turbance, w ∈ W ⊆ Rg , measurement disturbance, v ∈ V ⊆ Rp , and
system initial state, x(0) ∈ X. One of our main purposes is to provide
a state estimate to the MPC regulator as part of a feedback control sys-
tem, in which case the model changes to x + = f (x, u, w) with both
process disturbance w and control input u. But state estimation is a
general technique that is often used in monitoring applications without
any feedback control. So for simplicity of presentation, we start with
state estimation as an independent subject and neglect the control in-
put u as part of the system model as in (4.1).
Finally, in Section 4.5, we briefly treat the problem of combined MHE
estimation and MPC regulation. In Chapter 5, we discuss the combined
use of MHE and MPC in more detail.

4.2 Full Information Estimation


Of the estimators considered in this chapter, full information estima-
tion will prove to have the best theoretical properties in terms of stabil-
ity and optimality. Unfortunately, it will also prove to be computation-
ally intractable except for the simplest cases, such as a linear system
model. Its value therefore lies in clearly defining what is desirable in a

269
270 State Estimation

System Decision Optimal


variable variable decision
state x χ x̂
process disturbance w ω ŵ
measured output y η ŷ
measurement disturbance v ν v̂

Table 4.1: System and state estimator variables.

state estimator. One method for practical estimator design therefore


is to come as close as possible to the properties of full information es-
timation (FIE) while maintaining a tractable online computation. This
design philosophy leads directly to moving horizon estimation (MHE).
First we define some notation necessary to distinguish the system
variables from the estimator variables. We have already introduced the
system variables (x, w, y, v). In the estimator optimization problem,
these have corresponding decision variables, which we denote (χ, ω, η,
ν). The optimal decision variables are denoted (x̂, ŵ, ŷ, v̂), and these
optimal decisions are the estimates provided by the state estimator.
This notation is summarized in Table 4.1. Next we summarize the re-
lationships between these variables

x + = f (x, w) y = h(x) + v
+
χ = f (χ, ω) y = h(χ) + ν
+
x̂ = f (x̂, ŵ) y = h(x̂) + v̂

Notice that it is always the system measurement y that appears in the


second column of equations. We also can define the decision variable
output, η = h(χ), but notice that ν measures the fitting error, ν =
y − h(χ), and we must use the system measurement y and not η in
this relationship. Therefore, we do not satisfy a relationship like η =
h(χ) + ν, but rather

y = h(χ) + ν η = h(χ)
y = h(x̂) + v̂ ŷ = h(x̂)

We begin with a reasonably general definition of the full information


estimator that produces an estimator that is stable, which we also shall
4.2 Full Information Estimation 271

define subsequently. The full information objective function is

 TX
−1
VT (χ(0), ω) = ℓx χ(0) − x 0 + ℓ(ω(i), ν(i)) (4.2)
i=0

subject to
χ + = f (χ, ω) y = h(χ) + ν
in which T is the current time, y(i) is the measurement at time i, and x 0
is the prior estimate of the initial state.1 Occasionally we shall consider
input disturbances to an explicitly given nominal input. If we denote
this nominal input trajectory as w, then we adjust the model constraint
to χ + = f (χ, w + ω), so that ω measures the difference from the nom-
inal model’s input. We recover the standard problem by setting w = 0.
Because ν = y − h(χ) is the error in fitting the measurement y, ℓ(ω,
ν) penalizes the model disturbance and the fitting error. These are the
two error sources we reconcile in all state estimation problems.
The full information estimator is then defined as the solution to

PT (x 0 , w0:k−1 , y0:k−1 ) := min VT (χ(0), ω) (4.3)


χ(0),ω

and we use the notation PT (x 0 , y0:k−1 ) for the usual case when the
nominal input is w = 0. The solution to the optimization exists for all
T ∈ I≥0 because VT (·) is continuous, due to the continuity of f (·) and
h(·), and because VT (·) is an unbounded function of its arguments, as
will be clear after stage costs ℓx (·) and ℓ(·) are defined. We denote the
solution as x̂(0|T ), ŵ(i|T ), 0 ≤ i ≤ T − 1, T ≥ 1, and the optimal cost
as VT0 . We also use x̂(T ) := x̂(T |T ) to simplify the notation.
We require a definition of state estimation general enough to include
this optimization approach. Attempting to express the state estimate
as a finite dimensional dynamical system, as we do with the Kalman
filter for linear systems, is not sufficient here. Instead we consider the
state estimate at any time k ∈ I≥0 to be a function of the prior x 0 ,
nominal input (if nonzero), w0:T −1 , and the measurement y0:T −1 .

Definition 4.1 (State Estimator). A state estimator is a sequence of func-


tions (ΨT )T ≥0 defined ΨT : X × WT × YT → X for all T ∈ I≥0 , and the
1 Notice that we have dropped the final measurement y(T ) compared to the problem

considered in Chapter 1 to formulate the prediction form rather than the filtering form
of the state estimation problem. So what we denote here as x̂(T |T ) would be x̂ − (T )
in the notation of Chapter 1. This change is purely for notational convenience, and all
results developed in this chapter also can be expressed in the filtering form of MHE.
272 State Estimation

state estimate at time T is denoted

x̂(T ) = ΨT (x 0 , w0:T −1 , y0:T −1 )

If the nominal input sequence is w = 0, as is usually the case, then we


drop the second argument and write simply

x̂(T ) = ΨT (x 0 , y0:T −1 )

In the full information estimator, the function Ψ (·) denotes the fi-
nal element of the state trajectory in the solution to (4.3). One impor-
tant characteristic of optimization-based estimation worth bearing in
mind as we progress is that x̂(T ) = ΨT (x 0 , y0:T −1 ) does not imply that
x̂(T + 1) = Ψ1 (x̂(T ), yT ), even though y0:T := (y0:T −1 , yT ). In (non-
linear) full information estimation, we have no convenient means to
move from x̂(T ) to x̂(T + 1), and must instead recompute the entire
optimal trajectory with ΨT +1 (x 0 , y0:T ). As we shall see subsequently,
this confers some desirable properties on the estimator, but renders
its online computation intractable since the size of the optimization
problem increases with time.
Next we require a definition of robust stability suitable for state
estimation in this general form. The standard attempt2 would be to
use the following type of bound in the definition of robust stability

|x(k) − x̂(k)| ≤ αx (|x(0) − x 0 | , k)+


γw (∥w∥0:k−1 ) + γv (∥v∥0:k−1 ) (4.4)

for all k ∈ I≥0 with αx (·) ∈ KL and γw (·), γv (·) ∈ K. But, for the gen-
eral class of estimators under consideration here, an inequality of this
type does not ensure that the estimate error converges to zero when
the disturbances converge to zero. To ensure this desirable property
we strengthen the definition of estimator stability to the following.
Definition 4.2 (Robustly globally asymptotically stable estimation). A
state estimator (Ψk )k≥0 is robustly globally asymptotically stable (RGAS)
if there exist KL-functions αx , αw , αv such that

|x(k) − x̂(k)| ≤ αx (|x(0) − x 0 | , k) ⊕ max αw ( w(j) , k − j − 1)


j∈I0:k−1

⊕ max αv ( v(j) , k − j − 1) (4.5)


j∈I0:k−1

for all k ∈ I≥0 , x(0), x 0 ∈ X, and w ∈ W, v ∈ V.


2 See the previous printings of this chapter, for example.
4.2 Full Information Estimation 273

We have chosen the convolution maximization form for the stability


definition, where the notation a ⊕ b denotes max(a, b) for a, b ∈ R. We
choose to maximize on time index j rather than sum on j since we do
not know a priori that the KL-functions αw , αv decrease sufficiently
quickly to ensure that the sums converge as k → ∞.
We can then readily establish the following convergence result (Al-
lan and Rawlings, 2020, Proposition 3.11)

Proposition 4.3 (RGAS plus convergent disturbances imply convergent


estimates). If an estimator is RGAS and ((w(k), v(k)))k≥0 converges to
zero, then the estimate error converges to zero.

The proof of this proposition is discussed in Exercise 4.13.

Example 4.4: The Kalman filter of a linear system is RGAS


Show that the steady-state Kalman filter (predictor) of a detectable, sta-
bilizable linear system

x + = Ax + Gw y = Cx + v

is RGAS and satisfies both (4.5) as well as (4.4).

Solution
For (A, C) detectable and (A, G) stabilizable, the steady-state Kalman
predictor is nominally exponentially stable as discussed in Exercise
4.17. The steady-state estimator takes the form

x̂ + = Ax̂ + L(y − C x̂) x̂(0) = x 0

where L satisfies a steady-state Riccati equation and AL := (A − LC) is


a stable matrix. Subtracting the estimator from the system gives

(x − x̂)+ = AL (x − x̂) + Gw − Lv

Solving this linear system gives

k−1
X k−j−1
x(k) − x̂(k) = AkL (x(0) − x 0 ) + AL (Gw(j) − Lv(j))
j=0

Since AL is stable, we have the bound (Horn and Johnson, 1985, p.299)
|AiL | ≤ cλi in which max eig(AL ) < λ < 1. Taking norms and using
274 State Estimation

this bound gives for all k ≥ 0

|x(k) − x̂(k)| ≤ cλk |x(0) − x0 | +


k−1
X  k−j−1
c |G| w(j) + |L| v(j) λ (4.6)
j=0

Taking the largest disturbance terms outside and performing the sum
then gives
c  
|x(k) − x̂(k)| ≤ cλk |x(0) − x0 | + |G| ∥w∥0:k−1 + |L| ∥v∥0:k−1
1−λ

So we have that (4.4) is satisfied after defining αx (r , k) := cr λk , which


is an exponential KL-function, and γw (r ) := (c |G| /(1 − λ)) r and
γv (r ) := (c |L| /(1 − λ)) r , which are linear K-functions.
To obtain the stronger convolution maximization form, first note
that for 0 ≤ λ < 1 and z(j) > 0

k−1
X k−1
X 
z(j)λk−j−1 = z(j)λ(k−j−1)/2 λ(k−j−1)/2 ≤
j=0 j=0
1
√ max z(j)λ(k−j−1)/2
1 − λ j∈I0:k−1

Using this result in (4.6) and letting η := λ > λ so that 0 ≤ η < 1, we
have that

|x(k) − x̂(k)| ≤ cηk |x(0) − x0 |+c |G| /(1−η) max w(j) ηk−j−1 +
j∈I0:k−1

c |L| /(1 − η) max v(j) ηk−j−1


j∈I0:k−1

Finally, using Exercise 4.6(d), we convert the sum to maximization and


satisfy (4.5) with

αx (r , k) := 3cr ηk αw (r , k) := 3c |G| /(1 − η)r ηk


αv (r , k) := 3c |L| /(1 − η)r ηk

and the steady-state Kalman predictor is RGAS, and the KL-functions


αx , αw , αv are of exponential form. □
The next order of business is to decide what class of systems to con-
sider if the goal is to obtain a stable state estimator. A standard choice
4.2 Full Information Estimation 275

in most nonlinear estimation literature is to assume system observabil-


ity. The drawback with this choice is that it is overly restrictive, even for
linear systems. As discussed in Chapter 1, for linear systems we require
only detectability for stable estimation (Exercise 1.33). We therefore
start instead with an assumption of detectability that is appropriate
for nonlinear systems, called incremental input/output-to-state stabil-
ity (i-IOSS) Sontag and Wang (1997). This definition is an incremen-
tal property in which we compare two trajectories starting at different
initial conditions x1 , x2 ∈ X and experiencing different disturbance se-
quences, w1 , w2 ∈ W∞ . We use x(k; x, w) to denote the solution to (4.1)
for initial condition x and disturbance sequence w. To compress the
notation, we define the incremental differences in state ∆x(k) := x(k,
x1 , w1 ) − x(k, x2 , w2 ), input difference ∆w(k) := w1 (k) − w2 (k), and
output ∆y(k) := h(x(k, x1 , w1 )) − h(x(k, x2 , w2 )). For convenience,
we choose a detectability assumption that is similar in structure to our
choice of stability definition.

Definition 4.5 (i-IOSS). The system x + = f (x, w), y = h(x) is in-


crementally input/output-to-state stable (i-IOSS) if there exist functions
βx (·), βw (·), βv (·) ∈ KL such that

|∆x(k)| ≤ βx (|∆x(0)| , k) ⊕ max βw ( ∆w(j) , k − j − 1)


j∈I0:k−1

⊕ max βv ( ∆y(j) , k − j − 1) (4.7)


j∈I0:k−1

for all k ∈ I≥0 , all initial states x1 , x2 ∈ X, and all disturbance se-
quences w1 , w2 ∈ W∞ .

In previous versions of the text we used the more traditional defi-


nition of i-IOSS that has a single KL-function βx and two asymptotic
gain K-functions γw , γv and the following bound in place of (4.7)

|∆x(k)| ≤ βx (|∆x(0)| , k) + γw (|∆w|) + γv (|∆y|) (4.8)

It is straightforward to show that the bound in (4.7) implies the bound


in (4.8). Although it is not straightforward, Allan, Rawlings, and Teel
(2020, Proposition 4) show that the bound in (4.8) also implies the one
in (4.7). Therefore the choice of the form of the bound in Definition 4.5
is indeed one of convenience, as we shall see in the proof of the next
proposition.
System properties such as i-IOSS are generically difficult to check
for a given nonlinear application of interest. It is therefore important
276 State Estimation

to ask whether the assumption is overly restrictive. We show that it is


not overly restrictive if the goal is to build an RGAS estimator for the
system (Allan et al., 2020, Proposition 5).

Proposition 4.6 (RGAS estimator implies i-IOSS). If a system admits an


RGAS estimator (Ψk )k≥0 , then the system is i-IOSS.

Proof. Consider two initial conditions denoted x1,0 and x2,0 , two input
sequences w1 and w2 generating from (4.1) two corresponding state
trajectories x1 and x2 . Now consider input and output disturbance se-
quences w e 1 (j) = w1 (j) − w2 (j), and v1 (j) = h(x2 (j)) − h(x1 (j)) for
j ∈ I≥0 . Let the system generating the measurements for state estima-
tion be x(k) = x(k; x1,0 , w2 + w e 1 ), y = h(x) + v1 . Note that the system
generating the measurements has initial condition x1,0 , nominal input,
w = w2 , but disturbed or actual input w1 since w2 +w e 1 = w1 ; so we have
that x(k) = x1 (k) for k ∈ I≥0 . The output measurements are exactly
h(x2 ) because of the output disturbance. The state estimator is there-
fore based on nominal input w = w2 and output measurement h(x2 ).
Let the state estimator then have x2,0 as its prior. The information given
to the estimator is then consistent, and it produces x̂(k) = Ψk (x2,0 , w2 ,
h(x2 )) = x2 (k) for k ∈ I≥0 . If the estimator is RGAS, then (4.5) gives
for this system and estimator

|x1 (k) − x2 (k)| ≤ αx ( x1,0 − x2,0 , k)


⊕ max αw ( w
e 1 (j) , k − j − 1) ⊕ max αv ( v1 (j) , k − j − 1)
j∈I0:k−1 j∈I0:k−1

and substituting the defined disturbances

|x1 (k) − x2 (k)| ≤ αx ( x1,0 − x2,0 , k)


⊕ max αw ( w1 (j) − w2 (j) , k − j − 1)
j∈I0:k−1

⊕ max αv ( h(x1 (j)) − h(x2 (j)) , k − j − 1)


j∈I0:k−1

for all k ∈ I≥0 . Note that since x1,0 , x2,0 , w1 , w2 are arbitrary, the system
is i-IOSS. ■

Sontag and Wang (1997, Proposition 23) derived an earlier result of


this style but restricted to estimators in the class of observers evolving
in the same state space as x with output injection.
We shall find an i-IOSS Lyapunov function useful to establish the
estimator’s stability. We have the following definition.
4.2 Full Information Estimation 277

Definition 4.7 (i-IOSS Lyapunov function). A function Λ : X × X → R≥0


is an i-IOSS Lyapunov function for the system (4.1) if there exist K∞ -
functions α1 , α2 , α3 and K-functions σw , σv such that

α1 (|x1 − x2 |) ≤ Λ(x1 , x2 ) ≤ α2 (|x1 − x2 |) (4.9)


Λ(f (x1 , w1 ), f (x2 , w2 )) ≤ Λ(x1 , x2 ) − α3 (|x1 − x2 |)
+ σw (|w1 − w2 |) (4.10)
+ σv (|h(x1 ) − h(x2 )|)

for all x1 , x2 ∈ X and w1 , w2 ∈ W.

The following converse theorem establishes that a system is i-IOSS


if and only if the system admits an i-IOSS Lyapunov function.

Theorem 4.8 (i-IOSS and Lyapunov function equivalence). A system


(4.1) is i-IOSS if and only if it admits an i-IOSS Lyapunov function.

The proof that an i-IOSS Lyapunov function implies i-IOSS is pro-


vided in (Allan and Rawlings, 2019, Proposition 5, Remark 6). The con-
verse implication is more involved and is provided in (Allan et al., 2020,
Theorem 8).
The last element that we require is system stabilizability. Most of
the literature on FIE and MHE has not stressed this requirement and
sometimes tacitly assumes an unnecessarily strong form of it by ex-
pressing the system as x + = f (x) + w, but we obtain sharper conclu-
sions by addressing it. We take the following definition of stabilizabil-
ity.

Definition 4.9 (Incremental Stabilizability with respect to stage cost


L(·)). A nonlinear system x + = f (x, u) is said to be incrementally
stabilizable with respect to stage cost L(·) if there exists K-function
α such that for every two initial conditions x1 , x2 ∈ X and control
sequence w1 ∈ W∞ , another control sequence w2 ∈ W∞ exists such
that
X∞
L(x1 (k), x2 (k), w1 (k), w2 (k)) ≤ α(|x1 − x2 |)
k=0

With all of the basic concepts introduced, we can state our working
assumptions for the full-information state estimation problem.

Assumption 4.10 (Continuity). The functions f (·), h(·), ℓx (·), and ℓ(·)
are continuous, ℓx (0) = 0, and ℓ(0, 0) = 0. The sets X and W are closed.
278 State Estimation

Assumption 4.11 (Positive-definite stage cost). The stage cost ℓ(·) sat-
isfies
σw (|ω|) + σv (|ν|) ≤ ℓ(ω, ν) ≤ σ w (|ω|) + σ v (|ν|)

for all ω ∈ W, ν ∈ V for some K∞ -functions σ w and σ v , and the K∞ -


functions σw and σv come from (4.10) of the i-IOSS Lyapunov function.
Furthermore, we have that

σx (|χ − x 0 |) ≤ ℓx (χ − x 0 ) ≤ σ x (|χ − x 0 |)

for all χ, x 0 ∈ X for some K∞ -functions σx and σ x .

Assumption 4.12 (Stabilizability). The system (4.1) is stabilizable with


respect to the stage cost L(x1 , x2 , w1 , w2 ) := ℓ(w2 −w1 , h(x1 )−h(x2 )).

Assumption 4.13 (Detectability). The system (4.1) is i-IOSS.

Remark.

(a) Assumptions 4.10 and 4.11 guarantee that a solution to (4.3) exists
for all finite T ≥ 0 (Rawlings and Ji, 2012).

(b) From Theorem 4.8, Assumption 4.13 implies the existence of an


i-IOSS Lyapunov function satisfying (4.9)–(4.10).

(c) Notice that the stage cost is chosen to be compatible with the sys-
tem’s detectability properties in Assumption 4.11.

(d) A similar case can be made in regulation that one must choose the
regulator’s stage cost to be compatible with the system’s stabilizabil-
ity properties. We did not emphasize this issue in Chapter 2, and in-
stead allowed the stage cost to affect the MPC regulator’s feasibility set
XN . The consequence of choosing the stage cost inappropriately in the
zero-state MPC regulator would therefore be a catastrophic reduction
in the size of the feasibility set, with the worst case being XN = {0}.

(e) If we strengthen the detectability property to exponential detectabil-


ity, then the stage cost restriction is relaxed. For example, any positive
definite quadratic stage cost is compatible with exponential detectabil-
ity as discussed in Exercise 4.12.

(f) The stage cost also is chosen to be compatible with the system’s
stabilizability properties in Assumption 4.12.

(g) It is not strictly necessary to assume the upper bounds in Assump-


tion 4.11. From Assumption 4.10, ℓ(·) and ℓx (·) are continuous and
4.2 Full Information Estimation 279

therefore have upper-bounding K∞ -functions (Rawlings and Risbeck,


2015, Proposition 19). But it is helpful to name these upper-bounding
functions here.

4.2.1 Nominal Estimator Stability

In this section we set w = 0, v = 0, and estimator stability in Definition


4.2 reduces to existence of a KL-function αx such that for all k ∈ I≥0 ,
x(0), x 0 ∈ X
|x(k) − x̂(k)| ≤ αx (|x(0) − x 0 | , k) (4.11)

We refer to this property as “nominal” stability. Since the main pur-


pose of state estimation is to deal with nonzero disturbances w, v,
one may wonder why we should bother analyzing nominal stability in
the first place. The motivation is to illustrate in this simple setting
a new analysis tool, termed a Q-function.3 This function takes the
place of a Lyapunov function in our estimator stability analysis. It has
the characteristics that we expect of a Lyapunov function, but it has
some additional features: two time arguments instead of one, and an
extra inequality involving the estimator’s prior and the system’s ini-
tial condition. The extra complexity seems to be required by the fact
that, unlike the zero-state regulator, the evolution of the estimate er-
ror cannot be expressed as a simple dynamical system. We introduce
the Q-function in this setting where the stability arguments can be pre-
sented in their entirety. We closely follow the development in Allan and
Rawlings (2019) in this section. Then the same tools introduced in this
section can be used in the next section to treat bounded disturbances
w, v, which is the case of most interest. The arguments for that case
become significantly longer and more detailed, so we will have to be
content to state the main results and point to the appropriate refer-
ences for the proofs. The results in that section mainly follow Allan
and Rawlings (2020); Allan (2020).
First we consider the estimation problem (4.3) on the infinite hori-
zon, i.e., in the limit T → ∞. For the zero disturbance case, the choice
χ(0) = x(0), and ω(j) = 0 for all j ∈ I0:T −1 is feasible and gives cost
ℓx (x(0) − x 0 ), which is independent of T . So we have the following
upper bound for the optimal FIE cost for all T ∈ I≥0

VT0 (x(0)) ≤ ℓx (x(0) − x 0 )


3 Allanand Rawlings (2019, 2020) introduce the name Q-function to commemorate
the seminal contributions of David Q. Mayne to control and estimation theory.
280 State Estimation

and from (Keerthi and Gilbert, 1985, Theorem 2), a solution to the in-
finite horizon problem exists. If we consider the solution of a k-stage
problem, optimality of the infinite horizon problem gives

X
0
V∞ ≤ Vk0 + min ℓ(ω(i), ν(i)) (4.12)
ωk:∞
i=k

subject to

χ + = f (χ, ω) y = h(χ) + ν χ(k) = x̂(k|k)

The stabilizability assumption then provides an upper bound for the


minimization in (4.12) as follows. In the sum, the system generating
the data starts at x(k) and experiences zero input disturbance. The
estimator starts at x̂(k|k) and optimizes the input sequence to fit the
data. The definition of stabilizability and Assumption 4.12 gives

X
L(x(i), χ(i), 0, ω(i)) ≤ α(|x(k) − x̂(k|k)|)
i=k

for L(x(i), χ(i), 0, ω(i)) = ℓ(ω(i), y(i)−h(χ(i))) = ℓ(ω(i), ν(i)). Us-


ing this bound in (4.12) then gives
0
V∞ ≤ Vk0 + α(|x(k) − x̂(k|k)|)

In previous versions of FIE analysis, we made use of the fact that the
optimal solution of the estimation problem at time k + 1 gives feasible,
but possibly suboptimal decision variables at time k. That argument
leads to the inequality

Vk0 ≤ Vk+1
0
− ℓ(ŵ(k|k + 1), v̂(k|k + 1)) (4.13)
 
which shows that the sequence Vk0 is nondecreasing. Since it is
k≥0
bounded above by ℓx (x(0) − x 0 ), it converges, and that implies that
ℓ(ŵ(k|k+1), v̂(k|k+1)) → 0 as k → ∞. The problem with this approach
is that it compares two different trajectories, and does not generalize
well to the bounded disturbance case where the infinite horizon prob-
lem is not bounded above. So we change course from previous analysis
and consider instead a single trajectory, but different times within the
trajectory by introducing partial sums
j−1
 X
0
V (j|k) = ℓx x̂(0|k) − x 0 + ℓ(x̂(i|k), v̂(i|k))
i=0
4.2 Full Information Estimation 281

with j ≤ k ∈ I≥0 . Changing j rather than k is then straightforward

V 0 (j|k) = V 0 (j + 1|k) − ℓ(ŵ(j|k), v̂(j|k)) (4.14)

Note that we have an equality here, not even an inequality as arises in


(4.13) when comparing optimal costs at k and k + 1.
The Q-function. We now modify the optimal cost of the estimation
problem to create something that operates similarly to a Lyapunov
function in this context. First we flip the function so that it decreases
rather than increases with k. We define Y (j|k)
0
Y (j|k) := V∞ − V 0 (j | k)

for j ≤ k ∈ I≥0 . We know that V 0 (j|k) ≤ V 0 (k|k) ≤ V∞0


for all j ≤ k
because the objective function is a sum of positive stage costs. We can
also deduce that
0
V∞ ≤ V 0 (j|k) + α( x(j) − x̂(j|k) )

for all j ≤ k using the same argument as we used above with j = k.


These give the corresponding bounds for the flipped function Y (·)

0 ≤ Y (j|k) ≤ α( x(j) − x̂(j|k) )

for all j ≤ k. Substituting (4.14) into the definition of Y (·) then gives a
cost decrease equality

Y (j + 1|k) = Y (j|k) − ℓ(ŵ(j|k), v̂(j|k))

for j ≤ k − 1.
The last step is to use the i-IOSS Lyapunov function implied by the
detectability Assumption 4.13. Applying (4.9)–(4.10) to the values x(j)
and x̂(j|k) gives

α1 ( x(j) − x̂(j|k) ) ≤ Λ(x(j), x̂(j|k)) ≤ α2 ( x(j) − x̂(j|k) )


Λ(x(j + 1), x̂(j + 1|k)) ≤ Λ(x(j), x̂(j|k)) − α3 ( x(j) − x̂(j|k) )
+ σw ( ŵ(j|k) ) + σv ( ν̂(j|k) )

for all j ≤ k. We define the Q-function as the sum of Λ(·) and Y (·)

Q(j|k) := Y (j|k) + Λ(x(j), x̂(j|k))

Substituting the bounds on Y (·) and Λ(·) into this definition gives pos-
itive upper and lower bounds on Q(·)

α1 ( x(j) − x̂(j|k) ) ≤ Q(j|k) ≤ α2 ( x(j) − x̂(j|k) )


282 State Estimation

with K∞ -function α2 := α + α2 , and the following descent condition

Q(j + 1|k) ≤ Q(j|k) − α3 ( x(j) − x̂(j|k) )


+ σw ( ŵ(j|k) ) + σv ( ν̂(j|k) ) − ℓ(ŵ(j|k), v̂(j|k))
≤ Q(j|k) − α3 ( x(j) − x̂(j|k) )

where we used Assumption 4.11 to achieve the final inequality. Note


that choosing an appropriate stage cost in estimation is what allows
the decrease in cost due to optimization to overcome the effect of the
positive supply rate in the i-IOSS Lyapunov function.
The inequalities established for Q(j|k) make it well suited for sta-
bility analysis except for one remaining issue, which we resolve next.
The upper bound at j = 0 gives

Q(0|k) ≤ α2 (|x(0) − x̂(0|k)|) (4.15)

But to achieve nominal stability in (4.11) we require a bound that de-


pends on the distance of the initial state from the prior x 0 , not the
estimated initial state at time k, x̂(0|k). So we create that bound next.
Note from Assumption 4.11 and the previous discussion of the infinite
horizon problem

σx (|x̂(0|k) − x 0 |) ≤ ℓx (x̂(0|k) − x 0 ) ≤ Vk0 ≤ V∞


0
≤ ℓx (x(0) − x 0 )

for all k ∈ I≥0 , and x(0), x 0 ∈ X. From Assumption 4.11, we also have
the upper bound

ℓx (x(0) − x 0 ) ≤ σ x (|x(0) − x 0 |)

Combining these gives



|x̂(0|k) − x 0 | ≤ σx−1 σ x (|x(0) − x 0 |)

Using the triangle inequality and this result gives

|x̂(0|k) − x(0)| ≤ |x̂(0|k) − x 0 | + |x(0) − x 0 | ≤ σ 0 (|x(0) − x 0 |)

with σ 0 (·) := (·) + σx−1 (σ x (·)). Substituting this result in (4.15) then
gives the desired bound

Q(0|k) ≤ α0 (|x(0) − x 0 |)

with K∞ -function α0 := α2 ◦ σ 0 .
Summarizing, we have established that FIE provides a Q-function
that meets the following definition.
4.2 Full Information Estimation 283

Definition 4.14 (Q-function for estimation). A function Q(j|k) is a Q-


function for some state estimator if there exist K∞ -functions µ0 , µ1 ,
µ2 , µ3 such that

Q(0|k) ≤ µ0 (|x(0) − x 0 |) (4.16)


µ1 ( x(j) − x̂(j|k) ) ≤ Q(j|k) ≤ µ2 ( x(j) − x̂(j|k) ) (4.17)
Q(j + 1|k) ≤ Q(j|k) − µ3 ( x(j) − x̂(j|k) ) (4.18)

for all j ≤ k ∈ I≥0 for (4.16) and (4.17) and j ≤ k − 1 ∈ I≥0 for (4.18).
Next we establish a Q-function theorem for nominal stability (Allan
and Rawlings, 2019, Theorem 14).
Theorem 4.15 (Q-function theorem for global asymptotic stability). If
a state estimator admits a Q-function, then it is globally asymptotically
stable (GAS).

Proof. First combine (4.17) and (4.18) to obtain

Q(j + 1|k) ≤ Q(j|k) − µ3 (µ2−1 (Q(j|k)))

Next use the same standard construction shown in Appendix B, Theo-


rem B.15, to obtain a K∞ -function σ satisfying σ (s) < s for s > 0 and
σ (s) ≥ s − µ3 (µ2−1 (s)), which gives

Q(j + 1|k) ≤ σ (Q(j|k))

Applying this result recursively starting at j = 0 gives

Q(j|k) ≤ σ j (Q(0|k))

Combining this with (4.16) and (4.17) then gives for all j ≤ k

x(j) − x̂(j|k) ≤ µ1−1 (σ j (µ0 (|x(0) − x 0 |)))


:= αx (|x(0) − x 0 | , j)

Note that αx (·) ∈ KL, and on choosing j = k, we have that

|x(k) − x̂(k|k)| ≤ αx (|x(0) − x 0 | , k)

for all k ∈ I≥0 , and the state estimator is GAS. ■

So applying this theorem establishes that FIE is globally asymptot-


ically stable for the case of zero input and output disturbances. We
summarize the result in the following theorem.
Theorem 4.16 (Stability of full information estimation). Let Assump-
tions 4.10–4.13 hold. Then full information estimation is GAS.
284 State Estimation

4.2.2 Robust Estimator Stability

The reason for increasing the abstraction level in the current presen-
tation is not to handle nominal stability. That simple problem can be
addressed with simple tools. The point is to address finally FIE with
bounded disturbances. We are now in a good position to accomplish
that. Let’s first recall what we concluded about the steady-state Kalman
filter (predictor) with bounded disturbances. We showed in Example 4.4
that the Kalman predictor is RGAS and that the estimate error satisfies
(4.5). So that result represents the gold standard of FIE for a nonlinear
system with bounded disturbances. We’ll see next how close we can
come to the same conclusion for nonlinear systems.
The system continuity and detectability conditions from the nom-
inal case are unchanged when treating the bounded disturbance case.
But the stage cost and stabilizability assumptions require modification.
We state the new conditions next.

Assumption 4.17 (Stage cost under disturbances). The stage cost ℓ(·)
satisfies

σw (2 |ω|) + σv (2 |ν|) ≤ ℓ(ω, ν) ≤ σ w (|ω|) + σ v (|ν|)

for all ω ∈ W, ν ∈ V, for some K∞ -functions σ w and σ v , and the K∞ -


functions σw and σv come from (4.10) of the i-IOSS Lyapunov function.
Furthermore, we have that

α2 (2 |χ − x 0 |) ≤ ℓx (χ − x 0 ) ≤ σ x (|χ − x 0 |)

for all χ, x 0 ∈ X, for some K∞ -function σ x , and the K∞ -function α2


comes from (4.9) of the i-IOSS Lyapunov function.

Assumption 4.18 (Stabilizability under disturbances). There exists K-


function γ such that for every finite sequences w ∈ Wk and v ∈ Vk and
any χ, x ∈ X, there exists ω ∈ W∞ such that the following holds for all
k≥0

X k
X
ℓ(ω(j), ν(j)) ≤ α(|χ − x|) + γ(|w(i), v(i)|)
i=0 i=0

in which

χ + = f (χ, ω) y = h(χ) + ν
 
f (x, w) for i ∈ I0:k−1 h(x) + v for i ∈ I0:k−1
x+ = y=
f (x, 0) for i ∈ Ik:∞ h(x) for i ∈ Ik:∞
4.2 Full Information Estimation 285

Remark.

(a) Note the introduction of the factor of two in the lower bound of
ℓ(·) in Assumption 4.17 compared to the nominal case, Assumption
4.11.

(b) Note the new compatibility restriction on the lower bound for ℓx (·)
in Assumption 4.17 compared to the nominal case, Assumption 4.11.

(c) In the stabilizability assumption note that the upper bound on the
infinite horizon cost grows linearly with time for the case of bounded
disturbances. It is anticipated that the full-information optimal cost
also increases without bound for this bounded disturbance case. The
divergence of the optimal cost presents one of the primary challenges
in the estimator stability analysis.

It also will prove insightful to break out a stronger case of detectabil-


ity, termed exponential detectability, defined as follows.

Definition 4.19 (Exponentially i-IOSS). The system x + = f (x, w), y =


h(x) is exponentially incrementally input/output-to-state stable (expo-
nentially i-IOSS) if there exist 0 ≤ λ < 1 and positive constants bx ,
bw , bv such that for all k ∈ I≥0 , all initial states x1 , x2 ∈ X, and all
disturbance sequences w1 , w2 ∈ W∞

|x1 (k) − x2 (k)| ≤ bx |x1 − x2 | λk ⊕ max bw ∆w(j) λk−j−1


j∈I0:k−1

⊕ max bv ∆y(j) λk−j−1 (4.19)


j∈I0:k−1

where x1 (k) = x(k; x1 , w1 ), x2 (k) = x(k; x2 , w2 ), ∆w(k) = w1 (k) −


w2 (k), and ∆y(k) = h(x1 (k)) − h(x2 (k)).

Note that we have restricted the KL-functions of (asymptotic) de-


tectability to an exponential form. We shall see subsequently that this
stronger form of detectability makes the analysis of moving horizon es-
timation particularly straightforward. Note also that detectable linear
systems satisfy this property.
When we assume exponential detectability, we also achieve a stronger
form of stability, termed robust global exponential stability, defined in
convolution maximization form as follows.

Definition 4.20 (Robustly globally exponentially stable estimation). A


state estimator (Ψk )k≥0 is robustly globally exponentially stable (RGES)
286 State Estimation

if there exist 0 ≤ λ < 1 and positive constants ax , aw , av such that

|x(k) − x̂(k)| ≤ ax |x(0) − x 0 | λk ⊕ max aw w(j) λk−j−1


j∈I0:k−1

⊕ max av v(j) λk−j−1 (4.20)


j∈I0:k−1

for all k ∈ I≥0 , x(0), x 0 ∈ X, and w ∈ W, v ∈ V.

It is often convenient to compress the notation and combine the


disturbances as d(j) := (w(j), v(j)), j ∈ I≥0 with D := W × V, and use
the following equivalent definition of RGES.

Proposition 4.21 (Equivalent definition of RGES). A state estimator (Ψk )k≥0


is robustly globally exponentially stable (RGES) if there exist 0 ≤ λ < 1
and positive constant ad such that

|x(k) − x̂(k)| ≤ ax |x(0) − x 0 | λk ⊕ max ad d(j) λk−j−1 (4.21)


j∈I0:k−1

for all k ∈ I≥0 , x(0), x 0 ∈ X, and d ∈ D.

Proof of this proposition is discussed in Exercise 4.20.


Next we strengthen the asymptotic Assumptions 4.11–4.13 to their
exponential versions.

Assumption 4.22 (Power-law bounds for stage costs). There exist pos-
itive constants c ℓ , c x , c ℓ , c x and σ ≥ 1 such that

c ℓ |(ω, ν)|σ ≤ ℓ(ω, ν) ≤ c ℓ |(ω, ν)|σ


c x |χ − x 0 |σ ≤ ℓx (χ − x 0 ) ≤ c x |χ − x 0 |σ

for all ω ∈ W, ν ∈ V, and χ, x 0 ∈ X.

Assumption 4.23 (Exponential stabilizability). The system (4.1) is expo-


nentially incrementally stabilizable, i.e., there exists positive constant
c > 0 such that for every two initial conditions x1 , x2 ∈ X and input
sequence ω1 ∈ W∞ , there exists ω2 ∈ W∞ such that

X
ℓ(ω2 (k) − ω1 (k), h(x1 (k)) − h(x2 (k))) ≤ c |x1 − x2 |σ
k=0

where σ ≥ 1 comes from Assumption 4.22.

Assumption 4.24 (Exponential detectability). The system (4.1) is expo-


nentially i-IOSS.
4.2 Full Information Estimation 287

We then have the following result for robust stability of FIE under
disturbances.

Theorem 4.25 (Robust stability of full information estimation).


(a) Let Assumptions 4.10, 4.13, 4.17, and 4.18 hold. Then full informa-
tion estimation is RGAS.

(b) Let Assumptions 4.10 and 4.22–4.24 hold. Then full information
estimation is RGES.

The proof for RGES is given in (Allan and Rawlings, 2020, Theorem
3.16). The considerably more involved proof for RGAS is given in (Allan,
2020, Theorem 5.18).
Theorem 4.25 is a reasonable resting place for the theory of full
information estimation. We can finally handle bounded disturbances
in a fairly clean theoretical development with reasonable assumptions
on the system’s detectability and stabilizability. If one is willing to
strengthen the detectability assumption to exponential detectability as
in Theorem 4.25(b), the theoretical development is reasonably com-
pact, and can be easily extended to MHE as we show subsequently.
Moreover, by strengthening the definitions of RGAS and RGES using
the convolution maximization form, we have the desirable and antici-
pated consequence that stability implies convergence of estimate error
given convergence of disturbances.

4.2.3 Interlude—Linear System Review

Given the many structural similarities between estimation and regu-


lation, the reader may wonder why the stability analysis of the full
information estimator presented in the previous sections looks rather
different than the zero-state regulator stability analysis presented in
Chapter 2.

State Estimation as Optimal Control of Estimate Error

To provide some insight into essential differences, as well as similari-


ties, between estimation and regulation, consider again the estimation
problem in the simplest possible setting with a linear time-invariant
model and Gaussian noise

x + = Ax + Gw w ∼ N(0, Q)
y = Cx + v v ∼ N(0, R) (4.22)
288 State Estimation

and random initial state x(0) ∼ N(x 0 , P − (0)). In FIE, we define the
objective function
 TX−1 
1 2 2 2
VT (χ(0), ω) = |χ(0) − x 0 |(P − (0))−1 + |ω(i)|Q−1 + |ν(i)|R−1
2 i=0

subject to χ + = Aχ + Gω, y = Cχ + ν. Denote the solution to this


optimization as

(x̂(0|T ), w
b T ) = arg min VT (χ(0), ω)
χ(0),ω

and the trajectory of state estimates comes from the model x̂(i+1|T ) =
Ax̂(i|T )+Gŵ(i|T ). We define estimate error as x e (i|T ) = x(i)− x̂(i|T )
for 0 ≤ i ≤ T − 1, T ≥ 1.
The simplest stability question is nominal stability, i.e., if noise-free
data are provided to the estimator, (w(i), v(i)) = 0 for all i ≥ 0 in
(4.22), is the estimate error asymptotically stable as T → ∞ for all x0 ?
We next make this statement precise. First we note that the noise-free
measurement satisfies y(i) − C x̂(i|T ) = Cx e (i|T ), 0 ≤ i ≤ T and the
initial condition term can be written in estimate error as x̂(0) − x(0) =
−(xe (0) − a) in which a = x(0) − x 0 . For the noise-free measurement
we can therefore rewrite the cost function as
 TX
−1 
1 2 2
VT (a, x
e (0), w) = x
e (0) − a (P − (0))−1 + Cx
e (i) R −1 + |w(i)|2Q−1
2 i=0
(4.23)
in which we list explicitly the dependence of the cost function on pa-
rameter a. For estimation we solve

min VT (a, x
e (0), w) (4.24)
x
e (0),w
+
subject to x e = Ax e + Gw. Now consider problem (4.24) as an opti-
mal control problem (OCP) using w as the manipulated variable and
minimizing an objective that measures size of estimate error x e and
0 0
control w. We denote the optimal solution as x e (0; a) and w (a). Sub-
stituting these into the model equation gives optimal estimate error
0
x
e (j|T ; a), 0 ≤ j ≤ T , 0 ≤ T . Parameter a denotes how far x(0), the
system’s initial state generating the measurement, is from x 0 , the prior.
0
e , w0 ) = 0, and we
If we are lucky and a = 0, the optimal solution is (x
0
achieve zero cost in VT and zero estimate error x e (j|T ) at all time in
4.2 Full Information Estimation 289

the trajectory 0 ≤ j ≤ T for all time T ≥ 1. The stability analysis in


estimation is to show that the origin for xe is asymptotically stable. In
other words, we wish to show there exists a KL function β such that
0
x
e (T ; a) ≤ β(|a| , T ) for all T ∈ I≥0 .
We note the following differences between standard regulation and
the estimation problem (4.24). First we see that (4.24) is slightly non-
standard because it contains an extra decision variable, the initial state,
and an extra term in the cost function, (4.23). Indeed, without this extra
term, the regulator could choose x e (0) = 0 to zero the estimate error
immediately, choose w = 0, and achieve zero cost in VT0 (a) for all a.
The nonstandard regulator allows x e (0) to be manipulated as a decision
variable, but penalizes its distance from a. Next we look at the stability
question.
The stability analysis is to show there exists KL function β such that
0
x
e (T ; a) ≤ β(|a| , T ) for all T ∈ I≥0 . Here convergence is a question
about the terminal state in a sequence of different OCPs with increasing
horizon length T . That is also not the standard regulator convergence
question, which asks how the state trajectory evolves using the optimal
control law. In standard regulation, we inject the optimal first input
and ask whether we are successfully moving the system to the origin
as time increases. In estimation, we do not inject anything into the
system; we are provided more information as time increases and ask
whether our explanation of the data is improving (terminal estimate
error is decreasing) as time increases.
Because stability is framed around the behavior of the terminal
state, we would not choose backward dynamic programming (DP) to
solve (4.24), as in standard regulation. We do not seek the optimal first
control move as a function of a known initial state. Rather we seek
0
the optimal terminal state x e (T ; a) as a function of the parameter a
appearing in the cost function. This problem is better handled by for-
ward DP as discussed in Sections 1.3.2 and 1.4.3 of Chapter 1 when
solving the full information state estimation problem. Exercise 4.16
discusses how to solve (4.24); we obtain the following recursion for the
optimal terminal state

0 0
x e (k)C) x
e (k + 1; a) = (A − L e (k; a) (4.25)

0
for k ≥ 0. The initial condition for the recursion is x
e (0; a) = a. The
e
time-varying gains L(k) and associated cost matrices P − (k) required
290 State Estimation

are

P − (k + 1) = GQG′ + AP − (k)A′
− AP − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)A′ (4.26)
e (k) = AP − (k)C ′ (CP − (k)C ′ + R)−1
L (4.27)

in which P − (0) is specified in the problem. As expected, these are


the standard estimator recursions developed in Chapter 1. Jazwin-
ski (1970, Theorem 7.4) follows an argument introduced by Deyst and
Price (1968), assumes controllability and observability, and tries to es-
tablish stability for this more restrictive case by showing that V (k,

x
e ) := (1/2)xe P (k)−1 x
e is a Lyapunov function for (4.25). Notice that
this Lyapunov function candidate is not the optimal cost of (4.24) as
in a standard regulation problem. The optimal cost of (4.24), VT0 (a),
is an increasing function of T rather than a decreasing function of T
as required for a Lyapunov function. Although one can find Lyapunov
functions valid for estimation, they do not have the same simple con-
nection to optimal cost functions as in standard regulation problems,
even in the linear, unconstrained case. Stability arguments based in-
stead on properties of VT0 (a) are simpler and more easily adapted to
cover new situations arising in research problems.

Duality of Linear Estimation and Regulation

For linear systems, the estimate error x


e in FIE and state x in regulation
to the origin display an interesting duality that we summarize briefly
here. Consider the following steady-state estimation and infinite hori-
zon regulation problems.
Estimator problem.

x(k + 1) = Ax(k) + Gw(k)


y(k) = Cx(k) + v(k)

R>0 Q>0 (A, C) detectable (A, G) stabilizable


 
x
e (k + 1) = A − LeC x e (k)

Regulator problem.

x(k + 1) = Ax(k) + Bu(k)


y(k) = Cx(k)
4.2 Full Information Estimation 291

Regulator Estimator
A A′
B C′
C G′
k l=N −k
Π(k) P − (l) Regulator Estimator
Π(k − 1) P − (l + 1) R > 0, Q > 0 R > 0, Q > 0
Π P− (A, B) stabilizable (A, C) detectable
Q Q (A, C) detectable (A, G) stabilizable
R R
Pf P − (0)

K −Le
A + BK (A − L e C)′

x x
e

Table 4.2: Duality variables and stability conditions for linear quad-
ratic regulation and least squares estimation.

R>0 Q>0 (A, B) stabilizable (A, C) detectable

x(k + 1) = (A + BK) x(k)

In Appendix A, we derive the dual dynamic system following the ap-


proach in Callier and Desoer (1991), and obtain the duality variables in
regulation and estimation listed in Table 4.2.
We also have the following result connecting controllability of the
original system and observability of the dual system.

Lemma 4.26 (Duality of controllability and observability). (A, B) is con-


trollable (stabilizable) if and only if (A′ , B ′ ) is observable (detectable).

This result can be established directly using the Hautus lemma and
is left as an exercise. This lemma and the duality variables allow us to
translate stability conditions for infinite horizon regulation problems
into stability conditions for FIE problems, and vice versa. For example,
the following is a basic theorem covering convergence of Riccati equa-
tions in the form that is useful in establishing exponential stability of
regulation as discussed in Chapter 1.

Theorem 4.27 (Riccati iteration and regulator stability). Given (A, B)


stabilizable, (A, C) detectable, Q > 0, R > 0, Pf ≥ 0, and the discrete
292 State Estimation

Riccati equation

Π(k − 1) = C ′ QC + A′ Π(k)A−
A′ Π(k)B(B ′ Π(k)B + R)−1 B ′ Π(k)A, k = N, . . . , 1
Π(N) = Pf

Then
(a) There exists Π ≥ 0 such that for every Pf ≥ 0

lim Π(k) = Π
k→−∞

and Π is the unique solution of the steady-state Riccati equation

Π = C ′ QC + A′ ΠA − A′ ΠB(B ′ ΠB + R)−1 B ′ ΠA

among the class of positive semidefinite matrices.

(b) The matrix A + BK, in which

K = −(B ′ ΠB + R)−1 B ′ ΠA

is a stable matrix.
Bertsekas (1987, pp.59–64) provides a proof for a slightly different
version of this theorem. Exercise 4.17 explores translating this theorem
into the form that is useful for establishing exponential convergence
of FIE.

4.3 Moving Horizon Estimation


As displayed in Figure 1.5 of Chapter 1, in MHE we consider only the
N most recent measurements, yN (T ) = y(T − N), y(T − N + 1), . . . ,

y(T − 1) . For T > N, the MHE objective function is given by
TX
−1
V̂T (χ(T − N), ω) = ΓT −N (χ(T − N)) + ℓ(ω(i), ν(i))
i=T −N

subject to χ + = f (χ, ω), y = h(χ) + ν. The MHE problem is defined to


be
Pb T (x T −N , yN (T )) := min V̂T (χ(T − N), ω) (4.28)
χ(T −N),ω

in which ω = (ω(T − N), . . . , ω(T − 1)). The designer chooses the


prior weighting Γk (·) for k > 0. Until the data horizon is full, i.e.,
for times T ≤ N, we generally define the MHE problem to be the full
information problem.
4.3 Moving Horizon Estimation 293

4.3.1 Zero Prior Weighting

Here we discount the early data completely and choose Γi (·) = 0 for
all i ≥ 0. Because it discounts the past data completely, this form of
MHE must be able to asymptotically reconstruct the state using only
the most recent N measurements. The first issue is establishing exis-
tence of the solution. Unlike the full information problem, in which the
positive definite initial penalty guarantees that the optimization takes
place over a bounded (compact) set, here there is zero initial penalty.
So we must restrict the system further than i-IOSS to ensure solution
existence. We show next that observability is sufficient for this pur-
pose.

Definition 4.28 (Observability). The system x + = f (x, w), y = h(x) is


observable if there exist finite No ∈ I≥1 , γw (·), γv (·) ∈ K such that for
every two initial states z1 and z2 , and any two disturbance sequences
w1 , w2 , and all k ≥ No
 
|z1 − z2 | ≤ γw ∥w1 − w2 ∥0:k−1 + γv yz1 ,w1 − yz2 ,w2 0:k−1

Let Assumption 4.10 hold. Then the MHE objective function V̂T (χ(T −
N), ω) is a continuous function of its arguments because f (·) and h(·)
are continuous. We next show that V̂T (·) is an unbounded function of
its arguments, which establishes existence of the solution of the MHE
optimization problem. Let Assumption 4.11 hold. Then we have that

TX
−1
V̂T (χ(T − N), ω) = ℓ(ω(i), ν(i)) ≥
i=T −N
TX
−1
σw (|ω(i)|) + σv (|ν(i)|) (4.29)
i=T −N

From observability we have that for N ≥ No

|x(T − N) − χ(T − N)| ≤ γw (∥w − ω∥T −N:T −1 )+


γv (∥v − ν∥T −N:T −1 ) (4.30)

Consider arbitrary but fixed values of time T , horizon length N ≥ No ,


and the system state and measurement sequence. Let the decision vari-
ables |(χ(T − N), ω)| → ∞. Then we have that either |χ(T − N)| → ∞
or |ω| → ∞. If |ω| → ∞, we have directly from (4.29) that V̂T →
∞. On the other hand, if |χ(T − N)| → ∞, then from (4.30), since
294 State Estimation

x(T − N), w and v are fixed, we have that either ∥ω∥T −N:T −1 → ∞
or ∥ν∥T −N:T −1 → ∞, which implies from (4.29) that V̂T → ∞. We con-
clude that V̂T (χ(T − N), ω) → ∞ if |(χ(T − N), ω)| → ∞. Therefore
the objective function is a continuous and unbounded function of its
arguments, and existence of the solution of the MHE problem can be es-
tablished from the Weierstrass theorem (Proposition A.7). The solution
does not have to be unique.
We show next that final-state observability is a less restrictive and
more natural system requirement for MHE with zero prior weighting to
provide stability and convergence.

Definition 4.29 (Final-state observability). The system x + = f (x, w),


y = h(x) is final-state observable (FSO) if there exist finite No ∈ I≥1 ,
γ w (·), γ v (·) ∈ K such that for every two initial states z1 and z2 , and
any two disturbance sequences w1 , w2 , and all k ≥ No

|x(k; z1 , w1 ) − x(k; z2 , w2 )| ≤ γ w ∥w1 − w2 ∥0:k−1 +

γv yz1 ,w1 − yz2 ,w2 0:k−1

Notice that FSO is not the same as observable. For sufficiently re-
stricted f (·), FSO is weaker than observable and stronger than i-IOSS
(detectable) as discussed in Exercise 4.14.
To ensure FSO, we restrict the system as follows.

Definition 4.30 (Globally K-continuous). A function f : X → Y is glob-


ally K-continuous if there exists function σ (·) ∈ K such that for all
x1 , x2 ∈ X
f (x1 ) − f (x2 ) ≤ σ (|x1 − x2 |) (4.31)

We then have the following result.

Proposition 4.31 (Observable and global K-continuous imply FSO). An


observable system x + = f (x, w), y = h(x) with globally K-continuous
f (·) is final-state observable.

The proof of this proposition is discussed in Exercise 4.14. Con-


sider two equal disturbance sequences, w1 = w2 , and two equal mea-
surement sequences y1 = y2 . FSO implies that for every pair z1 and z2 ,
x(No ; z1 , w1 ) = x(No ; z2 , w1 ); we know the final states at time k = No
are equal. FSO does not imply that the initial states are equal as re-
quired when the system is observable. We can of course add the non-
negative term β(|z1 − z2 | , k) to the right-hand side of the FSO inequal-
ity and obtain the i-IOSS inequality, so FSO implies i-IOSS. Exercise 4.11
4.3 Moving Horizon Estimation 295

treats observable, FSO, and detectable for the linear time-invariant sys-
tem, which can be summarized compactly in terms of the eigenvalues
of the partitioned state transition matrix corresponding to the unob-
servable modes.

Definition 4.32 (RGAS estimation (observable case)). The estimate is


based on the noisy measurement y = h(x(x0 , w)) + v. The estimator is
RGAS (observable case) if there exist No ∈ I≥1 and function δ(·) ∈ K
such that the following holds for all x0 , x 0 ∈ X, w ∈ W, v ∈ V, and
k ≥ No

x(k; x0 , w) − x(k; x̂(0|k), w


b k ) ≤ δ(∥(w, v)∥k−No :k−1 )

Remark. Notice that the definition of RGAS estimation in the observ-


able case is silent about what happens to estimate error at early times,
k < No , while the estimator is collecting enough measurements to ob-
tain its first valid state estimate.

We have the following theorem for this estimator.

Theorem 4.33 (MHE is RGAS (observable case)). Consider an observable


system with globally K-continuous f (·), and measurement sequence
generated by (4.1) with bounded disturbances. Let Assumptions 4.10
and 4.11 hold. Then the MHE estimator using zero prior weighting and
N ≥ No is RGAS (observability case).

Proof. Consider the system to be at state x(k − N) at time k − N and


subject to disturbance sequence (wk , vk ). Due to system observability
and Assumption 4.10, the MHE problem has a solution for all k ≥ N ≥
No . Denote the estimator solution at such time k as initial state x̂(k −
N|k) and disturbance sequence (ŵk , v̂k ). We start by noting that the
optimal MHE cost satisfies the bounds
k−1
X k−1
X
ℓ(ŵk (i), v̂k (i)) = V̂k0 ≤ ℓ(w(i), v(i))
i=k−N i=k−N

Using the upper and lower bounds in Assumption 4.11, σw , σv , σ w ,


σ v , (B.1), and noting that max(|a| , |b|) ≤ |(a, b)| ≤ |a| + |b|, we can
convert these bounds into

σ ( (w
b k, v
b k) k−N:k−1 ) ≤ V̂k0 ≤ σ (∥(wk , vk )∥k−N:k−1 )

where σ (·) := min(σw , σv )((·)/2) and σ := 2N max(σ w , σ v ). Note


that σ (·), σ (·) ∈ K. The system is FSO by Proposition 4.31 since the
296 State Estimation

system is observable and f (·) is globally K-continuous. Considering


x(k − N) and x̂(k − N|k) as two initial conditions and applying the FSO
bound gives
 
|x(k) − x̂(k)| ≤ γ w wk − w
bk k−N:k−1 + γv (vk − v
b k) k−N:k−1

for some K-functions γ w , γ v . Again using |(a, b)| ≥ max(|a| , |b|) and
the triangle inequality, this bound can be rearranged into
 
|x(k) − x̂(k)| ≤ γ x ∥(wk , vk )∥k−N:k−1 + γ x (w
b k, v
b k) k−N:k−1

where γ x (·) := 2 max(γ w , γ v )(2(·)). Note that γ x (·) ∈ K. Next apply


σ −1 to the V̂k0 inequality above to obtain

(w
b k, v
b k) k−N:k−1 ≤ σ −1 ◦ σ (∥(wk , vk )∥k−N:k−1 )

and substitute this result into the previous inequality to obtain for all
k ≥ N ≥ No

|x(k) − x̂(k)| ≤ δ ∥(wk , vk )∥k−N:k−1

with δ := γ x + γ x ◦ σ −1 ◦ σ , which is also a K-function. We have


therefore established that MHE with zero prior weighting is RGAS (ob-
servable case). ■

Notice that unlike in FIE, the estimate error bound does not require
the initial error x(0) − x 0 since we have zero prior weighting and as
a result have assumed observability rather than detectability. Notice
also that RGAS implies estimate error converges to zero for convergent
disturbances. Finally, the K-functions σ and hence δ increase with N,
which shows that this analysis can likely be tightened to remove this
N dependence. See also the Notes discussion on this point.

4.3.2 Nonzero Prior Weighting

The two drawbacks of zero prior weighting are: the system had to be
assumed observable rather than detectable to ensure existence of the
solution to the MHE problem; and a large horizon N may be required
to obtain performance comparable to full information estimation. We
address these two disadvantages by using nonzero prior weighting. To
get started, we use forward DP, as we did in Chapter 1 for the uncon-
strained linear case, to decompose the FIE problem exactly into the MHE
problem (4.28) in which Γk (·) is chosen as arrival cost.
4.3 Moving Horizon Estimation 297

Definition 4.34 (Full information arrival cost). The full information ar-
rival cost is defined as

ZT (p) = min VT (χ(0), ω) (4.32)


χ(0),ω

subject to

χ + = f (χ, ω) y = h(χ) + ν χ(T ; χ(0), ω) = p

We have the following equivalence.

Lemma 4.35 (MHE and FIE equivalence). The MHE problem (4.28) is
equivalent to the full information problem (4.3) for the choice Γk (·) =
Zk (·) for all k > N and N ≥ 1.

The proof is left as an exercise. This lemma is the essential insight


provided by the DP recursion. But notice that evaluating arrival cost in
(4.32) has the same computational complexity as solving a full infor-
mation problem. So next we generate an MHE problem that has simpler
computational requirements, but retains the excellent stability proper-
ties of full information estimation.

4.3.3 RGES of MHE under exponential assumptions

We consider the simplest case of MHE in which we penalize deviation


from x̂(k|k) with prior weighting that has power-law upper and lower
bounds with time-invariant parameters described by the following as-
sumption.

Assumption 4.36 (MHE prior weighting bounds). For all k ∈ I≥0 , Γk :


X × X → R≥0 is continuous and there exist constants c Γ , c Γ ≥ 0 such
that the following bounds hold uniformly in k for all χ, x̂(k|k) ∈ X

c Γ |χ − x̂(k|k)|σ ≤ Γk (χ) ≤ c Γ |χ − x̂(k|k)|σ (4.33)

in which σ ≥ 1 comes from Assumption 4.22.

So, when solving the MHE problem at time T , we bound the prior
weighting on the initial state at time T − N using the deviation from the
estimate x̂(T −N|T −N). Choosing a constant cΓ satisfying c Γ ≤ cΓ ≤ c Γ
and corresponding prior weighting Γk (χ) = cΓ |χ − x̂(k|k)|σ would be
the simplest choice meeting this assumption.
We next establish that MHE is RGES under the exponential case as-
sumptions with this so-called filtering prior and constant prior weight-
ing bounds (Allan and Rawlings, 2020, Theorem 4.2).
298 State Estimation

Theorem 4.37 (MHE is RGES). Let Assumptions 4.10, 4.22–4.24, and


4.36 hold. Then there exists a horizon length N such that MHE is RGES
for all N ≥ N.
Proof. Let e(k) := x(k) − x̂(k|k) and e0 := x(0) − x 0 to compress the
notation. Let any time k ∈ I≥0 be expressed as k = k0 + pN for k0 ∈
I0:N−1 and p ≥ 0. Since k0 ≤ N − 1, the horizon at time k = k0 is not yet
filled, and the MHE problem reduces to the FIE problem; we have from
Theorem 4.25(b) and (4.21) that
|e(k0 )| ≤ ax |e0 | λk0 ⊕ max ad d(k0 − j − 1) λj
j∈I0:k0 −1

with 0 ≤ λ < 1. Now consider the time to be one horizon length later.
The MHE problem at this time has identical structure to the FIE problem,
but with different data: the initial prior x 0 is replaced by x̂(k0 |k0 ), the
bounds on ℓx (·) are replaced by the bounds on Γk (·), and the initial
and final times (0, k0 ) are replaced by (k0 , k0 + N). We therefore have
that
|e(k0 + N)| ≤ aΓ |e(k0 )| λN ⊕ max ad d(k0 + N − j − 1) λj
j∈I0:N−1

where the RGES constant ax is altered by the new data to a new constant
denoted aΓ > 0.4 Using the previous bound for e(k0 ) then gives

|e(k0 + N)| ≤ |e0 | ax (aΓ λk0 +N ) ⊕ aΓ λN max ad d(k0 − j − 1) λj


j∈I0:k0 −1

⊕ max ad d(k0 + N − j − 1) λj
j∈I0:N−1

We next choose N large enough so that aΓ λN < 1. Choose N ∈ I≥1 as


the smallest value such that aΓ λN < 1, and we restrict the horizon to
N ≥ N. Repeating this bounding argument gives for p ≥ 0

e(k0 + pN) ≤
p
|e0 | ax (aΓ λk0 +pN ) ⊕ (aΓ λN )p max ad d(k0 − j − 1) λj
j∈I0:k0 −1

p−1
M
(aΓ λN )i max ad d(k0 + (p − i)N − j − 1) λj
j∈I0:N−1
i=0
4 The constant ax is derived in the proof of Theorem 3.16 in Allan and Rawlings
h i1/σ
(2020) and shown to be ax := (c x + c2 2σ −1 (1 + c x /c x ))/c1 where c1 ≤ c2 are
the constants in the power-law bounds for the exponential i-IOSS Lyapunov function
corresponding to Assumption 4.24, and c x , c x are from Assumption 4.22. The value
of aΓ is therefore given by replacing c x and c x in this expression with c Γ and c Γ ,
respectively. Note that ax , aΓ ≥ 1 since c1 ≤ c2 .
4.3 Moving Horizon Estimation 299

y(k)

MHE problem at T
yT −N:T −1

smoothing update
yT −N−1:T −2

filtering update
yT −2N:T −N−1

T − 2N T −N T

Figure 4.1: Smoothing update.

1/N
Now let η := aΓ λ, and note that λ ≤ η < 1 by the choice of N.
Substituting η into the previous equation and noting that aΓ ≥ 1 gives
the bound

e(k0 + pN) ≤ |e0 | ax ηk0 +pN ⊕ max ad d(k0 − j − 1) ηpN+j


j∈I0:k0 −1

p−1
M
max ad d(k0 + (p − i)N − j − 1) ηiN+j
j∈I0:N−1
i=0

Now substitute k = k0 + pN and note that the maximizations simplify


giving
|e(k)| ≤ ax |e0 | ηk ⊕ max ad d(k − j − 1) ηj
j∈I0:k−1

for all k ≥ 0 with N ≥ N, and MHE is RGES from Proposition 4.21. ■

Filtering versus smoothing update. The MHE approach discussed


to this point uses, at all time T > N, the MHE estimate x̂(T − N) and
prior weighting function ΓT −N (·), which may be regarded as our best
approximation of the arrival cost. We call this approach a “filtering
update” because the prior weight at time T is derived from the solution
of the MHE “filtering problem” at time T − N, i.e., the estimate of x̂(T −
N) := x̂(T − N|T − N) given measurements up to time T − N − 1. For
300 State Estimation

EKF MHE filtering MHE smoothing actual


3
100

bA | (atm)
2 10−1
pA (atm)

10−2
1 10−3

|pA − p
10−4
0
10−5
−1 10−6

4
100

bB | (atm)
pB (atm)

3 10−1
|pB − p

2 10−2

10−3
1
0 5 10 15 20 0 5 10 15 20
time (min) time (min)

Figure 4.2: Comparison of filtering and smoothing updates for the


batch reactor system. Second column shows absolute
estimate error.

implementation, this choice requires storage of a window of N prior


filtering estimates to be used in the prior weighting functions as time
progresses.
Next we describe a “smoothing update” that can be used instead. As
depicted in Figure 4.1, in the smoothing update we wish to use x̂(T −
N|T − 1) (instead of x̂(T − N|T − N)) for the prior and wish to find
an appropriate prior weighting based on this choice. For the linear
unconstrained problem we can find an exact prior weighting that gives
an equivalence to the full information problem. See Rao, Rawlings, and
Lee (2001) and Rao (2000, pp.80–93) for a derivation of this equivalence,
with minor error corrections provided in the first edition of this text
(Rawlings and Mayne, 2009, p.292).
We illustrate with the following example why the smoothing update
may be useful in nonlinear models.

Example 4.38: Filtering and smoothing updates


Consider a constant-volume batch reactor in which the reaction 2A ⇋ B
takes place (Tenny and Rawlings, 2002). The system state x consists
4.3 Moving Horizon Estimation 301

of the partial pressures (pA , pb ) that evolve according to

dpA 2
= −2k1 pA + 2k2 pB
dt
dpB 2
= k1 p A − k2 p B
dt

with k1 = 0.16 min−1 atm−1 and k2 = 0.0064 min−1 . The only mea-
surement is total pressure, y = pA + pB .
Starting from initial condition x = (3, 1), the system is measured
with sample time ∆ = 0.1 min. The model is exact and there are no
disturbances. Using a poor initial estimate x 0 = (0.1, 4.5), parameters
" # " #
10−4 0 h i 1 0
Q= R = 0.01 P=
0 0.01 0 1

and horizon N = 10, MHE is performed on the system using the filtering
and smoothing updates for the prior weighting. For comparison, the
EKF is also used. The resulting estimates are plotted in Figure 4.2.
In this simulation, MHE performs well with either update formula.
Due to the structure of the filtering update, every N = 10 time steps,
a poor state estimate is used as the prior, which leads to undesirable
periodic behavior in the estimated state. Due to the poor initial state es-
timate, the EKF produces negative pressure estimates, leading to large
estimate errors throughout the simulation. □

Summary remarks. The results presented in this section are repre-


sentative of what is currently known about MHE for bounded distur-
bances, but we expect that this analysis remains far from finished. Sev-
eral questions remain open.

• Is MHE RGAS if the system is asymptotically (rather than exponen-


tially) i-IOSS? What are the required compatibility conditions be-
tween the allowable stage costs and the i-IOSS condition to achieve
an RGAS MHE estimator?

• What are the best methods to update the MHE initial penalty, Γk (·)
to obtain an accurate estimator with a small horizon N for com-
putational efficiency?

• Is MHE with a smoothing update instead of a filtering update


RGAS? What stage costs are allowable to achieve RGAS of an MHE
estimator with a smoothing update?
302 State Estimation

4.4 Other Nonlinear State Estimators

State estimation for nonlinear systems has a long history, and moving
horizon estimation is a rather new approach to the problem. As with
model predictive control, the optimal estimation problem on which
moving horizon is based has a long history, but only the rather recent
advances in computing technology have enabled moving horizon esti-
mation to be considered as a viable option in online applications. It is
therefore worthwhile to compare moving horizon estimation to other
less computationally demanding nonlinear state estimators.

4.4.1 Particle Filtering

An extensive discussion and complete derivation of particle filtering


appeared in the first edition of the text (Rawlings and Mayne, 2009,
pp.301–355). This material is available electronically on the text’s web-
site. As with many sample-based procedures, however, it seems that all
of the available sampling strategies in particle filtering do run into the
“curse of dimensionality.” The low density of samples in a reasonably
large-dimensional space (say n ≥ 5) lead to inaccurate state estimates.
For this reason we omit further discussion of particle filtering in this
edition.
Feedback particle filtering has recently been suggested as an alter-
native to overcome many of the drawbacks of the particle filter (Yang,
Mehta, and Meyn, 2013). In feedback particle filtering, one uses the
measurements to influence the particle locations by solving an optimal
control problem for repositioning the particles to obtain an accurate
posterior distribution after measurement. Application examples and
a burgeoning literature on the theoretical properties of different algo-
rithms indicate that this technique may provide a valuable addition to
nonlinear estimation (Berntorp and Grover, 2018).

4.4.2 Extended Kalman Filtering

The extended Kalman filter (EKF) generates estimates for nonlinear sys-
tems by first linearizing the nonlinear system, and then applying the
linear Kalman filter equations to the linearized system. The approach
can be summarized in a recursion similar in structure to the Kalman
4.4 Other Nonlinear State Estimators 303

filter (Stengel, 1994, pp.387–388)

x̂ − (k + 1) = f (x̂(k), 0)
P − (k + 1) = A(k)P (k)A(k)′ + G(k)QG(k)′
x̂ − (0) = x 0 P − (0) = Q0

The mean and covariance after measurement are given by



x̂(k) = x̂ − (k) + L(k) y(k) − h(x̂ − (k))
L(k) = P − (k)C(k)′ (R + C(k)P − (k)C(k)′ )−1
P (k) = P − (k) − L(k)C(k)P − (k)

with the following linearizations


∂f (x, w) ∂f (x, w) ∂h(x)
A(k) = G(k) = C(k) =
∂x (x̂(k),0) ∂w (x̂(k),0) ∂x x̂ − (k)

The densities of w, v, and x0 are assumed to be normal. Many vari-


ations on this theme have been proposed, such as the iterated EKF
and the second-order EKF (Gelb, 1974, 190–192). Of the nonlinear fil-
tering methods, the EKF method has received the most attention due
to its relative simplicity and demonstrated effectiveness in handling
some nonlinear systems. Examples of implementations include esti-
mation for the production of silicon/germanium alloy films (Middle-
brooks and Rawlings, 2006), polymerization reactions (Prasad, Schley,
Russo, and Bequette, 2002), and fermentation processes (Gudi, Shah,
and Gray, 1994). The EKF is at best an ad hoc solution to a difficult
problem, however, and hence there exist many pitfalls to the practi-
cal implementation of EKFs (see, for example, (Wilson, Agarwal, and
Rippin, 1998)). These problems include the inability to accurately in-
corporate physical state constraints, and the naive use of linearization
of the nonlinear model.
Until recently, few properties regarding the stability and conver-
gence of the EKF have been established. Recent research shows bounded
estimation error and exponential convergence for the continuous and
discrete EKF forms given observability, small initial estimation error,
small noise terms, and no model error (Reif, Günther, Yaz, and Unbe-
hauen, 1999; Reif and Unbehauen, 1999; Reif, Günther, Yaz, and Unbe-
hauen, 2000). Depending on the system, however, the bounds on initial
estimation error and noise terms may be unrealistic. Also, initial esti-
mation error may result in bounded estimate error but not exponential
convergence, as illustrated by Chaves and Sontag (2002).
304 State Estimation

Julier and Uhlmann (2004a) summarize the status of the EKF as


follows.

The extended Kalman filter is probably the most widely used


estimation algorithm for nonlinear systems. However, more
than 35 years of experience in the estimation community
has shown that it is difficult to implement, difficult to tune,
and only reliable for systems that are almost linear on the
time scale of the updates.

We seem to be making a transition from a previous era in which new


approaches to nonlinear filtering were criticized as overly complex be-
cause “the EKF works,” to a new era in which researchers are demon-
strating ever simpler examples in which the EKF fails completely. The
unscented Kalman filter is one of the methods developed specifically
to overcome the problems caused by the naive linearization used in the
EKF.

4.4.3 Unscented Kalman Filtering

The linearization of the nonlinear model at the current state estimate


may not accurately represent the dynamics of the nonlinear system be-
havior even for one sample time. In the EKF prediction step, the mean
propagates through the full nonlinear model, but the covariance prop-
agates through the linearization. The resulting error is sufficient to
throw off the correction step and the filter can diverge even with a per-
fect model. The unscented Kalman filter (UKF) avoids this linearization
at a single point by sampling the nonlinear response at several points.
The points are called sigma points, and their locations and weights are
chosen to satisfy the given starting mean and covariance (Julier and
Uhlmann, 2004a,b).5 Given x̂ and P , choose sample points, zi , and
weights, w i , such that
X X
x̂ = w i zi P= w i (zi − x̂)(zi − x̂)′
i i

Similarly, given w ∼ N(0, Q) and v ∼ N(0, R), choose sample points


ni for w and mi for v. Each of the sigma points is propagated forward
at each sample time using the nonlinear system model. The locations
5 Note that this idea is fundamentally different than the idea of particle filtering.

The sigma points are chosen deterministically, for example, as points on a selected
covariance contour ellipse or a simplex. The particle filtering points are chosen by
random sampling.
4.4 Other Nonlinear State Estimators 305

and weights of the transformed points then update the mean and co-
variance

zi (k + 1) = f (zi (k), ni (k))


ηi = h(zi ) + mi all i

From these we compute the forecast step


X X
x̂ − = w i zi ŷ − = w i ηi
i i
X
− i i i
P = w (z − x̂ )(z − x̂ − )′

After measurement, the EKF correction step is applied after first ex-
pressing this step in terms of the covariances of the innovation and
e := y − ŷ − . We next
state prediction. The output error is given as y
rewrite the Kalman filter update as

x̂ = x̂ − + L(y − ŷ − )
′ ′
L = E((x − x̂ − )y
e ) E(y
ey e )−1
| {z } | {z }
P −C′ (R+CP − C ′ )−1
− − ′ ′
P = P − L E((x − x̂ )y
e )
| {z }
CP −

in which we approximate the two expectations with the sigma-point


samples
′ X
E((x − x̂ − )y
e )≈ w i (zi − x̂ − )(ηi − ŷ − )′
i
′ X
ey
E(y e )≈ w i (ηi − ŷ − )(ηi − ŷ − )′
i

See Julier, Uhlmann, and Durrant-Whyte (2000); Julier and Uhlmann


(2004a); van der Merwe, Doucet, de Freitas, and Wan (2000) for more
details on the algorithm. An added benefit of the UKF approach is that
the partial derivatives ∂f (x, w)/∂x, ∂h(x)/∂x are not required. See
also Nørgaard, Poulsen, and Ravn (2000) for other derivative-free non-
linear filters of comparable accuracy to the UKF. See Lefebvre, Bruyn-
inckx, and De Schutter (2002); Julier and Uhlmann (2002) for an inter-
pretation of the UKF as a use of statistical linear regression.
The UKF has been tested in a variety of simulation examples taken
from different application fields including aircraft attitude estimation,
306 State Estimation

tracking and ballistics, and communication systems. In the chemical


process control field, Romanenko and Castro (2004); Romanenko, San-
tos, and Afonso (2004) have compared the EKF and UKF on a strongly
nonlinear exothermic chemical reactor and a pH system. The reactor
has nonlinear dynamics and a linear measurement model, i.e., a sub-
set of states is measured. In this case, the UKF performs significantly
better than the EKF when the process noise is large. The pH system
has linear dynamics but a strongly nonlinear measurement, i.e., the pH
measurement. In this case, the authors show a modest improvement
in the UKF over the EKF.

4.4.4 EKF, UKF, and MHE Comparison

One nice feature enjoyed by the EKF and UKF formulations is the re-
cursive update equations. One-step recursions are computationally ef-
ficient, which may be critical in online applications with short sample
times. The MHE computational burden may be reduced by shorten-
ing the length of the moving horizon, N. But use of short horizons
may produce inaccurate estimates, especially after an unmodeled dis-
turbance. This unfortunate behavior is the result of the system’s non-
linearity. As we saw in Sections 1.4.3–1.4.4, for linear systems, the full
information problem and the MHE problem are identical to a one-step
recursion using the appropriate state penalty coming from the filtering
Riccati equation. Losing the equivalence of a one-step recursion to full
information or a finite moving horizon problem brings into question
whether the one-step recursion can provide equivalent estimator per-
formance. We show in the following example that the EKF and the UKF
do not provide estimator performance comparable to MHE.

Example 4.39: EKF, UKF, and MHE performance comparison


Consider the following set of reversible reactions taking place in a well-
stirred, isothermal, gas-phase batch reactor
k1 k2
-⇀ B + C
A ↽- -⇀ C
2B ↽-
k−1 k−2

The material balance for the reactor is


   
cA −1 0 " #
d     k1 cA − k−1 cB cC
 cB  =  1 −2
dt k2 cB2 − k−2 cC
cC 1 1
dx
= fc (x)
dt
4.4 Other Nonlinear State Estimators 307

with states and measurement


h i′ h i
x = cA cB cC y = RT 1 1 1 x

in which cj denotes the concentration of species j in mol/L, R is the


gas constant, and T is the reactor temperature in K. The measurement
is the reactor pressure in atm, and we use the ideal gas law to model
the pressure. The model is nonlinear because of the two second-order
reactions. We model the system plus disturbances with the following
discrete time model

x + = f (x) + w
y = Cx + v

in which f is the solution of the ordinary differential equations (ODEs)


over the sample time, ∆, i.e., if s(t, x0 ) is the solution of dx/dt = fc (x)
with initial condition x(0) = x0 at t = 0, then f (x) = s(∆, x). The
state and measurement disturbances, w and v, are assumed to be zero-
mean independent normals with constant covariances Q and R. The
following parameter values are used in the simulations

RT = 32.84 mol · atm/L


∆ = 0.25 k1 = 0.5 k−1 = 0.05 k2 = 0.2 k−2 = 0.01
h i
C = 1 1 1 RT P (0) = (0.5)2 I Q = (0.001)2 I R = (0.25)2
   
1 0.5
   
x 0 = 0 x(0) = 0.05
4 0

The prior density for the initial state, N(x 0 , P (0)), is deliberately cho-
sen to poorly represent the actual initial state to model a large initial
disturbance to the system. We wish to examine how the different esti-
mators recover from this large unmodeled disturbance.

Solution
Figure 4.3 (top) shows a typical EKF performance for these conditions.
Note that the EKF cannot reconstruct the state for this system and that
the estimates converge to incorrect steady states displaying negative
concentrations of A and B. For some realizations of the noise sequences,
the EKF may converge to the correct steady state. Even for these cases,
308 State Estimation

1.5
C

C
0.5
cj B
A
0
A
B
−0.5

−1
0 5 10 15 20 25 30
t

1000

100

10

1 C
cj
0.1 B

0.01
A
0.001

0.0001
0 20 40 60 80 100 120 140
t

Figure 4.3: Evolution of the state (solid line) and EKF state estimate
(dashed line). Top plot shows negative concentration es-
timates with the standard EKF. Bottom plot shows large
estimate errors and slow convergence with the clipped
EKF.
4.4 Other Nonlinear State Estimators 309

1.5
C

C
0.5
cj B
A
0
A
B
−0.5

−1
0 5 10 15 20 25 30
t

1.5

1 C

0.5 C

cj B

0 A
A

−0.5 B

−1
0 5 10 15 20 25 30 35
t

Figure 4.4: Evolution of the state (solid line) and UKF state estimate
(dashed line). Top plot shows negative concentration es-
timates with the standard UKF. Bottom plot shows similar
problems even if constraint scaling is applied.
310 State Estimation

0.7

C
0.6

0.5

0.4
cj
0.3
B
0.2

0.1
A
0
0 5 10 15 20 25 30
t

Figure 4.5: Evolution of the state (solid line) and MHE state estimate
(dashed line).

however, negative concentration estimates still occur during the tran-


sient, which correspond to physically impossible states. Figure 4.3 (bot-
tom) presents typical results for the clipped EKF, in which negative val-
ues of the filtered estimates are set to zero. Note that although the
estimates converge to the system states, this estimator gives pressure
estimates that are two orders of magnitude larger than the measured
pressure before convergence is achieved.
The standard UKF achieves results similar to the EKF as shown in
Figure 4.4 (top). Vachhani, Narasimhan, and Rengaswamy (2006) have
proposed a modification to the UKF to handle constrained systems. In
this approach, the sigma points that violate the constraints are scaled
back to the feasible region boundaries and the sigma-point weights are
modified accordingly. If this constrained version of the UKF is applied
to this case study, the estimates do not significantly improve as shown
in Figure 4.4 (bottom). The UKF formulations used here are based on
the algorithm presented by Vachhani et al. (2006, Sections 3 and 4)
with the tuning parameter κ set to κ = 1. Adjusting this parameter
4.4 Other Nonlinear State Estimators 311

using other suggestions from the literature (Julier and Uhlmann, 1997;
Qu and Hahn, 2009; Kandepu, Imsland, and Foss, 2008) and trial and
error, does not substantially improve the UKF estimator performance.
Better performance is obtained in this example if the sigma points
that violate the constraints are simply saturated rather than rescaled
to the feasible region boundaries. But, this form of clipping still does
not prevent the occurrence of negative concentrations in this example.
Negative concentration estimates are not avoided by either scaling or
clipping of the sigma points. As a solution to this problem, the use
of constrained optimization for the sigma points is proposed (Vach-
hani et al., 2006; Teixeira, Tôrres, Aguirre, and Bernstein, 2008). If one
is willing to perform online optimization, however, MHE with a short
horizon is likely to provide more accurate estimates at similar computa-
tional cost compared to approaches based on optimizing the locations
of the sigma points.
The authors have only recently become aware of yet another ap-
proach to handling constraints in the UKF that does work well on this
example (Kolås, Foss, and Schei, 2009). It remains to be seen whether
further examples can be constructed that this approach cannot ad-
dress.
Finally, Figure 4.5 presents typical results of applying constrained
MHE to this example. For this simulation we choose N = 10 and the
smoothing update for the arrival cost approximation. Note that MHE
recovers well from the poor initial prior. Comparable performance is
obtained if the filtering update is used instead of the smoothing update
to approximate the arrival cost. The MHE estimates are also insensitive
to the choice of horizon length N for this example. □

The EKF, UKF, and all one-step recursive estimation methods, suffer
from the “short horizon syndrome” by design. One can try to reduce
the harmful effects of a short horizon through tuning various other
parameters in the estimator, but the basic problem remains. Large
initial state errors lead to inaccurate estimation and potential estimator
divergence. The one-step recursions such as the EKF and UKF can be
viewed as one extreme in the choice between speed and accuracy in
that only a single measurement is considered at each sample. That is
similar to an MHE problem in which the user chooses N = 1. Situations
in which N = 1 lead to poor MHE performance often lead to unreliable
EKF and UKF performance as well.
312 State Estimation

4.5 On combining MHE and MPC


Estimating the state of a system is an interesting problem in its own
right, with many important applications having no connection to feed-
back control. But in some applications the goal of state estimation is
indeed to provide a state feedback controller with a good estimate of
the system state based on the available measurements. We close this
chapter with a look at the properties of such control systems consist-
ing of a moving horizon estimator that provides the state estimate to
a model predictive controller.
What’s desirable. Consider the evolution of the system x + = f (x, u,
w) and its measurement y = h(x) + v when taking control using MPC
based on the state estimate

x + = f (x, κN (x̂), w) y = h(x) + v

with f : Z × W → Rn , h : Rn → Rp , in which w ∈ W is the process


disturbance and v ∈ V is the measurement disturbance, u = κN (x̂)
is the control from the MPC regulator, and x̂ is generated by the MHE
estimator. We assume, as we have through the text, that f (·) and h(·)
are continuous functions. Again we denote estimate error by e := x − x̂,
which gives for the state evolution

x + = f (x, κN (x − e), w) y = h(x) + v (4.34)

The obvious difficulty with analyzing the effect of estimate error is the
coupling of estimation and control. Unlike the problem studied earlier
in the chapter, where x + = f (x, w), we now have estimate error also
influencing state evolution. This coupling precludes obtaining the sim-
ple bounds on |e(k)| in terms of (e(0), w, v) as we did in the previous
sections.
What’s possible. Here we lower our sights from the analysis of the
fully coupled problem and consider only the effect of bounded esti-
mate error on the combined estimation/regulation problem. To make
this precise, consider the following definition of an incrementally, uni-
formly input/output-to-state stable (i-UIOSS) system.

Definition 4.40 (i-UIOSS). The system

x + = f (x, u, w) y = h(x)

is incrementally uniformly input/output-to-state stable (i-UIOSS) if there


exist functions α(·) ∈ KL and γw (·), γv (·) ∈ K such that for any
4.5 On combining MHE and MPC 313

two initial states z1 and z2 , any input sequence u, and any two distur-
bance sequences w1 and w2 generating state sequences x1 (z1 , u, w1 )
and x2 (z2 , u, w2 ), the following holds for all k ∈ I≥0

|x(k; z1 , u, w1 ) − x(k; z2 , u, w2 )| ≤ α(|z1 − z2 | , k)⊕


 
γw ∥w1 − w2 ∥0:k−1 ⊕ γv ∥h(x1 ) − h(x2 )∥0:k−1 (4.35)

Notice that the bound is uniform in the sense that it is independent


of the input sequence u generated by a controller. See Cai and Teel
(2008, Definition 3.4) for similar definitions. Exercise 4.15 discusses
how to establish that a detectable linear system x + = Ax + Bu + Gw,
y = Cx is i-UIOSS.
Given this strong form of detectability, we assume that we can derive
an error bound of the form

Assumption 4.41 (Bounded estimate error). There exists δ > 0 and


β(·) ∈ KL and σ (·) ∈ K such that for all ∥(w, v)∥ ≤ δ and for all
k ≥ 0 the following holds

|e(k)| ≤ β(|e(0)| , k) + σ (∥(w, v)∥)

Next we note that the evolution of the state in the form of (4.34)
is not a compelling starting point for analysis because the estimate
error perturbation appears inside a possibly discontinuous function,
κN (·) (recall Example 2.8). Therefore, as in (Roset, Heemels, Lazar, and
Nijmeijer, 2008), we instead express the equivalent evolution, but in
terms of the state estimate as

x̂ + = f (x̂ + e, κN (x̂), w) − e+ y = h(x̂ + e) + v

which is more convenient because the estimate error appears inside


continuous functions f (·) and h(·).
We require that the system not leave an invariant set due to the
disturbance.

Definition 4.42 (Robust positive invariance). A set X ⊆ Rn is robustly


positive invariant with respect to a difference inclusion x + ∈ f (x, d)
if there exists some δ > 0 such that f (x, d) ⊆ X for all x ∈ X and all
disturbance sequences d satisfying ∥d∥ ≤ δ.

So, we define robust asymptotic stability as input-to-state stability


on a robust positive invariant set.
314 State Estimation

Definition 4.43 (Robust asymptotic stability). The origin of a perturbed


difference inclusion x + ∈ f (x, d) is RAS in X if there exists some
δ > 0 such that for all disturbance sequences d satisfying ∥d∥ ≤ δ
we have both that X is robustly positive invariant and that there exist
β(·) ∈ KL and γ(·) ∈ K such that for each x ∈ X, we have for all
k ∈ I≥0 that all solutions φ(k; x, d) satisfy

φ(k; x, d) ≤ β(|x| , k) + γ(∥d∥) (4.36)

To establish input-to-state stability, we define an ISS Lyapunov func-


tion for a difference inclusion, similar to an ISS Lyapunov function de-
fined in Jiang and Wang (2001); Lazar, Heemels, and Teel (2013). See
also Definition B.45 in Appendix B.

Definition 4.44 (ISS Lyapunov function). V (·) is an ISS Lyapunov func-


tion in the robust positive invariant set X for the difference inclu-
sion x + ∈ f (x, d) if there exists some δ > 0, functions α1 (·), α2 (·),
α3 (·) ∈ K∞ , and function σ (·) ∈ K such that for all x ∈ X and
∥d∥ ≤ δ

α1 (|x|) ≤ V (x) ≤ α2 (|x|) (4.37)


+
sup V (x ) ≤ V (x) − α3 (|x|) + σ (|d|) (4.38)
x + ∈f (x,d)

The value of an ISS Lyapunov function is analogous to having a Lya-


punov function in standard stability analysis: it allows us to conclude
input-to-state stability and therefore robust asymptotic stability. The
following result is therefore highly useful in robustness analysis.

Proposition 4.45 (ISS Lyapunov stability theorem). If a difference inclu-


sion x + ∈ f (x, d) admits an ISS Lyapunov function in a robust positive
invariant set X for all ∥d∥ ≤ δ for some δ > 0, then the origin is RAS in
X for all ∥d∥ ≤ δ.

The proof of this proposition follows Jiang and Wang (2001) as mod-
ified for a difference inclusion on a robust positive invariant set in Al-
lan, Bates, Risbeck, and Rawlings (2017, Proposition 19).
Combined MHE/MPC is RAS. Our strategy now is to establish that
VN0 (x) is an ISS Lyapunov function for the combined MHE/MPC system
subject to process and measurement disturbances on a robust positive
invariant set. We have already established the upper and lower bound-
ing inequalities
α1 (|x|) ≤ VN0 (x) ≤ α2 (|x|)
4.5 On combining MHE and MPC 315

XN+1

e XN
x x̂

f (x̂, κN (x̂), 0)
e
u
κf (·)
f (x, κN (x̂), 0)

x+ f (x, κN (x̂), w) Xf

−e+
x̂ +

e
u

Figure 4.6: Although the nominal trajectory from x̂ may terminate


on the boundary of Xf , the three perturbed trajectories,
including the one from x̂ + , terminate in Xf . After Allan
et al. (2017).

So we require only

sup VN0 (x + ) ≤ VN0 (x) − α3 (|x|) + σ (|d|)


x + ∈f (x,d)

with disturbance d defined here as d := (e, w, e+ ). That plus robust


positive invariance establishes that the controlled system is RAS.
Figure 4.6 gives the picture of the argument we are going to make.
We have that x̂ + = f (x̂ + e, κN (x̂), w) − e+ and x + = f (x, κN (x̂), w).
We create the standard candidate input sequence by dropping the first
input and applying the terminal control law to  the terminal state, i.e.,
e = u0 (1; x̂), . . . , u0 (N − 1; x̂), κf (x 0 (N; x̂)) . We then compute dif-
u
ference in cost of trajectories starting at f (x̂, κN (x̂), 0) and x̂ + using
the same input sequence u e . We choose the terminal region to be a
sublevel set of the terminal cost, Xf = levτ Vf , τ > 0. Note that u e is
feasible for both initial states, i.e., both trajectories terminate in Xf , if
|(e, w, e+ )| is small enough.
As in Chapter 3, we make use of Proposition 3.4 to bound the size
of the change to a continuous function (Allan et al., 2017, Proposition
20). Since VN (x, u) is continuous, Proposition 3.4 gives

VN (x̂ + , u
e ) − VN f (x̂, κN (x̂), 0), u
e ≤ σV ( x̂ + − f (x̂, κN (x̂), 0) )
316 State Estimation

with σV (·) ∈ K. Note that we are not using the possibly discontinuous
VN0 (x) here. Since f (x, u, w) is also continuous
x̂ + − f (x̂, κN (x̂), 0) = f (x̂ + e, κN (x̂), w) − e+ − f (x̂, κN (x̂), 0)
≤ f (x̂ + e, κN (x̂), w) − f (x̂, κN (x̂), 0) + e+
≤ σf (|(e, w)|) + e+
≤σ
e f (|d|)

with d := (e, w, e+ ) and σ


e f (·) ∈ K. Therefore
+ e

VN (x̂ , u) − VN f (x̂, κN (x̂), 0), u
e ≤ σV ◦ σe f (|d|) := σ (|d|)
+ e

VN (x̂ , u) ≤ VN f (x̂, κN (x̂), 0), u
e + σ (|d|)

with σ (·) ∈ K. Note that for the candidate sequence, VN f (x̂, κN (x̂),

e ≤ VN0 (x̂) − ℓ(x̂, κN (x̂)), so we have that
0), u

e ) ≤ VN0 (x̂) − α1 (|x̂|)
VN f (x̂, κN (x̂), 0 , u
since α1 (|x|) ≤ ℓ(x, κN (x)) for all x. Therefore, we finally have
e ) ≤ VN0 (x̂) − α1 (|x̂|) + σ (|d|)
VN (x̂ + , u
VN0 (x̂ + ) ≤ VN0 (x̂) − α1 (|x̂|) + σ (|d|) (4.39)

and we have established that VN0 (·) satisfies the inequality of an ISS-
Lyapunov function. This analysis leads to the following main result.
Theorem 4.46 (Combined MHE/MPC is RAS). For the MPC regulator,
let the standard Assumptions 2.2, 2.3, and 2.14 hold, and choose Xf =
levτ Vf for some τ > 0. For the moving horizon estimator, let Assump-
tion 4.41 hold. Then for every ρ > 0 there exists δ > 0 such that if
∥d∥ ≤ δ, the origin is RAS for the system x̂ + = f (x̂ + e, κN (x̂), w) − e+ ,
y = h(x̂ + e) + v, in the set Xρ = levρ Vf .
A complete proof of this theorem, for the more general case of sub-
optimal MPC, is given in Allan et al. (2017, Theorem 21). The proof
proceeds by first showing that Xρ is robustly positive invariant for all
ρ > 0. That argument is similar to the one presented in Chapter 3 be-
fore Proposition 3.5. The proof then establishes that inequality (4.39)
holds for all x̂ ∈ Xρ . Proposition 4.45 is then invoked to establish that
the origin is RAS.
Notice that neither VN0 (·) nor κN (·) need be continuous for this com-
bination of MHE and MPC to be inherently robust. Since x = x̂ + e, The-
orem 4.46 also gives robust asymptotic stability of the evolution of x
in addition to x̂ for the closed-loop system with bounded disturbances.
4.5 On combining MHE and MPC 317

x ×10−5 x−x
b
0.875 1

0.850
c
0
0.825

0.800
−1
×10−4
332.5 2.5

330.0 0.0
T
327.5 −2.5

325.0 −5.0
0 25 50 75 100 0 25 50 75 100
time time

Figure 4.7: Closed-loop performance of combined nonlinear


MHE/MPC with no disturbances. First column shows sys-
tem states, and second column shows estimation error.
Dashed line shows concentration setpoint. Vertical lines
indicate times of setpoint changes.

Example 4.47: Combined MHE/MPC


Consider the nonlinear reactor system from Example 1.11 with sample
time ∆ = 0.5 min and height h and inlet flow F fixed to their steady-
state values. The resulting system has two states (concentration c and
temperature T ) and one input (cooling temperature Tc ). The only mea-
sured output is temperature, which means the reactor concentration
must be estimated via MHE.
To illustrate the performance of combined MHE/MPC, closed-loop
control to a changing setpoint is simulated. Figure 4.7 shows the chang-
ing states x and estimate errors x − x̂. Note that each setpoint change
leads to a temporary increase in estimate error, which eventually de-
cays back to zero. Note that zero prior weighting is used in the MHE
formulation.
To illustrate the response to disturbances, the simulation is repeated
for varying disturbance sizes. The system itself is subject to a distur-
bance w, which adds to the evolution of concentration, while the tem-
318 State Estimation

∥w∥ = 0 ∥w∥ ≤ 0.001 ∥w∥ ≤ 0.001


∥v∥ = 0 ∥v∥ ≤ 0.1 ∥v∥ ≤ 1
335

T
330

325

∥w∥ ≤ 0.01 ∥w∥ ≤ 0.01 ∥w∥ ≤ 0.025


∥v∥ ≤ 1 ∥v∥ ≤ 10 ∥v∥ ≤ 10
335

T
330

325

0.8 0.9 0.8 0.9 0.8 0.9


c c c

Figure 4.8: Closed-loop performance of combined nonlinear


MHE/MPC for varying disturbance size. The system is
controlled between two steady states.

perature measurement is subject to noise v. Figure 4.8 shows a phase


plot of system evolution subject to the same setpoint changes as be-
fore. As the disturbances become larger, the system deviates further
from its setpoint. Note that the same MHE objective function (with zero
prior weight) is used for all cases. □

4.6 Notes
State estimation is a fundamental topic appearing in many branches of
science and engineering, and has a large literature. A nice and brief
annotated bibliography describing the early contributions to optimal
state estimation of the linear Gaussian system is provided by Åström
(1970, pp. 252-255). Kailath (1974) provides a comprehensive and his-
torical review of linear filtering theory including the historical devel-
opment of Wiener-Kolmogorov theory for filtering and prediction that
preceded Kalman filtering (Wiener, 1949; Kolmogorov, 1941).
Jazwinski (1970) provides an early and comprehensive treatment of
the optimal stochastic state estimation problem for linear and nonlin-
4.6 Notes 319

ear systems. As mentioned in Section 4.2.3, Jazwinski (1970) follows


Deyst and Price (1968) and proposes V (k, x) = (1/2)x ′ P (k)−1 x as a
Lyapunov function candidate for the linear controllable and observ-
able time-varying system. Note that the estimate error dynamic sys-
tem is time varying even if the model is time invariant because the
optimal estimator gains are time varying. This choice of Lyapunov
function has been used to establish estimator stability in many subse-
quent textbooks (Stengel, 1994, pp.474-475). The most complete treat-
ment of the linear problem in the literature seems to be (Anderson and
Moore, 1981), which assumes uniform stabilizability and detectabil-
ity for the time-varying system and establishes exponential stability.
Kailath (1974, p.152) remarks that the known proofs that the optimal
filter is stable “are somewhat difficult, and it is significant that only a
small fraction of the vast literature on the Kalman filter deals with this
problem.”
For establishing stability of the steady-state optimal linear estima-
tor, simpler arguments suffice because the estimate error equation is
time invariant. Establishing duality with the optimal regulator is a fa-
vorite technique for establishing estimator stability in this case. See,
for example, Kwakernaak and Sivan (1972, Theorem 4.11) for a general
steady-state stability theorem for the linear Gaussian case.
Many of the full information and MHE results in this chapter are
motivated by early results in Rao (2000) and Rao, Rawlings, and Mayne
(2003). The full information analysis given here is more general be-
cause (i) we assume nonlinear detectability rather than nonlinear ob-
servability, and (ii) we establish asymptotic stability under process and
measurement disturbances, which were neglected in previous analysis.
Muske, Rawlings, and Lee (1993) and Meadows, Muske, and Rawlings
(1993) apparently were the first to use the increasing property of the
optimal cost to establish classical (not KL) asymptotic stability for full
information estimation for linear models with constraints. Robertson
and Lee (2002) present the interesting statistical interpretation of MHE
for the constrained linear system. Michalska and Mayne (1995) estab-
lish stability of moving horizon estimation with zero prior weighting
for the continuous time nonlinear system. Alessandri, Baglietto, and
Battistelli (2008) also provide a stability analysis of MHE with an ob-
servability assumption and quadratic stage cost.
Rawlings and Ji (2012) streamlined the presentation of the full infor-
mation problem for the case of convergent disturbances, and pointed
to MHE of bounded disturbances, and suboptimal MHE as two signifi-
320 State Estimation

cant open research problems. Next Ji, Rawlings, Hu, Wynn, and Diehl
(2016); Hu, Xie, and You (2015) provided the first analysis of full infor-
mation estimation for bounded disturbances by introducing a max term
in the estimation objective function, and assuming stronger forms of
the i-IOSS detectability condition. This reformulation did provide RAS
of full information estimation with bounded disturbances, but had the
unfortunate side effect of removing convergent estimate error for con-
vergent disturbances.
In a major step forward, Müller (2017) examined MHE with bounded
disturbances for similarly restrictive i-IOSS conditions, and established
bounds on arrival cost penalty and horizon length that provide both
RAS for bounded disturbances and convergence of estimate error for
convergent disturbances. Hu (2017) generalized the detectability con-
ditions in Ji et al. (2016) and treated both full information with the max
term and MHE estimation. At this stage of development, all the bounds
for robust stability became worse with increasing horizon length, which
seems problematic since the use of more measurements should im-
prove estimation. In another significant step, Knüfer and Müller (2018)
next introduced a fading memory formulation of FIE and MHE for expo-
nentially i-IOSS systems whose bounds improved with horizon length.
But this formulation required that the stage cost satisfy the triangle
inequality, which excludes the quadratic penalty commonly used in es-
timation, especially for linear systems.
As described in detail throughout the chapter, Allan (2020) intro-
duced explicit stabilizability assumptions into the analysis and estab-
lished a converse theorem for i-IOSS. He then showed for general stage
costs that FIE is RGAS for (asymptotic) i-IOSS systems, thus removing
the exponential part of the assumption, and that MHE is RGES for ex-
ponentially i-IOSS systems. As mentioned in the chapter, whether MHE
is RGAS for (asymptotic) i-IOSS systems remains an open question. Fi-
nally, numerous application papers using MHE have appeared in the
last several years indicating a growing interest in this approach to state
estimation.
For the case of output feedback, there are of course alternatives
to simply combining independently designed MHE estimators and MPC
regulators as briefly analyzed in Section 4.5. Recently Copp and Hes-
panha (2017) propose solving instead a single min-max optimization
for simultaneous estimation and control. Because of the excellent re-
sultant closed-loop properties, this class of approaches certainly war-
rants further attention and development.
4.7 Exercises 321

4.7 Exercises

Exercise 4.1: Input-to-state stability and convergence


Assume the nonlinear system
x + = f (x, u)
is input-to-state stable (ISS) so that for all x0 ∈ Rn , input sequences u, and k ≥ 0
|x(k; x0 , u)| ≤ β(|x0 | , k) + γ(∥u∥)
in which x(k; x0 , u) is the solution to the system equation at time k starting at state
x0 using input sequence u, and γ ∈ K and β ∈ KL.
(a) Show that the ISS property also implies
|x(k; x0 , u)| ≤ β(|x0 | , k) + γ(∥u∥0:k−1 )
in which ∥u∥a:b = max a≤j≤b u(j) .

(b) Show that the ISS property implies the “converging-input converging-state” prop-
erty (Jiang and Wang, 2001), (Sontag, 1998, p. 330), i.e., show that if the system
is ISS, then u(k) → 0 implies x(k) → 0.

Exercise 4.2: Output-to-state stability and convergence


Assume the nonlinear system
x + = f (x) y = h(x)
is output-to-state stable (OSS) so that for all x0 ∈ Rn and k ≥ 0
|x(k; x0 )| ≤ β(|x0 | , k) + γ(∥y∥0:k )
in which x(k; x0 ) is the solution to the system equation at time k starting at state x0 ,
and γ ∈ K and β ∈ KL.
Show that the OSS property implies the “converging-output converging-state” prop-
erty (Sontag and Wang, 1997, p. 281) i.e., show that if the system is OSS, then y(k) → 0
implies x(k) → 0.

Exercise 4.3: i-IOSS and convergence


Establish that if system
x + = f (x, w) y = g(x)
is i-IOSS, and w1 (k) → w2 (k) and y1 (k) → y2 (k) as k → ∞, then
x(k; z1 , w1 ) → x(k; z2 , w2 ) as k → ∞ for all z1 , z2

Exercise 4.4: Observability and detectability of linear time-invariant sys-


tems and OSS
Consider the linear time-invariant system
x + = Ax y = Cx
(a) Show that if the system is observable, then the system is OSS.

(b) Show that the system is detectable if and only if the system is OSS.
322 State Estimation

Exercise 4.5: Observability and detectability of linear time-invariant system


and IOSS
Consider the linear time-invariant system with input

x + = Ax + Gw y = Cx

(a) Show that if the system is observable, then the system is IOSS.

(b) Show that the system is detectable if and only if the system is IOSS.

Exercise 4.6: Max or sum?


To facilitate complicated arguments involving K and KL functions, it is often conve-
nient to interchange sum and max operations. First some suggestive notation: let the
max operator over scalars be denoted with the ⊕ symbol so that

a ⊕ b := max(a, b)

(a) Show that the ⊕ operator is commutative and associative, i.e., a ⊕ b = b ⊕ a and
(a ⊕ b) ⊕ c = a ⊕ (b ⊕ c) for all a, b, c, so that the following operation is well
defined and the order of operation is inconsequential
n
M
a1 ⊕ a2 ⊕ a3 · · · ⊕ an := ai
i=1

(b) Find scalars d and e such that for all a, b ≥ 0, the following holds

d(a + b) ≤ a ⊕ b ≤ e(a + b)

(c) Find scalars d and e such that for all a, b ≥ 0, the following holds

d(a ⊕ b) ≤ a + b ≤ e(a ⊕ b)

(d) Generalize the previous result to the n term sum; find dn , en , dn , en such that
the following holds for all ai ≥ 0, i = 1, 2, . . . , n
n
X n
M n
X
dn ai ≤ ai ≤ en ai
i=1 i=1 i=1
Mn Xn Mn
dn ai ≤ ai ≤ en ai
i=1 i=1 i=1

(e) Show that establishing convergence (divergence) in sum or max is equivalent,


i.e., consider a time sequence (s(k))k≥0
n
X n
M
s(k) = ai (k) s(k) = ai (k)
i=1 i=1

Show that
lim s(k) = 0 (∞) if and only if lim s(k) = 0 (∞)
k→∞ k→∞
4.7 Exercises 323

Exercise 4.7: Where did my constants go?


Once K and KL functions appear, we may save some algebra by switching from the
sum to the max.
In the following, let γ(·) be any K function and ai ∈ R≥0 , i ∈ I1:n .
(a) If you choose to work with sum, derive the following bounding inequalities
(Rawlings and Ji, 2012)

γ(a1 + a2 + · · · + an ) ≤ γ(na1 ) + γ(na2 ) + · · · + γ(nan )


 
1
γ(a1 + a2 + · · · + an ) ≥ γ(a1 ) + γ(a2 ) + · · · + γ(an )
n

(b) If you choose to work with max instead, derive instead the following simpler
result
γ(a1 ⊕ a2 ⊕ · · · ⊕ an ) = γ(a1 ) ⊕ γ(a2 ) ⊕ · · · ⊕ γ(an )
Notice that you have an equality rather than an inequality, which leads to tighter
bounds.

Exercise 4.8: Linear systems and incremental stability


Show that for a linear time-invariant system, i-ISS (i-OSS, i-IOSS) is equivalent to ISS
(OSS, IOSS).

Exercise 4.9: Nonlinear observability and Lipschitz continuity implies i-OSS


Consider the following definition of observability for nonlinear systems in which f and
h are Lipschitz continuous. A system

x + = f (x) y = h(x)

is observable if there exists No ∈ I≥1 and K function γ such that


NX
o −1
y(k; x1 ) − y(k; x2 ) ≥ γ(|x1 − x2 |) (4.40)
k=0

holds for all x1 , x2 ∈ Rn . This definition was used by Rao et al. (2003) in showing
stability of nonlinear MHE to initial condition error under zero state and measurement
disturbances.
(a) Show that this form of nonlinear observability implies i-OSS.

(b) Show that i-OSS does not imply this form of nonlinear observability and, there-
fore, i-OSS is a weaker assumption.
The i-OSS concept generalizes the linear system concept of detectability to nonlinear
systems.

Exercise 4.10: Equivalance of detectability and IOSS for continuous time,


linear, time-invariant system
Consider the continuous time, linear, time-invariant system with input

ẋ = Ax + Bu y = Cx

Show that the system is detectable if and only if the system is IOSS.
324 State Estimation

Exercise 4.11: Observable, FSO, and detectable for linear systems


Consider the linear time-invariant system
x + = Ax y = Cx
and its observability canonical form. What conditions must the system satisfy to be
(a) observable?

(b) final-state observable (FSO)?

(c) detectable?

Exercise 4.12: Exponential detectability and compatibility of stage cost


We commented in the text that working with exponential detectability lessens the
requirement for stage-cost compatibility in Assumption 4.11 that is necessary with
(asymptotic) detectability. To see why, consider the noise-free case and assume sys-
tem 4.1 is exponentially i-IOSS. Without loss of generality, the exponential i-IOSS Lya-
punov function can then be chosen quadratic (Allan, 2020, Corollary 2.15). The descent
condition is then given by

Λ(f (x1 , w1 ), f (x2 , w2 ))) ≤ Λ(x1 , x2 ) − a3 |x1 − x2 |2 +


aw |w1 − w2 |2 + av |h(x1 ) − h(x2 )|2
which holds for all x1 , x2 ∈ X and w1 , w2 ∈ W. Assume we have chosen the usual
least-squares stage cost
ℓ(w, v) = |w|2 −1 + |v|2 −1
Qw Rv
where Qw , Rv > 0 are estimates of the variances of process and measurement dis-
turbances w, v, respectively. The standard descent condition in the noise-free case
is
Y (j + 1|k) = Y (j|k) − ℓ(ŵ(j|k), v̂(j|k)
−1 2 −1 2
≤ Y (j|k) − σ (Qw ) ŵ(j|k) − σ (Rv ) v̂(j|k)
where σ (A) is the smallest singular value of matrix A. Define Q(j|k) := Y (j|k) +
Λ(x(j), x̂(j|k)), and to establish estimator stability we need to show that the Q-function
has a descent condition
2
Q(j + 1|k) ≤ Q(j|k) − c3 x(j) − x̂(j|k)
for some c3 > 0.
But how can we have a descent condition when we have not assumed any rela-
tionship between matrices Qw , Rv in the stage cost and constants aw and av in the
system’s detectability condition?
Hint: consider what you are asked to show in Exercise B.3(b) about the converse
theorem for exponential stability. Use a similar idea here.

Exercise 4.13: Convergent disturbances


Prove Proposition 4.3, i.e., show that if an estimator is RGAS and (w(k), v(k)) → 0
as k → ∞, then x̂(k) → x(k) as k → ∞. Hint: in the definition of RGAS, break the
maximization over interval [0 : k − 1] into maximization over two intervals [0 : M −
1] ∪ [M : k − 1] and choose M to control the size of each maximization.
4.7 Exercises 325

Exercise 4.14: Observabilty plus K-continuity imply FSO


Prove Proposition 4.31. Hint: first try replacing global K-continuity with the stronger
assumption of global Lipschitz continuity to get a feel for the argument.

Exercise 4.15: Detectable linear time-invariant system and i-UIOSS


Show that the detectable linear time-invariant system x + = Ax + Bu + Gw, y = Cx is
i-UIOSS from Definition 4.40.

Exercise 4.16: Dynamic programming recursion for Kalman predictor


In the Kalman predictor, we use forward DP to solve at stage k

min ℓ(x, w) + Vk− (x) s.t. z = Ax + w


x,w

in which x is the state at the current stage and z is the state at the next stage. The
stage cost and arrival cost are given by
2  2
ℓ(x, w) = (1/2) y(k) − Cx ′ −1
R −1 +w Q w Vk− (x) = (1/2) x − x̂ − (k) (P − (k))−1

and we wish to find the value function V 0 (z), which we denote Vk+1

(z) in the Kalman
predictor estimation problem.

(a) Combine the two x terms to obtain


 
1
min w ′ Q−1 w + (x − x̂(k))′ P (k)−1 (x − x̂(k)) s.t. z = Ax + w
x,w 2

and, using the third part of Example 1.1, show

P (k) = P − (k) − P − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)


L(k) = P − (k)C ′ (CP − (k)C ′ + R)−1
x̂(k) = x̂ − (k) + L(k)(y(k) − C x̂ − (k))

(b) Add the w term and use the inverse form in Exercise 1.18 to show the optimal
cost is given by

V 0 (z) = (1/2)(z − Ax̂ − (k + 1))′ (P − (k + 1))−1 (z − Ax̂ − (k + 1))



x̂ (k + 1) = Ax̂(k)
P − (k + 1) = AP (k)A′ + Q

Substitute the results for x̂(k) and P (k) above and show

Vk+1 (z) = (1/2)(z − x̂ − (k + 1))′ (P − (k + 1))−1 (z − x̂(k + 1))
P − (k + 1) = Q + AP − (k)A′ − AP − (k)C ′ (CP − (k)C ′ + R)−1 CP − (k)A
e (k)(y(k) − C x̂ − (k))
x̂ − (k + 1) = Ax̂ − (k) + L
e (k) = AP − (k)C ′ (CP − (k)C ′ + R)−1
L

(c) Compare and contrast this form of the estimation problem to the one given in
Exercise 1.29 that describes the Kalman filter.
326 State Estimation

Exercise 4.17: Duality, cost to go, and covariance


Using the duality variables of Table 4.2, translate Theorem 4.27 into the version that is
relevant to the state estimation problem.

Exercise 4.18: Estimator convergence for (A, G) not stabilizable


What happens to the stability of the optimal estimator if we violate the condition

(A, G) stabilizable

(a) Is the steady-state Kalman filter a stable estimator? Is the full information esti-
mator a stable estimator? Are these two answers contradictory? Work out the
results for the case A = 1, G = 0, C = 1, P − (0) = 1, Q = 1, R = 1.
Hint: you may want to consult de Souza, Gevers, and Goodwin (1986).

(b) Can this phenomenon happen in the LQ regulator? Provide the interpretation
of the time-varying regulator that corresponds to the time-varying filter given
above. Does this make sense as a regulation problem?

Exercise 4.19: Exponential stability of the Kalman predictor


Establish that the Kalman predictor defined in Section 4.2.3 is a globally exponentially
stable estimator. What is the corresponding linear quadratic regulator?

Exercise 4.20: Equivalent definition of RGES


Prove Proposition 4.21.
Hint: Consider arbitrary w ∈ Rg , v ∈ Rp . Show that
1. For every aw , av > 0, there exists ad > 0 such that aw |w|⊕av |v| ≤ ad |(w, v)|;
2. For every ad > 0, there exist aw , av > 0 such that ad |(w, v)| ≤ aw |w|⊕av |v|.
Bibliography

A. Alessandri, M. Baglietto, and G. Battistelli. Moving-horizon state estimation


for nonlinear discrete-time systems: New stability results and approxima-
tion schemes. Automatica, 44(7):1753–1765, 2008.

D. A. Allan. A Lyapunov-like Function for Analysis of Model Predictive Con-


trol and Moving Horizon Estimation. PhD thesis, University of Wisconsin–
Madison, August 2020.

D. A. Allan and J. B. Rawlings. A Lyapunov-like function for full information es-


timation. In American Control Conference, pages 4497–4502, Philadelphia,
PA, July 10–12, 2019.

D. A. Allan and J. B. Rawlings. Robust stability of full information estimation.


SIAM J. Cont. Opt., 2020. Submitted 4/16/2020.

D. A. Allan, C. N. Bates, M. J. Risbeck, and J. B. Rawlings. On the inherent


robustness of optimal and suboptimal nonlinear MPC. Sys. Cont. Let., 106:
68–78, August 2017.

D. A. Allan, J. B. Rawlings, and A. R. Teel. Nonlinear detectability and incre-


mental input/output-to-state stability. Technical Report 2020–01, TWCCC
Technical Report, July 2020.

B. D. O. Anderson and J. B. Moore. Detectability and stabilizability of time-


varying discrete-time linear systems. SIAM J. Cont. Opt., 19(1):20–32, 1981.

K. J. Åström. Introduction to Stochastic Control Theory. Academic Press, San


Diego, California, 1970.

K. Berntorp and P. Grover. Feedback particle filter with data-driven gain-


function approximation. IEEE Trans. Aero. Elec. Sys., 54(5):2118–2130, 2018.

D. P. Bertsekas. Dynamic Programming. Prentice-Hall, Inc., Englewood Cliffs,


New Jersey, 1987.

C. Cai and A. R. Teel. Input–output-to-state stability for discrete-time systems.


Automatica, 44(2):326 – 336, 2008.

F. M. Callier and C. A. Desoer. Linear System Theory. Springer-Verlag, New


York, 1991.

327
328 Bibliography

M. Chaves and E. D. Sontag. State-estimators for chemical reaction networks of


Feinberg-Horn-Jackson zero deficiency type. Eur. J. Control, 8(4):343–359,
2002.

D. A. Copp and J. P. Hespanha. Simultaneous nonlinear model predictive con-


trol and state estimation. Automatica, 77:143–154, 2017.

C. E. de Souza, M. R. Gevers, and G. C. Goodwin. Riccati equation in optimal fil-


tering of nonstabilizable systems having singular state transition matrices.
IEEE Trans. Auto. Cont., 31(9):831–838, September 1986.

J. Deyst and C. Price. Conditions for asymptotic stability of the discrete


minimum-variance linear estimator. IEEE Trans. Auto. Cont., 13(6):702–705,
Dec 1968.

A. Gelb, editor. Applied Optimal Estimation. The M.I.T. Press, Cambridge, Mas-
sachusetts, 1974.

R. Gudi, S. Shah, and M. Gray. Multirate state and parameter estimation in an


antibiotic fermentation with delayed measurements. Biotech. Bioeng., 44:
1271–1278, 1994.

R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press,


1985.

W. Hu. Robust Stability of Optimization-based State Estimation Under Bounded


Disturbances. ArXiv e-prints, Feb 2017.

W. Hu, L. Xie, and K. You. Optimization-based state estimation under bounded


disturbances. In 2015 54th IEEE Conference on Decision and Control CDC,
pages 6597–6602, Dec 2015.

A. H. Jazwinski. Stochastic Processes and Filtering Theory. Academic Press,


New York, 1970.

L. Ji, J. B. Rawlings, W. Hu, A. Wynn, and M. Diehl. Robust stability of moving


horizon estimation under bounded disturbances. IEEE Trans. Auto. Cont.,
61(11):3509–3514, November 2016.

Z.-P. Jiang and Y. Wang. Input-to-state stability for discrete-time nonlinear


systems. Automatica, 37:857–869, 2001.

S. J. Julier and J. K. Uhlmann. A new extension of the Kalman filter to nonlinear


systems. In International Symposium Aerospace/Defense Sensing, Simula-
tion and Controls, pages 182–193, 1997.

S. J. Julier and J. K. Uhlmann. Author’s reply. IEEE Trans. Auto. Cont., 47(8):
1408–1409, August 2002.
Bibliography 329

S. J. Julier and J. K. Uhlmann. Unscented filtering and nonlinear estimation.


Proc. IEEE, 92(3):401–422, March 2004a.

S. J. Julier and J. K. Uhlmann. Corrections to unscented filtering and nonlinear


estimation. Proc. IEEE, 92(12):1958, December 2004b.

S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte. A new method for the non-


linear transformation of means and covariances in filters and estimators.
IEEE Trans. Auto. Cont., 45(3):477–482, March 2000.

T. Kailath. A view of three decades of linear filtering theory. IEEE Trans. Inform.
Theory, IT-20(2):146–181, March 1974.

R. Kandepu, L. Imsland, and B. A. Foss. Constrained state estimation using the


unscented kalman filter. In Procedings of the 16th Mediterranean Conference
on Control and Automation, pages 1453–1458, Ajaccio, France, June 2008.

S. S. Keerthi and E. G. Gilbert. An existence theorem for discrete-time infinite-


horizon optimal control problems. IEEE Trans. Auto. Cont., 30(9):907–909,
September 1985.

S. Knüfer and M. A. Müller. Robust global exponential stability for moving


horizon estimation. In 2018 IEEE Conference on Decision and Control (CDC),
pages 3477–3482, Dec 2018.

S. Kolås, B. A. Foss, and T. S. Schei. Constrained nonlinear state estimation


based on the UKF approach. Comput. Chem. Eng., 33(8):1386–1401, 2009.

A. N. Kolmogorov. Interpolation and extrapolation of stationary random se-


quences. Bull. Moscow Univ., USSR, Ser. Math. 5, 1941.

H. Kwakernaak and R. Sivan. Linear Optimal Control Systems. John Wiley and
Sons, New York, 1972.

M. Lazar, W. P. M. H. Heemels, and A. R. Teel. Further input-to-state stability


subtleties for discrete-time systems. Automatic Control, IEEE Transactions
on, 58(6):1609–1613, June 2013.

T. Lefebvre, H. Bruyninckx, and J. De Schutter. Comment on “A new method


for the nonlinear transformation of means and covariances in filters and
estimators”. IEEE Trans. Auto. Cont., 47(8):1406–1408, August 2002.

E. S. Meadows, K. R. Muske, and J. B. Rawlings. Constrained state estimation


and discontinuous feedback in model predictive control. In Proceedings of
the 1993 European Control Conference, pages 2308–2312, 1993.

H. Michalska and D. Q. Mayne. Moving horizon observers and observer-based


control. IEEE Trans. Auto. Cont., 40(6):995–1006, 1995.
330 Bibliography

S. A. Middlebrooks and J. B. Rawlings. State estimation approach for determin-


ing composition and growth rate of Si1−x Gex chemical vapor deposition uti-
lizing real-time ellipsometric measurements. Applied Opt., 45:7043–7055,
2006.

M. A. Müller. Nonlinear moving horizon estimation in the presence of bounded


disturbances. Automatica, 79:306–314, 2017.

K. R. Muske, J. B. Rawlings, and J. H. Lee. Receding horizon recursive state


estimation. In Proceedings of the 1993 American Control Conference, pages
900–904, June 1993.

M. Nørgaard, N. K. Poulsen, and O. Ravn. New developments in state estimation


for nonlinear systems. Automatica, 36:1627–1638, 2000.

V. Prasad, M. Schley, L. P. Russo, and B. W. Bequette. Product property and


production rate control of styrene polymerization. J. Proc. Cont., 12(3):353–
372, 2002.

C. C. Qu and J. Hahn. Computation of arrival cost for moving horizon estima-


tion via unscented Kalman filtering. J. Proc. Cont., 19(2):358–363, 2009.

C. V. Rao. Moving Horizon Strategies for the Constrained Monitoring and Con-
trol of Nonlinear Discrete-Time Systems. PhD thesis, University of Wisconsin–
Madison, 2000.

C. V. Rao, J. B. Rawlings, and J. H. Lee. Constrained linear state estimation – a


moving horizon approach. Automatica, 37(10):1619–1628, 2001.

C. V. Rao, J. B. Rawlings, and D. Q. Mayne. Constrained state estimation for


nonlinear discrete-time systems: stability and moving horizon approxima-
tions. IEEE Trans. Auto. Cont., 48(2):246–258, February 2003.

J. B. Rawlings and L. Ji. Optimization-based state estimation: Current status


and some new results. J. Proc. Cont., 22:1439–1444, 2012.

J. B. Rawlings and D. Q. Mayne. Model Predictive Control: Theory and Design.


Nob Hill Publishing, Madison, WI, 2009. 668 pages, ISBN 978-0-9759377-0-9.

J. B. Rawlings and M. J. Risbeck. On the equivalence between statements with


epsilon-delta and K-functions. Technical Report 2015–01, TWCCC Technical
Report, December 2015.

K. Reif and R. Unbehauen. The extended Kalman filter as an exponential ob-


server for nonlinear systems. IEEE Trans. Signal Process., 47(8):2324–2328,
August 1999.
Bibliography 331

K. Reif, S. Günther, E. Yaz, and R. Unbehauen. Stochastic stability of the


discrete-time extended Kalman filter. IEEE Trans. Auto. Cont., 44(4):714–
728, April 1999.

K. Reif, S. Günther, E. Yaz, and R. Unbehauen. Stochastic stability of the


continuous-time extended Kalman filter. IEE Proceedings-Control Theory
and Applications, 147(1):45–52, January 2000.

D. G. Robertson and J. H. Lee. On the use of constraints in least squares esti-


mation and control. Automatica, 38(7):1113–1124, 2002.

A. Romanenko and J. A. A. M. Castro. The unscented filter as an alternative


to the EKF for nonlinear state estimation: a simulation case study. Comput.
Chem. Eng., 28(3):347–355, March 15 2004.

A. Romanenko, L. O. Santos, and P. A. F. N. A. Afonso. Unscented Kalman


filtering of a simulated pH system. Ind. Eng. Chem. Res., 43:7531–7538,
2004.

B. J. P. Roset, W. P. M. H. Heemels, M. Lazar, and H. Nijmeijer. On robustness of


constrained discrete-time systems to state measurement errors. Automat-
ica, 44(4):1161 – 1165, 2008.

E. D. Sontag. Mathematical Control Theory. Springer-Verlag, New York, second


edition, 1998.

E. D. Sontag and Y. Wang. Output-to-state stability and detectability of nonlin-


ear systems. Sys. Cont. Let., 29:279–290, 1997.

R. F. Stengel. Optimal Control and Estimation. Dover Publications, Inc., 1994.

B. O. S. Teixeira, L. A. B. Tôrres, L. A. Aguirre, and D. S. Bernstein. Unscented


filtering for interval-constrained nonlinear systems. In Proceedings of the
47th IEEE Conference on Decision and Control, pages 5116–5121, Cancun,
Mexico, December 9-11 2008.

M. J. Tenny and J. B. Rawlings. Efficient moving horizon estimation and non-


linear model predictive control. In Proceedings of the American Control
Conference, pages 4475–4480, Anchorage, Alaska, May 2002.

P. Vachhani, S. Narasimhan, and R. Rengaswamy. Robust and reliable estima-


tion via unscented recursive nonlinear dynamic data reconciliation. J. Proc.
Cont., 16(10):1075–1086, December 2006.

R. van der Merwe, A. Doucet, N. de Freitas, and E. Wan. The unscented parti-
cle filter. Technical Report CUED/F-INFENG/TR 380, Cambridge University
Engineering Department, August 2000.
332 Bibliography

N. Wiener. The Extrapolation, Interpolation, and Smoothing of Stationary Time


Series with Engineering Applications. Wiley, New York, 1949. Originally
issued as a classified MIT Rad. Lab. Report in February 1942.

D. I. Wilson, M. Agarwal, and D. W. T. Rippin. Experiences implementing the


extended Kalman filter on an industrial batch reactor. Comput. Chem. Eng.,
22(11):1653–1672, 1998.

T. Yang, P. G. Mehta, and S. P. Meyn. Feedback particle filter. IEEE Trans. Auto.
Cont., 58(10):2465–2480, 2013.
5
Output Model Predictive Control

5.1 Introduction
In Chapter 2 we show how model predictive control (MPC) may be em-
ployed to control a deterministic system, that is, a system in which there
are no uncertainties and the state is known. In Chapter 3 we show how
to control an uncertain system in which uncertainties are present but
the state is known. Here we address the problem of MPC of an un-
certain system in which the state is not fully known. We assume that
there are outputs available that may be used to estimate the state as
shown in Chapter 4. These outputs are used by the model predictive
controller to generate control actions; hence the name output MPC.
The state is not known, but a noisy measurement y(t) of the state
is available at each time t. Since the state x is not known, it is re-
placed by a hyperstate p that summarizes all prior information (previ-
ous inputs and outputs and the prior distribution of the initial state)
and that has the “state” property: future values of p can be deter-
mined from the current value of p, and current and future inputs
and outputs. Usually p(t) is the conditional density of x(t) given
the prior density p(0) of x(0), and the current available “information”

I(t) := y(0), y(1), . . . , y(t − 1), u(0), u(1), . . . , u(t − 1) .
For the purpose of control, future hyperstates have to be predicted
since future noisy measurements of the state are not known. So the
hyperstate satisfies an uncertain difference equation of the form

p + = φ(p, u, ψ) (5.1)

where (ψ(t))t∈I≥0 is a sequence of random variables. The problem of


controlling a system with unknown state x is transformed into the
problem of controlling an uncertain system with known state p. For

333
334 Output Model Predictive Control

example, if the underlying system is described by

x + = Ax + Bu + w
y = Cx + ν

where (w(t))t∈I≥0 and (ν(t))t∈I≥0 are sequences of zero-mean, normal,


independent random variables with variances Σw and Σν , respectively,
and if the prior density p(0) of x(0) is normal with density n(x̄0 , Σ0 ),
then, as is well known, p(t) is the normal density n(x̂(t), Σ(t)) so that
the hyperstate p(t) is finitely parameterized by (x̂(t), Σ(t)). Hence the
evolution equation for p(t) may be replaced by the simpler evolution
equation for (x̂, Σ), that is by

x̂(t + 1) = Ax̂(t) + Bu + L(t)ψ(t) (5.2)


Σ(t + 1) = Φ(Σ(t)) (5.3)

in which

Φ(Σ) := AΣA′ − AΣC ′ (C ′ ΣC + Σν )−1 CΣA′ + Σw


ψ(t) := y(t) − C x̂(t) = Cx
e (t) + ν(t)
x
e (t) := x(t) − x̂(t)

The initial conditions for (5.2) and (5.3) are

x̂(0) = x̄0 Σ(0) = Σ0

These are, of course, the celebrated Kalman filter equations derived


in Chapter 1. The random variables x e and ψ have the following den-
e (t) ∼ n(0, Σ(t)) and ψ(t) ∼ n(0, Σν + C ′ Σ(t)C). The finite
sities: x
dimensional equations (5.2) and (5.3) replace the difference equation
(5.1) for the hyperstate p that is a conditional density and, therefore,
infinite dimensional in general. The sequence (ψ(t))t∈I≥0 is known as
the innovations sequence; ψ(t) is the “new” information contained in
y(t).
Output control, in general, requires control of the hyperstate p that
may be computed with difficulty, since p satisfies a complex evolution
equation p + = φ(p, u, ψ) where ψ is a random disturbance. Control-
ling p is a problem of the same type as that considered in Chapter 3,
but considerably more complex since the function p(·) is infinite di-
mensional. Because of the complexity of the evolution equation for p,
a simpler procedure is often adopted. Assuming that the state x is
known, a stabilizing controller u = κ(x) is designed. An observer or
5.2 A Method for Output MPC 335

filter yielding an estimate x̂ of the state is then separately designed


and the control u = κ(x̂) is applied to the plant. Indeed, this form of
control is actually optimal for the linear quadratic Gaussian (LQG) op-
timal control problem considered in Chapter 1, but is not necessarily
optimal and stabilizing when the system is nonlinear and constrained.
We propose a variant of this procedure, modified to cope with state and
control constraints.
The state estimate x̂ satisfies an uncertain difference equation with
an additive disturbance of the same type as that considered in Chapter
3. Hence we employ tube MPC, similar to that employed in Chapter
3, to obtain a nominal trajectory satisfying tightened constraints. We
then construct a tube that has as its center the nominal trajectory, and
which includes every possible realization of x̂ = (x̂(t))t∈I≥0 . We then
construct a second tube that includes the first tube in its interior, and
is such that every possible realization of the sequence x = (x(t))t∈I≥0
lies in its interior. The tightened constraints are chosen to ensure every
possible realization of x = (x(t))t∈I≥0 does not transgress the original
constraints. An advantage of the method presented here is that its
online complexity is comparable to that of conventional MPC.
As in Chapter 3, a caveat is necessary. Because of the inherent com-
plexity of output MPC, different compromises between simplicity and
efficiency are possible. For this reason, output MPC remains an active
research area and alternative methods, available or yet to be developed,
may be preferred.

5.2 A Method for Output MPC


Suppose the system to be controlled is described by

x + = Ax + Bu + w
y = Cx + ν

The state and control are required to satisfy the constraints x(t) ∈ X
and u(t) ∈ U for all t, and the disturbance is assumed to lie in the
compact set W. It is assumed that the origin lies in the interior of the
sets X, U, and W. The state estimator (x̂, Σ) evolves, as shown in the
sequel, according to

x̂ + = φ(x̂, u, ψ) (5.4)
+
Σ = Φ(Σ) (5.5)
336 Output Model Predictive Control

x2

x(k)

x̂(k)
{x̂(0)} ⊕ Σ

x1

Figure 5.1: State estimator tube. The solid line x̂(t) is the center of
the tube, and the dashed line is a sample trajectory of
x(t).

in which ψ is a random variable in the stochastic case, and a bounded


disturbance taking values in Ψ when w and ν are bounded. In the latter
case, x ∈ {x̂} ⊕ Σ implies x + ∈ {x̂ + } ⊕ Σ+ for all ψ ∈ Ψ.
As illustrated in Figure 5.1, the evolution equations generate a tube,
which is the set sequence ({x̂(t)} ⊕ Σ(t))t∈I≥0 ; at time t the center of
the tube is x̂(t) and the “cross section” is Σ(t). When the disturbances
are bounded, which is the only case we consider in the sequel, all possi-
ble realizations of the state trajectory (x(t)) lie in the set {x̂(t)} ⊕ Σ(t)
for all t; the dashed line is a sample trajectory of x(t).
From (5.4), the estimator trajectory (x̂(t))t∈I≥0 is influenced both by
the control that is applied and by the disturbance sequence (ψ(t))t∈I≥0 .
If the trajectory were influenced only by the control, we could choose
the control to satisfy the control constraints, and to cause the estima-
tor tube to lie in a region such that the state constraints are satisfied by
all possible realizations of the state trajectory. Hence the output MPC
problem would reduce to a conventional MPC problem with modified
constraints in which the state is x̂, rather than x. The new state con-
straint is x̂ ∈ X̂ where X̂ is chosen to ensure that x̂ ∈ X̂ implies x ∈ X
and, therefore, satisfies X̂ ⊆ X ⊖ Σ if Σ does not vary with time t.
But the estimator state x̂(t) is influenced by the disturbance ψ (see
(5.4)), so it cannot be precisely controlled. The problem of controlling
the system described by (5.4) is the same type of problem studied in
5.2 A Method for Output MPC 337

x2

x̄(k)

{x̄(0)} ⊕ S
{x̄(0)} ⊕ Γ

x1

Figure 5.2: The system with disturbance. The state estimate lies in
the inner tube, and the state lies in the outer tube.

Chapter 3, where the system was described by x + = f (x, u, w) with


the estimator state x̂, which is accessible, replacing the state x. Hence
we may use the techniques presented in Chapter 3 to choose a control
that forces x̂ to lie in another tube ({x̄(t)} ⊕ S(t))t∈I≥0 where the set
sequence (S(t))t∈I≥0 that defines the cross section of the tube is pre-
computed. The sequence (x̄(t))t∈I≥0 that defines the center of the tube
is the state trajectory of the nominal (deterministic) system defined by

x̄ + = φ(x̄, ū, 0) (5.6)

the nominal version of (5.4). Thus we get two tubes, one embedded in
the other. At time t the estimator state x̂(t) lies in the set {x̄(t)}⊕S(t),
and x(t) lies in the set {x̂(t)} ⊕ Σ(t), so that for all t

x(t) ∈ {x̄(t)} ⊕ Γ(t) Γ(t) := Σ(t) ⊕ S(t)

Figure 5.2 shows the tube ({x̄(t)} ⊕ S(t)), in which the trajectory (x̂(t))
lies, and the tube ({x̄(t)} ⊕ Γ(t)), in which the state trajectory (x(t))
lies.
338 Output Model Predictive Control

5.3 Linear Constrained Systems: Time-Invariant Case


5.3.1 Introduction

We consider the following uncertain linear time-invariant system

x + = Ax + Bu + w
y = Cx + ν (5.7)

in which x ∈ Rn is the current state, u ∈ Rm is the current control


action, x + is the successor state, w ∈ Rn is an unknown state distur-
bance, y ∈ Rp is the current measured output, ν ∈ Rp is an unknown
output disturbance, the pair (A, B) is assumed to be controllable, and
the pair (A, C) observable. The state and additive disturbances w and
ν are known only to the extent that they lie, respectively, in the C-
sets1 W ⊆ Rn and N ⊆ Rp . Let φ(i; x(0), u, w) denote the solution
of (5.7) at time i if the initial state at time 0 is x(0), and the control
and disturbance sequences are, respectively, u := (u(0), u(1), . . .) and
w := (w(0), w(1), . . .). The system (5.7) is subject to the following set
of hard state and control constraints

x∈X u∈U (5.8)

in which X ⊆ Rn and U ⊆ Rm are polyhedral and polytopic sets respec-


tively; both sets contain the origin as an interior point.

5.3.2 State Estimator

To estimate the state a Kalman filter or Luenberger observer is em-


ployed

x̂ + = Ax̂ + Bu + L(y − ŷ)


ŷ = C x̂ (5.9)

in which x̂ ∈ Rn is the current observer state (state estimate), u ∈ Rm


is the current control action, x̂ + is the successor state of the observer
system, ŷ ∈ Rp is the current observer output, and L ∈ Rn×p . The
output injection matrix L is chosen to satisfy ρ(AL ) < 1 where AL :=
A − LC.
The estimated state x̂ therefore satisfies the following uncertain
difference equation

x̂ + = Ax̂ + Bu + L(Cx
e + ν)
1 Recall, a C-set is a convex, compact set containing the origin.
5.3 Linear Constrained Systems: Time-Invariant Case 339

The state estimation error x e := x − x̂ so that x = x̂ + x


e is defined by x e.
+
Since x = Ax + Bu + w, the state estimation error x e satisfies
+
x
e = AL x
e +w
e e := w − Lν
w (5.10)

Because w and ν are bounded, so is w


e ; in fact, w
e takes values in the
C-set W̄ defined by
W̄ := W ⊕ (−LN)

We recall the following standard definitions (Blanchini, 1999).

Definition 5.1 (Positive invariance; robust positive invariance). A set


Ω ⊆ Rn is positive invariant for the system x + = f (x) and the con-
straint set X if Ω ⊆ X and f (x) ∈ Ω, ∀x ∈ Ω.
A set Ω ⊆ Rn is robust positive invariant for the system x + = f (x, w)
and the constraint set (X, W) if Ω ⊆ X and f (x, w) ∈ Ω, ∀w ∈ W,
∀x ∈ Ω.

e is compact, there exists, as shown in Kol-


Since ρ(AL ) < 1 and W
manovsky and Gilbert (1998), Theorem 4.1, a robust positive invariant
set Σ ⊆ Rn , satisfying
AL Σ ⊕ W̄ = Σ (5.11)
+
Hence, for all x
e ∈ Σ, x
e = AL x e +we ∈ Σ for all w e ∈W e ; the term robust
in the description of Σ refers to this property. In fact, Σ is the mini-
+
mal robust, positive invariant set for xe = AL x e +w e, w e ∈W e , i.e., a set
that is a subset of all robust positive invariant sets. There exist tech-
niques (Raković, Kerrigan, Kouramas, and Mayne, 2005) for obtaining,
for every ϵ > 0, a polytopic, nonminimal, robust, positive invariant set
Σ0 that satisfies dH (Σ, Σ0 ) ≤ ϵ where dH (·, ·) is the Hausdorff metric.
However, it is not necessary to compute the set Σ or Σ0 as shown in
Chapter 3. An immediate consequence of (5.11) is the following.

Proposition 5.2 (Proximity of state and state estimate). If the initial


system and observer states, x(0) and x̂(0) respectively, satisfy {x(0)} ∈
{x̂(0)} ⊕ Σ, then x(i) ∈ {x̂(i)} ⊕ Σ for all i ∈ I≥0 , and all admissible
disturbance sequences w and ν.

The assumption that x e (i) ∈ Σ for all i is a steady-state assumption;


if x
e (0) ∈ Σ, then x
e (i) ∈ Σ for all i. If, on the other hand, x e (0) ∈ Σ(0)
where Σ(0) ⊇ Σ, then it is possible to show that x e (i) ∈ Σ(i) for all
i ∈ I≥0 where Σ(i) → Σ in the Hausdorff metric as i → ∞; the sequence
(Σ(i)) satisfies Σ(0) ⊇ Σ(1) ⊇ Σ(2) ⊇ · · · ⊇ Σ. Hence, it is reasonable
340 Output Model Predictive Control

to assume that if the estimator has been running for a “long” time, it
is in steady state.
Hence we have obtained a state estimator, with “state” (x̂, Σ) satis-
fying

x̂ + = Ax̂ + Bu + L(y − ŷ) (5.12)


+
Σ =Σ

and x(i) ∈ x̂(i) ⊕ Σ for all i ∈ I≥0 , thus meeting the requirements
specified in Section 5.2. Knowing this, our remaining task is to control
x̂(i) so that the resultant closed-loop system is stable and satisfies all
constraints.

5.3.3 Controlling x̂

Since xe (i) ∈ Σ for all i, we seek a method for controlling the observer
state x̂(i) in such a way that x(i) = x̂(i) + x e (i) satisfies the state
constraint x(i) ∈ X for all i. The state constraint x(i) ∈ X will be
satisfied if we control the estimator state to satisfy x̂(i) ∈ X ⊖ Σ for all
i. The estimator state satisfies (5.12) which can be written in the form

x̂ + = Ax̂ + Bu + δ (5.13)

where the disturbance δ is defined by

δ := L(y − ŷ) = L(Cx


e + ν)

and, therefore, always lies in the C-set ∆ defined by

∆ := L(C Σ ⊕ N)

The problem of controlling x̂ is, therefore, the same as that of control-


ling an uncertain system with known state. This problem was exten-
sively discussed in Chapter 3. We can therefore use the approach of
Chapter 3 here with x̂ replacing x, δ replacing w, X ⊖ Σ replacing X
and ∆ replacing W.
To control (5.13) we use, as in Chapter 3, a combination of open-loop
and feedback control, i.e., we choose the control u as follows

u = ū + Ke e := x̂ − x̄ (5.14)

where x̄ is the state of a nominal (deterministic) system that we shall


shortly specify; ū is the feedforward component of the control u, and
5.3 Linear Constrained Systems: Time-Invariant Case 341

Ke is the feedback component. The matrix K is chosen to satisfy


ρ(AK ) < 1 where AK := A + BK. The feedforward component v of
the control u generates, as we show subsequently, a trajectory (x̄(i)),
which is the center of the tube in which the state estimator trajectory
(x̂(i)) lies. The feedback component Ke attempts to steer the trajec-
tory (x̂(i)) of the state estimate toward the center of the tube, and
thereby controls the cross section of the tube. The controller is dy-
namic since it incorporates the nominal dynamic system.
With this control, x̂ satisfies the following difference equation

x̂ + = Ax̂ + B ū + BKe + δ δ∈∆ (5.15)

The nominal (deterministic) system describing the evolution of x̄ is


obtained by neglecting the disturbances BKe and δ in (5.15) yielding

x̄ + = Ax̄ + B ū

The deviation e = x̂ − x̄ between the state x̂ of the estimator and the


state x̄ of the nominal system satisfies

e + = AK e + δ AK := A + BK (5.16)

The feedforward component ū of the control u generates the trajectory


(x̄(i)), which is the center of the tube in which the state estimator
trajectory (x̂(i)) lies. Because ∆ is a C-set and ρ(AK ) < 1, there exists
a robust positive invariant C-set S satisfying

AK S ⊕ ∆ = S

An immediate consequence is the following.

Proposition 5.3 (Proximity of state estimate and nominal state). If the


initial states of the estimator and nominal system, x̂(0) and x̄(0) re-
spectively, satisfy x̂(0) ∈ {x̄(0)} ⊕ S, then x̂(i) ∈ {x̄(i)} ⊕ S and
u(i) ∈ {ū(i)} ⊕ KS for all i ∈ I≥0 , and all admissible disturbance se-
quences w and ν.

It follows from Proposition 5.3 that the state estimator trajectory


x̂ remains in the tube ({x̄(i)} ⊕ S)i∈I≥0 and the control trajectory ū re-
mains in the tube ({ū(i)} ⊕ KS)i∈I≥0 provided that e(0) ∈ S. Hence,
from Propositions 5.2 and 5.3, the state trajectory x lies in the tube
({x̄(i)} ⊕ Γ)i∈I≥0 where Γ := S⊕ Σ provided that xe (0) = x(0)− x̂(0) ∈ Σ
and e(0) ∈ S. This information may be used to construct a robust out-
put feedback model predictive controller using the procedures outlined
342 Output Model Predictive Control

in Chapter 3 for robust state feedback MPC of systems; the major dif-
ference is that we now control the estimator state x̂ and use the fact
that the actual state x lies in {x̂} ⊕ Σ.

5.3.4 Output MPC

Model predictive controllers now can be constructed as described in


Chapter 3, which dealt with robust control when the state was known.
There is an obvious difference in that we now are concerned with con-
trolling x̂ whereas, in Chapter 3, our concern was control of x. We
describe here the appropriate modification of the simple model predic-
tive controller presented in Section 3.5.2. We adopt the same procedure
of defining a nominal optimal control problem with tighter constraints
than in the original problem. The solution to this problem defines the
center of a tube in which solutions to the original system lie, and the
tighter constraints in the nominal problem ensure that the original con-
straints are satisfied by the actual system.
The nominal system is described by

x̄ + = Ax̄ + B ū (5.17)

The nominal optimal control problem is the minimization of the cost


function V̄N (x̄, ū) with
N−1
X
V̄N (x̄, ū) := ℓ(x̄(k), ū(k)) + Vf (x̄(N)) (5.18)
k=0

subject to satisfaction by the state and control sequences of (5.17) and


the tighter constraints

x̄(i) ∈ X̄ ⊆ X ⊖ Γ Γ := S ⊕ Σ (5.19)
ū(i) ∈ Ū ⊆ U ⊖ KS (5.20)

as well as a terminal constraint x̄(N) ∈ X̄f ⊆ X̄. Notice that Γ appears


in (5.19) whereas S, the set in which e = x̂− x̄ lies, appears in (5.20); this
differs from the case studied in Chapter 3 where the same set appears
in both equations. The sets W and N are assumed to be sufficiently
small to ensure satisfaction of the following condition.

Assumption 5.4 (Constraint bounds). Γ = S ⊕ Σ ⊆ X and KS ⊆ U.

If Assumption 5.4 holds, the sets on the right-hand side of (5.19)


and (5.20) are not empty; it can be seen from their definitions that the
5.3 Linear Constrained Systems: Time-Invariant Case 343

sets Σ and S tend to the set {0} as W and N tend to the set {0} in the
sense that dH (W, {0}) → 0 and dH (N, {0}) → 0.
It follows from Propositions 5.2 and 5.3, if Assumption 5.4 holds,
that satisfaction of the constraints (5.19) and (5.20) by the nominal sys-
tem ensures satisfaction of the constraints (5.8) by the original system.
The nominal optimal control problem is, therefore
PN (x̄) : V̄N0 (x̄) = min{V̄N (x̄, ū) | ū ∈ ŪN (x̄)}

in which the constraint set ŪN (x̄) is defined by

ŪN (x̄) := {ū | ū(k) ∈ Ū and φ̄(k; x̄, ū) ∈ X̄ ∀k ∈ {0, 1, . . . , N − 1},
φ̄(N; x̄, ū) ∈ X̄f } (5.21)

In (5.21), X̄f ⊆ X̄ is the terminal constraint set, and φ̄(k; x̄, ū) denotes
the solution of x̄ + = Ax̄ + B ū at time k if the initial state at time 0 is x̄
and the control sequence is ū = (ū(0), ū(1), . . . , ū(N − 1)). The termi-
nal constraint, which is not desirable in process control applications,
may be omitted, as shown in Chapter 2, if the set of admissible initial
states is suitably restricted. Let ū0 (x̄) denote the minimizing control
sequence; the stage cost ℓ(·) is chosen to ensure uniqueness of ū0 (x̄).
The implicit model predictive control law for the nominal system is
κ̄N (·) defined by
κ̄N (x̄) := ū0 (0; x̄)
where ū0 (0; x̄) is the first element in the sequence ū0 (x̄). The domain
of V̄N0 (·) and ū0 (·), and, hence, of κ̄N (·), is X̄N defined by
X̄N := {x̄ ∈ X̄ | ŪN (x̄) ≠ ∅} (5.22)
X̄N is the set of initial states x̄ that can be steered to X̄f by an admis-
sible control ū that satisfies the state and control constraints, (5.19)
and (5.20), and the terminal constraint. From (5.14), the implicit con-
trol law for the state estimator x̂ + = Ax̂ + Bu + δ is κN (·) defined
by
κN (x̂, x̄) := κ̄N (x̄) + K(x̂ − x̄)
The controlled composite system with state (x̂, x̄) satisfies
x̂ + = Ax̂ + BκN (x̂, x̄) + δ (5.23)
x̄ + = Ax̄ + B κ̄N (x̄) (5.24)
with initial state (x̂(0), x̄(0)) satisfying x̂(0) ∈ {x̄(0)} ⊕ S, x̄(0) ∈
X̄N . These constraints are satisfied if x̄(0) = x̂(0) ∈ X̄N . The control
algorithm may be formally stated as follows.
344 Output Model Predictive Control

Algorithm 5.5 (Robust control algorithm (linear constrained systems)).


First set i = 0, set x̂ = x̂(0), and set x̄ = x̂. Then repeat
1. At time i, solve the nominal optimal control problem P̄N (x̄) to ob-
tain the current nominal control action ū = κ̄N (x̄) and the control
u = x̄ + K(x̂ − x̄).

2. Apply the control u to the system being controlled.

3. Compute the successor state estimate x̂ + and the successor state of


the nominal system x̄ +

x̂ + = Ax̂ + Bu + L(y − C x̂) x̄ + = Ax̄ + B ū

4. Set (x̂, x̄) = (x̂ + , x̄ + ), set i = i + 1.

If the terminal cost Vf (·) and terminal constraint set X̄f satisfy the
stability Assumption 2.14, and if Assumption 5.4 is satisfied, the value
function V̄N0 (·) satisfies

V̄N0 (x̄) ≥ ℓ(x̄, κ̄N (x̄)) ∀x̄ ∈ X̄N


V̄N0 (x̄) ≤ Vf (x̄) ∀x̄ ∈ X̄N
V̄N0 (f (x̄, κ̄N (x̄))) ≤ V̄N0 (x̄) − ℓ(x̄, κ̄N (x̄)) ∀x̄ ∈ X̄N

in which ∆V̄N0 (x̄) := V̄N0 (f (x̄, κ̄N (x̄))) − V̄N0 (x̄).


As shown in Section 3.5.3, if, in addition to Assumption 5.4

1. the stability Assumption 2.14 is satisfied,

2. ℓ(x̄, ū) = (1/2)(|x̄|2Q + |ū|2R ) where Q and R are positive definite,

3. Vf (x̄) = (1/2) |x̄|2Pf where Pf is positive definite, and

4. X̄N is a C-set,

then there exist positive constants c1 and c2 such that

V̄N0 (x̄) ≥ c1 |x̄|2 ∀x̄ ∈ X̄N


2
V̄N0 (x̄) ≤ c2 |x̄| ∀x̄ ∈ X̄N
2
V̄N0 (f (x̄, κ̄N (x̄))) ≤ V̄N0 (x̄) − c1 |x̄| ∀x̄ ∈ X̄N
5.3 Linear Constrained Systems: Time-Invariant Case 345

It follows from Chapter 2 that the origin is exponentially stable for the
nominal system x̄ + = Ax̄ + B κ̄N (x̄) with a region of attraction X̄N so
that there exists a c > 0 and a γ ∈ (0, 1) such that

|x̄(i)| ≤ c |x̄(0)| γ i

for all x̄(0) ∈ X̄N , all i ∈ I≥0 . Also x̄(i) ∈ X̄N for all i ∈ I≥0 if
x̄(0) ∈ X̄N so that problem PN (x̄(i)) is always feasible. Because the
state x̂(i) of the state estimator always lies in {x̄(i)} ⊕ S, and the state
x(i) of the system being controlled always lies in {x̄(i)} ⊕ Γ, it fol-
lows that x̂(i) converges robustly and exponentially fast to S, and x(i)
converges robustly and exponentially fast to Γ. We are now in a posi-
tion to establish exponential stability of A := S × {0} with a region of
attraction (X̄N ⊕ S) × X̄N for the composite system (5.23) and (5.24).

Proposition 5.6 (Exponential stability of output MPC). The set A :=


S × {0} is exponentially stable with a region of attraction (X̄N ⊕ S) × X̄N
for the composite system (5.23) and (5.24).

Proof. Let φ := (x̂, x̄) denote the state of the composite system. Then
φ A is defined by
φ A = |x̂|S + |x̄|
where |x̂|S := d(x̂, S). But x̂ ∈ {x̄} ⊕ S implies x̂ = x̄ + e for some e ∈ S
so that
|x̂|S = d(x̂, S) = d(x̄ + e, S) ≤ d(x̄ + e, e) = |x̄|
since e ∈ S. Hence φ A ≤ 2 |x̄| so that

φ(i) A ≤ 2 |x̄(i)| ≤ 2c |x̄(0)| γ i ≤ 2c φ(0) γ i

for all φ(0) ∈ (X̄N ⊕ S) × X̄N . Since for all x̄(0) ∈ X̄N , x̄(i) ∈ X̄ and
ū(i) ∈ Ū, it follows that x̂(i) ∈ {x̄(i)} ⊕ S, x(i) ∈ X, and u(i) ∈ U for
all i ∈ I≥0 . Thus A := S × {0} is exponentially stable with a region of
attraction (X̄N ⊕S)× X̄N for the composite system (5.23) and (5.24). ■

It follows from Proposition 5.6 that x(i), which lies in the set {x̄(i)}⊕
Γ, Γ := S ⊕ Σ, converges to the set Γ. In fact x(i) converges to a set
that is, in general, smaller than Γ since Γ is a conservative bound on
x
e (i) + e(i). We determine this smaller set as follows. Let φ := (x e , e)
and let ψ := (w, ν); φ is the state of the two error systems and ψ is a
bounded disturbance lying in a C-set Ψ := W × N. Then, from (5.10) and
(5.16), the state φ evolves according to
eφ + B
φ+ = A eψ (5.25)
346 Output Model Predictive Control

where " # " #


e := AL 0 e := I −L
A B
LC AK 0 L

Because ρ(AL ) < 1 and ρ(AK ) < 1, it follows that ρ(A e ) < 1. Since
e
ρ(A) < 1 and Ψ is compact, there exists a robust positive invariant set
Φ ⊆ Rn × Rn for (5.25) satisfying

eΦ ⊕ B
A eΨ = Φ

Hence φ(i) ∈ Φ for all i ∈ I≥0 if φ(0) ∈ Φ. Sincehx(i) =i x̄(i) +


e(i) + x e (i), it follows that x(i) ∈ {x̄(i)} ⊕ H Φ, H := In In , for all
i ∈ I≥0 provided that x(0), x̂(0), and x̄(0) satisfy (x e (0), e(0)) ∈ Φ
where x (0) = x(0) − x̂(0) and e(0) = x̂(0) − x̄(0). If these initial
e
conditions are satisfied, x(i) converges robustly and exponentially fast
to the set H Φ.
The remaining robust controllers presented in Section 3.5 may be
similarly modified to obtain a robust output model predictive con-
troller.

5.3.5 Computing the Tightened Constraints

The analysis above shows the tightened state and control constraint
sets X̄ and Ū for the nominal optimal control problem can, in principle,
be computed using set algebra. Polyhedral set computations are not
robust, however, and usually are limited to sets in Rn with n ≤ 15. So
we present here an alternative method for computing tightened con-
straints, similar to that described in 3.5.3.
We next show how to obtain a conservative approximation to X̄ ⊆
X ⊖ Γ, Γ = S ⊕ Σ. Suppose c ′ x ≤ d is one of the constraints defining X.
Since e = x̂ − x̄ , which lies in S, and x
e = x − x̂, which lies in Σ, satisfy
+
e+ = AK e + LCx e + Lν and x e = AL x e + w − Lν, the constraint c ′ x ≤ d
(one of the constraints defining X), the corresponding constraint in X̄
should be c ′ x ≤ d − φX̄∞ in which

φX̄ ′ ′e
∞ = max{c e | e ∈ S} + max{c x |x
e ∈ Σ}

X ∞
X
j j
= max AK (LCx
e (j) + Lν(j)) + max AL (w(j) − Lν(j))
(w(i),ν(i)) (w(i),ν(i))
j=0 j=0

Pj−1
e (j) = i=0 AiL (w(i)−Lν(i)). The maximizations are subject
in which x
to the constraints w(i) ∈ W and ν(i) ∈ N for all i ∈ I≥0 . Because
5.4 Linear Constrained Systems: Time-Varying Case 347

maximization over an infinite horizon is impractical, we determine, as


in 3.5.3, a horizon N ∈ I≥0 and an α ∈ (0, 1) such that AN K W ⊂ αW
and AN L N ⊂ αN, and define the constraint in X̄ corresponding to the
constraint c ′ x ≤ d in X to be c ′ x ≤ d − (1 − α−1 )φX̄
N with

N−1
X N−1
X
j j
φX̄
N = max AK (LCx
e (j)+Lν(j))+ max AL (w(j)−Lν(j))
(w(i),ν(i)) (w(i),ν(i))
j=0 j=0

The tightened constraints yielding a conservative approximation to Ū :=


U ⊖ KS may be similarly computed. The constraint c ′ u ≤ d, one of the
constraints defining U, should be replaced by c ′ u ≤ d − (1 − α)−1 φŪN
with
N−1
X j
φŪ ′
N = max{c e | e ∈ KS} = max KAK (LCx
e (j) + Lν(j))
(w(i),ν(i))
j=0


The maximizations for computing φX̄ N and φN are subject to the con-
straints w(i) ∈ W and ν(i) ∈ N for all i ∈ I≥0 .

5.4 Linear Constrained Systems: Time-Varying Case


The time-invariant case corresponds to the “steady-state” situation in
which the sets S(t) and Σ(t) have settled down to their steady-state
values S and Σ, respectively. As a result the constraint sets X̄ and Ū
are also time invariant. When the state is accessible, the constraint
x ∈ X̄(i) := X ⊖ S(i) is less conservative than x ∈ X̄ = X ⊖ S, in which
S = S(∞). This relaxation of the constraint may be useful in some
applications. The version of tube-based MPC employed here is such
that S(t + 1) ⊃ S(t) for all t so that S(t) converges to S(∞) as t → ∞.
In other versions of tube-based MPC, in which PN (x) rather PN (x̄) is
solved online, S(t) is reset to the empty set so that advantage in using
S(t) rather than S(∞) is larger. On the other hand, the state estimation
set Σ(t) may increase or decrease with t depending on prior informa-
tion. The time-varying version of tube-based MPC is fully discussed in
Mayne, Raković, Findeisen, and Allgöwer (2009).

5.5 Offset-Free MPC


Offset-free MPC was introduced in Chapters 1 and 2 in a deterministic
context; see also Pannocchia and Rawlings (2003). Suppose the system
348 Output Model Predictive Control

Set Definition Membership


X state constraint set x∈X
U input constraint set u∈U
Wx state disturbance set wx ∈ W x
Wd integrating disturbance set wd ∈ Wd
W total state disturbance set, Wx × Wd w∈W
N measurement error set ν∈N
e
W estimate error disturbance set, W ⊕ (−LN) w
e ∈We
Φ total estimate error disturbance set,
Φ=A e LΦ ⊕ W
e φ∈Φ
h i
Σx state estimate error disturbance set, In 0 Φ x
e ∈ Σx
Σd integrating
h i disturbance estimate error set,
0 Ip Φ e ∈ Σd
d
∆ innovation set, L(Ce Φ ⊕ N) Ly
e ∈∆
∆x set containing state component
e Φ ⊕ N)
of innovation, Lx (C Lx y
e ∈ ∆x
∆d set containing integrating disturbance
component of innovation, Ld (C e Φ ⊕ N) Ld ye ∈ ∆d
S nominal state tracking error invariance set, e∈S
AK S ⊕ ∆x = S x̂ ∈ {x̄} + S
Γ state tracking error invariance set, S + Σx x ∈ {x̄} + Γ
Ū nominal input constraint set, Ū = U ⊖ KS ū ∈ Ū
X̄ nominal state constraint set, X̄ = X ⊖ Γ : x̄ ∈ X̄

Table 5.1: Summary of the sets and variables used in output MPC.

to be controlled is described by

x + = Ax + Bd d + Bu + wx
y = Cx + Cd d + ν
r = Hy re = r − r̄

in which wx and ν are unknown bounded disturbances taking values,


respectively, in the compact sets Wx and N containing the origin in
their interiors. In the following discussion, y = Cx + Cd d is the output
of the system being controlled, r is the controlled variable, and r̄ is
its setpoint. The variable re is the tracking error that we wish to mini-
mize. We assume d is nearly constant but drifts slowly, and model its
5.5 Offset-Free MPC 349

behavior by
d+ = d + wd
where wd is a bounded disturbance taking values in the compact set
Wd ; in practice d is bounded, although this is not implied by our model.
We assume that x ∈ Rn , d ∈ Rp , u ∈ Rm , y ∈ Rr , and e ∈ Rq , q ≤ r ,
and that the system to be controlled is subject to the usual state and
control constraints
x∈X u∈U
We assume X is polyhedral and U is polytopic.
Given the many sets that are required to specify the output feedback
case we are about to develop, Table 5.1 may serve as a reference for the
sets defined in the chapter and the variables that are members of these
sets.

5.5.1 Estimation

Since both x and d are unknown, it is necessary to estimate them.


For estimation purposes, it is convenient to work with the composite
system whose state is φ := (x, d). This system may be described more
compactly by

eφ + B
φ+ = A eu + w
eφ + ν
y =C

in which w = (wx , wd ) and


" # " #
A Bd B h i
e
A := e
B := e := C
C Cd
0 I 0

and w := (wx , wd ) takes values in W = Wx × Wd . A necessary and suf-


ficient condition for the detectability of (A e,C
e ) is given in Lemma 1.8.
A sufficient condition is detectability of (A, C), coupled with invertibil-
e,C
ity of Cd . If (A e ) is detectable, the state may be estimated using the
time-invariant observer or filter described by

e φ̂ + B
φ̂+ = A eu + δ e φ̂)
δ := L(y − C

in which L is such that ρ(Ae L ) < 1 where A


e L := A
e − LC
e . Clearly δ = Ly
e
where ye =C eφe + ν. The estimation error φe := φ − φ̂ satisfies

+
e =A
φ eφe + w − L(C
eφe + ν)
350 Output Model Predictive Control

or, in simpler form


+
e =A
φ e +w
e Lφ e e := w − Lν
w

Clearly w e defined by
e = w − Lν takes values in the compact set W
e := W ⊕ (−LN)
W
e decays to zero exponentially fast so that x̂ → x̄
If w and ν are zero, φ
e L ) < 1 and W
and d̂ → d exponentially fast. Since ρ(A e is compact, there
+
e
exists a robust positive invariant set Φ for φ = A e +w
e Lφ e, w
e ∈ W e
satisfying
Φ=A e LΦ ⊕ We
e (i) ∈ Φ for all i ∈ I≥0 if φ
Hence φ e (0) ∈ Φ. Since φ
e = (x e ) ∈ Rn × Rp
e, d
where x e
e := x − x̂ and d := d− d̂, we define the sets Σx and Σd as follows
h i h i
Σx := In 0 Φ Σd := 0 Ip Φ

It follows that x e (i) ∈ Σd so that x(i) ∈ {x̂} ⊕ Σx and


e (i) ∈ Σx and d
d(i) ∈ {d̂(i)} ⊕ Σd for all i ∈ I≥0 if φ e (0) = (x e (0)) ∈ Φ. That
e (0), d
φe (0) ∈ Φ is a steady-state assumption.

5.5.2 Control

The estimation problem has a solution similar to previous solutions.


The control problem is more difficult. As before, we control the esti-
mator state, making allowance for state estimation error. The estimator
state φ̂ satisfies the difference equation
e φ̂ + B
φ̂+ = A eu + δ

where the disturbance δ is defined by

δ := Ly eφ
e = L(C e + ν)

The disturbance δ = (δx , δd ) lies in the C−set ∆ defined by


e Φ ⊕ N)
∆ := L(C
e φ̂+B
where the set Φ is defined in Section 5.5.1. The system φ̂+ = A e u+δ
is not stabilizable, however, so we examine the subsystems with states
x̂ and d̂

x̂ + = Ax̂ + Bd d̂ + Bu + δx
d̂+ = d̂ + δd
5.5 Offset-Free MPC 351

where the disturbances δx and δd are components of δ = (δx , δd ) and


are defined by

δx := Lx y eφ
e = Lx (C e + ν) δd := Ld y eφ
e = Ld (C e + ν)

The matrices Lx and Ld are the corresponding components of L. The


disturbance δx and δd lie in the C−sets ∆x and ∆d defined by
h i h i
∆x := In 0 ∆ = Lx [C e Φ ⊕ N] ∆d := 0 Ip ∆ = Ld [C e Φ ⊕ N]

We assume that (A, B) is a stabilizable pair so the tube methodology


may be employed to control x̂. The system d̂+ = d̂ + δd is uncon-
trollable. The central trajectory is therefore chosen to be the nominal
version of the difference equation for (x̂, d̂) and is described by

x̄ + = Ax̄ + Bd d̂ + B ū
d̄+ = d̄

in which the initial state is (x̂, d̂). We obtain ū = κ̄N (x̄, d̄, r̄ ) by solving
a nominal optimal control problem defined later and set u = ū + Ke,
e := x̂ − x̄ where K is chosen so that ρ(AK ) < 1, AK := A + BK; this
is possible since (A, B) is assumed to be stabilizable. It follows that
e := x̂ − x̄ satisfies the difference equation

e+ = AK e + δx δx ∈ ∆x

Because ∆x is compact and ρ(AK ) < 1, there exists a robust positive


invariant set S for e+ = AK e + δx , δx ∈ ∆x satisfying

AK S ⊕ ∆x = S

Hence e(i) ∈ S for all i ∈ I≥0 if e(0) ∈ S. So, as in Proposition 5.3, the
states and controls of the estimator and nominal system satisfy x̂(i) ∈
{x̄(i)} ⊕ S and u(i) ∈ {ū(i)} ⊕ KS for all i ∈ I≥0 if the initial states
x̂(0) and x̄(0) satisfy x̂(0) ∈ {x̄(0)} ⊕ S. Using the fact established
previously that x e (i) ∈ Σx for all i, we can also conclude that x(i) =
x̄(i)+e(i)+x e (i) ∈ {x̄(i)}⊕Γ and that u(i) = ū(i)+Ke(i) ∈ {ū(i)}+KS
for all i where Γ := S ⊕ Σx provided, of course, that φ(0) ∈ {φ̂(0)} ⊕ Φ
and x(0) ∈ {x̂(0)} ⊕ S. These conditions are equivalent to φ e (0) ∈ Φ
and e(0) ∈ S where, for all i, e(i) := x̂(i) − x̄(i). Hence x(i) lies in X
and u(i) lies in U if x̄(i) ∈ X̄ := X ⊖ Γ and ū(i) ∈ Ū := U ⊖ KS.
Thus x̂(i) and x(i) evolve in known neighborhoods of the central
state x̄(i) that we can control. Although we know that the uncontrol-
lable state d(i) lies in the set {d̂(i)} ⊕ iΣd for all i, the evolution of d̂(i)
352 Output Model Predictive Control

is an uncontrollable random walk and is, therefore, unbounded. If the


initial value of d̂ at time 0 is d̂0 , then d̂(i) lies in the set {d̂0 } ⊕ iΣd that
increases without bound as i increases. This behavior is a defect in our
model for the disturbance d; the model is useful for estimation pur-
poses, but is unrealistic in permitting unbounded values for d. Hence
we assume in the sequel that d evolves in a compact C−set Xd . We can
modify the observer to ensure that d̂ lies in Xd , but find it simpler to
observe that if d lies in Xd , d̂ must lie in Xd ⊕ Σd .
Target Calculation. Our first task is to determine the target state x̄s
and associated control ūs ; we require our estimate of the tracking error
re = r − r̄ to be zero in the absence of any disturbances. We follow
the procedure outlined in Pannocchia and Rawlings (2003). Since our
estimate of the measurement noise ν is 0 and since our best estimate
of d when the target state is reached is d̂, we require

r̂ − r̄ = H(C x̄s + Cd d̂) − r̄ = 0

We also require the target state to be an equilibrium state satisfying,


therefore, x̄s = Ax̄s + Bd d̂ + B ūs for some control ūs . Given (d̂, r̄ ), the
target equilibrium pair (x̄s , ūs )(d̂, r̄ ) is computed as follows

(x̄s , ūs )(d̂, r̄ ) = arg min{L(x̄, ū) | x̄ = Ax̄ + Bd d̂ + B ū,


x̄,ū

H(C x̄ + Cd d̂) = r̄ , x̄ ∈ X̄, ū ∈ Ū}

where L(·) is an appropriate cost function; e.g., L(x̄, ū) = (1/2) |ū|2R̄ .
The equality constraints
h i in this optimization problem can be satisfied
if the matrix I−A −B
HC 0 has full rank. As the notation indicates, the
target equilibrium pair (x̄s , ūs )(d̂, r̄ ) is not constant, but varies with
the estimate of the disturbance state d.
MPC algorithm. The control objective is to steer the central state x̄
to a small neighborhood of the target state x̄s (d̂, r̄ ) while satisfying
the state and control constraints x ∈ X and u ∈ U. It is desirable
that x̄(i) converges to x̄s (d̂, r̄ ) if d̂ remains constant, in which case
x(i) converges to the set {x̄s (d̂, r̄ )} ⊕ Γ. We are now in a position to
specify the optimal control problem whose solution yields ū = κ̄N (x̄,
d̂, r̄ ) and, hence, u = ū + K(x̂ − x̄).To achieve this objective, we define
the deterministic optimal control problem

P̄N (x̄, d̂, r̄ ) : VN0 (x̄, d̂, r̄ ) := min{VN (x̄, d̂, r̄ , ū) | ū ∈ ŪN (x̄, d̂, r̄ )}

5.5 Offset-Free MPC 353

in which the cost VN (·) and the constraint set ŪN (x̄, d̂, r̄ ) are defined
by

N−1
X
VN (x̄, d̂, r̄ , ū) := ℓ(x̄(i) − x̄s (d̂, r̄ ), ū(i) − ūs (d̂, r̄ ))+
i=0

Vf (x̄(N), x̄s (d̂, r̄ ))


ŪN (x̄, d̂, r̄ ) := ū | x̄(i) ∈ X̄, ū(i) ∈ Ū ∀i ∈ I0:N−1 ,
x̄(N) ∈ X̄f (x̄s (d̂, r̄ ))

and, for each i, x̄(i) = φ̄(i; x̄, d̂, ū), the solution of x̄ + = Ax̄ + Bd d̂ + B ū
when the initial state is x̄, the control sequence is ū, and the disturbance
d̂ is constant, i.e., satisfies the nominal difference equation d̂+ = d̂. The
set of feasible (x̄, d̂, r̄ ) and the set of feasible states x̄ for P̄N (x̄, d̂, r̄ )
are defined by

F̄N := {(x̄, d̂, r̄ ) | UN (x̄, d̂, r̄ ) ≠ ∅} X̄N (d̂, r̄ ) := {x̄ | (x̄, d̂, r̄ ) ∈ F̄N }

The terminal cost is zero when the terminal state is equal to the target
state. The solution to P̄N (x̄, d̂, r̄ ) is

ū0 (x̄, d̂, r̄ ) = {ū0 (0; x̄, d̂, r̄ ), ū0 (1; x̄, d̂, r̄ ), . . . , ū0 (N − 1; x̄, d̂, r̄ )}

and the implicit model control law κ̄N (·) is defined by

κ̄N (x̄, d̂, r̄ ) := ū0 (0; x̄, d̂, r̄ )

where ū0 (0; x̄, d̂, r̄ ) is the first element in the sequence ū0 (x̄, d̂, r̄ ). The
control u applied to the plant and the observer is u = κN (x̂, x̄, d̂, r̄ )
where κN (·) is defined by

κN (x̂, x̄, d̂, r̄ ) := κ̄N (x̄, d̂, r̄ ) + K(x̂ − x̄)

Although the optimal control problem P̄N (x̄, d̂, r̄ ) is deterministic, d̂ is


random, so that the sequence (x̄(i)), which satisfies x̄ + = Ax̄ + Bd d̂ +
B κ̄N (x̄, d̂, r̄ ), is random, unlike the case discussed in Chapter 3. The
control algorithm may now be formally stated.

Algorithm 5.7 (Robust control algorithm (offset-free MPC)).


1. At time 0, set i = 0, set φ̂ = φ̂(0) (φ̂ = (x̂, d̂), and set x̄ = x̂.
354 Output Model Predictive Control

2. At time i, solve the “nominal” optimal control problem P̄N (x̄, d̂, r̄ )
to obtain the current “nominal” control action ū = κ̄N (x̄, d̂, r̄ ) and the
control action u = ū + K(x̂ − x̄).

3. If P̄N (x̄, d̂, r̄ ) is infeasible, adopt safety/recovery procedure.

4. Apply the control u to the system being controlled.


e x̂ + B
5. Compute the successor state estimate φ̂+ = A e φ̂).
e u + L(y − C

6. Compute the successor state x̄ + = Ax̄ + Bd d̂ + B ū of the nominal


system.

7. Set (φ̂, x̄) = (φ̂+ , x̄ + ), set i = i + 1.

In normal operation, Step 2 is not activated; Propositions 5.2 and


5.3 ensure that the constraints x̂ ∈ {x̄} ⊕ S and u ∈ {ū} ⊕ KS are
satisfied. If an unanticipated event occurs and Step 2 is activated, the
controller can be reinitialized by setting ū = κ̄N (x̂, d̂, r̄ ), setting u = ū,
and relaxing constraints if necessary.

5.5.3 Convergence Analysis

We give here an informal discussion of the stability properties of the


controller. The controller described above is motivated by the follow-
ing consideration: nominal MPC is able to handle “slow” uncertainties
such as the drift of a target point.“Fast” uncertainties, however, are
better handled by the tube controller that generates, using MPC, a suit-
able central trajectory and a “fast” ancillary controller to steer trajec-
tories of the uncertain system toward the central trajectory. As shown
above, the controller ensures that x(i) ∈ {x̄(i)} ⊕ Γ for all i; its suc-
cess therefore depends on the ability of the controlled nominal system
x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ), ū = κ̄N (x̄, d̂, r̄ ), to track the target
x̄s (d̂, r̄ ) that varies as d̂ evolves.
Assuming that the standard stability assumptions are satisfied for
the nominal optimal control problem P̄N (x̄, d̂, r̄ ) defined above, we
have
2
VN0 (x̄, d̂, r̄ ) ≥ c1 x̄ − x̄s (d̂, r̄ )
2
VN0 (x̄, d̂, r̄ ) ≤ c2 x̄ − x̄s (d̂, r̄ )
2
VN0 (x̄ + , d̂, r̄ ) ≤ VN0 (x̄, d̂, r̄ ) − c1 x̄ − x̄s (d̂, r̄ )
5.5 Offset-Free MPC 355

with x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ), for all (x̄, d̂, r̄ ) ∈ F̄N . The first and
last inequalities follow from our assumptions; we assume the existence
of the upper bound in the second inequality. The inequalities hold for
all (x̄, d̂, r̄ ) ∈ F̄N . Note that the last inequality does NOT ensure VN0 (x̄ + ,
2
d̂+ , r̄ ) ≤ VN0 (x̄, d̂, r̄ ) − c1 x̄ − x̄s (d̂, r̄ ) with x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄,
d̂, r̄ ) and d̂+ := d̂ + δd . The perturbation due to δd has to be taken into
account when analyzing stability.
Constant d̂. If d̂ remains constant, x̄s (d̂, r̄ ) is exponentially stable for
x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ) with a region of attraction X̄N (d̂, r̄ ). It
can be shown, as in the proof of Proposition 5.6, that the set A(d̂, r̄ ) :=
({x̄s (d̂, r̄ )} ⊕ S) × {x̄s (d̂, r̄ )} is exponentially stable for the composite
system x̂ + = Ax̂ + Bd d̂ + BκN (x̂, x̄, d̂, r̄ ) + δx , x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄,
d̂), δx ∈ ∆x , with a region of attraction (X̄N (d̂, r̄ ) ⊕ S) × X̄N (d̂, r̄ ).
Hence x(i) ∈ {x̄(i)} ⊕ Γ tends to the set {x̄s (d̂, r̄ )} ⊕ Γ as i → ∞. If, in
addition, W = {0} and N = {0}, then ∆ = {0} and Γ = Σ = S = {0} so
that x(i) → x̄s (d, r̄ ) and re (i) → 0 as i → ∞.
Slowly varying d̂. If d̂ is varying, the descent property of VN0 (·) is
modified and it is necessary to obtain an upper bound for VN0 (Ax̄ +
Bd (d̂ + δd ) + B κ̄N (x̄, d̂, r̄ ), d̂ + δd , r̄ ). We make use of Proposition 3.4 in
Chapter 3. If X̄N is compact and if (d̂, r̄ ) , x̄s (d̂, r̄ ) and (d̂, r̄ ) , ūs (d̂,
r̄ ) are both continuous in some compact domain, then, since VN (·) is
then continuous in a compact domain A, it follows from the properties
of VN0 (·) and Proposition 3.4 that there exists a K∞ function α(·) such
that
2
VN0 (x̄, d̂, r̄ ) ≥ c1 x̄ − x̄s (d̂, r̄ )
2
VN0 (x̄, d̂, r̄ ) ≤ c2 x̄ − x̄s (d̂, r̄ )
2
VN0 (x̄ + , d̂+ , r̄ ) ≤ VN0 (x̄, d̂, r̄ ) − c1 x̄ − x̄s (d̂, r̄ ) + α(δd )

for all (x̄, d̂, δd , r̄ ) ∈ V ; here (x̄, d̂)+ := (x̄ + , d̂+ ), x̄ + = Ax̄ + Bd (d̂ +
δd ) + B κ̄N (x̄, d̂, r̄ ) and d̂+ = d̂ + δd . A suitable choice for A is V ×
D × {r̄ } × UN with V the closure of leva VN0 (·) for some a > 0, and D a
compact set containing d and d̂. It follows that there exists a γ ∈ (0, 1)
such that
VN0 ((x̄, d̂)+ , r̄ ) ≤ γVN0 (x̄, d̂, r̄ ) + α(δd )
with γ = 1 − c1 /c2 ∈ (0, 1). Assuming that PN (x̄, d̂, r̄ ) is recursively
feasible
VN0 (x̄(i), d̂(i), r̄ ) ≤ γ i VN0 (x̄(0), d̂(0), r̄ ) + α(δd )(1 − γ i )/(1 − γ)
356 Output Model Predictive Control

in which x̄(0) = x(0) and d̂(0) = d(0). It then follows from the last
inequality and the bounds on VN0 (·) that

x̄(i) − x̄s (d̂(i), r̄ ) ≤ γ i/2 (c2 /c1 )1/2 x̄(0) − x̄s (d̂(0), r̄ ) + c(i)

with c(i) := [α(δd )(1 − γ i )/(1 − γ)]1/2 so that c(i) → c := [α(δd )/(1 −
γ)]1/2 and x̄(i) − x̄s (d̂(i), r̄ ) → c as i → ∞. Here we have made use
of the fact that (a + b)1/2 ≤ a1/2 + b1/2 .
Let C ⊂ Rn denote the set {x | |x| ≤ c}. Then x̄(i) → {x̄s (d̂(i),
r̄ )}⊕C, x̂(i) → {x̄s (d̂(i), r̄ )}⊕C ⊕S and x(i) → {x̄s (d̂(i), r̄ )}⊕C ⊕S ⊕Σ
as i → ∞. Since c(i) = [α(δd )(1 − γ i )/(1 − γ)]1/2 → 0 as δd → 0,
it follows that x̄(i) → x̄s (d̂(i), r̄ ) as i → ∞. The sizes of S and Σ
are dictated by the process and measurement disturbances, w and ν
respectively.
Recursive feasibility. The result that x(i) → {x̄s (d̂(i), r̄ )} ⊕ C ⊕ Γ ,
Γ := S⊕Σ, is useful because it gives an asymptotic bound on the tracking
error. But it does depend on the recursive feasibility of the optimal
control problem PN (·), which does not necessarily hold because of the
variation of d̂ with time. Tracking of a random reference signal has
been considered in the literature, but not in the context of output MPC.
We show next that PN (·) is recursively feasible and that the tracking
error remains bounded if the estimate d̂ of the disturbance d varies
sufficiently slowly—that is if δd in the difference equation d̂+ = d̂+δd is
sufficiently small. This can be ensured by design of the state estimator.
To establish recursive feasibility, assume that the current “state” is
(x̄, d̂, r̄ ) and x̄ ∈ X̄(d̂, r̄ ). In other words, we assume P̄N (x̄, d̂, r̄ ) is
feasible and x̄N := φ̄(N; x̄, κ̄N (x̄, d̂, r̄ )) ∈ Xf (x̄s (d̂, r̄ )). If the usual
stability conditions are satisfied, problem PN (x̄ + , d̂, r̄ ) is also feasible
so that x̄ + = Ax̄ + Bd d̂ + B κ̄N (x̄, d̂, r̄ ) ∈ X̄N (d̂, r̄ ). But d̂+ = d̂ + δd
so that PN (x̄ + , d̂+ , r̄ ) is not necessarily feasible since x̄N , which lies in
Xf (x̄s (d̂, r̄ )), does not necessarily lie in Xf (x̄s (d̂+ , r̄ )). Let the terminal
set Xf (x̄s (d̂, r̄ )) := {x | Vf (x − x̄s (d̂, r̄ )) ≤ c}. If the usual stability
conditions are satisfied, for each x̄N ∈ Xf (x̄s (d̂, r̄ )), there exists a u =
+
κf (x̄N ) that steers x̄N to a state x̄N in {x | Vf (x − x̄s (d̂, r̄ )) ≤ e}, e < c.
Consequently, there exists a feasible control sequence u e (x̄) ∈ ŪN (x̄,
+
d̂, r̄ ) that steers x̄ + to a state x̄N ∈ {x | Vf (x − x̄s (d̂, r̄ )) ≤ e}. If
the map d̂ , x̄s (d̂, r̄ ) is uniformly continuous, there exists a constant
+
a > 0 such that |δd | ≤ a implies that x̄N lies also in Xf (x̄s (d̂+ , r̄ )) =
+
{x | Vf (x − x̄s (d̂ , r̄ )) ≤ c}. Thus the control sequence u e (x̄) also
+ + +
steers x̄ to the set Xf (x̄s (d̂ , r̄ )) and hence lies in ŪN (x̄, d̂ , r̄ ). Hence
5.6 Nonlinear Constrained Systems 357

problem P̄N (x̄ + , d̂+ , r̄ ) is feasible so that P̄N is recursively feasible if


supi∈I0:∞ |δd (i)| ≤ e.

Computing the tightened constraints. The first step in the control


algorithm requires solution of the problem PN (x̄, d̂, r̄ ), in which the
state and control constraints are, respectively, x̄ ∈ X̄ and ū ∈ Ū. Since
the sets X̄ and Ū are difficult to compute, we replace them by tightened
versions of the original constraints as described in Section 5.3.5.

Summarizing, if the usual stability assumptions are satis-


fied, if d̂(i) remains in a compact set Xd̂ for all i, if the map
d̂ , x̄s (d̂, r̄ ) is continuous in Xd̂ , if ℓ(·) and Vf (·) are quad-
ratic and positive definite, and |δd (i)| ≤ a for all i, then the
asymptotic error x(i) − x̄s (d̂(i), r̄ ) lies in the compact set
C ⊕ Γ (Γ = S + Σ) that converges to the set {0} as the sets W
and N that bound the disturbances converge to the zero set
{0}. Similarly, the tracking error r − r̄ is also bounded and
converges to 0 as W and N converge to the zero set {0}.

5.6 Nonlinear Constrained Systems


When the system being controlled is nonlinear, the state can be esti-
mated using moving horizon estimation (MHE), as described in Chap-
ter 4. But establishing stability of nonlinear output MPC that employs
MHE does not appear to have received much attention, with one impor-
tant exception mentioned in Section 5.7.

5.7 Notes
The problem of output feedback control has been extensively discussed
in the general control literature. For linear systems, it is well known
that a stabilizing state feedback controller and an observer may be sep-
arately designed and combined to give a stabilizing output feedback
controller (the separation principle). For nonlinear systems, Teel and
Praly (1994) show that global stabilizability and complete uniform ob-
servability are sufficient to guarantee semiglobal stabilizability when
a dynamic observer is used, and provide useful references to related
work on this topic.
Although output MPC, in which nominal MPC is combined with a
separately designed observer, is widely used in industry since the state
358 Output Model Predictive Control

is seldom available, it has received relatively little attention in the liter-


ature because of the inherent difficulty in establishing asymptotic sta-
bility. An extra complexity in MPC is the presence of hard constraints.
A useful survey, more comprehensive than these notes, is provided in
Findeisen, Imsland, Allgöwer, and Foss (2003). Earlier Michalska and
Mayne (1995) show for deterministic systems that for any subset of
the region of attraction of the full state feedback system, there exists
a sampling time and convergence rate for the observer such that the
subset also lies in the region of attraction of the output feedback sys-
tem. A more sophisticated analysis in Imsland, Findeisen, Allgöwer,
and Foss (2003) using continuous time MPC shows that the region of
attraction and rate of convergence of the output feedback system can
approach that of the state feedback system as observer gain increases.
We consider systems with input disturbances and noisy state mea-
surement, and employ the “tube” methodology that has its roots in the
work of Bertsekas and Rhodes (1971), and Glover and Schweppe (1971)
on constrained discrete time systems subject to bounded disturbances.
Reachability of a “target set” and a “target tube” are considered in these
papers. These concepts were substantially developed in the context
of continuous time systems in Khurzhanski and Valyi (1997); Aubin
(1991); Kurzhanski and Filippova (1993).
The theory for discrete time systems is considerably simpler; a mod-
ern tube-based theory for optimal control of discrete time uncertain
systems with imperfect state measurement appears in Moitié, Quin-
campoix, and Veliov (2002). As in this chapter, they regard a set X of
states x that are consistent with past measurements as the “state” of
the optimal control problem. The set X satisfies an uncertain “full in-
formation” difference equation of the form X + = f ∗ (X, u, W, v) so the
output feedback optimal control problem reduces to robust control of
an uncertain system with known state X.
The optimal control problem remains difficult because the state X,
a subset of Rn , is difficult to obtain numerically and determination
of a control law as a function of (X, t) prohibitive. In Mayne, Raković,
Findeisen, and Allgöwer (2006); Mayne et al. (2009) the output feedback
problem is simplified considerably by replacing X(t) by a simple outer
approximation {x̂(t)} ⊕ Σx in the time-invariant case, and by {x̂(t)} ⊕
Σx (t) in the time-varying case. The set Σx , or the sequence (Σx (t)), may
be precomputed so the difficult evolution equation for X is replaced
by a simple evolution equation for x̂; in the linear case, the Luenberger
observer or Kalman filter describes the evolution of x̂. The output
5.7 Notes 359

feedback control problem reduces to control of an uncertain system


with known state x̂.
Artstein and Raković (2008) provide an interesting extension of the
invariant sets given in (5.11) to the nonlinear case x + ∈ F (x) + V when
F (·) is a contraction mapping and V is compact.
While the tube approach may be successfully employed for output
MPC when the system being controlled is linear, there seems to be no
literature on combining moving horizon estimation (MHE) with MPC
when the system being controlled is nonlinear, except for the paper
Copp and Hespanha (2014). The novel proposal in this paper is to re-
place separate solutions of the control and estimation problems by a
single min-max problem in which the cost is, unusually, over the inter-
val (−∞, ∞) or [−T , T ], and combines the cost of both estimation and
control. The authors also propose an efficient interior point algorithm
for solving the complex min-max problem.
The output MPC problem involves tracking of a possibly random ref-
erence, a problem that has extra difficulty when zero offset is required.
There is a growing literature dealing with tracking random references
not necessarily in the context of output MPC. Examples of papers deal-
ing with this topic are Limon, Alvarado, Alamo, and Camacho (2008);
Ferramosca, Limon, Alvarado, Alamo, and Camacho (2009); Falugi and
Mayne (2013).
360 Output Model Predictive Control

5.8 Exercises

Exercise 5.1: Hausdorff distance between a set and a subset


Show that dH (A, B) = max a∈A d(a, B) if A and B are two compact subsets of Rn satis-
fying B ⊆ A.

Exercise 5.2: Hausdorff distance between sets A ⊕ B and B


Show that dH (A ⊕ B, A) ≤ |B| if A and B are two compact subsets of Rn satisfying
0 ∈ B in which |B| := max b {|b| | b ∈ B}.

Exercise 5.3: Hausdorff distance between sets {z} ⊕ B and A


Show that dH ({z} ⊕ B, A) ≤ |z| + dH (A, B) if A and B are two compact sets in Rn .

Exercise 5.4: Hausdorff distance between sets {z} ⊕ A and A


Show that dH ({z} ⊕ A, A) = |z| if z is a point and A is a compact set in Rn .

Exercise 5.5: Hausdorff distance between sets A ⊕ C and B ⊕ C


Show that dH (A ⊕ C, B ⊕ C) ≤ dH (A, B) if A, B, and C are compact subsets of Rn .

Exercise 5.6: Hausdorff distance between sets FA and FB


Let A and B be two compact sets in Rn , and let F ∈ Rn×n . Show that

dH (F A, F B) ≤ |F | dH (A, B)

in which |F | is the induced norm of F satisfying |F x| ≤ |F | |x| and |x| := d(x, 0).

Exercise 5.7: Linear combination of sets; λ1 W ⊕ λ2 W = (λ1 + λ2 )W


If W is a convex set, show that λ1 W ⊕ λ2 W = (λ1 + λ2 )W for any λ1 , λ2 ∈ R≥0 . Hence
show W ⊕ λW ⊕ λ2 W ⊕ · · · = (1 − λ)−1 W if λ ∈ [0, 1).

Exercise 5.8: Hausdorff distance between the sets Φ(i) and Φ


Show that there exist c > 0 and γ ∈ (0, 1) such that

dH (Φ(i), Φ) ≤ cdH (Φ(0), Φ)γ i

in which
e Φ(i − 1) ⊕ B
Φ(i) = A eΨ

eΦ ⊕ B
Φ=A eΨ

e is a stable matrix (ρ(A


and A e ) < 1).
Bibliography

Z. Artstein and S. V. Raković. Feedback and invariance under uncertainty via


set-iterates. Automatica, 44(2):520–525, February 2008.

J. P. Aubin. Viability Theory. Systems & Control: Foundations & Applications.


Birkhauser, Boston, Basel, Berlin, 1991.

D. P. Bertsekas and I. B. Rhodes. Recursive state estimation for a set-


membership description of uncertainty. IEEE Trans. Auto. Cont., 16:117–
128, 1971.

F. Blanchini. Set invariance in control. Automatica, 35:1747–1767, 1999.

D. A. Copp and J. P. Hespanha. Nonlinear output-feedback model predictive


control with moving horizon estimation. In Proceedings of the 53rd Confer-
ence on Decision and Control, December 2014.

P. Falugi and D. Q. Mayne. Model predictive control for tracking random ref-
erences. In Proceedings of the 2013 European Control Conference, pages
518–523, 2013.

A. Ferramosca, D. Limon, I. Alvarado, T. Alamo, and E. F. Camacho. MPC for


tracking of constrained nonlinear systems. In Proceedings of the 48th Con-
ference on Decision and Control, and the 28th Chinese Control Conference,
pages 7978–7983, 2009.

R. Findeisen, L. Imsland, F. Allgöwer, and B. A. Foss. State and output feedback


nonlinear model predictive control: An overview. Eur. J. Control, 9(2–3):190–
206, 2003.

J. D. Glover and F. C. Schweppe. Control of linear dynamic systems with set


constrained disturbances. IEEE Trans. Auto. Cont., 16:411–423, 1971.

L. Imsland, R. Findeisen, F. Allgöwer, and B. A. Foss. A note on stability, ro-


bustness and performance of output feedback nonlinear model predictive
control. J. Proc. Cont., 13:633–644, 2003.

A. B. Khurzhanski and I. Valyi. Ellipsoidal-valued dynamics for estimation and


control. Systems & Control: Foundations & Applications. Birkhauser, Boston,
Basel, Berlin, 1997.

I. Kolmanovsky and E. G. Gilbert. Theory and computation of disturbance


invariant sets for discrete-time linear systems. Math. Probl. Eng., 4(4):317–
367, 1998.

361
362 Bibliography

A. B. Kurzhanski and T. F. Filippova. On the theory of trajectory tubes: A


mathematical formalism for uncertain dynamics, viability and control. In
A. B. Kurzhanski, editor, Advances in Nonlinear Dynamics and Control: A
Report from Russia, volume 17 of PSCT, pages 122–188. Birkhauser, Boston,
Basel, Berlin, 1993.

D. Limon, I. Alvarado, T. Alamo, and E. F. Camacho. MPC for tracking piece-


wise constant references for constrained linear systems. Automatica, pages
2382–2387, 2008.

D. Q. Mayne, S. V. Raković, R. Findeisen, and F. Allgöwer. Robust output feed-


back model predictive control of constrained linear systems. Automatica,
42(7):1217–1222, July 2006.

D. Q. Mayne, S. V. Raković, R. Findeisen, and F. Allgöwer. Robust output feed-


back model predictive control of constrained linear systems: Time varying
case. Automatica, 45(9):2082–2087, September 2009.

H. Michalska and D. Q. Mayne. Moving horizon observers and observer-based


control. IEEE Trans. Auto. Cont., 40(6):995–1006, 1995.

R. Moitié, M. Quincampoix, and V. M. Veliov. Optimal control of discrete-time


uncertain systems with imperfect measurement. IEEE Trans. Auto. Cont., 47
(11):1909–1914, November 2002.

G. Pannocchia and J. B. Rawlings. Disturbance models for offset-free MPC


control. AIChE J., 49(2):426–437, 2003.

S. V. Raković, E. C. Kerrigan, K. I. Kouramas, and D. Q. Mayne. Invariant ap-


proximations of the minimal robustly positively invariant sets. IEEE Trans.
Auto. Cont., 50(3):406–410, 2005.

A. R. Teel and L. Praly. Global stabilizability and observability implies


semiglobal stabilizability by output feedback. Sys. Cont. Let., 22:313–325,
1994.
6
Distributed Model Predictive Control

6.1 Introduction and Preliminary Results


In many large-scale control applications, it becomes convenient to break
the large plantwide problem into a set of smaller and simpler subprob-
lems in which the local inputs are used to regulate the local outputs.
The overall plantwide control is then accomplished by the composite
behavior of the interacting, local controllers. There are many ways to
design the local controllers, some of which produce guaranteed prop-
erties of the overall plantwide system. We consider four control ap-
proaches in this chapter: decentralized, noncooperative, cooperative,
and centralized control. The first three methods require the local con-
trollers to optimize over only their local inputs. Their computational
requirements are identical. The communication overhead is different,
however. Decentralized control requires no communication between
subsystems. Noncooperative and cooperative control require the input
sequences and the current states or state estimates for all the other
local subsystems. Centralized control solves the large, complex plant-
wide optimization over all the inputs. Communication is not a relevant
property for centralized control because all information is available
in the single plantwide controller. We use centralized control in this
chapter to provide a benchmark of comparison for the distributed con-
trollers.
We have established the basic properties of centralized MPC, both
with and without state estimation, in Chapters 2, 3, and 5. In this
chapter, we analyze some basic properties of the three distributed
approaches: decentralized, noncooperative, and cooperative MPC. We
show that the conditions required for closed-loop stability of decentral-
ized control and noncooperative control are often violated for coupled
multivariable systems under reasonable decompositions into subsys-
tems. For ensuring closed-loop stability of a wide class of plantwide

363
364 Distributed Model Predictive Control

models and decomposition choices, cooperative control emerges as


the most attractive option for distributed MPC. We then establish the
closed-loop properties of cooperative MPC for unconstrained and con-
strained linear systems with and without state estimation. We also dis-
cuss current challenges facing this method, such as input constraints
that are coupled between subsystems.
In our development of distributed MPC, we require some basic re-
sults on two topics: how to organize and solve the linear algebra of
linear MPC, and how to ensure stability when using suboptimal MPC.
We cover these two topics in the next sections, and then turn to the
distributed MPC approaches.

6.1.1 Least Squares Solution

In comparing various forms of linear distributed MPC it proves conve-


nient to see the MPC quadratic program for the sequence of states and
inputs as a single large linear algebra problem. To develop this linear
algebra problem, we consider first the unconstrained linear quadratic
(LQ) problem of Chapter 1, which we solved efficiently with dynamic
programming (DP) in Section 1.3.3
N−1
1 X 
V (x(0), u) = x(k)′ Qx(k) + u(k)′ Ru(k) + (1/2)x(N)′ Pf x(N)
2 k=0

subject to
x + = Ax + Bu
In this section, we first take the direct but brute-force approach to find-
ing the optimal control law. We write the model solution as
      
x(1) A B 0 ··· 0 u(0)
 x(2)   A2   AB B · · · 0  
       u(1) 
 .  =  .  x(0)+ . . .  .  (6.1)
 .   .   . .. .. ..   .. 
 .   .   . .  
N N−1 N−2
x(N) A A B A B ··· B u(N − 1)
| {z } | {z }
A B

or using the input and state sequences

x = Ax(0) + Bu

The objective function can be expressed as



V (x(0), u) = (1/2) x ′ (0)Qx(0) + x′ Qx + u′ Ru
6.1 Introduction and Preliminary Results 365

in which
h i
Q = diag Q Q ... Pf ∈ RNn×Nn
h i
R = diag R R ... R ∈ RNm×Nm (6.2)

Eliminating the state sequence. Substituting the model into the ob-
jective function and eliminating the state sequence gives a quadratic
function of u

V (x(0), u) = (1/2)x ′ (0)(Q + A′ QA)x(0) + u′ (B′ QA)x(0)+


(1/2)u′ (B′ QB + R)u (6.3)

and the optimal solution for the entire set of inputs is obtained in one
shot
u0 (x(0)) = −(B′ QB + R)−1 B′ QA x(0)
and the optimal cost is
   
1
V 0 (x(0)) = x ′ (0) Q + A′ QA − A′ QB(B′ QB + R)−1 B′ QA x(0)
2

If used explicitly, this procedure for computing u0 would be inefficient


because B′ QB + R is an (mN × mN) matrix. Notice that in the DP
formulation one has to invert instead an (m×m) matrix N times, which
is computationally less expensive.1 Notice also that unlike DP, the least
squares approach provides all input moves as a function of the initial
state, x(0). The gain for the control law comes from the first input
move in the sequence
h i
K(0) = − Im 0 · · · 0 (B′ QB + R)−1 B′ QA

It is not immediately clear that the K(0) and V 0 given above from the
least squares approach are equivalent to the result from the Riccati
iteration, (1.10)–(1.14) of Chapter 1, but since we have solved the same
optimization problem, the two results are the same.2
Retaining the state sequence. In this section we set up the least
squares problem again, but with an eye toward improving its efficiency.
Retaining the state sequence and adjoining the model equations as
1 Would you prefer to invert by hand 100 (1 × 1) matrices or a single (100 × 100)

dense matrix?
2 Establishing this result directly is an exercise in using the partitioned matrix inver-

sion formula. The next section provides another way to show they are equivalent.
366 Distributed Model Predictive Control

equality constraints is a central idea in optimal control and is described


in standard texts (Bryson and Ho, 1975, p. 44). We apply this standard
approach here. Wright (1997) provides a discussion of this problem in
the linear model MPC context and the extensions required for the quad-
ratic programming problem when there are inequality constraints on
the states and inputs. Including the state with the input in the sequence
of unknowns, we define the enlarged vector z to be
 
u(0)
 x(1) 
 
 
 u(1) 
 
 
z =  x(2) 
 .. 
 
 . 
 
u(N − 1)
x(N)
The objective function is

min(1/2)(x ′ (0)Qx(0) + z′ Hz)


u

in which h i
H = diag R Q R Q ··· R Pf
The constraints are
Dz = d
in which
   
B −I A
 A B −I  0
   
D = −
 ..

 d= 
 ..  x(0)
 .  .
A B −I 0
We now substitute these results into (1.57) and obtain the linear algebra
problem
    
R B′ u(0) 0
 Q −I A′   x(1)   0 
    
 R B′   u(1)   0 
    
 Q −I   x(2)   0 
    
    
 .. ..  .   . 
  .   . 
 . .  .   . 
    
 R B′     
  u(N − 1) =  0  x(0)
 Pf −I   x(N)   0 
    
 B −I   λ(1)  −A
    
    
 A B −I   λ(2)   0 
    
 ..  .   . 
  .   . 
 .  .   . 
B −I λ(N) 0
6.1 Introduction and Preliminary Results 367

Method FLOPs
dynamic programming (DP) Nm3
dense least squares N 3 m3
banded least squares N(2n + m)(3n + m)2

Table 6.1: Computational cost of solving finite horizon LQR problem.

This equation is rather cumbersome, but if we reorder the unknown


vector to put the Lagrange multiplier together with the state and input
from the same time index, and reorder the equations, we obtain the
following banded matrix problem
 R B′  u(0)   0 
 B −I   λ(1)  −A
    
 −I Q   x(1)   0 
    
 R   u(1)   0 
    
  .   
 .. ..  .   .. 
 . .  .   
    . 

 R B′
 =
 u(N − 2)  0 
 x(0) (6.4)
    

 A B −I   λ(N − 1)   0 
   
 −I Q A′  x(N − 1)  0 
    
 R B′  u(N − 1)  0 
    
 A B −I   λ(N)   0 
−I Pf x(N) 0

The banded structure allows a more efficient solution procedure.


The floating operation (FLOP) count for the factorization of a banded
matrix is O(LM 2 ) in which L is the dimension of the matrix and M is the
bandwidth. This compares to the regular FLOP count of O(L3 ) for the
factorization of a regular dense matrix. The bandwidth of the matrix in
(6.4) is 3n+m and the dimension of the matrix is N(2n+m). Therefore
the FLOP count for solving this equation is O(N(2n + m)(3n + m)2 ).
Notice that this approach reduces the N 3 dependence of the previous
MPC solution method. That is the computational advantage provided
by these adjoint methods for treating the model constraints. Table 6.1
summarizes the computational cost of the three approaches for the
linear quadratic regulator (LQR) problem. As shown in the table, DP
is highly efficient. When we add input and state inequality constraints
to the control problem and the state dimension is large, however, we
cannot conveniently apply DP. The dense least squares computational
cost is high if we wish to compute a large number of moves in the
horizon. Note the cost of dense least squares scales with the third
368 Distributed Model Predictive Control

power of horizon length N. As we have discussed in Chapter 2, con-


siderations of control theory favor large N. Another factor increasing
the computational cost is the trend in industrial MPC implementations
to larger multivariable control problems with more states and inputs,
i.e., larger m and n. Therefore, the adjoint approach using banded
least squares method becomes important for industrial applications in
which the problems are large and a solid theoretical foundation for the
control method is desirable.
We might obtain more efficiency than the banded structure if we
view (6.4) as a block tridiagonal matrix and use the method provided
by Golub and Van Loan (1996, p. 174). The final fine tuning of the
solution method for this class of problems is a topic of current research,
but the important point is that, whatever final procedure is selected,
the computational cost will be linear in N as in DP instead of cubic in
N as in dense least squares.
Furthermore, if we wish to see the connection to the DP solution, we
can proceed as follows. Substitute Π(N) = Pf as in (1.11) of Chapter 1
and consider the last three-equation block of the matrix appearing in
(6.4)
 
  x(N − 1)  
R B′   0
   u(N − 1)   
 A B −I   = 0 
 λ(N) 
−I Π(N) 0
x(N)
We can eliminate this small set of equations and solve for u(N − 1),
λ(N), x(N) in terms of x(N − 1), resulting in
   
u(N − 1) −(B ′ Π(N)B + R)−1 B ′ Π(N)A
   
 λ(N)  =  Π(N)(I − B(B ′ Π(N)B + R)−1 B ′ Π(N))A  x(N − 1)
x(N) (I − B(B Π(N)B + R) B Π(N))A
′ −1 ′

Notice that in terms of the Riccati matrix, we also have the relationship

A′ λ(N) = Π(N − 1)x(N − 1) − Qx(N − 1)

We then proceed to the next to last block of three equations


 
x(N − 2)
 
 u(N − 2)
 
 
R B′   0
  λ(N − 1)  
= 0 
 A B −I 
  
x(N − 1)
−I Q A′ 


 0
 u(N − 1) 
λ(N)
6.1 Introduction and Preliminary Results 369

Note that the last equation gives


λ(N − 1) = Qx(N − 1) + A′ λ(N) = Π(N − 1)x(N − 1)
Using this relationship and continuing on to solve for x(N−1), λ(N−1),
u(N − 2) in terms of x(N − 2) gives
   
u(N − 2) −(B ′ Π(N − 1)B + R)−1 B ′ Π(N − 1)A
   
 λ(N − 1)  =  Π(N − 1)(I − B(B Π(N − 1)B + R) B Π(N − 1))A  x(N − 2)
′ −1 ′

x(N − 1) (I − B(B ′ Π(N − 1)B + R)−1 B ′ Π(N − 1))A

Continuing on through each previous block of three equations pro-


duces the Riccati iteration and feedback gains of (1.10)–(1.13). The
other unknowns, the multipliers, are simply
λ(k) = Π(k)x(k) k = 1, 2, . . . , N
so the cost to go at each stage is simply x(k)′ λ(k), and we see the nice
connection between the Lagrange multipliers and the cost of the LQR
control problem.

6.1.2 Stability of Suboptimal MPC

When using distributed MPC, it may be necessary or convenient to im-


plement the control without solving the complete optimization. We
then have a form of suboptimal MPC, which was first considered in
Chapter 2, Section 2.7. Before adding the complexity of the distributed
version, we wish to further develop a few features of suboptimal MPC
in the centralized, single-player setting. These same features arise in
the distributed, many-player setting as we discuss subsequently.
We consider a specific variation of suboptimal MPC in which a start-
ing guess is available from the control trajectory at the previous time
and we take a fixed number of steps of an optimization algorithm. The
exact nature of the optimization method is not essential, but we do
restrict the method so that each iteration is feasible and decreases the
value of the cost function. To initialize the suboptimal controller, we
are given an initial state x(0) = x0 , and we generate an initial control
sequence u(0) = h(x0 ). We consider input constraints u(i) ∈ U ⊆ Rm ,
i ∈ I0:N−1 , which we also write as u ∈ UN ⊆ RN . As in Chapter 2 we
denote the set of feasible states as XN . These are the states for which
the initial control sequence h(x0 ) is well defined. The suboptimal MPC
algorithm is as follows.
Algorithm 6.1 (Suboptimal MPC (simplified)). Set current state x = x0 ,
current control sequence, u = h(x0 ), current warm start u
e = u. Then
repeat
370 Distributed Model Predictive Control

1. Obtain current measurement of state x.

2. The controller performs some number of iterations of a feasible path


optimization algorithm to obtain an improved control sequence u such
that VN (x, u(0)) ≤ VN (x, u
e (0)).

3. Inject the first element of the input sequence u.

4. Compute the next warm start.


+
e = (u(1), u(2), . . . , u(N − 1), 0)
u

This warm start is a simplified version of the one considered in


Chapter 2, in which the final control move in the warm start was deter-
mined by the control law κf (x). In distributed MPC it is simpler to use
zero for the final control move in the warm start. We establish later in
the chapter that the system cost function V (x, u) satisfies the follow-
ing properties for the form of suboptimal MPC generated by distributed
MPC. There exist constants a, b, c > 0 such that

a |(x, u)|2 ≤ V (x, u) ≤ b |(x, u)|2


V (x + , u+ ) − V (x, u) ≤ −c |(x, u(0))|2

These properties are similar to those required for a valid Lyapunov


function. The difference is that the cost decrease here does not de-
pend on the size of u, but only x and the first element of u, u(0). This
cost decrease is sufficient to establish that x(k) and u(k) converge to
zero, but allows the possibility that u(k) is large even though x(k) is
small. That fact prevents us from establishing the solution x(k) = 0
for all k is Lyapunov stable. We can establish that the solution x(k) = 0
for all k is Lyapunov stable at k = 0 only. We cannot establish uniform
Lyapunov stability nor Lyapunov stability for any k > 0. The problem
is not that our proof technique is deficient. There is no reason to expect
that the solution x(k) = 0 for all k is Lyapunov stable for suboptimal
MPC. The lack of Lyapunov stability of x(k) = 0 for all k is a subtle
issue and warrants some discussion. To make these matters more pre-
cise, consider the following standard definitions of Lyapunov stability
at time k and uniform Lyapunov stability (Vidyasagar, 1993, p. 136).

Definition 6.2 (Lyapunov stability). The zero solution x(k) = 0 for all
k is stable (in the sense of Lyapunov) at k = k0 if for any ε > 0 there
exists a δ(k0 , ε) > 0 such that

|x(k0 )| < δ =⇒ |x(k)| < ε ∀k ≥ k0 (6.5)


6.1 Introduction and Preliminary Results 371

Lyapunov stability is defined at a time k0 . Uniform stability is the


concept that guarantees that the zero solution is not losing stability
with time. For a uniformly stable zero solution, δ in Definition 6.2 is
not a function of k0 , so that (6.5) holds for all k0 .

Definition 6.3 (Uniform Lyapunov stability). The zero solution x(k) =


0 for all k is uniformly stable (in the sense of Lyapunov) if for any ε > 0
there exists a δ(ε) > 0 such that

|x(k0 )| < δ =⇒ |x(k)| < ε ∀k ≥ k0 ∀k0

Exercise 6.6 gives an example of a linear system for which x(k)


converges exponentially to zero with increasing k for all x(0), but the
zero solution x(k) = 0 for all k is Lyapunov stable only at k = 0. It
is not uniformly Lyapunov stable nor Lyapunov stable for any k > 0.
Without further restrictions, suboptimal MPC admits this same type of
behavior.
To ensure uniform Lyapunov stability, we add requirements to sub-
optimal MPC beyond obtaining only a cost decrease. Here we impose
the constraint
|u| ≤ d |x| x ∈ rB

in which d, r > 0. This type of constraint is also included somewhat


indirectly by the suboptimal control approach discussed in Section 2.7.
In that arrangement, this constraint is implied by the first case in (2.29),
which leads to Proposition 2.44. For simplicity, in this chapter we in-
stead include the constraint explicitly in the distributed MPC optimiza-
tion problem. Both approaches provide (uniform) Lyapunov stability
of the solution x(k) = 0 for all k.
The following lemma summarizes the conditions we use later in
the chapter for establishing exponential stability of distributed MPC.
A similar lemma establishing asymptotic stability of suboptimal MPC
was given by Scokaert, Mayne, and Rawlings (1999) (Theorem 1).
First we recall the definition of exponential stability.

Definition 6.4 (Exponential stability). Let X be positive invariant set for


x + = f (x). Then the origin is exponentially stable in X for x + = f (x)
if there exists c > 0 and 0 < γ < 1 such that for each x ∈ X

φ(i; x) ≤ c |x| γ i

for all i ≥ I≥0 .


372 Distributed Model Predictive Control

Consider next the suboptimal MPC controller. Let the system satisfy
(x , u+ ) = (f (x, u), g(x, u)) with initial sequence u(0) = h(x(0)). The
+

controller constraints are x(i) ∈ X ⊆ Rn for all i ∈ I0:N and u(i) ∈ U ⊆


Rm for all i ∈ I0:N−1 . Let XN denote the set of states for which the MPC
controller is feasible.

Lemma 6.5 (Exponential stability of suboptimal MPC). Assume that the


suboptimal MPC system satisfies the following inequalities with r , a, b,
c>0

a |(x, u)|2 ≤ V (x, u) ≤ b |(x, u)|2 x ∈ XN u ∈ UN


V (x + , u+ ) − V (x, u) ≤ −c |(x, u(0))|2 x ∈ XN u ∈ UN
|u| ≤ d |x| x ∈ rB

Then the origin is exponentially stable for the closed-loop system under
suboptimal MPC with region of attraction XN if either of the following
additional assumptions holds
(a) U is compact. In this case, XN may be unbounded.

(b) XN is compact. In this case U may be unbounded.

Proof. First we show that the origin of the extended state (x, u) is ex-
ponentially stable for x(0) ∈ XN .
(a) For the case U compact, we have |u| ≤ d |x| , x ∈ r B. Consider the
optimization
max |u| = s > 0
u∈UN

The solution exists by the Weierstrass theorem since U is compact,


which implies UN is compact. Then we have |u| ≤ (s/r ) |x| for x ∈
XN \r B, so we have |u| ≤ d′ |x| for x ∈ XN in which d′ = max(d, s/r ).

(b) For the case XN compact, consider the optimization

max V (x, h(x)) = V̄ > 0


x∈XN

The solution exists because XN is compact and h(·) and V (·) are con-
tinuous. Define the compact set Ū by

Ū = {u | V (x, u) ≤ V̄ , x ∈ XN }

The set is bounded because V (x, u) ≥ a |(x, u)|2 ≥ a |u|2 . The set is
closed because V is continuous. The significance of this set is that for
6.1 Introduction and Preliminary Results 373

all k ≥ 0 and all x ∈ XN , u(k) ∈ Ū. Therefore we have established that


XN compact implies u(k) evolves in a compact set as in the previous
case when U is assumed compact. Using the same argument as in that
case, we have established that there exists d′ > 0 such that |u| ≤ d′ |x|
for all x ∈ XN .
For the two cases, we therefore have established for all x ∈ XN ,
u ∈ UN (case (a)) or u ∈ Ū (case (b))

|(x, u)| ≤ |x| + |u| ≤ |x| + d′ |x| ≤ (1 + d′ ) |x|

which gives |x| ≥ c ′ |(x, u)| with c ′ = 1/(1 + d′ ) > 0. Hence, there
exists a3 = c(c ′ )2 such that V (x + , u+ ) − V (x, u) ≤ −a3 |(x, u)|2 for
all x ∈ XN . Therefore the extended state (x, u) satisfies the standard
conditions of an exponential stability Lyapunov function (see Theorem
B.19 in Appendix B) with a1 = a, a2 = b, a3 = c(c ′ )2 , σ = 2 for (x,
u) ∈ XN ×UN (case (a)) or XN × Ū (case (b)). Therefore for all x(0) ∈ XN ,
k≥0
|(x(k), u(k))| ≤ α |(x(0), u(0))| γ k
in which α > 0 and 0 < γ < 1.
Finally we remove the input sequence and establish that the origin
for the state (rather than the extended state) is exponentially stable for
the closed-loop system. We have for all x(0) ∈ XN and k ≥ 0

|x(k)| ≤ |(x(k), u(k))| ≤ α |(x(0), u(0))| γ k


≤ α(|x(0)| + |u(0)|)γ k ≤ α(1 + d′ ) |x(0)| γ k
≤ α′ |x(0)| γ k

in which α′ = α(1 + d′ ) > 0, and we have established exponential


stability of the origin on the feasible set XN . ■

Exercises 6.7 and 6.8 explore what to conclude about exponential


stability when both U and XN are unbounded.
We also consider later in the chapter the effects of state estimation
error on the closed-loop properties of distributed MPC. For analyzing
stability under perturbations, the following lemma is useful. Here e
plays the role of estimation error.

Lemma 6.6 (Global asymptotic stability and exponential convergence


with mixed powers of norm). Consider a dynamic system

(x + , e+ ) = f (x, e)
374 Distributed Model Predictive Control

with a zero steady-state solution, f (0, 0) = (0, 0). Assume there exists
a function V : Rn+m → R≥0 that satisfies the following for all (x, e) ∈
Rn × Rm
a(|x|σ + |e|γ ) ≤ V ((x, e)) ≤ b(|x|σ + |e|γ ) (6.6)
σ γ
V (f (x, e)) − V ((x, e)) ≤ −c(|x| + |e| ) (6.7)
with constants a, b, c, σ , γ > 0. Then the following holds for all (x(0),
e(0)) and k ∈ I≥0
|x(k), e(k)| ≤ δ(|x(0), e(0)|)λk
with λ < 1 and δ(·) ∈ K∞ .
The proof of this lemma is discussed in Exercise 6.9. We also require
a converse theorem for exponential stability.
Lemma 6.7 (Converse theorem for exponential stability). If the zero
steady-state solution of x + = f (x) is globally exponentially stable, then
there exists Lipschitz continuous V : Rn → R≥0 that satisfies the follow-
ing: there exist constants a, b, c, σ > 0, such that for all x ∈ Rn
a |x|σ ≤ V (x) ≤ b |x|σ
V (f (x)) − V (x) ≤ −c |x|σ
Moreover, any σ > 0 is valid, and the constant c can be chosen as large
as one wishes.
The proof of this lemma is discussed in Exercise B.3.

6.2 Unconstrained Two-Player Game


To introduce clearly the concepts and notation required to analyze dis-
tributed MPC, we start with a two-player game. We then generalize to
an M-player game in the next section.
Let (A11 , B11 , C11 ) be a minimal state space realization of the (u1 ,
y1 ) input-output pair. Similarly, let (A12 , B12 , C12 ) be a minimal state
space realization of the (u2 , y1 ) input-output pair. The dimensions are
u1 ∈ Rm1 , y1 ∈ Rp1 , x11 ∈ Rn11 , x12 ∈ Rn12 with n1 = n11 + n12 . Out-
put y1 can then be represented as the following, possibly nonminimal,
state space model
" #+ " #" # " # " #
x11 A11 0 x11 B11 0
= + u1 + u2
x12 0 A12 x12 0 B12
" #
h i x
11
y1 = C11 C12
x12
6.2 Unconstrained Two-Player Game 375

Proceeding in an analogous fashion with output y2 and inputs u1 and


u2 , we model y2 with the following state space model
" #+ " #" # " # " #
x22 A22 0 x22 B22 0
= + u2 + u1
x21 0 A21 x21 0 B21
" #
h i x
22
y2 = C22 C21
x21

We next define player one’s local cost functions

N−1
X
V1 (x1 (0), u1 , u2 ) = ℓ1 (x1 (k), u1 (k)) + V1f (x1 (N))
k=0

in which " #
x11
x1 =
x12
Note that the first local objective is affected by the second player’s
inputs through the model evolution of x1 , i.e., through the x12 states.
We choose the stage cost to account for the first player’s inputs and
outputs

ℓ1 (x1 , u1 ) = (1/2)(y1′ Q1 y1 + u′1 R1 u1 )


ℓ1 (x1 , u1 ) = (1/2)(x1′ Q1 x1 + u′1 R1 u1 )

in which h i
Q1 = C1′ Q1 C1 C1 = C11 C12

Motivated by the warm start to be described later, for stable systems,


we choose the terminal penalty to be the infinite horizon cost to go
under zero control

V1f (x1 (N)) = (1/2)x1′ (N)P1f x1 (N)

We choose P1f as the solution to the following Lyapunov equation as-


suming A1 is stable
A′1 P1f A1 − P1f = −Q1 (6.8)
We proceed analogously to define player two’s local objective function
and penalties

N−1
X
V2 (x2 (0), u1 , u2 ) = ℓ2 (x2 (k), u2 (k)) + V2f (x2 (N))
k=0
376 Distributed Model Predictive Control

In centralized control and the cooperative game, the two players


share a common objective, which can be considered to be the overall
plant objective

V (x1 (0), x2 (0), u1 , u2 ) = ρ1 V1 (x1 (0), u1 , u2 ) + ρ2 V2 (x2 (0), u2 , u1 )

in which the parameters ρ1 , ρ2 are used to specify the relative weights


of the two subsystems in the overall plant objective. Their values are
restricted so ρ1 , ρ2 > 0, ρ1 + ρ2 = 1 so that both local objectives must
have some nonzero effect on the overall plant objective.

6.2.1 Centralized Control

Centralized control requires the solution of the systemwide control


problem. It can be stated as

min V (x1 (0), x2 (0), u1 , u2 )


u1 ,u2

s.t. x1+ = A1 x1 + B 11 u1 + B 12 u2
x2+ = A2 x2 + B 22 u2 + B 21 u1

in which
" # " #
A11 0 A22 0
A1 = A2 =
0 A12 0 A21
" # " # " # " #
B11 0 0 B22
B 11 = B 12 = B 21 = B 22 =
0 B12 B21 0
This optimal control problem is more complex than all of the dis-
tributed cases to follow because the decision variables include both
u1 and u2 . Because the performance is optimal, centralized control is a
natural benchmark against which to compare the distributed cases: co-
operative, noncooperative, and decentralized MPC. The plantwide stage
cost and terminal cost can be expressed as quadratic functions of the
subsystem states and inputs

ℓ(x, u) = (1/2)(x ′ Qx + u′ Ru)


Vf (x) = (1/2)x ′ Pf x

in which
"# " # " #
x1 u1 ρ1 Q1 0
x= u= Q=
x2 u2 0 ρ2 Q2
" # " #
ρ1 R1 0 ρ1 P1f 0
R= Pf = (6.9)
0 ρ2 R2 0 ρ2 P2f
6.2 Unconstrained Two-Player Game 377

and we have the standard MPC problem considered in Chapters 1 and 2

min V (x(0), u)
u

s.t. x + = Ax + Bu (6.10)

in which " # " #


A1 0 B 11 B 12
A= B= (6.11)
0 A2 B 21 B 22

Given the terminal penalty in (6.8), stability of the closed-loop central-


ized system is guaranteed for all choices of system models and tuning
parameters subject to the usual stabilizability assumption on the sys-
tem model.

6.2.2 Decentralized Control

Centralized and decentralized control define the two extremes in dis-


tributing the decision making in a large-scale system. Centralized con-
trol has full information and optimizes the full control problem over all
decision variables. Decentralized control, on the other hand, optimizes
only the local objectives and has no information about the actions of
the other subsystems. Player one’s objective function is

N−1
X
V1 (x1 (0), u1 ) = ℓ1 (x1 (k), u1 (k)) + V1f (x1 (N))
k=0

We then have player one’s decentralized control problem

min V1 (x1 (0), u1 )


u1

s.t. x1+ = A1 x1 + B 11 u1

We know the optimal solution for this kind of LQ problem is a linear


feedback law
u01 = K1 x1 (0)

Notice that in decentralized control, player one’s model does not


account for the inputs of player two, and already contains model error.
In the decentralized problem, player one requires no information about
player two. The communication overhead for decentralized control
is therefore minimal, which is an implementation advantage, but the
resulting performance may be quite poor for systems with reasonably
378 Distributed Model Predictive Control

strong coupling. We compute an optimal K1 for system one (A1 , B 11 ,


Q1 , R1 ) and optimal K2 for system 2. The closed-loop system evolution
is then " #+ " #" #
x1 A1 + B 11 K1 B 12 K2 x1
=
x2 B 21 K1 A2 + B 22 K2 x2
and we know only that A11 + B 11 K1 and A22 + B 22 K2 are stable matri-
ces. Obviously the stability of the closed-loop, decentralized system is
fragile and depends in a sensitive way on the sizes of the interaction
terms B 12 and B 21 and feedback gains K1 , K2 .

6.2.3 Noncooperative Game

In the noncooperative game, player one optimizes V1 (x1 (0), u1 , u2 )


over u1 and player two optimizes V2 (x2 (0), u1 , u2 ) over u2 . From player
one’s perspective, player two’s planned inputs u2 are known distur-
bances affecting player one’s output through the dynamic model. Part
of player one’s optimal control problem is therefore to compensate for
player two’s inputs with his optimal u1 sequence in order to optimize
his local objective V1 . Similarly, player two considers player one’s in-
puts as a known disturbance and solves an optimal control problem
that removes their effect in his local objective V2 . Because this game
is noncooperative (V1 ≠ V2 ), the struggle between players one and two
can produce an outcome that is bad for both of them as we show sub-
sequently. Notice that unlike decentralized control, there is no model
error in the noncooperative game. Player one knows exactly the effect
of the actions of player two and vice versa. Any poor nominal perfor-
mance is caused by the noncooperative game, not model error.
Summarizing the noncooperative control problem statement, player
one’s model is
x1+ = A1 x1 + B 11 u1 + B 12 u2
and player one’s objective function is
N−1
X
V1 (x1 (0), u1 , u2 ) = ℓ1 (x1 (k), u1 (k)) + V1f (x1 (N))
k=0

Note that V1 here depends on u2 because the state trajectory x1 (k),


k ≥ 1 depends on u2 as shown in player one’s dynamic model. We then
have player one’s noncooperative control problem

min V1 (x1 (0), u1 , u2 )


u1

s.t. x1+ = A1 x1 + B 11 u1 + B 12 u2
6.2 Unconstrained Two-Player Game 379

Solution to player one’s optimal control problem. We now solve


player one’s optimal control problem. Proceeding as in Section 6.1.1
we define
 
u1 (0)
 
 x1 (1) 
  h i
 .. 
z= .  H = diag R1 Q1 · · · R1 P1f
 
 
u1 (N − 1)
x1 (N)

and can express player one’s optimal control problem as

min(1/2)(z′ Hz + x1 (0)′ Q1 x1 (0))


z

s.t. Dz = d

in which
 
B 11 −I
 A1 B 11 −I 
 
D = −
 ..


 . 
A1 B 11 −I
 
A1 x1 (0) + B 12 u2 (0)
 B 12 u2 (1) 
 

d= .. 

 . 
B 12 u2 (N − 1)

We then apply (1.57) to obtain


" #" # " # " #
H −D ′ z 0 0
−D 0 λ
= e 1 x1 (0) + −B
−A e 12 u2 (6.12)

in which we have defined


     
λ(1) A1 B 12
 λ(2)  0  B 12 
     
λ=  .. 
 e1 =  . 
A  . 
e 12 = 
B  ..


 .   .   . 
λ(N) 0 B 12

Solving this equation and picking out the rows of z corresponding to


the elements of u1 gives

u10 = K1 x1 (0) + L1 u2
380 Distributed Model Predictive Control

player two’s optimization


p
(u1 , u02 )

next iterate
p+1 p+1
(u1 , u2 )
w2

p
p p
(u1 , u2 ) w1 (u01 , u2 )
current iterate player one’s optimization

p p p+1 p+1
Figure 6.1: Convex step from (u1 , u2 ) to (u1 , u2 ); the param-
eters w1 , w2 with w1 + w2 = 1 determine location of
next iterate on line joining the two players’ optimiza-
p p
tions: (u01 , u2 ) and (u1 , u02 ).

and we see player one’s optimal decision depends linearly on his ini-
tial state, but also on player two’s decision. This is the key difference
between decentralized control and noncooperative control. In nonco-
operative control, player two’s decisions are communicated to player
one and player one accounts for them in optimizing the local objective.

Convex step. Let p ∈ I≥0 denote the integer-valued iteration in the


optimization problem. Looking ahead to the M-player game, we do
not take the full step, but a convex combination of the current optimal
p
solution, u10 , and the current iterate, u1
p+1 p
u1 = w1 u10 + (1 − w1 )u1 0 < w1 < 1

This iteration is displayed in Figure 6.1. Notice we have chosen a dis-


tributed optimization of the Gauss-Jacobi type (see Bertsekas and Tsit-
siklis, 1997, pp.219–223).
We place restrictions on the systems under consideration before
analyzing stability of the controller.

Assumption 6.8 (Unconstrained two-player game).


(a) All subsystems, Aij , i = 1, 2, j = 1, 2, are stable.

(b) The controller penalties Q1 , Q2 , R1 , R2 are positive definite.


6.2 Unconstrained Two-Player Game 381

The assumption of stable models is purely for convenience of expo-


sition. We treat unstable, stabilizable systems in Section 6.3.
Convergence of the players’ iteration. To understand the conver-
gence of the two players’ iterations, we express both players’ moves as
follows
p+1 p
u1 = w1 u10 + (1 − w1 )u1
p+1 p
u2 = w2 u20 + (1 − w2 )u2
1 = w1 + w2 0 < w1 , w2 < 1

or
" #p+1 " # " 0# " # " #p
u1 w1 I 0 u1 (1 − w1 )I 0 u1
= +
u2 0 w2 I u20 0 (1 − w2 )I u2

The optimal control for each player is


" 0# " #" # " # " #p
u1 K1 0 x1 (0) 0 L1 u1
= +
u20 0 K2 x2 (0) L2 0 u2

Substituting the optimal control into the iteration gives


" #p+1 " #" # " # " #p
u1 w1 K1 0 x1 (0) (1 − w1 )I w1 L1 u1
= +
u2 0 w2 K2 x2 (0) w2 L 2 (1 − w2 )I u2
| {z } | {z }
K L

Finally writing this equation in the plantwide notation, we express the


iteration as
up+1 = Kx(0) + Lup
The convergence of the two players’ control iteration is governed by
the eigenvalues of L. If L is stable, the control sequence converges to

u∞ = (I − L)−1 Kx(0) |λ| < 1 for λ ∈ eig(L)

in which
" #−1 " #
−1 w1 I −w1 L1 w1 K 1 0
(I − L) K =
−w2 L2 w2 I 0 w2 K 2
" #−1 " #
I −L1 K1 0
(I − L)−1 K =
−L2 I 0 K2
382 Distributed Model Predictive Control

Note that the weights w1 , w2 do not appear in the converged input


sequence. The u1∞ , u2∞ pair have the equilibrium property that nei-
ther player can improve his position given the other player’s current
decision. This point is called a Nash equilibrium (Başar and Olsder,
1999, p. 4). Notice that the distributed MPC game does not have a Nash
equilibrium if the eigenvalues of L are on or outside the unit circle. If
the controllers have sufficient time during the control system’s sam-
ple time to iterate to convergence, then the effect of the initial control
sequence is removed by using the converged control sequence. If the
iteration has to be stopped before convergence, the solution is

p−1
X
up+1 = Lp u[0] + Lj Kx(0) 0≤p
j=0

in which u[0] is the p = 0 (initial) input sequence. We use the brackets


with p = 0 to distinguish this initial input sequence from an optimal
input sequence.
Stability of the closed-loop system. We assume the Nash equilib-
rium is stable and there is sufficient computation time to iterate to
convergence.
We require a matrix of zeros and ones to select the first move from
the input sequence for injection into the plant. For the first player, the
required matrix is

u1 (0) = E1 u1
h i
E1 = Im1 0m1 ... 0m1 m1 × m1 N matrix

The closed-loop system is then


" #+ " #" #
x1 A1 0 x1
= +
x2 0 A2 x2
| {z }
A
" #" #" #−1 " #" #
B 11 B 12 E1 0 I −L1 K1 0 x1
B 21 B 22 0 E2 −L2 I 0 K2 x2
| {z }| {z }
B K

Using the plantwide notation for this equation and defining the feed-
back gain K gives
x + = (A + BK)x
6.2 Unconstrained Two-Player Game 383

The stability of the closed loop with converged, noncooperative control


is therefore determined by the eigenvalues of (A + BK).
We next present three simple examples to show that (i) the Nash
equilibrium may not be stable (L is unstable), (ii) the Nash equilibrium
may be stable but the closed loop is unstable (L is stable, A + BK is un-
stable), and (iii) the Nash equilibrium may be stable and the closed loop
is stable (L is stable, A + BK is stable). Which situation arises depends
in a nonobvious way on all of the problem data: A1 , A2 , B 11 , B 12 , B 21 ,
B 22 , Q1 , Q2 , P1f , P2f , R1 , R2 , w1 , w2 , N. One has to examine the eigen-
values of L and A + BK for each application of interest to know how the
noncooperative distributed MPC is going to perform. Even for a fixed
dynamic model, when changing tuning parameters such as Q, Pf , R, w,
one has to examine eigenvalues of L and A + BK to know the effect on
the closed-loop system. This is the main drawback of the noncoopera-
tive game. In many control system design methods, such as all forms of
MPC presented in Chapter 2, closed-loop properties such as exponen-
tial stability are guaranteed for the nominal system for all choices of
performance tuning parameters. Noncooperative distributed MPC does
not have this feature and a stability analysis is required. We show in the
next section that cooperative MPC does not suffer from this drawback,
at the cost of slightly more information exchange.

Example 6.9: Nash equilibrium is unstable


Consider the following transfer function matrix for a simple two-input
two-output system
" # " #" #
y1 (s) G11 (s) G12 (s) u1 (s)
=
y2 (s) G21 (s) G22 (s) u2 (s)

in which
 
1 0.5
 

G(s) =  s 2 + 2(0.2)s + 1 0.225s + 1 

 −0.5 1.5 
(0.5s + 1)(0.25s + 1) 0.75s 2 + 2(0.8)(0.75)s + 1
Obtain discrete time models (Aij , Bij , Cij ) for each of the four transfer
functions Gij (s) using a sample time of T = 0.2 and zero-order holds
on the inputs. Set the control cost function parameters to be

Q1 = Q2 = 1 P 1f = P 2f = 0 R1 = R2 = 0.01
N = 30 w1 = w2 = 0.5
384 Distributed Model Predictive Control

Compute the eigenvalues of the L matrix for this system using noncoop-
erative MPC. Show the Nash equilibrium is unstable and the closed-loop
system is therefore unstable. Discuss why this system is problematic
for noncooperative control.

Solution
For this problem L is a 60 × 60 matrix (N(m1 + m2 )). The magnitudes
of the largest eigenvalues are
h i
eig(L) = 1.11 1.11 1.03 1.03 0.914 0.914 · · ·

The noncooperative iteration does not converge. The steady-state gains


for this system are " #
1 0.5
G(0) =
−0.5 1.5
and we see that the diagonal elements are reasonably large compared
to the nondiagonal elements. So the steady-state coupling between the
two systems is relatively weak. The dynamic coupling is unfavorable,
however. The response of y1 to u2 is more than four times faster than
the response of y1 to u1 . The faster input is the disturbance and the
slower input is used for control. Likewise the response of y2 to u1 is
three times faster than the response of y2 to u2 . Also in the second
loop, the faster input is the disturbance and the slower input is used
for control. These pairings are unfavorable dynamically, and that fact
is revealed in the instability of L and lack of a Nash equilibrium for the
noncooperative dynamic regulation problem. □

Example 6.10: Nash equilibrium is stable but closed loop is unstable


Switch the outputs for the previous example and compute the eigenval-
ues of L and (A+BK) for the noncooperative distributed MPC regulator
for the system
 
−0.5 1.5
 
 (0.5s + 1)(0.25s + 1) 0.75s 2 + 2(0.8)(0.75)s + 1 
G(s) =  
 1 0.5 
s + 2(0.2)s + 1
2 0.225s + 1
Show in this case that the Nash equilibrium is stable, but the noncoop-
erative regulator destabilizes the system. Discuss why this system is
problematic for noncooperative control.
6.2 Unconstrained Two-Player Game 385

Solution
For this case the largest magnitude eigenvalues of L are
h i
eig(L) = 0.63 0.63 0.62 0.62 0.59 0.59 ···

and we see the Nash equilibrium for the noncooperative game is sta-
ble. So we have removed the first source of closed-loop instability by
switching the input-output pairings of the two subsystems. There are
seven states in the complete system model, and the magnitudes of the
eigenvalues of the closed-loop regulator (A + BK) are
h i
eig(A + BK) = 1.03 1.03 0.37 0.37 0.77 0.77 0.04

which also gives an unstable closed-loop system. We see the distributed


noncooperative regulator has destabilized a stable open-loop system.
The problem with this pairing is the steady-state gains are now
" #
−0.5 1.5
G(0) =
1 0.5

If one computes any steady-state interaction measure, such as the rel-


ative gain array (RGA), we see the new pairings are poor from a steady-
state interaction perspective
" #
0.14 0.86
RGA =
0.86 0.14

Neither pairing of the inputs and outputs is closed-loop stable with


noncooperative distributed MPC.
Decentralized control with this pairing is discussed in Exercise 6.10.

Example 6.11: Nash equilibrium is stable and the closed loop is stable
Next consider the system
 
1 0.5
 2 
 s + 2(0.2)s + 1 0.9s + 1 
G(s) =  
 −0.5 1.5 
(2s + 1)(s + 1) 2
0.75s + 2(0.8)(0.75)s + 1
Compute the eigenvalues of L and A + BK for this system. What do you
conclude about noncooperative distributed MPC for this system?
386 Distributed Model Predictive Control

Solution
This system is not difficult to handle with distributed control. The
gains are the same as in the original pairing in Example 6.9, and the
steady-state coupling between the two subsystems is reasonably weak.
Unlike Example 6.9, however, the responses of y1 to u2 and y2 to u1
have been slowed so they are not faster than the responses of y1 to u1
and y2 to u2 , respectively. Computing the eigenvalues of L and A + BK
for noncooperative control gives
h i
eig(L) = 0.61 0.61 0.59 0.59 0.56 0.56 0.53 0.53 · · ·
h i
eig(A + BK) = 0.88 0.88 0.74 0.67 0.67 0.53 0.53

The Nash equilibrium is stable since L is stable, and the closed loop is
stable since both L and A + BK are stable. □

These examples reveal the simple fact that communicating the ac-
tions of the other controllers does not guarantee acceptable closed-loop
behavior. If the coupling of the subsystems is weak enough, both dy-
namically and in steady state, then the closed loop is stable. In this
sense, noncooperative MPC has few advantages over completely decen-
tralized control, which has this same basic property.
We next show how to obtain much better closed-loop properties
while maintaining the small size of the distributed control problems.

6.2.4 Cooperative Game

In the cooperative game, the two players share a common objective,


which can be considered to be the overall plant objective

V (x1 (0), x2 (0), u1 , u2 ) = ρ1 V1 (x1 (0), u1 , u2 ) + ρ2 V2 (x2 (0), u2 , u1 )

in which the parameters ρ1 , ρ2 are used to specify the relative weights


of the two subsystems in the overall plant objective. In the coopera-
tive problem, each player keeps track of how his input affects the other
player’s output as well as his own output. We can implement this co-
operative game in several ways. The implementation leading to the
simplest notation is to combine x1 and x2 into a single model
" #+ " #" # " # " #
x1 A1 0 x1 B 11 B 12
= + u1 + u2
x2 0 A2 x2 B 21 B 22
6.2 Unconstrained Two-Player Game 387

and then express player one’s stage cost as


" #′ " #" #
1 x1 ρ1 Q1 0 x1 1
ℓ1 (x1 , x2 , u1 ) = + u′1 (ρ1 R1 )u1 + const.
2 x2 0 ρ2 Q2 x2 2
" #′ " #" #
1 x1 ρ1 P1f 0 x1
V1f (x1 , x2 ) =
2 x 2 0 ρ P
2 2f x2

Notice that u2 does not appear because the contribution of u2 to the


stage cost cannot be affected by player one, and can therefore be ne-
glected. The cost function is then expressed as

N−1
X
V (x1 (0), x2 (0), u1 , u2 ) = ℓ1 (x1 (k), x2 (k), u1 (k))+V1f (x1 (N), x2 (N))
k=0

Player one’s optimal control problem is

min V (x1 (0), x2 (0), u1 , u2 )


u1
" #+ " #" # " # " #
x1 A1 0 x1 B 11 B 12
s.t. = + u1 + u2
x2 0 A2 x2 B 21 B 22

Note that this form is identical to the noncooperative form presented


previously if we redefine the terms (noncooperative -→ cooperative)

" # " # " # " #


x1 A1 0 B 11 B 12
x1 → A1 → B 11 → B 12 →
x2 0 A2 B 21 B 22
" # " #
ρ1 Q1 0 ρ1 P1f 0
Q1 → R1 → ρ1 R1 P1f →
0 ρ2 Q2 0 ρ2 P2f

Any computational program written to solve either the cooperative or


noncooperative optimal control problem can be used to solve the other.

Eliminating states x2 . An alternative implementation is to remove


states x2 (k), k ≥ 1 from player one’s optimal control problem by sub-
stituting the dynamic model of system two. This implementation re-
duces the size of the dynamic model because only states x1 are re-
tained. This reduction in model size may be important in applications
with many players. The removal of states x2 (k), k ≥ 1 also introduces
linear terms into player one’s objective function. We start by using the
388 Distributed Model Predictive Control

dynamic model for x2 to obtain


      
x2 (1) A2 B 21 u1 (0)
   2   
 x2 (2)   A2   A2 B 21 B 21  u1 (1) 
      
 .  =  .  x2 (0) +  .. .. ..  .. +
 ..   ..   . . .  . 
      
x2 (N) AN2 AN−1
2 B 21 AN−2
2 B 21 ... B 21 u1 (N − 1)
  
B 22 u2 (0)
  
 A2 B 22 B 22   u2 (1) 
  

 .. .. .. 
 .. 

 . . .  . 
AN−1
2 B 22 AN−2
2 B 22 ... B 22 u2 (N − 1)

Using more compact notation, we have

x2 = A2 x2 (0) + B21 u1 + B22 u2

We can use this relation to replace the cost contribution of x2 with


linear and quadratic terms in u1 as follows

N−1
X
x2 (k)′ Q2 x2 (k) + x2 (N)′ P2f x2 (N) =
k=0
   
u1′ B′21 Q2 B21 u1 + 2 x2 (0)′ A′2 + u2′ B′22 Q2 B21 u1 + constant

in which
h i
Q2 = diag Q2 Q2 ... P2f Nn2 × Nn2 matrix

and the constant term contains products of x2 (0) and u2 , which are
constant with respect to player one’s decision variables and can there-
fore be neglected.
Next we insert the new terms created by eliminating x2 into the cost
function. Assembling the cost function gives

e z + h′ z
min(1/2)z′ H
z

s.t. Dz = d

and (1.57) again gives the necessary and sufficient conditions for the
optimal solution
" #" # " # " # " #
He −D ′ z 0 e2
−A e 22
−B
−D 0 λ
= e 1 x1 (0) +
−A 0
x2 (0) +
e 12
−B
u2 (6.13)
6.2 Unconstrained Two-Player Game 389

in which

e = H + E ′ B′ Q2 B21 E
H e 22 = E ′ B′ Q2 B22
B e 2 = E ′ B′ Q2 A2
A
21 21 21
h i
E = IN ⊗ Im1 0m1 ,n1

See also Exercise 6.13 for details on constructing the padding matrix E.
Comparing the cooperative and noncooperative dynamic games, (6.13)
and (6.12), we see the cooperative game has made three changes: (i)
the quadratic penalty H has been modified, (ii) the effect of x2 (0) has
been included with the term A e 2 , and (iii) the influence of u2 has been
e
modified with the term B 22 . Notice that the size of the vector z has not
changed, and we have accomplished the goal of keeping player one’s
dynamic model in the cooperative game the same size as his dynamic
model in the noncooperative game.
Regardless of the implementation choice, the cooperative optimal
control problem is no more complex than the noncooperative game con-
sidered previously. The extra information required by player one in the
cooperative game is x2 (0). Player one requires u2 in both the cooper-
ative and noncooperative games. Only in decentralized control does
player one not require player two’s input sequence u2 . The other ex-
tra required information, A2 , B21 , Q2 , R2 , P2f , are fixed parameters and
making their values available to player one is a minor communication
overhead.
Proceeding as before, we solve this equation for z0 and pick out the
rows corresponding to the elements of u10 giving
" #
h i x (0)
1
u10 (x(0), u2 ) = K11 K12 + L1 u2
x2 (0)

Combining the optimal control laws for each player gives


" # " #" # " # " #p
u10 K11 K12 x1 (0) 0 L1 u1
= +
u20 K21 K22 x2 (0) L2 0 u2

in which the gain matrix multiplying the state is a full matrix for the
cooperative game. Substituting the optimal control into the iteration
gives
" #p+1 " #" # " # " #p
u1 w1 K11 w1 K12 x1 (0) (1 − w1 )I w1 L1 u1
= +
u2 w2 K21 w2 K22 x2 (0) w2 L2 (1 − w2 )I u2
| {z } | {z }
K L
390 Distributed Model Predictive Control

Finally writing this equation in the plantwide notation, we express the


iteration as
up+1 = Kx(0) + Lup
Exponential stability of the closed-loop system. In the case of coop-
erative control, we consider the closed-loop system with a finite number
of iterations, p. With finite iterations, distributed MPC becomes a form
of suboptimal MPC as discussed in Sections 6.1.2 and 2.7. To analyze
the behavior of the cooperative controller with a finite number of it-
erations, we require the cost decrease achieved by a single iteration,
which we derive next. First we write the complete system evolution as
in (6.10)
x + = Ax + Bu
in which A and B are defined in (6.11). We can then use (6.3) to express
the overall cost function

V (x(0), u) = (1/2)x ′ (0)(Q + A′ QA)x(0) + u′ (B′ QA)x(0)+


(1/2)u′ Hu u
in which A and B are given in (6.1), the cost penalties Q and R are
given in (6.2) and (6.9), and
Hu = B′ QB + R
The overall cost is a positive definite quadratic function in u because
R1 and R2 are positive definite, and therefore so are R1 , R2 , and R.
The iteration in the two players’ moves satisfies
 
p+1 p+1 p p
(u1 , u2 ) = (w1 u10 + (1 − w1 )u1 ), (w2 u20 + (1 − w2 )u2 )
p p
= (w1 u10 , (1 − w2 )u2 ) + ((1 − w1 )u1 , w2 u20 )
p+1 p+1 p p
(u1 , u2 ) = w1 (u10 , u2 ) + w2 (u1 , u20 ) (6.14)
Exercise 6.18 analyzes the cost decrease for a convex step with a posi-
tive definite quadratic function and shows
p+1 p+1 p p
V (x(0), u1 , u2 ) = V (x(0), u1 , u2 )
1h p i′ h i
− u − u0 (x(0)) P up − u0 (x(0)) (6.15)
2
in which P > 0 is given by
e D −1 Hu
P = Hu D −1 H e =D−N
H
" # " #
w1−1 Hu,11 0 −w1−1 w2 Hu,11 Hu,12
D= −1 N=
0 w2 Hu,22 Hu,21 −w1 w2−1 Hu,22
6.2 Unconstrained Two-Player Game 391

and Hu is partitioned for the two players’ input sequences. Notice that
the cost decrease achieved in a single iteration is quadratic in the dis-
tance from the optimum. An important conclusion is that each iter-
ation in the cooperative game reduces the systemwide cost. This cost
reduction is the key property that gives cooperative MPC its excellent
convergence properties, as we show next.
The two players’ warm starts at the next sample are given by
+
e 1 = (u1 (1), u1 (2), . . . , u1 (N − 1), 0)
u
+
e 2 = (u2 (1), u2 (2), . . . , u2 (N − 1), 0)
u
p p
We define the following linear time-invariant functions g1 and g2 as
the outcome of applying the control iteration procedure p times
p p
u1 = g1 (x1 , x2 , u1 , u2 )
p p
u2 = g2 (x1 , x2 , u1 , u2 )

in which p ≥ 0 is an integer, x1 and x2 are the states, and u1 , u2 are the


input sequences from the previous sample, used to generate the warm
start for the iteration. Here we consider p to be constant with time, but
Exercise 6.20 considers the case in which the controller iterations may
vary with sample time. The system evolution is then given by

x1+ = A1 x1 + B 11 u1 + B 12 u2 x2+ = A2 x2 + B 21 u1 + B 22 u2
p p
u1+ = g1 (x1 , x2 , u1 , u2 ) u2+ = g2 (x1 , x2 , u1 , u2 ) (6.16)
+ +
e1 , u
By the construction of the warm start, u e 2 , we have
+ +
V (x1+ , x2+ , u
e1 , u
e 2 ) = V (x1 , x2 , u1 , u2 ) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )
h i
+(1/2)ρ1 x1 (N)′ A′1 P1f A1 − P1f + Q1 x1 (N)
h i
+(1/2)ρ2 x2 (N)′ A′2 P2f A2 − P2f + Q2 x2 (N)

From our choice of terminal penalty satisfying (6.8), the last two terms
are zero giving
+ +
V (x1+ , x2+ , u
e1 , u
e 2 ) = V (x1 , x2 , u1 , u2 )
− ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 ) (6.17)

No optimization, p = 0. If we do no further optimization, then we


+ +
have u1+ = u
e 1 , u2+ = u
e 2 , and the equality

V (x1+ , x2+ , u1+ , u2+ ) = V (x1 , x2 , u1 , u2 ) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )


392 Distributed Model Predictive Control

The input sequences add a zero at each sample until u1 = u2 = 0 at


time k = N. The system decays exponentially under zero control and
the closed loop is exponentially stable.
Further optimization, p ≥ 1. We next consider the case in which
optimization is performed. Equation 6.15 then gives

+ +
V (x1+ , x2+ , u1+ , u2+ ) ≤ V (x1+ , x2+ , u
e1 , u
e 2 )−
h + i′ h + i
ue − u0 (x + ) P u e − u0 (x + ) p≥1

with equality holding for p = 1. Using this result in (6.17) gives

V (x1+ , x2+ , u1+ , u2+ ) ≤ V (x1 , x2 , u1 , u2 ) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )


h + i′ h + i
− u e − u0 (x + ) P u e − u0 (x + )

Since V is bounded below by zero and ℓ1 and ℓ2 are positive func-


tions, we conclude the time sequence V (x1 (k), x2 (k), u1 (k), u2 (k)) con-
verges. and therefore x1 (k), x2 (k), u1 (k), and u2 (k) converge to zero.
+
Moreover, since P > 0, the last term implies that u e converges to
u0 (x + ), which converges to zero because x + converges to zero. There-
fore, the entire input sequence u converges to zero. Because the total
system evolution is a linear time-invariant system, the convergence is
exponential. Even though we are considering here a form of subopti-
mal MPC, we do not require an additional inequality constraint on u
because the problem considered here is unconstrained and the itera-
tions satisfy (6.15).

6.2.5 Tracking Nonzero Setpoints

For tracking nonzero setpoints, we compute steady-state targets as dis-


cussed in Section 1.5. The steady-state input-output model is given by

ys = Gus G = C(I − A)−1 B

in which G is the steady-state gain of the system. The two subsystems


are denoted " # " #" #
y1s G11 G12 u1s
=
y2s G21 G22 u2s
For simplicity, we assume that the targets are chosen to be the mea-
surements (H = I). Further, we assume that both local systems are
square, and that the local targets can be reached exactly with the local
6.2 Unconstrained Two-Player Game 393

inputs. This assumption means that G11 and G22 are square matrices
of full rank. We remove all of these assumptions when we treat the con-
strained two-player game in the next section. If there is model error,
integrating disturbance models are required as discussed in Chapter 1.
We discuss these later.
The target problem also can be solved with any of the four ap-
proaches discussed so far. We consider each.

Centralized case. The centralized problem gives in one shot both in-
puts required to meet both output setpoints

us = G−1 ysp
ys = ysp

Decentralized case. The decentralized problem considers only the


diagonal terms and computes the following steady inputs

" #
−1
G11
us = −1 ysp
G22

Notice these inputs produce offset in both output setpoints

" #
−1
I G12 G22
ys = −1 ysp
G21 G11 I

Noncooperative case. In the noncooperative game, each player at-


tempts to remove offset in only its outputs. Player one solves the fol-
lowing problem

min(y1 − y1sp )′ Q1 (y1 − y1sp )


u1

s.t. y1 = G11 u1 + G12 u2

Because the target can be reached exactly, the optimal solution is to


find u1 such that y1 = y1sp , which gives

 
p
u01s = G11
−1
y1sp − G12 u2
394 Distributed Model Predictive Control

Player two solves the analogous problem. If we iterate on the two play-
ers’ solutions, we obtain
" #p+1 " #" #
−1
u1s w1 G11 y1sp
= −1 +
u2s w2 G22 y2sp
| {z }
Ks
" #" #p
−1
w2 I −w1 G11 G12 u1s
−1
−w2 G22 G21 w1 I u2s
| {z }
Ls

This iteration can be summarized by


p+1 p
us = K s ysp + Ls us
If Ls is stable, this iteration converges to
u∞ −1
s = (I − Ls ) K s ysp

u∞
s =G
−1
ysp
and we have no offset. We already have seen that we cannot expect
the dynamic noncooperative iteration to converge. The next several
examples explore the issue of whether we can expect at least the steady-
state iteration to be stable.
Cooperative case. In the cooperative case, both players work on min-
imizing the offset in both outputs. Player one solves
" #′ " #" #
y1 − y1sp ρ1 Q 1 y1 − y1sp
min(1/2)
u1 y2 − y2sp ρ2 Q 2 y2 − y2sp
" # " # " #
y1 G11 G12
s.t. = u1 + u2
y2 G21 G22
We can write this in the general form
min(1/2)rs′ Hrs + h′ rs
rs

s.t. Drs = d
in which
   
y1s ρ1 Q 1 " #
    −Qysp
rs = y2s  H= ρ2 Q 2  h=
0
u1s 0
" # " #
h i G11 G12
D = I −G1 d = G2 u 2 G1 = G2 =
G12 G22
6.2 Unconstrained Two-Player Game 395

We can then solve the linear algebra problem


" #" # " #
H −D ′ rs h
=−
−D 0 λs d

and identify the linear gains between the optimal u1s and the setpoint
ysp and player two’s input u2s
p
u01s = K1s ysp + L1s u2s

Combining the optimal control laws for each player gives


" 0 # " # " #" #p
u1s K1s 0 L1s u1s
= ysp +
u02s K2s L2s 0 u2s

Substituting the optimal control into the iteration gives


" #p+1 " # " #" #p
u1s w1 K1s (1 − w1 )I w1 L1s u1s
= ysp +
u2s w2 K2s w2 L2s (1 − w2 )I u2s
| {z } | {z }
Ks Ls

Finally writing this equation in the plantwide notation, we express the


iteration as
p+1 p
us = K s ysp + Ls us
As we did with the cooperative regulation problem, we can analyze the
optimization problem to show that this iteration is always stable and
converges to the centralized target. Next we explore the use of these
approaches in some illustrative examples.

Example 6.12: Stability and offset in the distributed target calcula-


tion
Consider the following two-input, two-output system with steady-state
gain matrix and setpoint
" # " #" # " # " #
y1s −0.5 1.0 u1s y1sp 1
= =
y2s 2.0 1.0 u2s y2sp 1

(a) Show the first 10 iterations of the noncooperative and cooperative


steady-state cases starting with the decentralized solution as the
initial guess.
Describe the differences. Compute the eigenvalues of L for the
cooperative and noncooperative cases. Discuss the relationship
396 Distributed Model Predictive Control

between these eigenvalues and the result of the iteration calcula-


tions.
Mark also the solution to the centralized and decentralized cases
on your plots.

(b) Switch the pairings and repeat the previous part. Explain your
results.

Solution
(a) The first 10 iterations of the noncooperative steady-state calcu-
lation are shown in Figure 6.2. Notice the iteration is unstable
and the steady-state target does not converge. The cooperative
case is shown in Figure 6.3. This case is stable and the iterations
converge to the centralized target and achieve zero offset. The
magnitudes of the eigenvalues of Ls for the noncooperative (nc)
and cooperative (co) cases are given by
eig(Lsnc ) = {1.12, 1.12} eig(Lsco ) = {0.757, 0.243}
Stability of the iteration is determined by the magnitudes of the
eigenvalues of Ls .

(b) Reversing the pairings leads to the following gain matrix in which
we have reversed the labels of the outputs for the two systems
" # " #" #
y1s 2.0 1.0 u1s
=
y2s −0.5 1.0 u2s
The first 10 iterations of the noncooperative and cooperative con-
trollers are shown in Figures 6.4 and 6.5. For this pairing, the
noncooperative case also converges to the centralized target. The
eigenvalues are given by
eig(Lsnc ) = {0.559, 0.559} eig(Lsco ) = {0.757, 0.243}
The eigenvalues of the cooperative case are unaffected by the re-
versal of pairings. □

Given the stability analysis of the simple unconstrained two-player


game, we remove from further consideration two options we have been
discussing to this point: noncooperative control and decentralized con-
trol. We next further develop the theory of cooperative MPC and com-
pare its performance to centralized MPC in more general and challeng-
ing situations.
6.2 Unconstrained Two-Player Game 397

y2 = 1
6

2
u2 ude uce
0

−2 y1 = 1

−4

−6
−5 −4 −3 −2 −1 0 1 2 3 4 5
u1

Figure 6.2: Ten iterations of noncooperative steady-state calcula-


tion, u[0] = ude ; iterations are unstable, up → ∞.

y2 = 1
6

2
u2 ude uce
0

−2 y1 = 1

−4

−6
−5 −4 −3 −2 −1 0 1 2 3 4 5
u1

Figure 6.3: Ten iterations of cooperative steady-state calculation,


u[0] = ude ; iterations are stable, up → uce .
398 Distributed Model Predictive Control

2.5

2
y1 = 1

1.5

u2 1
y2 = 1 uce ude

0.5

−0.5
−0.6 −0.4 −0.2 0 0.2 0.4 0.6
u1

Figure 6.4: Ten iterations of noncooperative steady-state calcula-


tion, u[0] = ude ; iterations are stable with reversed pair-
ing.

2.5

2
y1 = 1

1.5

u2 1
y2 = 1 uce ude

0.5

−0.5
−0.6 −0.4 −0.2 0 0.2 0.4 0.6
u1

Figure 6.5: Ten iterations of cooperative steady-state calculation,


u[0] = ude ; iterations are stable with reversed pairing.
6.2 Unconstrained Two-Player Game 399

6.2.6 State Estimation

Given output measurements, we can express the state estimation prob-


lem also in distributed form. Player one uses local measurements of
y1 and knowledge of both inputs u1 and u2 to estimate state x1

x̂1+ = A1 x̂1 + B 11 u1 + B 12 u2 + L1 (y1 − C1 x̂1 )

Defining estimate error to be e1 = x1 − x̂1 gives

e1+ = (A1 − L1 C1 )e1

Because all the subsystems are stable, we know L1 exists so that A1 −


L1 C1 is stable and player one’s local estimator is stable. The estimate
error for the two subsystems is then given by
" #+ " #" #
e1 AL1 e1
= (6.18)
e2 AL2 e2

in which ALi = Ai − Li Ci .
Closed-loop stability. The dynamics of the estimator are given by
" #+ " #" # " #" #
x̂1 A1 x̂1 B 11 B 12 u1
= + +
x̂2 A2 x̂2 B 21 B 22 u2
" #" #
L1 C1 e1
L2 C2 e2

In the control law we use the state estimate in place of the state, which
is unmeasured and unknown. We consider two cases.
Converged controller. In this case the distributed control law con-
verges to the centralized controller, and we have
" # " #" #
u1 K11 K12 x̂1
=
u2 K21 K22 x̂2

The closed-loop system evolves according to


" #+ (" # " #" #) " #
x̂1 A1 B 11 B 12 K11 K12 x̂1
= + +
x̂2 A2 B 21 B 22 K21 K22 x̂2
" #" #
L1 C1 e1
L2 C2 e2
400 Distributed Model Predictive Control

The A+BK term is stable because this term is the same as in the stabiliz-
ing centralized controller. The perturbation is exponentially decaying
because the distributed estimators are stable. Therefore x̂ goes to zero
exponentially, which, along with e going to zero exponentially, implies
x goes to zero exponentially.
Finite iterations. Here we use the state plus input sequence descrip-
tion given in (6.16), which, as we have already noted, is a linear time-
invariant system. With estimate error, the system equation is
 +    
x̂1 A1 x̂1 + B 11 u1 + B 12 u2 L1 C1 e1
x̂ +  A x̂ + B u + B u  L C e 
 2  2 2 21 1 22 2   2 2 2
 + =  p + 
 u1   g1 (x̂1 , x̂2 , u1 , u2 )   0 
p
u2+ g2 (x̂1 , x̂2 , u1 , u2 ) 0

Because there is again only one-way coupling between the estimate er-
ror evolution, (6.18), and the system evolution given above, the com-
posite system is exponentially stable.

6.3 Constrained Two-Player Game


Now that we have introduced most of the notation and the fundamen-
tal ideas, we consider more general cases. Because we are interested
in establishing stability properties of the controlled systems, we focus
exclusively on cooperative distributed MPC from this point forward. In
this section we consider convex input constraints on the two players.
We assume output constraints have been softened with exact soft con-
straints and added to the objective function, so do not consider output
constraints explicitly. The input constraints break into two significant
categories: coupled and uncoupled constraints. We treat each of these
in turn.
We also allow unstable systems and replace Assumption 6.8 with
the following more general restrictions on the systems and controller
parameters.

Assumption 6.13 (Constrained two-player game).


(a) The systemsh (Aii , B i ), i = 1, 2 are stabilizable, in which Ai = diag(A1i ,
B
A2i ) and B i = B1i
2i
.

(b) The systems (Ai , Ci ), i = 1, 2 are detectable.

(c) The input penalties R1 , R2 are positive definite, and the state penal-
ties Q1 , Q2 are semidefinite.
6.3 Constrained Two-Player Game 401

(d) The systems (A1 , Q1 ) and (A2 , Q2 ) are detectable.

(e) The horizon is chosen sufficiently long to zero the unstable modes,
N ≥ max i∈I1:2 nu u
i , in which ni is the number of unstable modes of Ai ,
i.e., number of λ ∈ eig(Ai ) such that |λ| ≥ 1.

Assumption (b) implies that we have Li such that (Ai − Li Ci ), i =


1, 2 is stable. Note that the stabilizable and detectable conditions of
Assumption 6.13 are automatically satisfied if we obtain the state space
models from a minimal realization of the input/output models for (ui ,
yj ), i, j = 1, 2.

Unstable modes. To handle unstable systems, we add constraints to


zero the unstable modes at the end of the horizon. To set up this
constraint, consider the real Schur decomposition of Aij for i, j ∈ I1:2
" # " s ′#
h i As − Sij
s u ij
Aij = Sij Sij u′ (6.19)
Au
ij Sij

in which Asij is upper triangular and stable, and Au


ij is upper triangular
3
with all unstable eigenvalues. Given the Schur decomposition (6.19),
we define the matrices

Sis = diag(Si1
s s
, Si2 ) Asi = diag(Asi1 , Asi2 ) i ∈ I1:2
Siu = u
diag(Si1 u
, Si2 ) Au
i = diag(Au u
i1 , Ai2 ) i ∈ I1:2

These matrices satisfy the Schur decompositions


" #" #
h i As − Sis

Ai = Sis Siu i
′ i ∈ I1:2
Au
i Siu

We further define the matrices Σ1 , Σ2 as the solutions to the Lyapunov


equations

′ ′ ′ ′
As1 Σ1 As1 − Σ1 = −S1s Q1 S1s As2 Σ2 As2 − Σ2 = −S2s Q2 S2s (6.20)

We then choose the terminal penalty for each subsystem to be the cost
to go under zero control

′ ′
P1f = S1s Σ1 S1s P2f = S2s Σ2 S2s
3 If Aij is stable, then there is no Au u
ij and Sij .
402 Distributed Model Predictive Control

6.3.1 Uncoupled Input Constraints

We consider convex input constraints of the following form


Hu(k) ≤ h k = 0, 1, . . . , N
Defining convex set U
U = {u|Hu ≤ h}
we express the input constraints as
u(k) ∈ U k = 0, 1, . . . , N
We drop the time index and indicate the constraints are applied over
the entire input sequence using the notation u ∈ U. In the uncoupled
constraint case, the two players’ inputs must satisfy
u 1 ∈ U1 u 2 ∈ U2
in which U1 and U2 are convex subsets of Rm1 and Rm2 , respectively.
The constraints are termed uncoupled because there is no interaction
or coupling of the inputs in the constraint relation. Player one then
solves the following constrained optimization
min V (x1 (0), x2 (0), u1 , u2 )
u1
" #+ " #" # " # " #
x1 A1 0 x1 B 11 B 12
s.t. = + u1 + u2
x2 0 A2 x2 B 21 B 22
u 1 ∈ U1
u′
Sj1 xj1 (N) =0 j ∈ I1:2
|u1 | ≤ d1 (|x11 (0)| + |x21 (0)|) x11 (0), x21 (0) ∈ r B
in which we include the system’s hard input constraints, the stabil-
ity constraint on the unstable modes, and the Lyapunov stability con-
straints. Exercise 6.22 discusses how to write the constraint |u1 | ≤
d1 |x1 (0)| as a set of linear inequalities on u1 . Similarly, player two
solves
min V (x1 (0), x2 (0), u1 , u2 )
u2
" #+ " #" # " # " #
x1 A1 0 x1 B 11 B 12
s.t. = + u1 + u2
x2 0 A2 x2 B 21 B 22
u 2 ∈ U2
u′
Sj2 xj2 (N) =0 j ∈ I1:2
|u2 | ≤ d2 (|x21 (0)| + |x22 (0)|) x12 (0), x22 (0) ∈ r B
6.3 Constrained Two-Player Game 403

We denote the solutions to these problems as

u10 (x1 (0), x2 (0), u2 ) u20 (x1 (0), x2 (0), u1 )

The feasible set XN for the unstable system is the set of states for which
the unstable modes can be brought to zero in N moves while satisfying
the input constraints.
p p
Given an initial iterate, (u1 , u2 ), the next iterate is defined to be
p p
(u1 , u2 )p+1 = w1 (u10 (x1 (0), x2 (0), u2 ), u2 )+
p p
w2 (u1 , u20 (x1 (0), x2 (0), u1 ))

To reduce the notational burden we denote this as


p p
(u1 , u2 )p+1 = w1 (u10 , u2 ) + w2 (u1 , u20 )

and the functional dependencies of u10 and u20 should be kept in mind.
This procedure provides three important properties, which we es-
tablish next.

1. The iterates are feasible: (u1 , u2 )p ∈ (U1 , U2 ) implies (u1 , u2 )p+1 ∈


(U1 , U2 ). This follows from convexity of U1 , U2 and the convex
p p
combination of the feasible points (u1 , u2 ) and (u10 , u20 ) to make
(u1 , u2 )p+1 .

2. The cost decreases on iteration: V (x1 (0), x2 (0), (u1 , u2 )p+1 ) ≤


V (x1 (0), x2 (0), (u1 , u2 )p ) for all x1 (0), x2 (0), and for all feasible
(u1 , u2 )p ∈ (U1 , U2 ). The systemwide cost satisfies the following
inequalities
  
p+1 p+1 p p
V (x(0), u1 , u2 ) = V x(0), w1 (u10 , u2 ) + w2 (u1 , u20 )
p p
≤ w1 V (x(0), (u10 , u2 )) + w2 V (x(0), (u1 , u20 ))
p p p p
≤ w1 V (x(0), (u1 , u2 )) + w2 V (x(0), (u1 , u2 ))
p p
= V (x(0), u1 , u2 )

The first equality follows from (6.14). The next inequality follows
from convexity of V . The next follows from optimality of u10 and
u20 , and the last follows from w1 + w2 = 1. Because the cost is
bounded below, the cost iteration converges.

3. The converged solution of the cooperative problem is equal to


the optimal solution of the centralized problem. Establishing this
property is discussed in Exercise 6.26.
404 Distributed Model Predictive Control

Exponential stability of the closed-loop system. We next consider


the closed-loop system. The two players’ warm starts at the next sam-
ple are as defined previously
+
e 1 = (u1 (1), u1 (2), . . . , u1 (N − 1), 0)
u
+
e 2 = (u2 (1), u2 (2), . . . , u2 (N − 1), 0)
u
p p
We define again the functions g1 , g2 as the outcome of applying the
control iteration procedure p times
p p
u1 = g1 (x1 , x2 , u1 , u2 )
p p
u2 = g2 (x1 , x2 , u1 , u2 )
The important difference between the previous unconstrained and this
p p
constrained case is that the functions g1 , g2 are nonlinear due to the
input constraints. The system evolution is then given by
x1+ = A1 x1 + B 11 u1 + B 12 u2 x2+ = A2 x2 + B 21 u1 + B 22 u2
p p
u1+ = g1 (x1 , x2 , u1 , u2 ) u2+ = g2 (x1 , x2 , u1 , u2 )
We have the following cost using the warm start at the next sample
+ +
V (x1+ , x2+ , u
e1 , u
e 2 ) = V (x1 , x2 , u1 , u2 ) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )
h i
+(1/2)ρ1 x1 (N)′ A′1 P1f A1 − P1f + Q1 x1 (N)
h i
+(1/2)ρ2 x2 (N)′ A′2 P2f A2 − P2f + Q2 x2 (N)
u ′
Using the Schur decomposition (6.19) and the constraints Sji xji (N) =
0 for i, j ∈ I1:2 , the last two terms can be written as
h i
′ ′ ′
(1/2)ρ1 x1 (N)′ S1s As1 Σ1 As1 − Σ1 + S1s Q1 S1s S1s x1 (N)
h i
′ ′ ′
+(1/2)ρ2 x2 (N)′ S2s As2 Σ2 As2 − Σ2 + S2s Q2 S2s S2s x2 (N)
These terms are zero because of (6.20). Using this result and applying
the iteration for the controllers gives
V (x1+ , x2+ , u1+ , u2+ ) ≤ V (x1 , x2 , u1 , u2 ) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )
The Lyapunov stability constraints give (see also Exercise 6.28)
|(u1 , u2 )| ≤ 2 max(d1 , d2 ) |(x1 , x2 )| (x1 , x2 ) ∈ r B
Given the cost decrease and this constraint on the size of the input
sequence, we satisfy the conditions of Lemma 6.5, and conclude the
solution x(k) = 0 for all k is exponentially stable on all of XN if either
XN is compact or U is compact.
6.3 Constrained Two-Player Game 405

6.3.2 Coupled Input Constraints

By contrast, in the coupled constraint case, the constraints are of the


form
H1 u1 + H2 u2 ≤ h or (u1 , u2 ) ∈ U (6.21)
These constraints represent the players sharing some common resource.
An example would be different subsystems in a chemical plant drawing
steam or some other utility from a single plantwide generation plant.
The total utility used by the different subsystems to meet their control
objectives is constrained by the generation capacity.
The players solve the same optimization problems as in the un-
coupled constraint case, with the exception that both players’ input
constraints are given by (6.21). This modified game provides only two
of the three properties established for the uncoupled constraint case.
These are

1. The iterates are feasible: (u1 , u2 )p ∈ U implies (u1 , u2 )p+1 ∈ U.


This follows from convexity of U and the convex combination of
p p
the feasible points (u1 , u2 ) and (u10 , u20 ) to make (u1 , u2 )p+1 .

2. The cost decreases on iteration: V (x1 (0), x2 (0), (u1 , u2 )p+1 ) ≤


V (x1 (0), x2 (0), (u1 , u2 )p ) for all x1 (0), x2 (0), and for all feasible
(u1 , u2 )p ∈ U. The systemwide cost satisfies the same inequalities
established for the uncoupled constraint case giving
p+1 p+1 p p
V (x(0), u1 , u2 ) ≤ V (x(0), u1 , u2 )

Because the cost is bounded below, the cost iteration converges.

The converged solution of the cooperative problem is not equal to the


optimal solution of the centralized problem, however. We have lost
property 3 of the uncoupled case. To see how the convergence property
is lost, consider Figure 6.6. Region U is indicated by the triangle and its
interior. Consider point up on the boundary of U. Neither player one
nor player two can improve upon the current point up so the iteration
has converged. But the converged point is clearly not the optimal point,
uce .
Because of property 2, the nominal stability properties for the cou-
pled and uncoupled cases are identical. The differences arise when the
performance of cooperative control is compared to the benchmark of
centralized control. Improving the performance of cooperative con-
trol in the case of coupled constraints is therefore a topic of current
406 Distributed Model Predictive Control

cost decrease for player two

u2

uce

up cost decrease for player one


U

u1

Figure 6.6: Cooperative control stuck on the boundary of U under


coupled constraints; up+1 = up ≠ uce .

research. Current approaches include adding another player to the


game, whose sole objective is to parcel out the coupled resource to the
other players in a way that achieves optimality on iteration. This ap-
proach also makes sense from an engineering perspective because it
is commonplace to design a dedicated control system for managing a
shared resource such as steam or power among many plant units. The
design of this single unit’s control system is a reasonably narrow and
well-defined task compared to the design of a centralized controller for
the entire plant.

6.3.3 Exponential Convergence with Estimate Error

Consider next the constrained system evolution with estimate error


 +  
x̂ Ax̂ + B 1 u1 + B 2 u2 + Le
 +  
u  =  g p (x̂, u)  (6.22)
e+ AL e

The estimate error is globally exponentially stable so we know from


Lemma 6.7 that there exists a Lipschitz continuous Lyapunov function
J(·) such that for all e ∈ Rn

a |e| ≤ J(e) ≤ b |e|


+
J(e ) − J(e) ≤ −c |e|
6.3 Constrained Two-Player Game 407

in which b > 0, a > 0, and we can choose constant c > 0 as large


as desired. In the subsequent development, we require this Lyapunov
function to be based on the first power of the norm rather than the
usual square of the norm to align with Lipschitz continuity of the Lya-
punov function. From the stability of the solution x(k) = 0 for all k for
the nominal system, the cost function V (x̂, u) satisfies for all x̂ ∈ XN ,
u ∈ UN

e |(x̂, u)|2 ≤ V (x̂, u) ≤ b


a e |(x̂, u)|2

V (Ax̂ + B 1 u1 + B 2 u2 , u+ ) − V (x̂, u) ≤ −ce |x̂|2


|u| ≤ d |x̂| x̂ ∈ re B

in which a e , ce , re > 0. We propose W (x̂, u, e) = V (x̂, u) + J(e) as a


e, b
Lyapunov function candidate for the perturbed system. We next derive
the required properties of W (·) to establish exponential stability of the
solution (x(k), e(k)) = 0. From the definition of W (·) we have for all
(x̂, u, e) ∈ XN × UN × Rn

e |(x̂, u)|2 + a |e| ≤ W (x̂, u, e) ≤ b


a e |(x̂, u)|2 + b |e|

a(|(x̂, u)|2 + |e|) ≤ W (x̂, u, e) ≤ b(|(x̂, u)|2 + |e|) (6.23)

in which a = min(a e , b). Next we compute the cost


e , a) > 0, b = max(b
change

W (x̂ + , u+ , e+ ) − W (x̂, u, e) = V (x̂ + , u+ ) − V (x̂, u) + J(e+ ) − J(e)

The Lyapunov function V is quadratic in (x, u) and therefore Lipschitz


continuous on bounded sets. Therefore, for all x̂, u1 , u2 , u+ , e in some
bounded set

V (Ax̂ + B 1 u1 + B 2 u2 + Le, u+ ) − V (Ax̂ + B 1 u1 + B 2 u2 , u+ ) ≤ LV |Le|

in which LV is the Lipschitz constant for V with respect to its first


argument. Using the system evolution we have

V (x̂ + , u+ ) ≤ V (Ax̂ + B 1 u1 + B 2 u2 , u+ ) + L′V |e|

in which L′V = LV |L|. Subtracting V (x̂, u) from both sides gives

V (x̂ + , u+ ) − V (x̂, u) ≤ −ce |x̂|2 + L′V |e|


408 Distributed Model Predictive Control

Substituting this result into the equation for the change in W gives

W (x̂ + , u+ , e+ ) − W (x̂, u, e) ≤ −ce |x̂|2 + L′V |e| − c |e|


≤ −ce |x̂|2 − (c − L′V ) |e|
W (x̂ + , u+ , e+ ) − W (x̂, u, e) ≤ −c(|x̂|2 + |e|) (6.24)

in which we choose c > L′V , which is possible because we may choose


c as large as we wish, and c = min(ce , c − L′V ) > 0. Notice this step is
what motivated using the first power of the norm in J(·). Lastly, we
require the constraint

|u| ≤ d |x̂| x̂ ∈ re B (6.25)

Lemma 6.14 (Global asymptotic stability and exponential convergence


of perturbed system). If either XN or U is compact, there exist λ < 1
and δ(·) ∈ K∞ such that the combined system (6.22) satisfies for all
(x(0), e(0)) and k ≥ 0

|x(k), e(k)| ≤ δ(|x(0), e(0)|)λk

The proof is based on the properties (6.23), (6.24), and (6.25) of func-
tion W (x̂, u, e), and is basically a combination of the proofs of Lemmas
6.5 and 6.6. The region of attraction is the set of states and initial es-
timate errors for which the unstable modes of the two subsystems can
be brought to zero in N moves while satisfying the respective input
constraints. If both subsystems are stable, for example, the region of
attraction is (x, e) ∈ XN × Rn .

6.3.4 Disturbance Models and Zero Offset

Integrating disturbance model. As discussed in Chapter 1, we model


the disturbance with an integrator to remove steady offset. The aug-
mented models for the local systems are
" #+ " #" # " # " #
xi Ai Bdi xi B i1 B i2
= + u1 + u2
di 0 I di 0 0
" #
h i x
i
yi = Ci Cdi i = 1, 2
di

We wish to estimate both xi and di from measurements yi . To ensure


this goal is possible, we make the following restriction on the distur-
bance models.
6.3 Constrained Two-Player Game 409

Assumption 6.15 (Disturbance models).


" #
I − Ai −Bdi
rank = ni + pi i = 1, 2
Ci Cdi

It is always possible to satisfy this assumption by proper choice of


Bdi , Cdi . From Assumption 6.13 (b), (Ai , Ci ) is detectable, which implies
that the first ni columns of the square (ni + pi ) × (ni + pi ) matrix in
Assumption
h i 6.15 are linearly independent. Therefore the columns of
−Bdi
Cdi can be chosen so that the entire matrix has rank ni + pi . As-
sumption 6.15 is equivalent to detectability of the following augmented
system.

Lemma 6.16 (Detectability of distributed disturbance model). Consider


the augmented systems
" #
Ai Bdi h i
e
Ai = e i = Ci Cdi
C i = 1, 2
0 I

The augmented systems (A e i, C


e i ), i = 1, 2 are detectable if and only if
Assumption 6.15 is satisfied.

Proving this lemma is discussed in Exercise 6.29. The detectability


assumption then establishes the existence of L ei − L
e i such that (A e i ),
eiC
i = 1, 2 are stable and the local integrating disturbances can be esti-
mated from the local measurements.
Centralized target problem. We can solve the target problem at the
plantwide level or as a distributed target problem at the subunit con-
troller level. Consider first the centralized target problem with the dis-
turbance model discussed in Chapter 1, (1.45)

1 2 1 2
min us − usp Rs + Cxs + Cd d̂(k) − ysp
xs ,us 2 2 Qs

subject to
" #" # " #
I−A −B xs Bd d̂(k)
=
HC 0 us rsp − HCd d̂(k)
Eus ≤ e

in which we have removed the state inequality constraints to be consis-


tent with the regulator problem. We denote the solution to this prob-
lem (xs (k), us (k)). Notice first that the solution of the target problem
410 Distributed Model Predictive Control

depends only on the disturbance estimate, d̂(k), and not the solution
of the control problem. So we can analyze the behavior of the target
by considering only the exponential convergence of the estimator. We
restrict the plant disturbance d so that the target problem is feasible,
and denote the solution to the target problem for the plant disturbance,
d̂(k) = d, as (xs∗ , u∗
s ). Because the estimator is exponentially stable,
we know that d̂(k) → d as k → ∞. Because the target problem is a posi-
tive definite quadratic program (QP), we know the solution is Lipschitz
continuous on bounded sets in the term d̂(k), which appears linearly
in the objective function and the right-hand side of the equality con-
straint. Therefore, if we also restrict the initial disturbance estimate
error so that the target problem remains feasible for all time, we know
(xs (k), us (k)) → (xs∗ , u∗
s ) and the rate of convergence is exponential.

Distributed target problem. Consider next the cooperative approach,


in which we assume the input inequality constraints are uncoupled. In
the constrained case, we try to set things up so each player solves a
local target problem
" #′ " #" #
1 y1s − y1sp Q1s y1s − y1sp
min +
x1s ,u1s 2 y2s − y2sp Q2s y2s − y2sp
" #′ " #" #
1 u1s − u1sp R1s u1s − u1sp
2 u2s − u2sp R2s u2s − u2sp

subject to
    
I − A1 −B 11 −B 12 x1s Bd1 d̂1 (k)
 I − A2 −B 22     
 −B 21   x2s   Bd2 d̂2 (k) 
  = 
 H1 C1  u1s  r1sp − H1 Cd1 d̂1 (k)

H2 C2 u2s r2sp − H2 Cd2 d̂2 (k)
E1 u1s ≤ e1

in which

y1s = C1 x1s + Cd1 d̂1 (k) y2s = C2 x2s + Cd2 d̂2 (k) (6.27)

But here we run into several problems. First, the constraints to ensure
zero offset in both players’ controlled variables are not feasible with
only the u1s decision variables. We require also u2s , which is not avail-
able to player one. We can consider deleting the zero offset condition
for player two’s controlled variables, the last equality constraint. But
6.3 Constrained Two-Player Game 411

if we do that for both players, then the two players have different and
coupled equality constraints. That is a path to instability as we have
seen in the noncooperative target problem. To resolve this issue, we
move the controlled variables to the objective function, and player one
solves instead the following
" #′ " #" #
1 H1 y1s − r1sp T1s H1 y1s − r1sp
min
x1s ,u1s 2 H2 y2s − r2sp T2s H2 y2s − r2sp

subject to (6.27) and


 
" #  x1s  " #
I − A1 −B 11 −B 12  x2s  Bd1 d̂1 (k)
 =
I − A2 −B 21 −B 22 u1s  Bd2 d̂2 (k)
u2s
E1 u1s ≤ e1 (6.28)

The equality constraints for the two players appear coupled when writ-
ten in this form. Coupled constraints admit the potential for the op-
timization to become stuck on the boundary of the feasible region,
and not achieve the centralized target solution after iteration to con-
vergence. But Exercise 6.30 discusses how to show that the equality
constraints are, in fact, uncoupled. Also, the distributed target prob-
lem as expressed here may not have a unique solution when there are
more manipulated variables than controlled variables. In such cases,
a regularization term using the input setpoint can be added to the ob-
jective function. The controlled variable penalty can be converted to a
linear penalty with a large penalty weight to ensure exact satisfaction
of the controlled variable setpoint.
If the input inequality constraints are coupled, however, then the
distributed target problem may indeed become stuck on the boundary
of the feasible region and not eliminate offset in the controlled vari-
ables. If the input inequality constraints are coupled, we recommend
using the centralized approach to computing the steady-state target.
As discussed above, the centralized target problem eliminates offset in
the controlled variables as long as it remains feasible given the distur-
bance estimates.
Zero offset. Finally we establish the zero offset property. As de-
scribed in Chapter 1, the regulator is posed in deviation variables

x
e (k) = x̂(k) − xs (k) u
e (k) = u(k) − us (k) e = u − us (k)
u
412 Distributed Model Predictive Control

in which the notation u − us (k) means to subtract us (k) from each


element of the u sequence. Player one then solves

min V (x
e 1 (0), x
e 2 (0), u
e 1, u
e 2)
u
e1
" #+ " #" # " # " #
x
e1 A1 0 x
e1 B 11 B 12
s.t. = + u
e1 + u
e2
x
e2 0 A2 x
e2 B 21 B 22
e 1 ∈ U1 ⊖ us (k)
u
′ e
S1u x 1 (N) = 0
e 1 ≤ d1 x
u e 1 (0)

Notice that because the input constraint is shifted by the input tar-
get, we must retain feasibility of the regulation problem by restrict-
ing also the plant disturbance and its initial estimate error. If the two
players’ regulation problems remain feasible as the estimate error con-
verges to zero, we have exponential stability of the zero solution from
Lemma 6.14. Therefore we conclude

(x
e (k), u
e (k)) → (0, 0) Lemma 6.14
=⇒ (x̂(k), u(k)) → (xs (k), us (k)) definition of deviation variables
=⇒ (x̂(k), u(k)) → (xs∗ , u∗
s) target problem convergence
=⇒ x(k) → xs∗ estimator stability
=⇒ r (k) → rsp target equality constraint

and we have zero offset in the plant controlled variable r = Hy. The
rate of convergence of r (k) to rsp is also exponential. As we saw here,
this convergence depends on maintaining feasibility in both the target
problem and the regulation problem at all times.

6.4 Constrained M-Player Game

We have set up the constrained two-player game so that the approach


generalizes naturally to the M-player game. We do not have a lot of
work left to do to address this general case. Recall I1:M denotes the set
6.4 Constrained M-Player Game 413

of integers {1, 2, . . . , M}. We define the following systemwide variables


  
    
x1 (0) u1 B 1i B1i
 x (0)  u  B  B 
 2   2  2i   2i 
x(0) = 
 .. 
 u= 
 ..  Bi =  .. 
 B i =  . 
 .  i ∈ I1:M
 .   .   .   . 
xM (0) uM B Mi BMi
X
V (x(0), u) = ρj Vj (xj (0), u)
j∈I1:M

Each player solves a similar optimization, so for i ∈ I1:M

min V (x(0), u)
ui
X
+
s.t. x = Ax + Bj uj
j∈I1:M

u i ∈ Ui
u′
Sji xji (N) =0 j ∈ I1:M
X
|ui | ≤ di xji (0) if xji (0) ∈ r B, j ∈ I1:M
j∈I1:M

This optimization can be expressed as a quadratic program, whose con-


straints and linear cost term depend affinely on parameter x. The warm
start for each player at the next sample is generated from purely local
information
+
e i = (ui (1), ui (2), . . . , ui (N − 1), 0)
u i ∈ I1:M

The controller iteration is given by


X  
p p
up+1 = wj u1 , . . . , uj0 , . . . , uM
j∈I1:M
 
p
in which ui0 = ui0 x(0), uj∈I1:M ,j≠i . The plantwide cost function then
satisfies for any p ≥ 0
X
V (x + , u+ ) ≤ V (x, u) − ρj ℓj (xj , uj )
j∈I1:M

|u| ≤ d |x| x ∈ rB

For the M-player game, we generalize Assumption 6.13 of the two-


player game to the following.

Assumption 6.17 (Constrained M-player game).


414 Distributed Model Predictive Control

(a) The systems (Ai , B i ), i ∈ I1:M are stabilizable, in which Ai = diag(A1i ,


A2i , · · · , AMi ).

(b) The systems (Ai , Ci ), i ∈ I1:M are detectable.

(c) The input penalties Ri , i ∈ I1:M are positive definite, and Qi , i ∈ I1:M
are semidefinite.

(d) The systems (Ai , Qi ), i ∈ I1:M are detectable.

(e) The horizon is chosen sufficiently long to zero the unstable modes;
N ≥ max i∈I1:M (nu u
i ), in which ni is the number of unstable modes of
Ai .

(f) Zero offset. For achieving zero offset, we augment the models with
integrating disturbances such that
" #
I − Ai −Bdi
rank = ni + p i i ∈ I1:M
Ci Cdi

Applying Theorem 6.5 then establishes exponential stability of the


solution x(k) = 0 for all k. The region of attraction is the set of states
for which the unstable modes of each subsystem can be brought to zero
in N moves, while satisfying the respective input constraints. These
conclusions apply regardless of how many iterations of the players’
optimizations are used in the control calculation. Although the closed-
loop system is exponentially stable for both coupled and uncoupled
constraints, the converged distributed controller is equal to the cen-
tralized controller only for the case of uncoupled constraints.
The exponential stability of the regulator implies that the states and
inputs of the constrained M-player system converge to the steady-state
target. The steady-state target can be calculated as a centralized or
distributed problem. We assume the centralized target has a feasible,
zero offset solution for the true plant disturbance. The initial state of
the plant and the estimate error must be small enough that feasibility
of the target is maintained under the nonzero estimate error.

6.5 Nonlinear Distributed MPC

In the nonlinear case, the usual model comes from physical principles
and conservation laws of mass, energy, and momentum. The state has
a physical meaning and the measured outputs usually are a subset of
6.5 Nonlinear Distributed MPC 415

the state. We assume the model is of the form

dx1
= f1 (x1 , x2 , u1 , u2 ) y1 = C1 x1
dt
dx2
= f2 (x1 , x2 , u1 , u2 ) y2 = C2 x2
dt

in which C1 , C2 are matrices of zeros and ones selecting the part of the
state that is measured in subsystems one and two. We generally cannot
avoid state x2 dependence in the differential equation for x1 . But often
only a small subset of the entire state x2 appears in f1 , and vice versa.
The reason in chemical process systems is that the two subsystems are
generally coupled through a small set of process streams transferring
mass and energy between the systems. These connecting streams iso-
late the coupling between the two systems and reduce the influence to
a small part of the entire state required to describe each system.
Given these physical system models of the subsystems, the overall
plant model is
dx
= f (x, u) y = Cx
dt
with
" # " # " # " # " #
x1 u1 f1 y1 C1
x= u= f = y= C=
x2 u2 f2 y2 C2

6.5.1 Nonconvexity

The basic difficulty in both the theory and application of nonlinear MPC
is the nonconvexity in the control objective function caused by the non-
linear dynamic model. This difficulty applies even to centralized non-
linear MPC as discussed in Section 2.7, and motivates the development
of suboptimal MPC. In the distributed case, nonconvexity causes extra
difficulties. As an illustration, consider the simple two-player, noncon-
vex game depicted in Figure 6.7. The cost function is

V (u1 , u2 ) = e−2u1 − 2e−u1 + e−2u2 − 2e−u2


+ a exp(−β((u1 + 0.2)2 + (u2 + 0.2)2 ))

in which a = 1.1 and β = 0.4. Each player optimizes the cooperative


p
objective starting at ➀ and produces the points (u01 , u2 ), denoted ➁,
416 Distributed Model Predictive Control


u2
2


1


0 ➁

0 1 2 3 4 5
u1

Figure 6.7: Cost contours for a two-player, nonconvex game; cost


increases for the convex combination of the two players’
optimal points.

p
and (u1 , u02 ), denoted ➂. Consider taking a convex combination of the
two players’ optimal points for the next iterate
p+1 p+1 p p
(u1 , u2 ) = w1 (u01 , u2 ) + w2 (u1 , u02 ) w1 + w2 = 1, w1 , w2 ≥ 0

We see in Figure 6.7 that this iterate causes the objective function to
increase rather than decrease for most values of w1 , w2 . For w1 = w2 =
1/2, we see clearly from the contours that V at point ➃ is greater than
V at point ➀.
The possibility of a cost increase leads to the possibility of closed-
loop instability and precludes developing even a nominal control theory
for this simple approach, which was adequate for the convex, linear
plant case.4 In the centralized MPC problem, this nonconvexity issue
can be addressed in the optimizer, which can move both inputs simul-
taneously and always avoid a cost increase. One can of course consider
4 This point marked the state of affairs at the time of publication of the first edi-

tion of this text. The remaining sections summarize one approach that addresses the
nonconvexity problem (Stewart, Wright, and Rawlings, 2011).
6.5 Nonlinear Distributed MPC 417

adding another player to the game who has access to more systemwide
information. This player takes the optimization results of the indi-
vidual players and determines a search direction and step length that
achieve a cost decrease for the overall system. This player is often
known as a coordinator. The main drawback of this approach is that
the design of the coordinator may not be significantly simpler than the
design of the centralized controller.
Rather than design a coordinator, we instead let each player evaluate
the effect of taking a combination of all the players’ optimal moves. The
players can then easily find an effective combination that leads to a cost
decrease. We describe one such algorithm in the next section, which
we call the distributed gradient algorithm.

6.5.2 Distributed Algorithm for Nonconvex Functions

We consider the problem

min V (u) s.t. u∈U (6.29)


u

in which u ∈ Rm and V : Rm → R≥0 is twice continuously differentiable


and not necessarily convex. We assume U is closed, convex, and can
be expressed as U = U1 × · · · × UM with Ui ∈ Rmi for all i ∈ I1:M . We
solve approximately the following subproblems at iterate p ≥ 0 for all
i ∈ I1:M
p
min V (ui , u−i )
ui ∈Ui

p
in which u−i = (u1 , . . . , ui−1 , ui+1 , . . . , uM ). Let ui denote the approx-
imate solution to these optimizations. We compute the approximate
solutions via the standard technique of line search with gradient pro-
jection. At iterate p ≥ 0
p p
ui = Pi (ui − ∇i V (up )) (6.30)

in which ∇i V (up ) is the ith component of ∇V (up ) and Pi (·) denotes


p p p
projection onto the set Ui . Define the step υi = ui − ui . The step-
p
size αi is chosen as follows; each suboptimizer initializes the stepsize
p
with αi , and then uses backtracking until αi satisfies the Armijo rule
(Bertsekas, 1999, p.230)
p p p p p p
V (up ) − V (ui + αi υi , u−i ) ≥ −σ αi ∇i V (up )′ υi (6.31)
418 Distributed Model Predictive Control

in which σ ∈ (0, 1). After all suboptimizers finish backtracking, they


exchange proposed steps. Each suboptimizer forms a candidate step
p+1 p p p
ui = ui + wi αi υi ∀i ∈ I1:M (6.32)

and checks the following inequality


X p p p p
V (up+1 ) ≤ wi V (ui + αi υi , u−i ) (6.33)
i∈I1:M
P
with i∈I1:M wi = 1 and wi > 0 for all i ∈ I1:M . If condition (6.33) is
not satisfied, then we remove the direction with the least cost improve-
p p p p
ment, imax = arg max i {V (ui + αi υi , u−i )}, by setting wimax to zero and
repartitioning the remaining wi so that they sum to one. The candidate
step (6.32) is recalculated and condition (6.33) is checked again. This
process is repeated until (6.33) is satisfied. It may happen that con-
dition (6.33) is satisfied with only a single direction. The distributed
algorithm thus eliminates poor suboptimizer steps and ensures that
the objective function decreases at each iterate, even for nonconvex
objective functions. The proposed algorithm has the following proper-
ties.

Lemma 6.18 (Distributed gradient algorithm properties). The distributed


gradient projection algorithm has the following properties.
(a) (Feasibility.) Given a feasible initial condition, the iterates up are
feasible for all p ≥ 0.

(b) (Objective decrease.) The objective function decreases at every iter-


ate: V (up+1 ) ≤ V (up ).

(c) (Convergence.) Every accumulation point of the sequence (up )p≥0 is


a stationary point.

The proof of Lemma 6.18 is given in Stewart et al. (2011). Note


that the test of inequality (6.33) does not require a coordinator. At
each iteration the subsystems exchange the solutions of the gradient
projection. Because each subsystem has access to the plantwide model,
they can evaluate the objection function, and the algorithm can be run
independently on each controller. This computation is likely a smaller
overhead than a coordinating optimization.
Figure 6.8 shows the results of applying the proposed distributed
gradient algorithm to the previous example. The problem has two
global minima located at (0.007, 2.28) and (2.28, 0, 007), and a local
6.5 Nonlinear Distributed MPC 419

u2
2

0 1 2 3 4 5
u1

Figure 6.8: Nonconvex function optimized with the distributed gra-


dient algorithm. Iterations converge to local minima
from all starting points.

minimum at (0.23, 0.23). The inputs are constrained: 0.1 ≤ ui ≤ 4 for


i ∈ I1:2 . The algorithm is initialized at three different starting points
(0.5, 0.5), (3.9, 3.6), and (2.99, 3). From Figure 6.8 we see that each of
these starting points converges to a different local minimum.

6.5.3 Distributed Nonlinear Cooperative Control

Next we design a controller based on the distributed optimization al-


gorithm. For simplicity of presentation, we assume that the plant con-
sists of two subsystems. We consider the standard MPC cost function
for each system i ∈ I1:2

 N−1
X  
Vi x(0), u1 , u2 = ℓi xi (k), ui (k) + Vif x(N)
k=0

with ℓi (xi , ui ) denoting the stage cost, Vif (x) the terminal cost of sys-
tem i, and xi (i) = φi (k; xi , u1 , u2 ). Because xi is a function of both u1
420 Distributed Model Predictive Control

and u2 , Vi is a function of both u1 and u2 . As in the case for linear


plants, we define the plantwide objective
  
V x1 (0), x2 (0), u1 , u2 = ρ1 V1 x(0), u1 , u2 + ρ2 V2 x(0), u1 , u2

in which ρ1 , ρ2 > 0 are weighting factors. To simplify notation we


use V (x, u) to denote the plantwide objective. Similarly we define
the system stage cost and terminal cost as the combined stage costs
ℓ(x, u) := ρ1 ℓ1 (x1 , u1 ) + ρ2 ℓ2 (x2 , u2 ), and terminal costs Vf (x) :=
ρ1 V1f (x) + ρ2 V2f (x). Each subsystem has constraints of the form

u1 (k) ∈ U1 u2 (k) ∈ U2 k ∈ I0:N−1

in which each Ui ∈ Rmi is compact, convex, and contains the origin.


Finally, we define the terminal region Xf to be a sublevel set of Vf .

Xf = {x | Vf (x) ≤ a}

for some a > 0.


We next modify the standard stability assumption to account for
the distributed nature of the problem.
Assumption 6.19 (Basic stability assumption (distributed)). Vf (·), Xf ,
and ℓ(·) have the following properties.
(a) For all x ∈ Xf , there exists (u1 , u2 ) (such that (x, u1 , u2 ) ∈ Rn ×
U1 × U2 ) satisfying

f (x, u1 , u2 ) ∈ Xf
Vf (f (x, u1 , u2 )) − Vf (x) ≤ −ℓ(x, u1 , u2 )

(b) For each i ∈ I1:2 , there exist K∞ functions αi (·), and αf (·) satisfy-
ing

ℓi (xi , ui ) ≥ αi (|xi |) ∀(xi , ui ) ∈ XN × Ui


Vf (x) ≤ αf (|x|) ∀x ∈ Xf

This assumption implies that there exist local controllers κif : Xf →


Ui for all i ∈ I1:2 such that for all x ∈ Xf
 
Vf f (x, κ1f (x), κ2f (x)) − Vf (x) ≤ −ℓ x, κ1f (x), κ2f (x) (6.34)

with f (x, κ1f (x), κ2f (x)) ∈ Xf . Each terminal controller κif (·) may be
found offline.
6.5 Nonlinear Distributed MPC 421

Removing the terminal constraint in suboptimal MPC. To show sta-


bility, we require that φ(N; x, u) ∈ Xf . But the terminal constraint on
the state shows up as a coupled input constraint in each subsystem’s
optimization problem. As we have already discussed, coupled input
constraints may prevent the distributed algorithm from converging to
the optimal plantwide control (Stewart, Venkat, Rawlings, Wright, and
Pannocchia, 2010). The terminal constraint can be removed from the
control problem by modifying the terminal penalty, however, as we
demonstrate next.
For some β ≥ 1, we define the objective function

N−1
X
V β (x, u) = ℓ(x(k), u(k)) + βVf (x(N)) (6.35)
k=0

and the set of admissible initial (x, u) as

Z0 = {(x, u) ∈ X × UN | V β (x, u) ≤ V , φ(N; x, u) ∈ Xf } (6.36)

in which V > 0 is an arbitrary constant and X = Rn . The set of initial


states X0 is the projection of Z0 onto X

X0 = {x ∈ X | ∃u such that (x, u) ∈ Z0 }

We have the following result.

Proposition 6.20 (Terminal constraint satisfaction). Let {(x(k), u(k)) |


k ∈ I≥0 } denote the set of states and control sequences generated by
the suboptimal system. There exists a β > 1 such that for all β ≥ β, if
(x(0), u(0)) ∈ Z0 , then (x(k), u(k)) ∈ Z0 with φ(N; x(k), u(k)) ∈ Xf
for all k ∈ I≥0 .

The proof of this proposition is given in Stewart et al. (2011). We


are now ready to define the cooperative control algorithm for nonlinear
systems.

Cooperative control algorithm. Let x(0) be the initial state and u e∈U
be the initial feasible input sequence for the cooperative MPC algorithm
such that φ(N; x(0), u e ) ∈ Xf . At each iterate p, an approximate solu-
422 Distributed Model Predictive Control

tion of the following optimization problem is computed



min V x1 (0), x2 (0), u1 , u2
u

s.t. x1+ = f1 (x1 , x2 , u1 , u2 )


x2+ = f2 (x1 , x2 , u1 , u2 )
u i ∈ UN
i ∀i ∈ I1:2
|ui | ≤ δi (|xi (0)|) if x(0) ∈ Br ∀i ∈ I1:2 (6.37)

in which δi (·) ∈ K∞ and r > 0 can be chosen as small as desired.


We can express (6.37) in the form of (6.29) by eliminating the model
equality constraints. To implement distributed control, we simply use
the distributed gradient algorithm to solve (6.37).
Denote the solution returned by the algorithm as up̄ (x, u e ). The
first element of the sequence, denoted κ p̄ (x(0)) = up̄ (0; x(0), u e ), is
injected into the plant. To reinitialize the algorithm at the next sample
time, we compute the warm start
+ 
ue 1 = {u1 (1), u1 (2), . . . , u1 (N − 1), κ1f x(N) }
+ 
ue 2 = {u2 (1), u2 (2), . . . , u2 (N − 1), κ2f x(N) }

in which x(N) = φ(N; x(0), u1 , u2 ). We expect that it is not possible


to solve (6.37) to optimality in the available sample time, and the dis-
tributed controller is therefore a form of suboptimal MPC. The proper-
ties of the closed-loop system are therefore analyzed using suboptimal
MPC theory.

6.5.4 Stability of Distributed Nonlinear Cooperative Control

We first show that the plantwide objective function decreases between


sampling times. Let (x, u) be the state and input sequence at some
time. Using the warm start as the initial condition at the next sample
time, we have
+
V (x + , u
e ) = V (x, u) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )
−ρ1 V1f (x(N)) − ρ2 V2f (x(N))
 
+ρ1 ℓ1 x1 (N), κ1f x(N) + ρ2 ℓ2 x2 (N), κ2f x(N)
 
 
+ρ1 V1f f1 x1 (N), x2 (N), κ1f x(N) , κ2f x(N)
 

+ρ2 V2f f2 x1 (N), x2 (N), κ1f (x(N)), κ2f (x(N))
6.5 Nonlinear Distributed MPC 423

From (6.34) of the stability assumption, we have that


+
V (x + , u
e ) ≤ V (x, u) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )

By Lemma 6.18(b), the objective function cost only decreases from this
warm start, so that

V (x + , u+ ) ≤ V (x, u) − ρ1 ℓ1 (x1 , u1 ) − ρ2 ℓ2 (x2 , u2 )

and we have the required cost decrease of a Lyapunov function

V (x + , u+ ) − V (x, u) ≤ −α(|(x, u)|) (6.38)

in which α(|(x, u)|) = ρ1 α1 (|(x1 , u1 )|) + ρ2 α2 (|(x2 , u2 )|).


We can now state the main result. Let XN be the admissible set of
initial states for which the control optimization (6.37) is feasible.

Theorem 6.21 (Asymptotic stability). Let Assumptions 2.2, 2.3, and 6.19
hold, and let V (·) ← V β (·) from Proposition 6.20. Then for every x(0) ∈
XN , the origin is asymptotically stable for the closed-loop system x + =
f (x, κ p̄ (x)).

The proof follows, with minor modification, the proof that subopti-
mal MPC is asymptotically stable in Theorem 2.48. As in the previous
sections, the controller has been presented for the case of two subsys-
tems, but can be extended to any finite number of subsystems.
We conclude the discussion of nonlinear distributed MPC by revis-
iting the unstable nonlinear example system presented in Stewart et al.
(2011).

Example 6.22: Nonlinear distributed control


We consider the unstable nonlinear system

x1+ = x12 + x2 + u31 + u2


x2+ = x1 + x22 + u1 + u32

with initial condition (x1 , x2 ) = (3, −3). The control objective is to


stabilize the system and regulate the states to the origin. We use a
standard quadratic stage cost
1
ℓ1 (x1 , u1 ) = (x1′ Q1 x1 + u′1 R1 u1 )
2
1
ℓ2 (x2 , u2 ) = (x2′ Q2 x2 + u′2 R2 u2 )
2
424 Distributed Model Predictive Control

p=3 p = 10
4

2
x x1
0 x2

−2

u u1
0
u2
−1

−2

0 2 4 6 8 10 0 2 4 6 8 10
time time

Figure 6.9: Closed-loop state and control evolution with (x1 (0),
x2 (0)) = (3, −3). Setting p = 10 approximates the cen-
tralized controller.

with Q1 , Q2 > 0 and R1 , R2 > 0. This stage cost gives the objective
function
N−1
1 X
V (x, u) = x(k)′ Qx(k) + u(k)′ Ru(k) + Vf (x(N))
2 k=0

in which Q = diag(Q1 , Q2 ), R = diag(R1 , R2 ). The terminal penalty


is defined in the standard way for centralized MPC; we linearize the
system at the steady state, and design an LQ controller for the lin-
earized system. The terminal region is then a sublevel set of the termi-
nal penalty chosen small enough to satisfy the input constraints. We
use the following parameter values in the simulation study

Q=I R=I N=2 p=3 Ui = [−2.5, 2.5] ∀i ∈ I1:2

Figure 6.9 shows that the controller is stabilizing for as few as p = 3


iterations. Increasing the maximum number of iterations can signifi-
6.6 Notes 425

0.5

−0.5

u2 −1

−1.5

−2

−2.5

−3
−3 −2 −1 0 1 2
u1

Figure 6.10: Contours of V (x(0), u1 , u2 ) with N = 1 at k = 0, (x1 (0),


x2 (0)) = (3, −3). Iterations of the subsystem con-
trollers with initial condition (u01 , u02 ) = (0, 0).

cantly improve the performance. Figure 6.9 shows the performance


improvement for p = 10, which is close to the centralized MPC per-
formance. To see the difficulty in optimizing the nonconvex objec-
tive function, iterations of the initial control optimization are shown
in Figure 6.10 for the N = 1 case. Clearly the distributed optimization
method is able to efficiently handle this nonconvex objective with only
a few iterations. □

6.6 Notes
At least three different fields have contributed substantially to the ma-
terial presented in this chapter. We attempt here to point out briefly
what each field has contributed, and indicate what literature the inter-
ested reader may wish to consult for further pursuing this and related
subjects.
426 Distributed Model Predictive Control

Game theory. Game theory emerged in the mid-1900s to analyze situ-


ations in which multiple players follow a common set of rules but have
their own and different objectives that they try to optimize in com-
petition with each other. Von Neumann and Morgenstern introduced
the classic text on this subject, “Theory of Games and Economic Behav-
ior,” in 1944. A principle aim of game theory since its inception was to
model and understand human economic behavior, especially as it arises
in a capitalistic, free-market system. For that reason, much of the sub-
sequent game theory literature was published in economics journals
rather than systems theory journals. This field has contributed richly
to the ideas and vocabulary used in this chapter to describe distributed
control. For example, the game in which players have different objec-
tives is termed noncooperative. The equilibrium of a noncooperative
game is known as a Nash equilibrium (Nash, 1951). The Nash equilib-
rium is usually not Pareto optimal, which means that the outcomes for
all players can be improved simultaneously from the Nash solution. A
comprehensive overview of the game theory literature, especially the
parts relevant to control theory, is provided by Başar and Olsder (1999,
Chapter 1), which is a highly recommended reference. Analyzing the
equilibria of a noncooperative game is usually more complex than the
cooperative game (optimal control problem). The closed-loop proper-
ties of a receding horizon implementation of any of these game theory
solutions is not addressed in game theory. That topic is addressed by
control theory.

Distributed optimization. The optimization community has exten-


sively studied the issue of solving large-scale optimization problems
using distributed optimization methods. The primary motivation in
this field is to exploit parallel computing hardware and distributed
data communication networks to solve large optimization problems
faster. Bertsekas and Tsitsiklis provide an excellent and comprehen-
sive overview of this field, focusing on numerical algorithms for imple-
menting the distributed approaches. The important questions that are
addressed in designing a distributed optimization are: task allocation,
communication, and synchronization (Bertsekas and Tsitsiklis, 1997,
Chapter 1).
These basic concepts arise in distributed problems of all types, and
therefore also in the distributed MPC problem, which provides good
synergy between these fields. But one should also note the structural
distinctions between distributed optimization and distributed MPC. The
primary obstacle to implementing centralized MPC for large-scale plants
6.6 Notes 427

is not computational but organizational. The agents considered in dis-


tributed MPC are usually existing MPC systems already built for units
or subsystems within an existing large-scale process. The plant man-
agement often is seeking to improve the plant performance by bet-
ter coordinating the behavior of the different agents already in opera-
tion. Ignoring these structural constraints and treating the distributed
MPC problem purely as a form of distributed optimization, ignores as-
pects of the design that are critical for successful industrial applica-
tion (Rawlings and Stewart, 2008).
Control theory. Researchers have long studied the issue of how to dis-
tribute control tasks in a complex large-scale plant (Mesarović, Macko,
and Takahara, 1970; Sandell Jr., Varaiya, Athans, and Safonov, 1978).
The centralized controller and decentralized controller define two lim-
iting design extremes. Centralized control accounts for all possible
interactions, large and small, whereas decentralized control ignores
them completely. In decentralized control the local agents have no
knowledge of each others’ actions. It is well known that the nominal
closed-loop system behavior under decentralized control can be arbi-
trarily poor (unstable) if the system interactions are not small. The
following reviews provide general discussion of this and other perfor-
mance issues involving decentralized control (Šiljak, 1991; Lunze, 1992;
Larsson and Skogestad, 2000; Cui and Jacobsen, 2002).
The next level up in design complexity from decentralized control is
noncooperative control. In this framework, the agents have interaction
models and communicate at each iteration (Jia and Krogh, 2002; Motee
and Sayyar-Rodsari, 2003; Dunbar and Murray, 2006). The advantage
of noncooperative control over decentralized control is that the agents
have accurate knowledge of the effects of all other agents on their local
objectives. The basic issue to analyze and understand in this setup is
the competition between the agents. Characterizing the noncoopera-
tive equilibrium is the subject of noncooperative game theory, and the
impact of using that solution for feedback control is the subject of con-
trol theory. For example, Dunbar (2007) shows closed-loop stability for
an extension of noncooperative MPC described in (Dunbar and Murray,
2006) that handles systems with interacting subsystem dynamics. The
key assumptions are the existence of a stabilizing decentralized feed-
back law valid near the origin, and an inequality condition limiting the
coupling between the agents.
Cooperative MPC was introduced by Venkat, Rawlings, and Wright
(2007). They show that a receding horizon implementation of a coop-
428 Distributed Model Predictive Control

erative game with any number of iterates of the local MPC controllers
leads to closed-loop stability for linear dynamics. Venkat, Rawlings,
and Wright (2006a,b) show that state estimation errors (output instead
of state feedback) do not change the system closed-loop stability if the
estimators are also asymptotically stable. Most of the theoretical re-
sults on cooperative MPC of linear systems given in this chapter are
presented in Venkat (2006) using an earlier, different notation. If im-
plementable, this form of distributed MPC clearly has the best control
properties. Although one can easily modify the agents’ objective func-
tions in a single large-scale process owned by a single company, this
kind of modification may not be possible in other situations in which
competing interests share critical infrastructure.
The requirements of the many different classes of applications con-
tinue to create exciting opportunities for continued research in this
field. An excellent recent review provides a useful taxonomy of the dif-
ferent features of the different approaches (Scattolini, 2009). A recent
text compiles no less than 35 different approaches to distributed MPC
from more than 80 contributors (Maestre and Negenborn, 2014). The
growth in the number and diversity of applications of distributed MPC
shows no sign of abating.
6.7 Exercises 429

6.7 Exercises

Exercise 6.1: Three looks at solving the LQ problem (LQP)


In the following exercise, you will write three codes to solve the LQR using Octave or
MATLAB. The objective function is the LQR with mixed term
N−1
1 X 
V = x(k)′ Qx(k) + u(k)′ Ru(k) + 2x(k)′ Mu(k) + (1/2)x(N)′ Pf x(N)
2
k=0

First, implement the method described in Section 6.1.1 in which you eliminate the
state and solve the problem for the decision variable

u = (u(0), u(1), . . . , u(N − 1))

Second, implement the method described in Section 6.1.1 in which you do not elim-
inate the state and solve the problem for

z = (u(0), x(1), u(1), x(2), . . . , u(N − 1), x(N))

Third, use backward dynamic programming (DP) and the Riccati iteration to com-
pute the closed-form solution for u(k) and x(k).

(a) Let
" # " # " #
4/3 −2/3 1 h i 1
A= B= C = −2/3 1 x(0) =
1 0 0 1

Q = C ′ C + 0.001I Pf = Π R = 0.001 M =0
in which the terminal penalty, Pf is set equal to Π, the steady-state cost to go.
Compare the three solutions for N = 5. Plot x(k), u(k) versus time for the
closed-loop system.

(b) Let N = 50 and repeat. Do any of the methods experience numerical problems
generating an accurate solution? Plot the condition number of the matrix that
is inverted in the first two methods versus N.

(c) Now consider the following unstable system


     
27.8 −82.6 34.6 0.527 0.548 1
     
A = 25.6 −76.8 32.4 B = 0.613 0.530 x(0) = 1
40.6 −122.0 51.9 1.06 0.828 1

Consider regulator tuning parameters and constraints

Q=I Pf = Π R=I M =0

Repeat parts (a) and (b) for this system. Do you lose accuracy in any of the
solution methods? What happens to the condition number of H(N) and S(N)
as N becomes large? Which methods are still accurate for this case? Can you
explain what happened?
430 Distributed Model Predictive Control

Exercise 6.2: LQ as least squares


Consider the standard LQP
N−1
1 X 
min V = x(k)′ Qx(k) + u(k)′ Ru(k) + (1/2)x(N)′ Pf x(N)
u 2
k=0

subject to
x + = Ax + Bu
(a) Set up the dense Hessian least squares problem for the LQP with a horizon of
three, N = 3. Eliminate the state equations and write out the objective function
in terms of only the decision variables u(0), u(1), u(2).

(b) What are the conditions for an optimum, i.e., what linear algebra problem do
you solve to compute u(0), u(1), u(2)?

Exercise 6.3: Lagrange multiplier method


Consider the general least squares problem
1 ′
min V (x) = x Hx + const
x 2
subject to
Dx = d
(a) What is the Lagrangian L for this problem? What is the dimension of the Lagrange
multiplier vector, λ?

(b) What are necessary and sufficient conditions for a solution to the optimization
problem?

(c) Apply this approach to the LQP of Exercise 6.2 using the equality constraints to
represent the model equations. What are H, D, d for the LQP?

(d) Write out the linear algebra problem to be solved for the optimum.

(e) Contrast the two different linear algebra problems in these two approaches.
Which do you want to use when N is large and why?

Exercise 6.4: Reparameterizing an unstable system


Consider again the LQR problem with cross term
N−1
1 X 
min V = x(k)′ Qx(k) + u(k)′ Ru(k) + 2x(k)′ Mu(k) + (1/2)x(N)′ Pf x(N)
u 2
k=0

subject to
x + = Ax + Bu
and the three approaches of Exercise 6.1.
1. The method described in Section 6.1.1 in which you eliminate the state and solve
the problem for the decision variable

u = (u(0), u(1), . . . , u(N − 1))


6.7 Exercises 431

2. The method described in Section 6.1.1 in which you do not eliminate the state
and solve the problem for
z = (u(0), x(1), u(1), x(2), . . . , u(N − 1), x(N))
3. The method of DP and the Riccati iteration to compute the closed-form solution
for u(k) and x(k).
(a) You found that unstable A causes numerical problems in the first method using
large horizons. So let’s consider a fourth method. Reparameterize the input in
terms of a state feedback gain via
u(k) = Kx(k) + v(k)
in which K is chosen so that A + BK is a stable matrix. Consider the matrices in
a transformed LQP
N−1
1 X  e x(k) + v(k)′ R
e v(k) + 2x(k)′ M

e v(k) +(1/2)x(N)′ P
e x(N)
min V = x(k)′ Q f
v 2
k=0

ex + B
subject to x + = A e v.

What are the matrices A e,B e,P


e, Q e ,Re, M
e such that the two problems give the same
f
solution (state trajectory)?

(b) Solve the following problem using the first method and the fourth method and
describe differences between the two solutions. Compare your results to the DP
approach. Plot x(k) and u(k) versus k.

     
27.8 −82.6 34.6 0.527 0.548 1
     
A = 25.6 −76.8 32.4 B = 0.613 0.530 x(0) = 1
40.6 −122.0 51.9 1.06 0.828 1
Consider regulator tuning parameters and constraints
Q = Pf = I R=I M =0 N = 50

Exercise 6.5: Recursively summing quadratic functions


Consider generalizing Example 1.1 to an N-term sum. Let the N-term sum of quadratic
functions be defined as
N
1 X
V (N, x) = (x − x(i))′ Xi (x − x(i))
2
i=1
in which x, x(i) ∈ Rn are real n-vectors and Xi ∈ Rn×n are positive definite matrices.
(a) Show that V (N, x) can be found recursively
V (N, x) = (1/2)(x − v(N))′ H(N)(x − v(N)) + constant
in which v(i) and H(i) satisfy the recursion

H(i + 1) = Hi + Xi+1 v(i + 1) = H −1 (i + 1) Hi vi + Xi+1 x(i + 1)
H1 = X 1 v 1 = x1

Notice the recursively defined v(m) and H(m) provide the solutions and the
Hessian matrices of the sequence of optimization problems
min V (m, x) 1≤m≤N
x
432 Distributed Model Predictive Control

(b) Check your answer by solving the equivalent, but larger dimensional, constrained
least squares problem (see Exercise 1.16)

e (z − z0 )
min(z − z0 )′ H
z

subject to
Dz = 0
in which z, z0 ∈ RnN , e
H∈R nN×nN is a block diagonal matrix, D ∈ Rn(N−1)×nN
   
x(1) X1  
    I −I
 .   ..   
 .  e  .   .. .. 
z0 =  .  H=  D= 
     . . 
x(N − 1)  XN−1 
I −I
x(N) XN

(c) Compare the size and number of matrix inverses required for the two approaches.

Exercise 6.6: Why call the Lyapunov stability nonuniform?


Consider the following linear system

w + = Aw w(0) = Hx(0)
x = Cw

with solution w(k) = Ak w(0) = Ak Hx(0), x(k) = CAk Hx(0). Notice that x(0) com-
pletely determines both w(k) and x(k), k ≥ 0. Also note that zero is a solution, i.e.,
x(k) = 0, k ≥ 0 satisfies the model.

(a) Consider the following case


" # " #
cos θ − sin θ 0 h i
A=ρ H= C= 1 −1
sin θ cos θ −1
ρ = 0.925 θ = π /4 x(0) = 1

Plot the solution x(k). Does x(k) converge to zero? Does x(k) achieve zero
exactly for finite k > 0?

(b) Is the zero solution x(k) = 0 Lyapunov stable? State your definition of Lyapunov
stability, and prove your answer. Discuss how your answer is consistent with
the special case considered above.

Exercise 6.7: Exponential stability of suboptimal MPC with unbounded fea-


sible set
Consider again Lemma 6.5 when both U and XN are unbounded. Show that the subop-
timal MPC controller is exponentially stable on the following sets.
(a) Any sublevel set of V (x, h(x))

(b) Any compact subset of XN


6.7 Exercises 433

Exercise 6.8: A refinement to the warm start


Consider the following refinement to the warm start in the suboptimal MPC strategy.
First add the requirement that the initialization strategy satisfies the following bound

h(x) ≤ d̄ |x| x ∈ XN

in which d̄ > 0. Notice that all initializations considered in the chapter satisfy this
requirement.
Then, at time k and state x, in addition to the shifted input sequence from time
k − 1, u
e , evaluate the initialization sequence applied to the current state, u = h(x).
Select whichever of these two input sequence has lower cost as the warm start for time
k. Notice also that this refinement makes the constraint

|u| ≤ d |x| x ∈ rB

redundant, and it can be removed from the MPC optimization.


Prove that this refined suboptimal strategy is exponentially stabilizing on the set
XN . Notice that with this refinement, we do not have to assume that XN is bounded
or that U is bounded.

Exercise 6.9: Global asymptotic stability and exponential convergence with


mixed powers of the norm
Prove Lemma 6.6.
Hints: exponential convergence can be established as in standard exponential sta-
bility theorems. To establish Lyapunov stability, notice that |x(0)| ≤ |(x(0), e(0))| and
|e(0)| ≤ |(x(0), e(0))| and that (·)α for α > 0 is a K∞ function.

Exercise 6.10: Decentralized control of Examples 6.9–6.11

Apply decentralized control to the systems in Examples 6.9–6.11. Which of these sys-
tems are closed-loop unstable with decentralized control? Compare this result to the
result for noncooperative MPC.

Exercise 6.11: Cooperative control of Examples 6.9–6.11

Apply cooperative MPC to the systems in Examples 6.9–6.11. Are any of these systems
closed-loop unstable? Compare the closed-loop eigenvalues of converged cooperative
control to centralized MPC, and discuss any differences.

Exercise 6.12: Adding norms


Establish the following result used in the proof of Lemma 6.14. Given that w ∈ Rm ,
e ∈ Rn
1
√ (|w| + |e|) ≤ |(w, e)| ≤ |w| + |e| ∀w, e
2
434 Distributed Model Predictive Control

V (u) =constant

u2

u∗

up+1

up

u1

Figure 6.11: Optimizing a quadratic function in one set of variables


at a time.

Exercise 6.13: Padding matrices


Given a vector z and subvector u
 
u(0)
 x(1)   
  u(0)
 u(1) 
   
   u(1) 
 x(2)   
z=  u= .  x ∈ Rn u ∈ Rm
 .   . 
 .   . 
 . 
 
u(N − 1) u(N − 1)
x(N)

and quadratic function of u


(1/2)u′ Hu + h′ u
Find the corresponding quadratic function of z so that

(1/2)z′ Hz z + h′z z = (1/2)u′ Hu + h′ u ∀z, u

Hint: first find the padding matrix E such that u = Ez.

Exercise 6.14: A matrix inverse


Compute the four partitioned elements in the two-player feedback gain (I − L)−1 K

u∞ = (I − L)−1 Kx(0) eig(L) < 1

in which
" #−1 " #
−1 I −L1 K1 0
(I − L) K=
−L2 I 0 K2
6.7 Exercises 435

Exercise 6.15: Optimizing one variable at a time


Consider the positive definite quadratic function partitioned into two sets of variables

V (u) = (1/2)u′ Hu + c ′ u + d
" #" # " #
h i H H12 u1 h i u
11 1
V (u1 , u2 ) = (1/2) u′1 u′2 + c1′ c2′ +d
H21 H22 u2 u2

in which H > 0. Imagine we wish to optimize this function by first optimizing over
the u1 variables holding u2 fixed and then optimizing over the u2 variables holding u1
fixed as shown in Figure 6.11. Let’s see if this procedure, while not necessarily efficient,
is guaranteed to converge to the optimum.
p p
(a) Given an initial point (u1 , u2 ), show that the next iteration is
 
p+1 −1 p
u1 = −H11 H12 u2 + c1
 
p+1 −1 p
u2 = −H22 H21 u1 + c2 (6.39)

The procedure can be summarized as

up+1 = Aup + b (6.40)

in which the iteration matrix A and constant b are given by


" # " #
−1 −1
0 −H11 H12 −H11 c1
A= −1 b = −1 (6.41)
−H22 H21 0 −H22 c2

(b) Establish that the optimization procedure converges by showing the iteration
matrix is stable
eig(A) < 1

(c) Given that the iteration converges, show that it produces the same solution as

u∗ = −H −1 c

Exercise 6.16: Monotonically decreasing cost


Consider again the iteration defined in Exercise 6.15.
(a) Prove that the cost function is monotonically decreasing when optimizing one
variable at a time

V (up+1 ) < V (up ) ∀up ≠ −H −1 c

(b) Show that the following expression gives the size of the decrease

V (up+1 ) − V (up ) = −(1/2)(up − u∗ )′ P (up − u∗ )

in which
" # " #
e D −1 H e =D−N H11 0 0 H12
P = HD −1 H H D= N=
0 H22 H21 0

and u∗ = −H −1 c is the optimum.


Hint: to simplify the algebra, first change coordinates and move the origin of the coor-
dinate system to u∗ .
436 Distributed Model Predictive Control

Exercise 6.17: One variable at a time with convex step


Consider Exercise 6.15 but with the convex step for the iteration
  "
p+1 0 p # " p #
u1  = w1 u1 (u 2) +w u1
p+1 p 2 0 p 0 ≤ w1 , w2 w1 + w2 = 1
u2 u2 u2 (u1 )
(a) Show that the iteration for the convex step is also of the form
up+1 = Aup + b
and the A matrix and b vector for this case are
" # " #" #
−1 −1
w2 I −w1 H11 H12 −w1 H11 c1
A= −1 b= −1
−w2 H22 H21 w1 I −w2 H22 c2

(b) Show that A is stable.

(c) Show that this iteration also converges to u∗ = −H −1 c.

Exercise 6.18: Monotonically decreasing cost with convex step


Consider again the problem of optimizing one variable at a time with the convex step
given in Exercise 6.17.
(a) Prove that the cost function is monotonically decreasing
V (up+1 ) < V (up ) ∀up ≠ −H −1 c

(b) Show that the following expression gives the size of the decrease
V (up+1 ) − V (up ) = −(1/2)(up − u∗ )′ P (up − u∗ )
in which
e D −1 H
P = HD −1 H He =D−N
" # " #
w1−1 H11 0 −w1−1 w2 H11 H12
D= −1 N=
0 w2 H22 H21 −w1 w2−1 H22
and u∗ = −H −1 c is the optimum.
Hint: to simplify the algebra, first change coordinates and move the origin of the coor-
dinate system to u∗ .

Exercise 6.19: Splitting more than once


Consider the generalization of Exercise 6.15 in which we repeatedly decompose a prob-
lem into one-variable-at-a-time optimizations. For a three-variable problem we have the
three optimizations
p+1 p p
u1 = arg min V (u1 , u2 , u3 )
u1
p+1 p p p+1 p p
u2 = arg min V (u1 , u2 , u3 ) u3 = arg min V (u1 , u2 , u3 )
u2 u3
Is it true that
p+1 p+1 p+1 p p p
V (u1 , u2 , u3 ) ≤ V (u1 , u2 , u3 )
Hint: you may wish to consider the following example, V (u) = (1/2)u′ Hu + c ′ u, in
which      
2 1 1 0 1
     
H = 1 1 1  c = 1 up = 0
1 1 2 1 1
6.7 Exercises 437

Exercise 6.20: Time-varying controller iterations


We let pk ≥ 0 be a time-varying integer-valued index representing the iterations applied
in the controller at time k.

x1 (k + 1) = A1 x1 (k) + B 11 u1 (0; k) + B 12 u2 (0; k)


x2 (k + 1) = A2 x2 (k) + B 21 u1 (0; k) + B 22 u2 (0; k)
p
u1 (k + 1) = g1 k (x1 (k), x2 (k), u1 (k), u2 (k))
p
u2 (k + 1) = g2 k (x1 (k), x2 (k), u1 (k), u2 (k))

Notice the system evolution is time-varying even though the models are time invariant
because we allow a time-varying sequence of controller iterations.
Show that cooperative MPC is exponentially stabilizing for any pk ≥ 0 sequence.

Exercise 6.21: Stable interaction models


In some industrial applications it is preferable to partition the plant so that there are
no unstable connections between subsystems. Any inputs uj that have unstable con-
nections to outputs yi should be included in the ith subsystem inputs. Allowing an
unstable connection between two subsystems may not be robust to faults and other
kinds of system failures.5 To implement this design idea in the two-player case, we
replace Assumption 6.13 (b) with the following

Modified Assumption 6.13 (Constrained two-player game).


(b) The interaction models Aij , i ≠ j are stable.

Prove that Modified Assumption 6.13 (b) implies Assumption 6.13 (b). It may be
helpful to first prove the following lemma.

Lemma 6.23 (Local detectability). Given partitioned system matrices


" #
A 0 h i
A= C = C Cs
0 As

in which As is stable, the system (A, C) is detectable if and only if the system (A, C) is
detectable.

Hint: use the Hautus lemma as the test for detectability.


Next show that this lemma and Modified Assumption 6.13 (b) establishes the dis-
tributed detectability assumption, Assumption 6.13 (b).

Exercise 6.22: Norm constraints as linear inequalities


Consider the quadratic program (QP) in decision variable u with parameter x

min(1/2)u′ Hu + x ′ Du
u
s.t. Eu ≤ F x

5 We are not considering the common instability of base-level inventory management

in this discussion. It is assumed that level control in storage tanks (integrators) is


maintained at all times with simple, local level controllers. The internal unit flowrates
dedicated for inventory management are not considered available inputs in the MPC
problem.
438 Distributed Model Predictive Control

−∇V −∇V

θ
u∗ u∗
V = c2 < c1 N(U, u∗ )
V = c1
z
U U
(a) (b)

Figure 6.12: (a) Optimality of u∗ means the angle between −∇V and
any point z in the feasible region must be greater than
90◦ and less than 270◦ . (b) The same result restated:
u∗ is optimal if and only if the negative gradient is in
the normal cone to the feasible region at u∗ , −∇V |u∗ ∈
N(U, u∗ ).

in which u ∈ Rm , x ∈ Rn , and H > 0. The parameter x appears linearly (affinely) in


the cost function and constraints. Assume that we wish to add a norm constraint of
the following form
|u|α ≤ c |x|α α = 2, ∞
(a) If we use the infinity norm, show that this problem can be posed as an equivalent
QP with additional decision variables, and the cost function and constraints re-
main linear (affine) in parameter x. How many decision variables and constraints
are added to the problem?

(b) If we use the two norm, show that this problem can be approximated by a QP
whose solution does satisfy the constraints, but the solution may be suboptimal
compared to the original problem.

Exercise 6.23: Steady-state noncooperative game


Consider again the steady-state target problem for the system given in Example 6.12.
(a) Resolve the problem for the choice of convex step parameters w1 = 0.2, w2 =
0.8. Does the iteration for noncooperative control converge? Plot the iterations
for the noncooperative and cooperative cases.

(b) Repeat for the convex step w1 = 0.8, w2 = 0.2. Are the results identical to the
previous part? If not, discuss any differences.

(c) For what choices of w1 , w2 does the target iteration converge using noncooper-
ative control for the target calculation?

Exercise 6.24: Optimality conditions for constrained optimization


Consider the convex quadratic optimization problem
min V (u) subject to u∈U
u
6.7 Exercises 439

in which V is a convex quadratic function and U is a convex set. Show that u∗ is an


optimal solution if and only if
⟨z − u∗ , −∇ V |u∗ ⟩ ≤0 ∀z ∈ U (6.42)
Figure 6.12(a) depicts this condition for u ∈ R2 . This condition motivates defining the
normal cone (Rockafellar, 1970) to U at u∗ as follows
N(U, u∗ ) = {y | ⟨z − u∗ , y − u∗ ⟩ ≤ 0 ∀z ∈ U}
The optimality condition can be stated equivalently as u∗
is an optimal point if and
only if the negative gradient is in the normal cone to U at u∗
−∇ V |u∗ ∈ N(U, u∗ )
This condition and the normal cone are depicted in Figure 6.12(b).

Exercise 6.25: Partitioned optimality conditions with constraints


Consider a partitioned version of the constrained optimization problem of Exercise 6.24
with uncoupled constraints
min V (u1 , u2 ) subject to u1 ∈ U1 u2 ∈ U2
u1 ,u2

in which V is a quadratic function and U1 and U2 are convex and nonempty.


(a) Show that (u∗ ∗
1 , u2 ) is an optimal solution if and only if
⟨z1 − u∗
1 , −∇u1 V |(u∗ ,u∗ ) ⟩ ≤0 ∀z1 ∈ U1
1 2

⟨z2 − u∗
2 , −∇u2 V |(u∗ ,u∗ ) ⟩ ≤0 ∀z2 ∈ U2 (6.43)
1 2

(b) Extend the optimality conditions to cover the case


min V (u1 , . . . , uM ) subject to uj ∈ Uj j = 1, . . . , M
u1 ,...,uM

in which V is a quadratic function and the Uj are convex and nonempty.

Exercise 6.26: Constrained optimization of M variables


Consider an optimization problem with M variables and uncoupled constraints
min V (u1 , u2 , . . . , uM ) subject to ul ∈ Uj j = 1, . . . , M
u1 ,u2 ,...,uM

in which V is a strictly convex function. Assume that the feasible region is convex and
nonempty and denote the unique optimal solution as (u∗ ∗ ∗
1 , u2 , . . . , uM ) having cost
∗ ∗ ∗
V = V (u1 , . . . , uM ). Denote the M one-variable-at-a-time optimization problems at
iteration k
p+1 p p
zj = arg min V (u1 , . . . , uj , . . . , uM ) subject to uj ∈ Uj
uj

Then define the next iterate to be the following convex combination of the previous
and new points
p+1 p p+1 p p
uj = αj zj + (1 − αj )uj j = 1, . . . , M
p
ε≤ αj <1 0<ε j = 1, . . . , M, p≥1
M
X p
αj = 1, p≥1
j=1

Prove the following results.


440 Distributed Model Predictive Control

p p p
(a) Starting with any feasible point, (u01 , u02 , . . . , u0M ), the iterations (u1 , u2 , . . . , uM )
are feasible for p ≥ 1.

(b) The objective function decreases monotonically from any feasible initial point
p+1 p+1 p p
V (u1 , . . . , uM ) ≤ V (u1 , . . . , uM ) ∀u0l ∈ Uj , j = 1, . . . , M, p≥1

p p p
(c) The cost sequence V (u1 , u2 , . . . , uM ) converges to the optimal cost V ∗ from
any feasible initial point.
p p p
(d) The sequence (u1 , u2 , . . . , uM ) converges to the optimal solution (u∗ ∗
1 , u2 , . . . ,
u∗
M ) from any feasible initial point.

Exercise 6.27: The constrained two-variable special case


Consider the special case of Exercise 6.26 with M = 2

min V (u1 , u2 ) subject to u1 ∈ U1 u2 ∈ U2


u1 ,u2

in which V is a strictly positive quadratic function. Assume that the feasible region
is convex and nonempty and denote the unique optimal solution as (u∗ ∗
1 , u2 ) having
cost V ∗ = V (u∗
1 , u∗
2 ). Consider the two one-variable-at-a-time optimization problems
at iteration k
p+1 p p+1 p
u1 = arg min V (u1 , u2 ) u2 = arg min V (u1 , u2 )
u1 u2

subject to u1 ∈ U1 subject to u2 ∈ U2

We know from Exercise 6.15 that taking the full step in the unconstrained problem
with M = 2 achieves a cost decrease. We know from Exercise 6.19 that taking the full
step for an unconstrained problem with M ≥ 3 does not provide a cost decrease in
general. We know from Exercise 6.26 that taking a reduced step in the constrained
problem for all M achieves a cost decrease. That leaves open the case of a full step for
a constrained problem with M = 2.
Does the full step in the constrained case for M = 2 guarantee a cost decrease? If
so, prove it. If not, provide a counterexample.

Exercise 6.28: Subsystem stability constraints


Show that the following uncoupled subsystem constraints imply an overall system con-
straint of the same type. The first is suitable for asymptotic stability and the second
for exponential stability.
(a) Given r1 , r2 > 0, and functions γ1 and γ2 of class K, assume the following
constraints are satisfied

|u1 | ≤ γ1 (|x1 |) x1 ∈ r 1 B
|u2 | ≤ γ2 (|x2 |) x2 ∈ r 2 B

Show that there exists r > 0 and function γ of class K such that

|(u1 , u2 )| ≤ γ(|(x1 , x2 )|) (x1 , x2 ) ∈ r B (6.44)


6.7 Exercises 441

(b) Given r1 , r2 > 0, and constants c1 , c2 , σ1 , σ2 > 0, assume the following con-
straints are satisfied

|u1 | ≤ c1 |x1 |σ1 x1 ∈ r 1 B


|u2 | ≤ c2 |x2 |σ2 x2 ∈ r 2 B

Show that there exists r > 0 and function c, σ > 0 such that

|(u1 , u2 )| ≤ c |(x1 , x2 )|σ (x1 , x2 ) ∈ r B (6.45)

Exercise 6.29: Distributed disturbance detectability


Prove Lemma 6.16.
Hint: use the Hautus lemma as the test for detectability.

Exercise 6.30: Distributed target problem and uncoupled constraints


Player one’s distributed target problem in the two-player game is given in (6.28)
" #′ " #" #
H1 y1s − z1sp T1s H1 y1s − z1sp
min (1/2)
x11s ,x21s ,u1s H2 y2s − z2sp T2s H2 y2s − z2sp

subject to
 
" #  x1s  " #
I − A1 −B 11 −B 12  x2s  B1d d̂1 (k)
 =
I − A2 −B 21 −B 22 u1s  B2d d̂2 (k)
u2s
E1 u1s ≤ e1

Show that the constraints can be expressed so that the target problem constraints are
uncoupled.
Bibliography

T. Başar and G. J. Olsder. Dynamic Noncooperative Game Theory. SIAM,


Philadelphia, 1999.

D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, sec-


ond edition, 1999.

D. P. Bertsekas and J. N. Tsitsiklis. Parallel and Distributed Computation.


Athena Scientific, Belmont, Massachusetts, 1997.

A. E. Bryson and Y. Ho. Applied Optimal Control. Hemisphere Publishing, New


York, 1975.

H. Cui and E. W. Jacobsen. Performance limitations in decentralized control.


J. Proc. Cont., 12:485–494, 2002.

W. B. Dunbar. Distributed receding horizon control of dynamically coupled


nonlinear systems. IEEE Trans. Auto. Cont., 52(7):1249–1263, 2007.

W. B. Dunbar and R. M. Murray. Distributed receding horizon control with


application to multi-vehicle formation stabilization. Automatica, 42(4):549–
558, 2006.

G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins


University Press, Baltimore, Maryland, third edition, 1996.

D. Jia and B. H. Krogh. Min-max feedback model predictive control for dis-
tributed control with communication. In Proceedings of the American Con-
trol Conference, pages 4507–4512, Anchorage,Alaska, May 2002.

T. Larsson and S. Skogestad. Plantwide control- A review and a new design


procedure. Mod. Ident. Control, 21(4):209–240, 2000.

J. Lunze. Feedback Control of Large Scale Systems. Prentice-Hall, London, U.K.,


1992.

J. M. Maestre and R. R. Negenborn. Distributed Model Predictive Control Made


Easy. Springer Netherlands, 2014.

M. Mesarović, D. Macko, and Y. Takahara. Theory of hierarchical, multilevel


systems. Academic Press, New York, 1970.

442
Bibliography 443

N. Motee and B. Sayyar-Rodsari. Optimal partitioning in distributed model


predictive control. In Proceedings of the American Control Conference, pages
5300–5305, Denver,Colorado, June 2003.

J. Nash. Noncooperative games. Ann. Math., 54:286–295, 1951.

J. B. Rawlings and B. T. Stewart. Coordinating multiple optimization-based


controllers: New opportunities and challenges. J. Proc. Cont., 18:839–845,
2008.

R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, N.J.,


1970.

N. R. Sandell Jr., P. Varaiya, M. Athans, and M. Safonov. Survey of decentralized


control methods for large scale systems. IEEE Trans. Auto. Cont., 23(2):108–
128, 1978.

R. Scattolini. Architectures for distributed and hierarchical model predictive


control - a review. J. Proc. Cont., 19(5):723–731, May 2009.

P. O. M. Scokaert, D. Q. Mayne, and J. B. Rawlings. Suboptimal model predictive


control (feasibility implies stability). IEEE Trans. Auto. Cont., 44(3):648–654,
March 1999.

D. D. Šiljak. Decentralized Control of Complex Systems. Academic Press, Lon-


don, 1991.

B. T. Stewart, A. N. Venkat, J. B. Rawlings, S. J. Wright, and G. Pannocchia.


Cooperative distributed model predictive control. Sys. Cont. Let., 59:460–
469, 2010.

B. T. Stewart, S. J. Wright, and J. B. Rawlings. Cooperative distributed model


predictive control for nonlinear systems. J. Proc. Cont., 21(5):698–704, June
2011.

A. N. Venkat. Distributed Model Predictive Control: Theory and Applications.


PhD thesis, University of Wisconsin–Madison, October 2006.

A. N. Venkat, J. B. Rawlings, and S. J. Wright. Stability and optimality of


distributed, linear MPC. Part 1: state feedback. Technical Report 2006–
03, TWMCC, Department of Chemical and Biological Engineering, Univer-
sity of Wisconsin–Madison (Available at http://jbrwww.che.wisc.edu/tech-
reports.html), October 2006a.

A. N. Venkat, J. B. Rawlings, and S. J. Wright. Stability and optimality of


distributed, linear MPC. Part 2: output feedback. Technical Report 2006–
04, TWMCC, Department of Chemical and Biological Engineering, Univer-
sity of Wisconsin–Madison (Available at http://jbrwww.che.wisc.edu/tech-
reports.html), October 2006b.
444 Bibliography

A. N. Venkat, J. B. Rawlings, and S. J. Wright. Distributed model predictive con-


trol of large-scale systems. In Assessment and Future Directions of Nonlinear
Model Predictive Control, pages 591–605. Springer, 2007.

M. Vidyasagar. Nonlinear Systems Analysis. Prentice-Hall, Inc., Englewood


Cliffs, New Jersey, second edition, 1993.

J. von Neumann and O. Morgenstern. Theory of Games and Economic Behavior.


Princeton University Press, Princeton and Oxford, 1944.

S. J. Wright. Applying new optimization algorithms to model predictive con-


trol. In J. C. Kantor, C. E. Garcı́a, and B. Carnahan, editors, Chemical Process
Control–V, pages 147–155. CACHE, AIChE, 1997.
7
Explicit Control Laws for Constrained
Linear Systems

7.1 Introduction
In preceding chapters we show how model predictive control (MPC) can
be derived for a variety of control problems with constraints. It is in-
teresting to recall the major motivation for MPC; solution of a feedback
optimal control problem for constrained and/or nonlinear systems to
obtain a stabilizing control law is often prohibitively difficult. MPC
sidesteps the problem of determining a control law κ(·) by determin-
ing, instead, at each state x encountered, a control action u = κ(x)
by solving a mathematical programming problem. This procedure, if
repeated at every state x, yields an implicit control law κ(·) that solves
the original feedback problem. In many cases, determining an explicit
control law is impractical while solving a mathematical programming
problem online for a given state is possible; this fact has led to the
wide-scale adoption of MPC in the chemical process industry.
Some of the control problems for which MPC has been extensively
used, however, have recently been shown to be amenable to analysis,
at least for relatively simple systems. One such problem is control of
linear discrete time systems with polytopic constraints, for which de-
termination of a stabilizing control law was thought in the past to be
prohibitively difficult. It has been shown that it is possible, in principle,
to determine a stabilizing control law for some of these control prob-
lems. This result is often referred to as explicit MPC because it yields an
explicit control law in contrast to MPC that yields a control action for
each encountered state, thereby implicitly defining a control law. There
are two objections to this terminology. First, determination of control
laws for a wide variety of control problems has been the prime concern
of control theory since its birth and certainly before the advent of MPC,

445
446 Explicit Control Laws for Constrained Linear Systems

an important tool in this endeavor being dynamic programming (DP).


The new result shows that classical control-theoretic tools, such as DP,
can be successfully applied to a wider range of problems than was pre-
viously thought possible. MPC is a useful method for implementing
an implicit control law that can, in principle, be explicitly determined
using control-theoretic tools.
Second, some authors using this terminology have, perhaps inad-
vertently, implied that these results can be employed in place of con-
ventional MPC. This is far from the truth, since only relatively simple
problems, far simpler than those routinely solved in MPC applications,
can be solved. That said, the results may be useful in applications
where models with low state dimension are sufficiently accurate and
where it is important that the control be rapidly computed. A previ-
ously determined control law may yield the control action more rapidly
than solving an optimal control problem. Potential applications include
vehicle control.
In the next section we give a few simple examples of parametric
programming. In subsequent sections we show how the solutions to
parametric linear and quadratic programs may be obtained, and also
show how these solutions may be used to solve optimal control prob-
lems when the system is linear, the cost quadratic or affine, and the
constraints polyhedral.

7.2 Parametric Programming


A conventional optimization problem has the form V 0 = minu {V (u) |
u ∈ U} where u is the “decision” variable, V (u) is the cost to be min-
imized, and U is the constraint set. The solution to a conventional
optimization is a point or set in U; the value V 0 of the problem satis-
fies V 0 = V (u0 ) where u0 is a minimizer. A simple example of such
a problem is V 0 = minu {a + bu + (1/2)cu2 | u ∈ [−1, 1]} where
the solution is required for only one value of the parameters a, b and
c. The solution to this problem u0 = −b/c if |b/c| ≤ 1, u0 = −1 if
b/c ≥ 1 and u0 = 1 if b/c ≤ −1. This may be written more compactly
as u0 = −sat(b/c) where sat(·) is the saturation function. The corre-
sponding value is V 0 = a − b2 /2c if |b/c| ≤ 1, V 0 = a − b + c 2 /2 if
b/c ≥ 1 and V 0 = a + b + c 2 /2 if b/c ≤ −1.
A parametric programming problem P(x) on the other hand, takes
the form V 0 (x) = minu {V (x, u) | u ∈ U(x)} where x is a parame-
ter so that the optimization problem, and its solution, depend on the
7.2 Parametric Programming 447

value of the parameter. Hence, the solution to a parametric program-


ming problem P(x) is not a point or set but a function x , u0 (x)
that may be set valued; similarly the value of the problem is a function
x , V 0 (x). At each x, the minimizer u0 (x) may be a point or a set.
Optimal control problems often take this form, with x being the state,
and u, in open-loop discrete time optimal control, being a control se-
quence; u0 (x), the optimal control sequence, is a function of the initial
state. In state feedback optimal control, necessary when uncertainty is
present, DP is employed yielding a sequence of parametric optimiza-
tion problems in each of which x is the state and u a control action;
see Chapter 2. The programming problem in the first paragraph of this
section may be regarded as a parametric programming problem with
the parameter x := (a, b, c), V (x, u) := (x1 + x2 u + (1/2)x3 u2 /2) and
U(x) := [−1, 1]; U(x), in this example, does not depend on x. The
solution to this problem yields the functions u0 (·) and V 0 (·) defined
by u0 (x) = −sat(x2 /x3 ) and V 0 (x) = V (x, u0 (x)) = x1 + x2 u0 (x) +
(x3 /2)(u0 (x))2 .
Because the minimizer and value of a parametric programming prob-
lem are functions rather than points or sets, we would not, in general,
expect to be able to compute a solution. Surprisingly, parametric pro-
grams may be solved when the cost function V (·) is affine (V (x, u) =
a + b′ x + c ′ u) or quadratic (V (x, u) = (1/2)x ′ Qx + x ′ Su + (1/2)u′ Ru)
and U(x) is defined by a set of affine inequalities: U(x) = {u | Mu ≤
Nx + p}. The parametric constraint u ∈ U(x) may be conveniently
expressed as (x, u) ∈ Z where Z is a subset of (x, u)-space which we
will take to be Rn × Rm ; for each x

U(x) = {u | (x, u) ∈ Z}

We assume that x ∈ Rn and u ∈ Rm . Let X ⊂ Rn be defined by

X := {x | ∃u such that (x, u) ∈ Z} = {x | U(x) ≠ ∅}

The set X is the domain of V 0 (·) and u0 (·) and is thus the set of points
x for which a feasible solution of P(x) exists; it is the projection of Z
(which is a set in (x, u)-space) onto x-space. See Figure 7.1, which
illustrates Z and U(x) for the case when U(x) = {u | Mu ≤ Nx + p};
the set Z is thus defined by Z := {(x, u) | Mu ≤ Nx + p}. In this case,
both Z and U(x) are polyhedral.
Before proceeding to consider parametric linear and quadratic pro-
gramming, some simple examples may help the reader to appreciate
the underlying ideas. Consider first a very simple parametric linear
448 Explicit Control Laws for Constrained Linear Systems

U(x)
Z

x
X

Figure 7.1: The sets Z, X, and U(x).

u0 (x) Z

constraint

0 x

Figure 7.2: Parametric linear program.

program minu {V (x, u) | (x, u) ∈ Z} where V (x, u) := x + u and


Z := {(x, u) | u + x ≥ 0, u − x ≥ 0} so that U(x) = {u | u ≥ −x,
u ≥ x}. The problem is illustrated in Figure 7.2. The set Z is the region
lying above the two solid lines u = −x and u = x, and is convex.
The gradient ∇u V (x, u) = 1 everywhere, so the solution, at each x,
to the parametric program is the smallest u in U(x), i.e., the smallest
u lying above the two lines u = −x and u = x. Hence u0 (x) = −x
if x ≤ 0 and u0 (x) = x if x ≥ 0, i.e., u0 (x) = |x|; the graph of
u0 (·) is the dashed line in Figure 7.2. Both u0 (·) and V 0 (·), in which
V 0 (x) = x + u0 (x), are piecewise affine, being affine in each of the two
regions X1 := {x | x ≤ 0} and X2 := {x | x ≥ 0}.
Next consider an unconstrained parametric quadratic program (QP)
minu V (x, u) where V (x, u) := (1/2)(x − u)2 + u2 /2. The problem is
7.2 Parametric Programming 449

u
u0 (x)

Figure 7.3: Unconstrained parametric quadratic program.

u Z
u0 (x)

constraint

u0uc (x)

Figure 7.4: Parametric quadratic program.

illustrated in Figure 7.3. For each x ∈ R, ∇u V (x, u) = −x + 2u and


∇uu V (x, u) = 2 so that u0 (x) = x/2 and V 0 (x) = x 2 /4. Hence u0 (·)
is affine and V 0 (·) is quadratic in R.
We now add the constraint set Z := {(x, u) | u ≥ 1, u+x/2 ≥ 2, u+
x ≥ 2}; see Figure 7.4. The solution is defined on three regions, X1 :=
(−∞, 0], X2 := [0, 2], and X3 := [2, ∞). From the preceding example,
the unconstrained minimum is achieved at u0uc (x) = x/2 shown by the
solid straight line in Figure 7.4. Since ∇u V (x, u) = −x + 2u, ∇u V (x,
450 Explicit Control Laws for Constrained Linear Systems

u) > 0 for all u > u0uc (x) = x/2. Hence, in X1 , u0 (x) lies on the
boundary of Z and satisfies u0 (x) = 2 − x. Similarly, in X2 , u0 (x)
lies on the boundary of Z and satisfies u0 (x) = 2 − x/2. Finally, in
X3 , u0 (x) = u0uc (x) = x/2, the unconstrained minimizer, and lies in
the interior of Z for x > 1. The third constraint u ≥ 2 − x is active
in X1 , the second constraint u ≥ 2 − x/2 is active in X2 , while no
constraints are active in X3 . Hence the minimizer u0 (·) is piecewise
affine, being affine in each of the regions X1 , X2 and X3 . Since V 0 (x) =
2
(1/2) x − u0 (x) + u0 (x)2 /2, the value function V 0 (·) is piecewise
quadratic, being quadratic in each of the regions X1 , X2 and X3 .
We require, in the sequel, the following definitions.

Definition 7.1 (Polytopic (polyhedral) partition). A set P = {Zi | i ∈ I},


for some index set I, is called a polytopic (polyhedral) partition of the
polytopic (polyhedral) set Z if Z = ∪i∈I Zi and the sets Zi , i ∈ I, are
polytopes (polyhedrons) with nonempty interiors (relative to Z)1 that
are nonintersecting: int(Zi ) ∩ int(Zj ) = ∅ if i ≠ j.

Definition 7.2 (Piecewise affine function). A function f : Z → Rm is


said to be piecewise affine on a polytopic (polyhedral) partition P =
{Zi | i ∈ I} if it satisfies, for some Ki , ki , i ∈ I, f (x) = Ki x + ki for all
x ∈ Zi , all i ∈ I. Similarly, a function f : Z → R is said to be piecewise
quadratic on a polytopic (polyhedral) partition P = {Zi | i ∈ I} if it
satisfies, for some Qi , ri , and si , i ∈ I, f (x) = (1/2)x ′ Qi x + ri′ x + si
for all x ∈ Zi , all i ∈ I.

Note the piecewise affine and piecewise quadratic functions defined


this way are not necessarily continuous and may, therefore, be set val-
ued at the intersection of the defining polyhedrons. An example is the
piecewise affine function f (·) defined by

f (x) := −x − 1 x ∈ (−∞, 0]
:= x + 1 x ∈ [0, ∞)

This function is set valued at x = 0 where it has the value f (0) = {−1,
1}. We shall mainly be concerned with continuous piecewise affine and
piecewise quadratic functions.
We now generalize the points illustrated by our example above and
consider, in turn, parametric quadratic programming and parametric
1 The interior of a set S ⊆ Z relative to the set Z is the set {z ∈ S | ε(z)B ∩ aff(Z) ⊆
Z for some ε > 0} where aff(Z) is the intersection of all affine sets containing Z.
7.3 Parametric Quadratic Programming 451

linear programming and their application to optimal control problems.


We deal with parametric quadratic programming first because it is more
widely used and because, with reasonable assumptions, the minimizer
is unique making the underlying ideas somewhat simpler to follow.

7.3 Parametric Quadratic Programming


7.3.1 Preliminaries

The parametric QP P(x) is defined by

V 0 (x) = min{V (x, u) | (x, u) ∈ Z}


u

n m
where x ∈ R and u ∈ R . The cost function V (·) is defined by

V (x, u) := (1/2)x ′ Qx + u′ Sx + (1/2)u′ Ru + q′ x + r ′ u + c

and the polyhedral constraint set Z is defined by

Z := {(x, u) | Mx ≤ Nu + p}

where M ∈ Rr ×n , N ∈ Rr ×m and p ∈ Rr ; thus Z is defined by r affine


inequalities. Let u0 (x) denote the solution of P(x) if it exists, i.e., if
x ∈ X, the domain of V 0 (·); thus

u0 (x) := arg min{V (x, u) | (x, u) ∈ Z}


u

The solution u0 (x) is unique if V (·) is strictly convex in u; this is the


case if R is positive definite. Let the matrix Q be defined by
" #
Q S′
Q :=
S R

For simplicity we assume the following in the sequel.

Assumption 7.3 (Strict convexity). The matrix Q is positive definite.

Assumption 7.3 implies that both R and Q are positive definite. The
cost function V (·) may be written in the form

V (x, u) = (1/2)(x, u)′ Q(x, u) + q′ x + r ′ u + c

where, as usual, the vector (x, u) is regarded as a column vector (x ′ ,


u′ )′ in algebraic expressions. The parametric QP may also be expressed
as
V 0 (x) := min{V (x, u) | u ∈ U(x)}
u
452 Explicit Control Laws for Constrained Linear Systems

where the parametric constraint set U(x) is defined by


U(x) := {u | (x, u) ∈ Z} = {u ∈ Rm | Mu ≤ Nx + p}
For each x the set U(x) is polyhedral. The domain X of V 0 (·) and
u0 (·) is defined by
X := {x | ∃u ∈ Rm such that (x, u) ∈ Z} = {x | U(x) ≠ ∅}
For all (x, u) ∈ Z, let the index set I(x, u) specify the constraints that
are active at (x, u), i.e.,
I(x, u) := {i ∈ I1:r | Mi u = Ni x + pi }
where Mi , Ni , and pi denote, respectively, the ith row of M, N, and p.
Similarly, for any matrix or vector A and any index set I, AI denotes
the matrix or vector with rows Ai , i ∈ I. For any x ∈ X, the indices set
I 0 (x) specifies the constraints that are active at (x, u0 (x)), namely
I 0 (x) := I(x, u0 (x)) = {i ∈ I1:r | Mi u0 (x) = Ni x + pi }
Since u0 (x) is unique, I 0 (x) is well defined. Thus u0 (x) satisfies the
equation
Mx0 u = Nx0 x + px0
where
Mx0 := MI 0 (x) , Nx0 := NI 0 (x) , px0 := pI 0 (x) (7.1)

7.3.2 Preview

We show in the sequel that V 0 (·) is piecewise quadratic and u0 (·) piece-
wise affine on a polyhedral partition of X, the domain of both these
functions. To do this, we take an arbitrary point x in X, and show that
u0 (x) is the solution of an equality constrained QP P(x) : minu {V (x,
u) | Mx0 u = Nx0 x + px0 } in which the equality constraint is Mx0 u =
Nx0 x + px0 . We then show that there is a polyhedral region Rx0 ⊂ X in
which x lies and such that, for all w ∈ Rx0 , u0 (w) is the solution of
the equality constrained QP P(w) : minu {V (w, u) | Mx0 u = Nx0 w + px0 }
in which the equality constraints are the same as those for P(x). It
follows that u0 (·) is affine and V 0 (·) is quadratic in Rx0 . We then show
that there are only a finite number of such polyhedral regions so that
u0 (·) is piecewise affine, and V 0 (·) piecewise quadratic, on a polyhe-
dral partition of X. To carry out this program, we require a suitable
characterization of optimality. We develop this in the next subsection.
Some readers may prefer to jump to Proposition 7.8, which gives the
optimality condition we employ in the sequel.
7.3 Parametric Quadratic Programming 453

7.3.3 Optimality Condition for a Convex Program

Necessary and sufficient conditions for nonlinear optimization prob-


lems are developed in Section C.2 of Appendix C. Since we are con-
cerned here with a relatively simple optimization problem where the
cost is convex and the constraint set polyhedral, we give a self-contained
exposition that uses the concept of a polar cone.

Definition 7.4 (Polar cone). The polar cone of a cone C ⊆ Rn is the cone
C ∗ defined by

C ∗ := {g ∈ Rn | ⟨g, h⟩ ≤ 0 ∀h ∈ C}

We recall that a set C ⊆ Rn is a cone if 0 ∈ C and that h ∈ C implies


λh ∈ C for all λ > 0. A cone C is said to be generated by {ai | i ∈ I}
P
where I is an index set if C = i∈I {µi ai | µi ≥ 0, i ∈ I} in which case
we write C = cone{ai | i ∈ I}. We need the following result.

Proposition 7.5 (Farkas’s lemma). Suppose C is a polyhedral cone de-


fined by
C := {h | Ah ≤ 0} = {h | ⟨ai , h⟩ ≤ 0 | i ∈ I1:m }
in which, for each i, ai is the ith row of A. Then

C ∗ = cone{ai | i ∈ I1:m }

A proof of this result is given in Section C.2 of Appendix C; that


g ∈ cone{ai | i ∈ I1:m } implies ⟨g, h⟩ ≤ 0 for all h ∈ C is easily shown.
An illustration of Proposition 7.5 is given in Figure 7.5.
Next we make use of a standard necessary and sufficient condition
of optimality for optimization problems in which the cost is convex
and differentiable and the constraint set is convex.

Proposition 7.6 (Optimality conditions for convex set). Suppose, for


each x ∈ X, u , V (x, u) is convex and differentiable and U(x) is
convex. Then u is optimal for minu {V (x, u) | u ∈ U(x)} if and only if

u ∈ U(x) and ⟨∇u V (x, u), v − u⟩ ≥ 0 ∀v ∈ U(x)

Proof. This Proposition appears as Proposition C.9 in Appendix C where


a proof is given. ■

In our case U(x), x ∈ X, is polyhedral and is defined by

U(x) := {v ∈ Rm | Mv ≤ Nx + p} (7.2)
454 Explicit Control Laws for Constrained Linear Systems

a2
x2

C∗

a1
C

x1

Figure 7.5: Polar cone.

so v ∈ U(x) if and only if, for all u ∈ U(x), v − u ∈ U(x) − {u} :=


{v − u | v ∈ U(x)}. With h := v − u
( )
m Mi h ≤ 0, i ∈ I(x, u)
U(x)−{u} = h∈R
Mj h < Nj x + pj − Mj u, j ∈ I1:r \ I(x, u)

since Mi u = Ni x + pi for all i ∈ I(x, u). For each z = (x, u) ∈ Z, let


C(x, u) denote the cone of feasible directions2 h = v − u at u, i.e.,
C(x, u) is defined by

C(x, u) := {h ∈ Rm | Mi h ≤ 0, i ∈ I(x, u)}

Clearly

U(x)−{u} = C(x, u)∩{h ∈ Rm | Mi h < Ni x+pi −Mi u, i ∈ I1:r \I(x, u)}

so that U(x) − {u} ⊆ C(x, u); for any (x, u) ∈ Z, any h ∈ C(x, u),
there exists an α > 0 such that u + αh ∈ U(x). Proposition 7.6 may be
expressed as: u is optimal for minu {V (x, u) | u ∈ U(x)} if and only if

u ∈ U(x) and ⟨∇u V (x, u), h⟩ ≥ 0 ∀h ∈ U(x) − {u}

We may now state a modified form of Proposition 7.6.


2A direction h at u is feasible if there exists an ε > 0 such that u + λh ∈ U(x) for
all λ ∈ [0, ε].
7.3 Parametric Quadratic Programming 455

Proposition 7.7 (Optimality conditions in terms of polar cone). Suppose


for each x ∈ X, u , V (x, ·) is convex and differentiable, and U(x) is
defined by (7.2). Then u is optimal for minu {V (x, u) | u ∈ U(x)} if
and only if

u ∈ U(x) and ⟨∇u V (x, u), h⟩ ≥ 0 ∀h ∈ C(x, u)

Proof. We show that the condition ⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ C(x,
u) is equivalent to the condition ⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ U(x) −
{u} employed in Proposition 7.6. (i) Since U(x) − {u} ⊆ C(x, u),
⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ C(x, u) implies ⟨∇u V (x, u), h⟩ ≥ 0
for all h ∈ U(x) − {u}. (ii) ⟨∇u V (x, u), h⟩ ≥ 0 for all h ∈ U(x) − {u}
implies ⟨∇u V (x, u), αh⟩ ≥ 0 for all h ∈ U(x) − {u}, all α > 0. But,
for any h∗ ∈ C(x, u), there exists an α ≥ 1 such that h∗ = αh with
h := (1/α)h∗ ∈ U(x) − {u}. Hence ⟨∇u V (x, u), h∗ ⟩ = ⟨∇u V (x, u),
αh⟩ ≥ 0 for all h∗ ∈ C(x, u). ■

We now make use of Proposition 7.7 to obtain the optimality condi-


tion in the form we use in the sequel. For all (x, u) ∈ Z, let C ∗ (x, u)
denote the polar cone to C(x, u).

Proposition 7.8 (Optimality conditions for linear inequalities). Sup-


pose, for each x ∈ X, u , V (x, u) is convex and differentiable, and
U(x) is defined by (7.2). Then u is optimal for minu {V (x, u) | u ∈
U(x)} if and only if

u ∈ U(x) and − ∇u V (x, u) ∈ C ∗ (x, u) = cone{Mi′ | i ∈ I(x, u)}

Proof. The desired result follows from a direct application of Proposi-


tion 7.5 to Proposition 7.7. ■

Note that C(x, u) and C ∗ (x, u) are both cones so that each set con-
tains the origin. In particular, C ∗ (x, u) is generated by the gradients
of the constraints active at z = (x, u), and may be defined by a set of
affine inequalities: for each z ∈ Z, there exists a matrix Lz such that

C ∗ (x, u) = C ∗ (z) = {g ∈ Rm | Lz g ≤ 0}

The importance of this result for us lies in the fact that the necessary
and sufficient condition for optimality is satisfaction of two polyhedral
constraints, u ∈ U(x) and −∇u V (x, u) ∈ C ∗ (x, u). Proposition 7.8
may also be obtained by direct application of Proposition C.12 of Ap-
pendix C; C ∗ (x, u) may be recognized as NU(x) (u), the regular normal
cone to the set U(x) at u.
456 Explicit Control Laws for Constrained Linear Systems

7.3.4 Solution of the Parametric Quadratic Program

For the parametric programming problem P(x), the parametric cost


function is

V (x, u) := (1/2)x ′ Qx + u′ Sx + (1/2)u′ Ru + q′ x + r ′ u + c

and the parametric constraint set is

U(x) := {u | Mu ≤ Nx + p}

Hence, the cost gradient is

∇u V (x, u) = Ru + Sx + r

in which, because of Assumption 7.3, R is positive definite. Hence,


the necessary and sufficient condition for the optimality of u for the
parametric QP P(x) is

Mu ≤ Nx + p
− (Ru + Sx + r ) ∈ C ∗ (x, u)

in which C ∗ (x, u) = cone{Mi′ | i ∈ I(x, u)}, the cone generated by


the gradients of the active constraints, is polyhedral. We cannot use
this characterization of optimality directly to solve the parametric pro-
gramming problem since I(x, u) and, hence, C ∗ (x, u), varies with (x,
u). Given any x ∈ X, however, there exists the possibility of a region
containing x such that I 0 (x) ⊆ I 0 (w) for all w in this region. We make
use of this observation as follows. It follows from the definition of
I 0 (x) that the unique solution u0 (x) of P(x) satisfies the equation

Mi u = Ni x + pi , i ∈ I 0 (x), i.e.,
Mx0 u = Nx0 x + px0

where Mx0 , Nx0 , and px0 are defined in (7.1). Hence u0 (x) is the solution
of the equality constrained problem

V 0 (x) = min{V (x, u) | Mx0 u = Nx0 x + px0 }


u

If the active constraint set remains constant near the point x or, more
precisely, if I 0 (x) ⊆ I 0 (w) for all w in some region in Rn containing
x, then, for all w in this region, u0 (w) satisfies the equality constraint
7.3 Parametric Quadratic Programming 457

Mx0 u = Nx0 w + px0 . This motivates us to consider the simple equality


constrained problem Px (w) defined by

Vx0 (w) = min{V (w, u) | Mx0 u = Nx0 w + px0 }


u

u0x (w) = arg min{V (w, u) | Mx0 u = Nx0 w + px0 }


u

The subscript x indicates that the equality constraints in Px (w) depend


on x. Problem Px (w) is an optimization problem with a quadratic cost
function and linear equality constraints and is, therefore, easily solved;
see the exercises at the end of this chapter. Its solution is

Vx0 (w) = (1/2)w ′ Qx w + rx′ w + sx (7.3)


u0x (w) = Kx w + kx (7.4)

for all w such that I 0 (w) = I 0 (x) where Qx ∈ Rn×n , rx ∈ Rn , sx ∈ R,


Kx ∈ Rm×n and kx ∈ Rm are easily determined. Clearly, u0x (x) =
u0 (x); but, is u0x (w), the optimal solution to Px (w), the optimal so-
lution u0 (w) to P(w) in some region containing x and, if it is, what
is the region? Our optimality condition answers this question. For all
x ∈ X, let the region Rx0 be defined by
( )
0 u0x (w) ∈ U(w)
Rx := w (7.5)
−∇u V (w, u0x (w)) ∈ C ∗ (x, u0 (x))

Because of the equality constraint Mx0 u = Nx0 w +px0 in problem Px (w),


it follows that I(w, u0x (w)) ⊇ I(x, u0 (x)), and that C(w, u0x (w)) ⊆
C(x, u0 (x)) and C ∗ (w, u0x (w)) ⊇ C ∗ (x, u0 (x)) for all w ∈ Rx0 . Hence
w ∈ Rx0 implies u0x (w) ∈ U(w) and −∇u V (w, u0x (w)) ∈ C ∗ (w, u0x (w))
for all w ∈ Rx0 which, by Proposition 7.8, is a necessary and sufficient
condition for u0x (w) to be optimal for P(w). In fact, I(w, u0x (w)) =
I(x, u0 (x)) so that C ∗ (w, u0x (w)) = C ∗ (x, u0 (x)) for all w in the inte-
rior of Rx0 . The obvious conclusion of this discussion is the following.
Proposition 7.9 (Solution of P(w), w ∈ Rx0 ). For any x ∈ X, u0x (w) is
optimal for P(w) for all w ∈ Rx0 .
The constraint u0x (w) ∈ U(w) may be expressed as

M(Kx w + kx ) ≤ Nw + p

which is an affine inequality in w. Similarly, since ∇u V (w, u) = Ru +


Sw + r and since C ∗ (x, u0 (x)) = {g | L0x g ≤ 0} where L0x = L(x,u0 (x)) ,
the constraint −∇u V (x, u0x (w)) ∈ C(x, u0 (x)) may be expressed as

−L0x (R(Kx w + kx ) + Sw + r ) ≤ 0
458 Explicit Control Laws for Constrained Linear Systems

which is also an affine inequality in the variable w. Thus, for each x,


there exists a matrix Fx and vector fx such that

Rx0 = {w | Fx w ≤ fx }

so that Rx0 is polyhedral. Since u0x (x) = u0 (x), it follows that u0x (x) ∈
U(x) and −∇u V (x, u0x (x)) ∈ C ∗ (x, u0 (x)) so that x ∈ Rx0 .
Our next task is to bound the number of distinct regions Rx0 that
exist as we permit x to range over X. We note, from its definition, that
Rx0 is determined, through the constraint Mx0 u = Nx0 w + px0 in Px (w),
through u0x (·) and through C ∗ (x, u0 (x)), by I 0 (x) so that Rx0 1 ≠ Rx0 2
implies that I 0 (x1 ) ≠ I 0 (x2 ). Since the number of subsets of {1, 2, . . . ,
p} is finite, the number of distinct regions Rx0 as x ranges over X is
finite. Because each x ∈ X lies in the set Rx0 , there exists a discrete set
of points X ⊂ X such that X = ∪{Rx0 | x ∈ X}. We have proved the
following.

Proposition 7.10 (Piecewise quadratic (affine) cost (solution)).


(a) There exists a set X of a finite number of points in X such that X =
∪{Rx0 | x ∈ X} and {Rx0 | x ∈ X} is a polyhedral partition of X.

(b) The value function V 0 (·) of the parametric piecewise QP P is piece-


wise quadratic in X, being quadratic and equal to Vx0 (·), defined in (7.3)
in each polyhedron Rx , x ∈ X. Similarly, the minimizer u0 (·) is piece-
wise affine in X, being affine and equal to u0x (·) defined in (7.4) in each
polyhedron Rx0 , x ∈ X.

Example 7.11: Parametric QP


Consider the example in Section 7.2. This may be expressed as

V 0 (x) = min V (x, u), V (x, u) := {(1/2)x 2 − ux + u2 | Mu ≤ Nx + p}


u

where 
   
−1 0 −1
     
M = −1 N = 1/2 p = −2
−1 1 −2

At x = 1, u0 (x) = 3/2 and I 0 (x) = {2}. The equality constrained


optimization problem Px (w) is

Vx0 (w) = min{(1/2)w 2 − uw + u2 | −u = (1/2)w − 2}


u
7.3 Parametric Quadratic Programming 459

so that u0 (w) = 2 − w/2. Hence


( )
0 Mu0x (w) ≤ Nw + p(w)
Rx := w
−∇u V (w, u0x (w)) ∈ C ∗ (x, u0 (x))

Since M2 = −1, C ∗ (x) = cone{Mi′ | i ∈ I 0 (x)} = cone{M2′ } = {h ∈ R |


h ≤ 0}; also

∇u V (w, u0x (w)) = −w + 2u0 (w) = −w + 2(2 − w/2) = −2w + 4

so that Rx0 is defined by the following inequalities

(1/2)w − 2 ≤ −1 or w ≤ 2
(1/2)w − 2 ≤ (1/2)w − 2 or w ∈ R
(1/2)w − 2 ≤ w − 2 or w ≥ 0
2w − 4 ≤ 0 or w ≤ 2

which reduces to w ∈ [0, 2] so Rx0 = [0, 2] when x = 1; [0, 2] is the set


X2 determined in Section 7.2. □

Example 7.12: Explicit optimal control


We return to the MPC problem presented in Example 2.5 of Chapter 2

V 0 (x, u) = min{V (x, u) | u ∈ U}


u

V (x, u) := (3/2)x 2 + [2x, x]u + (1/2)u′ Hu


" #
3 1
H :=
1 2
U := {u | Mu ≤ p}

where    
1 0 1
−1 0 1
   
M :=   p :=  
0 1 1
0 −1 1
It follows from the solution to Example 2.5 that
" #
0 −1
u (2) =
−(1/2)

and I 0 (x) = {2}. The equality constrained optimization problem at


x = 2 is

Vx0 (w) = min{(3/2)w 2 + 2wu1 + wu2 + (1/2)u′ Hu | u1 = −1}


u
460 Explicit Control Laws for Constrained Linear Systems

so that " #
−1
u0x (w) =
(1/2) − (1/2)w
Hence u0x (2) = [−1, −1/2]′ = u0 (2) as expected. Since Mx0 = M2 =
[−1, 0], C ∗ (x, u0 (x)) = {g ∈ R2 | g1 ≤ 0}. Also
" #
2w + 3u1 + u2
∇u V (w, u) =
w + u1 + 2u2

so that " #
0 (3/2)w − (5/2)
∇u V (w, ux (w)) =
0
Hence Rx0 , x = 2 is the set of w satisfying the following inequalities

(1/2) − (1/2)w ≤ 1 or w ≥ −1
(1/2) − (1/2)w ≥ −1 or w ≤ 3
−(3/2)w + (5/2) ≤ 0 or w ≥ (5/3)

which reduces to w ∈ [5/3, 3]; hence Rx0 = [5/3, 3] when x = 2 as


shown in Example 2.5. □

7.3.5 Continuity of V 0 (·) and u0 (·)

Continuity of V 0 (·) and u0 (·) follows from Theorem C.34 in Appendix C.


We present here a simpler proof based on the above analysis. We use
the fact that the parametric quadratic problem is strictly convex, i.e.,
for each x ∈ X, u , V (x, u) is strictly convex and U(x) is convex,
so that the minimizer u0 (x) is unique as shown in Proposition C.8 of
Appendix C.
Let X = {xi | i ∈ I1:I } denote the set defined in Proposition 7.10(a).
For each i ∈ Ii:I , let Ri := Rx0 i , Vi (·) := Vx0i (·) and ui (·) := u0xi (·). From
Proposition 7.10, u0 (x) = ui (x) for each x ∈ Ri , each i ∈ I1:I so that
u0 (·) is affine and hence continuous in the interior of each Ri , and also
continuous at any point x on the boundary of X such that x lies in a
single region Ri . Consider now a point x lying in the intersection of
several regions, x ∈ ∩i∈J Ri , where J is a subset of I1:I . Then, by Propo-
sition 7.10, ui (x) = u0 (x) for all x ∈ ∩i∈J Ri , all i ∈ J. Each ui (·) is
affine and, therefore, continuous, so that u0 (·) is continuous in ∩i∈J Ri .
Hence u0 (·) is continuous in X. Because V (·) is continuous and u0 (·)
is continuous in X, the value function V 0 (·) defined by V 0 (x) = V (x,
u0 (x)) is also continuous in X. Let S denote any bounded subset of X.
7.4 Constrained Linear Quadratic Control 461

Then, since V 0 (x) = Vi (x) = (1/2)x ′ Qi x + ri′ x + si for all x ∈ Ri , all


i ∈ I1:I where Qi := Qxi , ri := rxi and si := sxi , it follows that V 0 (·) is
Lipschitz continuous in each set Ri ∩S and, hence, Lipschitz continuous
in X ∩ S. We have proved the following.

Proposition 7.13 (Continuity of cost and solution). The value function


V 0 (·) and the minimizer u0 (·) are continuous in X. Moreover, the value
function and the minimizer are Lipschitz continuous on bounded sets.

7.4 Constrained Linear Quadratic Control


We now show how parametric quadratic programming may be used to
solve the optimal receding horizon control problem when the system
is linear, the constraints polyhedral, and the cost is quadratic. The
system is described, as before, by

x + = Ax + Bu (7.6)

and the constraints are, as before

x∈X u∈U (7.7)

where X is a polyhedron containing the origin in its interior and U is


a polytope also containing the origin in its interior. There may be a
terminal constraint of the form

x(N) ∈ Xf (7.8)

where Xf is a polyhedron containing the origin in its interior. The cost


is  
N−1
X
VN (x, u) =  ℓ(x(i), u(i)) + Vf (x(N)) (7.9)
i=0

in which, for all i, x(i) = φ(i; x, u), the solution of (7.6) at time i
if the initial state at time 0 is x and the control sequence is u :=
(u(0), u(1), . . . , u(N − 1)). The functions ℓ(·) and Vf (·) are quadratic

ℓ(x, u) := (1/2)x ′ Qx + (1/2)u′ Ru, Vf (x) := (1/2)x ′ Qf x (7.10)

The state and control constraints (7.7) induce, via the difference equa-
tion (7.6), an implicit constraint (x, u) ∈ Z where

Z := {(x, u) | x(i) ∈ X, u(i) ∈ U, i ∈ I0:N−1 , x(N) ∈ Xf } (7.11)


462 Explicit Control Laws for Constrained Linear Systems

2.0

1.5

1.0

0.5
x2
0.0

−0.5

−1.0

−1.5

−2.0
−3 −2 −1 0 1 2 3
x1

Figure 7.6: Regions Rx , x ∈ X for a second-order example; after


Mayne and Raković (2003).

where, for all i, x(i) = φ(i; x, u). It is easily seen that Z is polyhedral
since, for each i, x(i) = Ai x + Mi u for
h some matrix Mi in R
n×Nm
; here
i′
u is regarded as the column vector u(0) u(1) · · · u(N − 1)′ .
′ ′

Clearly x(i) = φ(i; x, u) is linear in (x, u). The constrained linear op-
timal control problem may now be defined by

VN0 (x) = min{VN (x, u) | (x, u) ∈ Z}


u

Using the fact that for each i, x(i) = Ai x + Mi u, it is possible to deter-


mine matrices Q ∈ Rn×n , R ∈ RNm×Nm , and S ∈ RNm×n such that

VN (x, u) = (1/2)x ′ Q x ′ + (1/2)u′ Ru + u′ Sx (7.12)

Similarly, as shown above, there exist matrices M, N and a vector p such


that
Z = {(x, u) | Mu ≤ Nx + p} (7.13)

This is precisely the parametric problem studied in Section 7.3, so that


the solution u0 (x) to P(x) is piecewise affine on a polytopic partition
P = {Rx | x ∈ X} of X the projection of Z ⊂ Rn × RNm onto Rn , being
affine in each of the constituent polytopes of P. The receding horizon
control law is x , u0 (0; x), the first element of u0 (x). An example is
shown in Figure 7.6.
7.5 Parametric Piecewise Quadratic Programming 463

7.5 Parametric Piecewise Quadratic Programming


The dimension of the decision variable u in the constrained linear quad-
ratic control problem discussed in Section 7.4 is Nm which is large. It
may be better to employ dynamic programming by solving a sequence
of problems P1 , P2 , . . . , PN . Although P1 is a conventional parametric
QP, each problem Pi , i = 2, 3, . . . , N, has the form

Vi0 (x) = min{Vi−1


0
(Ax + Bu) + ℓ(x, u) | u ∈ U, Ax + Bu ∈ Xi−1 }
u

0
in which Vi−1 (·) is piecewise quadratic and Xi−1 is polyhedral. The
decision variable u in each problem Pi has dimension m. But each
problem Pi (x), x ∈ Xi , is a parametric piecewise QP rather than a
conventional parametric QP. Hence a method for solving parametric
piecewise quadratic programming problems is required if dynamic pro-
gramming is employed to obtain a parametric solution to PN . Readers
not concerned with this extension should proceed to Section 7.7.
The parametric QP P(x) is defined, as before, by

V 0 (x) = min{V (x, u) | (x, u) ∈ Z} (7.14)


u

where x ∈ X ⊂ Rn and u ∈ Rm , but now the cost function V (·) is


assumed to be continuous, strictly convex, and piecewise quadratic on
a polytopic partition P = {Zi | i ∈ I} of the set Z so that

V (z) = Vi (z) = (1/2)z′ Qi z + si′ z + ci

for all z ∈ Zi , all i ∈ I where I is an index set.3 In (7.14), the matrix Qi


and the vector si have the structure
" # " #
Qi Si′ qi
Qi = si =
Si Ri ri

so that for all i ∈ I

Vi (x, u) = (1/2)x ′ Qi x + u′ Si x + (1/2)u′ Ri u + qi′ x + ri′ u + c

For each x, the function u , Vi (x, u) is quadratic and depends on x.


The constraint set Z is defined, as above, by

Z := {(x, u) | Mu ≤ Nx + p}
3 Note that in this section the subscript i denotes partition i rather than “time to go.”
464 Explicit Control Laws for Constrained Linear Systems

Let u0 (x) denote the solution of P(x), i.e.,

u0 (x) = arg min{V (x, u) | (x, u) ∈ Z}


u

The solution u0 (x) is unique if V (·) is strictly convex in u at each x;


this is the case if each Ri is positive definite. The parametric piecewise
QP may also be expressed, as before, as

V 0 (x) = min{V (x, u) | u ∈ U(x)}


u

u0 (x) = arg min{V (x, u) | u ∈ U(x)}


u

where the parametric constraint set U(x) is defined by

U(x) := {u | (x, u) ∈ Z} = {u | Mu ≤ Nx + p}

Let X ⊂ Rn be defined by

X := {x | ∃u such that (x, u) ∈ Z} = {x | U(x) ≠ ∅}

The set X is the domain of V 0 (·) and of u0 (·) and is thus the set of
points x for which a feasible solution of P(x) exists; it is the projection
of Z, which is a set in (x, u)-space, onto x-space as shown in Figure 7.1.
We make the following assumption in the sequel.

Assumption 7.14 (Continuous, piecewise quadratic function). The func-


tion V (·) is continuous, strictly convex, and piecewise quadratic on the
polytopic partition P = {Zi | i ∈ I := I1:q } of the polytope Z in Rn × Rm ;
V (x, u) = Vi (x, u) where Vi (·) is a positive definite quadratic function
of (x, u) for all (x, u) ∈ Zi , all i ∈ I, and q is the number of constituent
polytopes in P.

The assumption of continuity places restrictions on the quadratic


functions Vi (·), i ∈ I. For example, we must have Vi (z) = Vj (z) for
all z ∈ Zi ∩ Zj . Assumption 7.14 implies that the piecewise quadratic
programming problem P(x) satisfies the hypotheses of Theorem C.34
so that the value function V 0 (·) is continuous. It follows from Assump-
tion 7.14 and Theorem C.34 that V 0 (·) is strictly convex and continuous
and that the minimizer u0 (·) is continuous. Assumption 7.14 implies
that Qi is positive definite for all i ∈ I. For each x, let the set U(x) be
defined by
U(x) := {u | (x, u) ∈ Z}
Thus U(x) is the set of admissible u at x, and P(x) may be expressed
in the form V 0 (x) = minu {V (x, u) | u ∈ U(x)}.
7.5 Parametric Piecewise Quadratic Programming 465

For each i ∈ I, we define an “artificial” problem Pi (x) as follows

Vi0 (x) := min{Vi (x, u) | (x, u) ∈ Zi }


u

u0i (x) := arg min{Vi (x, u) | (x, u) ∈ Zi }


u

The cost Vi (x, u) in the above equations may be replaced by V (x, u)


since V (x, u) = Vi (x, u) in Zi . The problem is artificial because it in-
cludes constraints (the boundaries of Zi ) that are not necessarily con-
straints of the original problem. We introduce this problem because it
helps us to understand the solution of the original problem. For each
i ∈ I1:p , let the set Ui (x) be defined as follows

Ui (x) := {u | (x, u) ∈ Zi }

Thus the set Ui (x) is the set of admissible u at x, and problem Pi (x)
may be expressed as Vi0 (x) := minu {Vi (x, u) | u ∈ Ui (x)}; the set
Ui (x) is polytopic. For each i, problem Pi (x) may be recognized as a
standard parametric QP discussed in Section 7.4. Because of the piece-
wise nature of V (·), we require another definition.

Definition 7.15 (Active polytope (polyhedron)). A polytope (polyhe-


dron) Zi in a polytopic (polyhedral) partition P = {Zi | i ∈ I} of a
polytope (polyhedron) Z is said to be active at z ∈ Z if z = (x, u) ∈ Zi .
The index set specifying the polytopes active at z ∈ Z is

S(z) := {i ∈ I | z ∈ Zi }

A polytope Zi in a polytopic partition P = {Zi | i ∈ I} of a polytope Z


is said to be active for problem P(x)) if (x, u0 (x)) ∈ Zi . The index set
specifying polytopes active at (x, u0 (x)) is S 0 (x) defined by

S 0 (x) := S(x, u0 (x)) = {i ∈ I | (x, u0 (x)) ∈ Zi }

Because we know how to solve the “artificial” problems Pi (x), i ∈ I


that are parametric quadratic programs, it is natural to ask if we can
recover the solution of the original problem P(x) from the solutions
to these simpler problems. This question is answered by the following
proposition.

Proposition 7.16 (Solving P using Pi ). For any x ∈ X, u is optimal for


P(x) if and only if u is optimal for Pi (x) for all i ∈ S(x, u).
466 Explicit Control Laws for Constrained Linear Systems

Proof. (i) Suppose u is optimal for P(x) but, contrary to what we wish
to prove, there exists an i ∈ S(x, u) = S 0 (x) such that u is not optimal
for Pi (x). Hence there exists a v ∈ Rm such that (x, v) ∈ Zi and
V (x, v) = Vi (x, v) < Vi (x, u) = V (x, u) = V 0 (x), a contradiction
of the optimality of u for P(x). (ii) Suppose u is optimal for Pi (x)
for all i ∈ S(x, u) but, contrary to what we wish to prove, u is not
optimal for P(x). Hence V 0 (x) = V (x, u0 (x)) < V (x, u). If u0 (x) ∈
Z(x,u) := ∪i∈S(x,u) Zi , we have a contradiction of the optimality of u
in Z(x,u) . Assume then that u0 (x) ∈ Zj , j ∉ S(x, u); for simplicity,
assume further that Zj is adjacent to Z(x,u) . Then, there exists a λ ∈
(0, 1] such that uλ := u + λ(u0 (x) − u) ∈ Z(x,u) ; if not, j ∈ S(x,
u), a contradiction. Since V (·) is strictly convex, V (x, uλ ) < V (x, u),
which contradicts the optimality of u in Z(x,u) . The case when Zj is not
adjacent to Z(x,u) may be treated similarly. ■

To obtain a parametric solution, we proceed as before. We select a


point x ∈ X and obtain the solution u0 (x) to P(x) using a standard
algorithm for convex programs. The solution u0 (x) satisfies an equal-
ity constraint Ex u = Fx x + gx , which we employ to define, for any
w ∈ X near x an easily solved equality constrained optimization prob-
lem Px (w) that is derived from the problems Pi (x), i ∈ S 0 (x). Finally,
we show that the solution to this simple problem is also a solution to
the original problem P(w) at all w in a polytope Rx ⊂ X in which x
lies.
For each i ∈ I, Zi is defined by

Zi := {(x, u) | M i u ≤ N i x + p i }

Let Mji , Nji and qji denote, respectively, the jth row of M i , N i and qi ,
and let Ii (x, u) and Ii0 (x), defined by

Ii (x, u) := {j | Mji u = Nji x + pji }, Ii0 (x) := Ii (x, u0i (x))

denote, respectively, the active constraint set at (x, u) ∈ Zi and the ac-
tive constraint set for Pi (x). Because we now use subscript i to specify
Zi , we change our notation slightly and now let Ci (x, u) denote the
cone of first-order feasible variations for Pi (x) at u ∈ Ui (x), i.e.,

Ci (x, u) := {h ∈ Rm | Mji h ≤ 0 ∀j ∈ Ii (x, u)}

Similarly, we define the polar cone Ci∗ (x, u) of the cone Ci (x, u) at
7.5 Parametric Piecewise Quadratic Programming 467

h = 0 by

Ci∗ (x, u) := {v ∈ Rm | v ′ h ≤ 0 ∀h ∈ Ci (x, u)}


 
 X 
= (Mji )′ λj λj ≥ 0, j ∈ Ii (x, u)
 
j∈Ii (x,u)

As shown in Proposition 7.7, a necessary and sufficient condition for


the optimality of u for problem Pi (x) is

−∇u Vi (x, u) ∈ Ci∗ (x, u), u ∈ Ui (x) (7.15)

If u lies in the interior of Ui (x) so that Ii0 (x) = ∅, condition (7.15)


reduces to ∇u Vi (x, u) = 0. For any x ∈ X, the solution u0 (x) of the
piecewise parametric program P(x) satisfies

Mji u = Nji x + pji , ∀j ∈ Ii0 (x), ∀i ∈ S 0 (x) (7.16)

To simplify our notation, we rewrite the equality constraint (7.16) as

Ex u = Fx x + gx

where the subscript x denotes the fact that the constraints are precisely
those constraints that are active for the problems Pi (x), i ∈ S 0 (x). The
fact that u0 (x) satisfies these constraints and is, therefore, the unique
solution of the optimization problem

V 0 (x) = min{V (x, u) | Ex u = Fx x + gx }


u

motivates us to define the equality constrained problem Px (w) for w ∈


X near x by

Vx0 (w) = min{Vx (w, u) | Ex u = Fx w + gx }


u

where Vx (w, u) := Vi (w, u) for all i ∈ S 0 (x) and is, therefore, a posi-
tive definite quadratic function of (x, u). The notation Vx0 (w) denotes
the fact that the parameter in the parametric problem Px (w) is now
w but the data for the problem, namely (Ex , Fx , gx ), is derived from
the solution u0 (x) of P(x) and is, therefore, x-dependent. Problem
Px (w) is a simple equality constrained problem in which the cost Vx (·)
is quadratic and the constraints Ex u = Fx w + gx are linear. Let Vx0 (w)
denote the value of Px (w) and u0x (w) its solution. Then

Vx0 (w) = (1/2)w ′ Qx w + rx′ w + sx


u0x (w) = Kx w + kx (7.17)
468 Explicit Control Laws for Constrained Linear Systems

where Qx , rx , sx , Kx and kx are easily determined. It is easily seen that


u0x (x) = u0 (x) so that u0x (x) is optimal for P(x). Our hope is that
u0x (w) is optimal for P(w) for all w in some neighborhood Rx of x.
We now show this is the case.
Proposition 7.17 (Optimality of u0x (w) in Rx ). Let x be an arbitrary
point in X. Then
(a) u0 (w) = u0x (w) and V 0 (w) = Vx0 (w) for all w in the set Rx defined
by
( )
n u0x (w) ∈ Ui (w) ∀i ∈ S 0 (x)
Rx := w∈R
−∇u Vi (w, u0x (w)) ∈ Ci∗ (x, u0 (x)) ∀i ∈ S 0 (x)

(b) Rx is a polytope

(c) x ∈ Rx

Proof.
(a) Because of the equality constraint 7.16 it follows that Ii (w, ux (w)) ⊇
Ii (x, u0 (x)) and that S(w, u0x (w)) ⊇ S(x, u0 (x)) for all i ∈ S(x,
u0 (x)) = S 0 (x), all w ∈ Rx . Hence Ci (w, u0x (w)) ⊆ Ci (x, u0 (x)),
which implies Ci∗ (w, u0x (w)) ⊇ Ci∗ (x, u0 (x)) for all i ∈ S(x, u0 (x)) ⊆
S(w, u0x (w)). It follows from the definition of Rx that u0x (w) ∈ Ui (w)
and that −∇u Vi (w, u0x (w)) ∈ Ci∗ (w, u0x (w)) for all i ∈ S(w, u0x (w)).
Hence u = u0x (w) satisfies necessary and sufficient for optimality for
Pi (w) for all i ∈ S(w, u), all w ∈ Rx and, by Proposition 7.16, neces-
sary and sufficient conditions of optimality for P(w) for all w ∈ Rx .
Hence u0x (w) = u0 (w) and Vx0 (w) = V 0 (w) for all w ∈ Rx .

(b) That Rx is a polytope follows from the facts that the functions
w , u0x (w) and w , ∇u Vi (w, u0x (w)) are affine, the sets Zi are poly-
topic and the sets Ci0 (x, u0 (x)) are polyhedral; hence (w, u0x (w)) ∈ Zi
is a polytopic constraint and −∇u Vi (w, u0x (w)) ∈ Ci∗ (x, u0 (x)) a
polyhedral constraint on w.

(c) That x ∈ Rx follows from Proposition 7.16 and the fact that u0x (x) =
u0 (x). ■

Reasoning as in the proof of Proposition 7.10, we obtain the follow-


ing.
Proposition 7.18 (Piecewise quadratic (affine) solution). There exists a
finite set of points X in X such that {Rx | x ∈ X} is a polytopic par-
tition of X. The value function V 0 (·) for P(x) is strictly convex and
7.6 DP Solution of the Constrained LQ Control Problem 469

piecewise quadratic and the minimizer u0 (·) is piecewise affine in X


being equal, respectively, to the quadratic function Vx0 (·) and the affine
function u0x (·) in each region Rx , x ∈ X.

7.6 DP Solution of the Constrained LQ Control Problem


A disadvantage in the procedure described in Section 7.4 for determin-
ing the piecewise affine receding horizon control law is the dimension
Nm of the decision variable u. It seems natural to inquire whether
or not dynamic programming (DP), which replaces a multistage deci-
sion problem by a sequence of relatively simple single-stage problems,
provides a simpler solution. We answer this question by showing how
DP may be used to solve the constrained linear quadratic (LQ) problem
discussed in Section 7.4. For all j ∈ I1:N , let Vj0 (·), the optimal value
function at time-to-go j, be defined by

Vj0 (x) := min{Vj (x, u) | (x, u) ∈ Zj }


u
j−1
X
Vj (x, u) := ℓ(x(i), u(i)) + Vf (x(j))
i=0

Zj := {(x, u) | x(i) ∈ X, u(i) ∈ U, i ∈ I0:j−1 , x(j) ∈ Xf }

with x(i) := φ(i; x, u); Vj0 (·) is the value function for Pj (x). As shown
in Chapter 2, the constrained DP recursion is
0
Vj+1 (x) = min{ℓ(x, u) + Vj0 (f (x, u)) | u ∈ U, f (x, u) ∈ Xj } (7.18)
u

Xj+1 = {x ∈ X | ∃ u ∈ U such that f (x, u) ∈ Xj } (7.19)

where f (x, u) := Ax + Bu with boundary condition

V00 (·) = Vf (·), X0 = Xf

The minimizer of (7.18) is κj+1 (x). In the equations, the subscript j


denotes time to go, so that current time i = N − j. For each j, Xj is
the domain of the value function Vj0 (·) and of the control law κj (·),
and is the set of states that can be steered to the terminal set Xf in
j steps or less by an admissible control that satisfies the state and
control constraints. The time-invariant receding horizon control law
for horizon j is κj (·) whereas the optimal policy for problem Pj (x) is
{κj (·), κj−1 (·), . . . , κ1 (·)}. The data of the problem are identical to the
data in Section 7.4.
470 Explicit Control Laws for Constrained Linear Systems

We know from Section 7.4 that Vj0 (·) is continuous, strictly convex
and piecewise quadratic, and that κj (·) is continuous and piecewise
affine on a polytopic partition PXj of Xj . Hence the function (x, u) ,
V (x, u) := ℓ(x, u) + Vj0 (Ax + Bu) is continuous, strictly convex and
piecewise quadratic on a polytopic partition PZj+1 of the polytope Zj+1
defined by

Zj+1 := {(x, u) | x ∈ X, u ∈ U, Ax + Bu ∈ Xj }

The polytopic partition PZj+1 of Zj+1 may be computed as follows: if


X is a constituent polytope of Xj , then, from (7.19), the corresponding
constituent polytope of PZj+1 is the polytope Z defined by

Z := {z = (x, u) | x ∈ X, u ∈ U, Ax + Bu ∈ X}

Thus Z is defined by a set of linear inequalities; also ℓ(x, u) + Vj0 (f (x,


u)) is quadratic on Z. Thus the techniques of Section 7.5 can be em-
ployed for its solution, yielding the piecewise quadratic value func-
0
tion Vj+1 (·), the piecewise affine control law κj+1 (·), and the polytopic
0
partition PXj+1 on which Vj+1 (·) and κj+1 (·) are defined. Each prob-
lem (7.18) is much simpler than the problem considered in Section 7.4
since m, the dimension of u, is much less than Nm, the dimension of
u. Thus, the DP solution is preferable to the direct method described
in Section 7.4.

7.7 Parametric Linear Programming


7.7.1 Preliminaries

The parametric linear program P(x) is

V 0 (x) = min{V (x, u) | (x, u) ∈ Z}


u

where x ∈ X ⊂ R and u ∈ Rm , the cost function V (·) is defined by


n

V (x, u) = q′ x + r ′ u

and the constraint set Z is defined by

Z := {(x, u) | Mu ≤ Nx + p}

Let u0 (x) denote the solution of P(x), i.e.,

u0 (x) = arg min{V (x, u) | (x, u) ∈ Z}


u
7.7 Parametric Linear Programming 471

The solution u0 (x) may be set valued. The parametric linear program
(LP) may also be expressed as

V 0 (x) = min{V (x, u) | u ∈ U(x)}


u

where, as before, the parametric constraint set U(x) is defined by

U(x) := {u | (x, u) ∈ Z} = {u | Mu ≤ Nx + p}

Also, as before, the domain of V 0 (·) and u0 (·), i.e., the set of points x
for which a feasible solution of P(x) exists, is the set X defined by

X := {x | ∃u such that (x, u) ∈ Z} = {x | U(x) ≠ ∅}

The set X is the projection of Z (which is a set in (x, u)-space) onto


x-space; see Figure 7.1. We assume in the sequel that the problem is
well posed, i.e., for each x ∈ X, V 0 (x) > −∞. This excludes problems
like V 0 (x) = inf u {x + u | −x ≤ 1, x ≤ 1} for which V 0 (x) = −∞ for all
x ∈ X = [−1, 1]. Let I1:p denote, as usual, the index set {1, 2, . . . , p}.
For all (x, u) ∈ Z, let I(x, u) denote the set of active constraints at
(x, u), i.e.,
I(x, u) := {i ∈ I1:p | Mi u = Ni x + pi }
where Ai denotes the ith row of any matrix (or vector) A. Similarly, for
any matrix A and any index set I, AI denotes the matrix with rows Ai ,
i ∈ I. If, for any x ∈ X, u0 (x) is unique, the set I 0 (x) of constraints
active at (x, u0 (x)) is defined by

I 0 (x) := I(x, u0 (x))

When u0 (x) is unique, it is a vertex (a face of dimension zero) of


the polyhedron U(x) and is the unique solution of

Mx0 u = Nx0 x + px0

where
Mx0 := MI 0 (x) , Nx0 := NI 0 (x) , px0 := pI 0 (x)
In this case, the matrix Mx0 has rank m.
Any face F of U(x) with dimension d ∈ {1, 2, . . . , m} satisfies Mi u =
Ni x + pi for all i ∈ IF , all u ∈ F for some index set IF ⊆ I1:p . The matrix
MIF with rows Mi , i ∈ IF , has rank m − d, and the face F is defined by

F := {u | Mi u = Ni x + pi , i ∈ IF } ∩ U(x)
472 Explicit Control Laws for Constrained Linear Systems

When u0 (x) is not unique, it is a face of dimension d ≥ 1 and the set


I 0 (x) of active constraints is defined by

I 0 (x) := {i | Mi u = Ni x+pi ∀u ∈ u0 (x)} = {i | i ∈ I(x, u) ∀u ∈ u0 (x)}

The set {u | Mi u = Ni x + pi , i ∈ I 0 (x)} is a hyperplane in which


u0 (x) lies. See Figure 7.7 where u0 (x1 ) is unique, a vertex of U(x1 ),
and I 0 (x1 ) = {2, 3}. If, in Figure 7.7, r = −e1 , then u0 (x1 ) = F2 (x1 ),
a face of dimension 1; u0 (x1 ) is, therefore, set valued. Since u ∈ Rm
where m = 2, u0 (x1 ) is a facet, i.e., a face of dimension m − 1 = 1.
Thus u0 (x1 ) is a set defined by u0 (x1 ) = {u | M1 u ≤ N1 x1 + p1 ,
M2 u = N2 x1 + p2 , M3 u ≤ N3 x1 + p3 }. At each z = (x, u) ∈ Z, i.e., for
each (x, u) such that x ∈ X and u ∈ U(x), the cone C(z) = C(x, u)
of first-order feasible variations is defined, as before, by

C(z) := {h ∈ Rm | Mi h ≤ 0, i ∈ I(z)} = {h ∈ Rm | MI(z) h ≤ 0}

If I(z) = I(x, u) = ∅ (no constraints are active), C(z) = Rm (all varia-


tions are feasible).
Since u , V (x, ·) is convex and differentiable, and U(x) is poly-
hedral for all x, the parametric LP P(x) satisfies the assumptions of
Proposition 7.8. Hence, repeating Proposition 7.8 for convenience, we
have

Proposition 7.19 (Optimality conditions for parametric LP). A neces-


sary and sufficient condition for u to be a minimizer for the parametric
LP P(x) is
u ∈ U(x) and − ∇u V (x, u) ∈ C ∗ (x, u)
where ∇u V (x, u) = r and C ∗ (x, u) is the polar cone of C(x, u).

An important difference between this result and that for the para-
metric QP is that ∇u V (x, u) = r and, therefore, does not vary with x
or u. We now use this result to show that both V 0 (·) and u0 (·) are
piecewise affine. We consider the simple case when u0 (x) is unique
for all x ∈ X.

7.7.2 Minimizer u0 (x) Is Unique for all x ∈ X

Before proceeding to obtain the solution to a parametric LP when the


minimizer u0 (x) is unique for each x ∈ X, we look first at the simple
example illustrated in Figure 7.7, which shows the constraint set U(x)
for various values of the parameter x in the interval [x1 , x3 ]. The set
7.7 Parametric Linear Programming 473

M3′

u2 M4′

u3,4 U(x1 )
F3 (x1 )
F4 (x1 )
3 u2,3 = u0 (x1 )

F1 (x3 )
F2 (x1 )

F5 (x1 ) M2′
F1 (x2 )

M5′ −r
F1 (x1 )

0 F6 (x1 ) 3 u1

M1′

M6′

Figure 7.7: Solution to a parametric LP.

U(x1 ) has six faces: F1 (x1 ), F2 (x1 ), F3 (x1 ), F4 (x1 ), F5 (x1 ), and F6 (x1 ).
Face F1 (x) lies in the hyperplane H1 (x) that varies linearly with x;
each face Fi (x), i = 2, . . . , 6, lies in the hyperplane Hi that does not
vary with x. All the faces vary with x as shown so that U(x2 ) has four
faces: F1 (x2 ), F3 (x2 ), F4 (x2 ), and F5 (x2 ); and U(x3 ) has three faces:
F1 (x3 ), F4 (x3 ), and F5 (x3 ). The face F1 (x) is shown for three values
of x: x = x1 (the bold line), and x = x2 and x = x3 (dotted lines).
It is apparent that for x ∈ [x1 , x2 ], u0 (x) = u2,3 in which u2,3 is the
intersection of H2 and H3 , and u0 (x3 ) = u3,4 , in which u3,4 is the
intersection of H3 and H4 . It can also be seen that u0 (x) is unique for
all x ∈ X.
We now return to the general case. Suppose, for some ∈ X, u0 (x)
is the unique solution of P(x); u0 (x) is the unique solution of

Mx0 u = Nx0 x + px0

It follows that u0 (x) is the trivial solution of the simple equality con-
strained problem defined by

V 0 (x) = min{V (x, u) | Mx0 u = Nx0 x + px0 } (7.20)


u

The solution u0 (x) of this equality constrained problem is trivial be-


cause it is determined entirely by the equality constraints; the cost
474 Explicit Control Laws for Constrained Linear Systems

plays no part.
The optimization problem (7.20) motivates us, as in parametric quad-
ratic programming, to consider, for any parameter w “close” to x, the
simpler equality constrained problem Px (w) defined by

Vx0 (w) = min{V (w, u) | Mx0 u = Nx0 w + px0 }


u

u0x (w) = arg min{V (w, u) | Mx0 u = Nx0 w + px0 }


u

Let u0x (w) denote the solution of Px (w). Because, for each x ∈ X,
the matrix Mx0 has full rank m, there exists an index set Ix such that
MIx ∈ Rm×m is invertible. Hence, for each w, u0x (w) is the unique
solution of
MIx u = NIx w + pIx
so that for all x ∈ X, all w ∈ Rm

u0x (w) = Kx w + kx (7.21)

where Kx := (MIx )−1 NIx and kx := (MIx )−1 pIx . In particular, u0 (x) =
u0x (x) = Kx x + kx . Since Vx0 (x) = Vx (x, u0x (w)) = q′ x + r ′ u0x (w), it
follows that
Vx0 (x) = (q′ + r ′ Kx )x + r ′ kx
for all x ∈ X, all w ∈ Rm . Both Vx0 (·) and u0x (·) are affine in x.
It follows from Proposition 7.19 that −r ∈ C ∗ (x, u0 (x)) = cone{Mi′ |
i ∈ I 0 (x) = I(x, u0 (x))} = cone{Mi′ | i ∈ Ix }. Since Px (w) satisfies
the conditions of Proposition 7.8, we may proceed as in Section 7.3.4
and define, for each x ∈ X, the set Rx0 as in (7.5)
( )
u0x (w) ∈ U(w)
Rx0 := w ∈ Rn
−∇u V (w, u0x (w)) ∈ C ∗ (x, u0 (x))

It then follows, as shown in Proposition 7.9, that for any x ∈ X, u0x (w)
is optimal for P(w) for all w ∈ Rx0 . Because P(w) is a parametric LP,
however, rather than a parametric QP, it is possible to simplify the def-
inition of Rx0 . We note that ∇u V (w, u0x (w)) = r for all x ∈ X, all
w ∈ Rm . Also, it follows from Proposition 7.8, since u0 (x) is optimal
for P(x), that −∇u V (x, u0 (x)) = −r ∈ C ∗ (x) so that the second con-
dition in the definition above for Rx0 is automatically satisfied. Hence
we may simplify our definition for Rx0 ; for the parametric LP, Rx0 may
be defined by
Rx0 := {w ∈ Rn | u0x (w) ∈ U(w)} (7.22)
7.8 Constrained Linear Control 475

Because u0x (·) is affine, it follows from the definition of U(w) that
Rx0 is polyhedral. The next result follows from the discussion in Sec-
tion 7.3.4.

Proposition 7.20 (Solution of P). For any x ∈ X, u0x (w) is optimal for
P(w) for all w in the set Rx0 defined in (7.22).

Finally, the next result characterizes the solution of the parametric


LP P(x) when the minimizer is unique.

Proposition 7.21 (Piecewise affine cost and solution).


(a) There exists a finite set of points X in X such that {Rx0 | x ∈ X} is a
polyhedral partition of X.

(b) The value function V 0 (·) for P(x) and the minimizer u0 (·) are piece-
wise affine in X being equal, respectively, to the affine functions Vx0 (·)
and u0x (·) in each region Rx , x ∈ X.

(c) The value function V 0 (·) and the minimizer u0 (·) are continuous in
X.

Proof. The proof of parts (a) and (b) follows, apart from minor changes,
the proof of Proposition 7.10. The proof of part (c) uses the fact that
u0 (x) is unique, by assumption, for all x ∈ X and is similar to the
proof of Proposition 7.13. ■

7.8 Constrained Linear Control


The previous results on parametric linear programming may be applied
to obtain the optimal receding horizon control law when the system is
linear, the constraints polyhedral, and the cost linear as is done in a
similar fashion in Section 7.4 where the cost is quadratic. The optimal
control problem is therefore defined as in Section 7.4, except that the
stage cost ℓ(·) and the terminal cost Vf (·) are now defined by

ℓ(x, u) := q′ x + r ′ u Vf (x) := qf′ x

As in Section 7.4, the optimal control problem PN (x) may be expressed


as
VN0 (x) = min{VN (x, u) | Mu ≤ Nx + p}
u

where, now
VN (x, u) = q′ x + r′ u
476 Explicit Control Laws for Constrained Linear Systems

Hence the problem has the same form as that discussed in Section 7.7
and may be solved as shown there.
It is possible, using a simple transcription, to use the solution of
PN (x) to solve the optimal control problem when the stage cost and
terminal cost are defined by

ℓ(x, u) := |Qx|p + |Ru|p , Vf (x) := Qf x


p

where |·|p denotes the p-norm and p is either 1 or ∞.

7.9 Computation
Our main purpose above was to establish the structure of the solution
of parametric linear or QPs and, hence, of the solutions of constrained
linear optimal control problems when the cost is quadratic or linear.
We have not presented algorithms for solving these problem although;
there is now a considerable literature on this topic. One of the ear-
liest algorithms (Serón, De Doná, and Goodwin, 2000) is enumeration
based: checking every active set to determine if it defines a non-empty
region in which the optimal control is affine. There has recently been
a return to this approach because of its effectiveness in dealing with
systems with relatively high state dimension but a low number of con-
straints (Feller, Johansen, and Olaru, 2013). The enumeration based
procedures can be extended to solve mixed-integer problems. While
the early algorithms for parametric linear and quadratic programming
have exponential complexity, most later algorithms are based on a lin-
ear complementarity formulation and execute in polynomial time in
the number of regions; they also use symbolic perturbation to select a
unique and continuous solution when one exists (Columbano, Fukudu,
and Jones, 2009). Some research has been devoted to obtaining ap-
proximate solutions with lower complexity but guaranteed properties
such as stability (Borrelli, Bemporad, and Morari, 2017, Chapter 13).
Toolboxes for solving parametric linear and quadratic programming
problems include the The Multi-Parametric Toolbox in MATLAB and MPT3
described in (Herceg, Kvasnica, Jones, and Morari, 2013).
A feature of parametric problems is that state dimension is not a
reliable indicator of complexity. There exist problems with two states
that require over 105 regions and problems with 80 states that require
only hundreds of regions. While problems with state dimension less
than, say, 4 can be expected to have reasonable complexity, higher di-
mension problems may or may not have manageable complexity.
7.10 Notes 477

7.10 Notes
Early work on parametric programming, e.g., (Dantzig, Folkman, and
Shapiro, 1967) and (Bank, Guddat, Klatte, Kummer, and Tanner, 1983),
was concerned with the sensitivity of optimal solutions to parameter
variations. Solutions to the parametric linear programming problem
were obtained relatively early (Gass and Saaty, 1955) and (Gal and Ne-
doma, 1972). Solutions to parametric QPs were obtained in (Serón et al.,
2000) and (Bemporad, Morari, Dua, and Pistikopoulos, 2002) and ap-
plied to the determination of optimal control laws for linear systems
with polyhedral constraints. Since then a large number of papers on
this topic have appeared, many of which are reviewed in (Alessio and
Bemporad, 2009). Most papers employ the Kuhn-Tucker conditions of
optimality in deriving the regions Rx , x ∈ X. Use of the polar cone con-
dition was advocated in (Mayne and Raković, 2002) in order to focus on
the geometric properties of the parametric optimization problem and
avoid degeneracy problems. Section 7.5, on parametric piecewise quad-
ratic programming, is based on (Mayne, Raković, and Kerrigan, 2007).
The example in Section 7.4 was first computed by Raković (Mayne and
Raković, 2003). That results from parametric linear and quadratic pro-
gramming can be employed, instead of maximum theorems, to estab-
lish continuity of u0 (·) and, hence, of V 0 (·), was pointed out by Bem-
porad et al. (2002) and Borrelli (2003, p. 37).
Much research has been devoted to obtaining reliable algorithms;
see the survey papers (Alessio and Bemporad, 2009) and (Jones, Barić,
and Morari, 2007) and the references therein. Jones (2017, Chapter
13) provides a useful review of approximate explicit control laws of
specified complexity that nevertheless guarantee stability and recursive
feasibility.
478 Explicit Control Laws for Constrained Linear Systems

7.11 Exercises

Exercise 7.1: QP with equality constraints


Obtain the solution u0 and the value V 0 of the equality constrained optimization
problem V 0 = minu {V (u) | h(u) = 0} where V (u) = (1/2)u′ Ru + r ′ u + c and
h(u) := Mu − p.

Exercise 7.2: Parametric QP with equality constraints


Show that the solution u0 (x) and the value V 0 (x) of the parametric optimization prob-
lem V 0 (x) = minu {V (x, u) | h(x, u) = 0} where V (x, u) := (1/2)x ′ Qx + u′ Sx +
(1/2)u′ Ru + q′ x + r ′ u + c and h(x, u) := Mu − Nx − p have the form u0 (x) = Kx + k
and V 0 (x) = (1/2)x ′ Q̄x + q̄′ x + s. Determine Q̄, q̄, s, K, and k.

Exercise 7.3: State and input trajectories in constrained LQ problem


For the constrained linear quadratic problem defined in Section 7.4, show that u :=
(u(0), u(1), . . . , u(N − 1)) and x := (x(0), x(1), . . . , x(N)), where x(0) = x and x(i) =
φ(i; x, u), i = 0, 1, . . . , N, satisfy
x = Fx + Gu
and determine the matrices F and G; in this equation u and x are column vectors.
Hence show that VN (x, u) and Z, defined respectively in (7.9) and (7.11), satisfy (7.12)
and (7.13), and determine Q , R, M, N, and p.

Exercise 7.4: The parametric LP with unique minimizer


For the example of Figure 7.7, determine u0 (x), V 0 (x), I 0 (x), and C ∗ (x) for all x in
the interval [x1 , x3 ]. Show that −r lies in C ∗ (x) for all x in [x1 , x3 ].

Exercise 7.5: Cost function and constraints in constrained LQ control prob-


lem
For the constrained linear control problem considered in Section 7.8, determine the
matrices M, N, and p that define the constraint set Z, and the vectors q and r that
define the cost VN (·).

Exercise 7.6: Cost function in constrained linear control problem


Show that |x|p , p = 1 and p = ∞, may be expressed as max j {sj′ x | j ∈ J} and
determine si , i ∈ I for the two cases p = 1 and p = ∞. Hence show that the optimal
control problem in Section 7.8 may be expressed as
0
VN (x) = min{VN (x, v) | Mv ≤ Nx + p}
v

where, now, v is a column vector whose components are u(0), u(1), . . . , u(N−1), ℓx (0),
ℓx (1), . . . , ℓx (N), ℓu (0), ℓu (1), . . . , ℓu (N − 1) and f ; the cost VN (x, v) is now defined
by
N−1
X
VN (x, v) = (ℓx (i) + ℓu (i)) + f
i=0
7.11 Exercises 479

Finally, Mv ≤ Nx + p now specifies the constraints u(i) ∈ U and x(i) ∈ X, |Ru(i)|p ≤


ℓu (i), |Qx(i)|p ≤ ℓx (i), i = 0, 1, . . . , N − 1, x(N) ∈ Xf , and Qf x(N) ≤ f . As before,
x+ = Fx + Gu.

Exercise 7.7: Is QP constraint qualification relevant to MPC?


Continuity properties of the MPC control law are often used to establish robustness
properties of MPC such as robust asymptotic stability. In early work on continuity
properties of linear model MPC, Scokaert, Rawlings, and Meadows (1997) used results
on continuity of QPs with respect to parameters to establish MPC stability under per-
turbations. For example, Hager (1979) considered the following QP
min(1/2)u′ Hu + h′ u + c
u
subject to
Du ≤ d
and established that the QP solution u0 and cost V 0 are Lipschitz continuous in the
data of the QP, namely the parameters H, h, D, d. To establish this result Hager (1979)
made the following assumptions.
• The solution is unique for all H, h, D, d in a chosen set of interest.
• The rows of D corresponding to the constraints active at the solution are linearly
independent. The assumption of linear independence of active constraints is a
form of constraint qualification.
(a) First we show that some form of constraint qualification is required to establish
continuity of the QP solution with respect to matrix D. Consider the following
QP example that does not satisfy Hager’s constraint qualification assumption.
" # " # " # " #
1 0 1 1 1 −1
H= D= d= h= c=1
0 1 −1 −1 −1 −1
Find the solution u0 for this problem.
Next perturb the D matrix to
" #
1 1
D=
−1 + ϵ −1
in which ϵ > 0 is a small perturbation. Find the solution to the perturbed prob-
lem. Are V 0 and u0 continuous in parameter D for this QP? Draw a sketch of
the feasible region and cost contours for the original and perturbed problems.
What happens to the feasible set when D is perturbed?

(b) Next consider MPC control of the following system with state inequality con-
straint and no input constraints
" # " # " #
−1/4 1 1 1 1
A= B= x(k) ≤ k ∈ I0:N
−1 1/2 −1 −1 1
Using a horizon N = 1, eliminate the state x(1) and write out the MPC QP for
the input u(0) in the form given above for Q = R = I and zero terminal penalty.
Find an initial condition x0 such that the MPC constraint matrix D and vector d
are identical to those given in the previous part. Is this x0 ∈ XN ?
Are the rows of the matrix of active constraints linearly independent in this MPC
QP on the set XN ? Are the MPC control law κN (x) and optimal value function
0
VN (x) Lipschitz continuous on the set XN for this system? Explain the reason
if these two answers differ.
480 Explicit Control Laws for Constrained Linear Systems

Octave implicit
0.25 Octave explicit
Matlab implicit
0.20 Matlab explicit
probability density

0.15

0.10

0.05

0.00
0 10 20 30 40 50 60 70 80
execution time (ms)

Figure 7.8: Solution times for explicit and implicit MPC for N = 20.
Plot shows kernel density estimate for 10,000 samples
using a Gaussian kernel (σ = 1 ms).

Exercise 7.8: Explicit versus implicit


Using the system from Figure 7.6, find the explicit control law for horizon N = 20
(you should find 1719 regions). Implement a simple lookup function for the explicit
control law. Randomly sample a large number of points (≥ 1000) from XN and compare
execution times for explicit MPC (via the lookup function) and implicit MPC (via solving
a QP). Which method is better? Example results are shown in Figure 7.8, although your
times may vary significantly. How could you improve your lookup function?

Exercise 7.9: Cascaded MPC and PID


Consider a Smart TankTM of liquid whose height h evolves according to
dh
τ + h = Kq, τ = 10, K = 1
dt
with q the (net) inflow. The tank is SmartTM in that it has an integrated PI controller
that computes
 
1
q = Kc hsp − h + ϵ
τc
Z
ϵ = hsp − h dt
7.11 Exercises 481

so that the height of the tank returns to hsp automatically. Unfortunately, the controller
parameters are not very SmartTM , as they are fixed permanently at Kc = 1/2 and τc = 1.

(a) Simulate the closed-loop behavior of the system starting from h = −1, ϵ = 0
with hsp ≡ 0.

(b) Design an MPC controller to choose hsp . As a cost function take

ℓ(h, ϵ, q, hsp ) = 5(h2 + ϵ2 ) + q2 + 10h2sp

so that the controller drives the system to h = ϵ = 0. Choose ∆ = 1. How does


performance compare to the previous case? How much storage (i.e., how many
floating-point numbers must be stored) to implement this controller?

(c) Add the constraint q ∈ [−0.2, 0.2] to the MPC formulation, and design an explicit
MPC controller valid for h ∈ [−5, 5] and ϵ ∈ [−10, 10] (use solvempqp.m from
Figure 7.6, and add constraints Ep ≤ e to only search the region of interest). How
large does N have to be so that the full region is covered? How much storage is
needed to implement this controller?

Exercise 7.10: Explicit economic MPC for electricity arbitrage


Electricity markets are often subject to real-time pricing, whereby the cost of purchas-
ing electricity varies with time. Suppose that you have a large battery that allows you
to buy electricity at one time and then sell it back to the grid at another. We can model
this as a simple integrator system

x+ = x + u

with x representing the amount of stored energy in the tank, and u giving the amount
of electricity that is purchased for the battery u > 0 or discharged from the battery
and sold back to the grid (u < 0). We wish to find an explicit control law based on the
initial condition x(0) a known forecast of electricity prices c(0), c(1), . . . , c(N − 1).

(a) To start, suppose that u is constrained to the interval [−1, 1] but x is uncon-
strained. A reasonable optimization problem is
N−1
X
min c(k)u(k) + 0.1u(k)2
u
k=0
s.t. x(k + 1) = x(k) + u(k)
u(k) ∈ [−1, 1]

where the main component of the objective function is the cost of electricity
purchase/sale with a small penalty added to discourage larger transactions. By
removing the state evolution equation, formulate an explicit quadratic program-
ming problem with N variables (the u(k)) and N + 1 parameters (x(0) and the
price forecast c(k)). What is a theoretical upper bound on the number of regions
in the explicit control law? Assuming that x(0) ∈ [−10, 10] and each c(k) ∈ [−1,
−1], find the explicit control law for a few small values of N. (Consider using
solvempqp.m from Figure 7.6; you will need to add constraints Ep ≤ e on the
parameter vector to make sure the regions are bounded.) How many regions do
you find?
482 Explicit Control Laws for Constrained Linear Systems

(b) To make the problem more realistic, we add the constraint x(k) ∈ [−10, 10]
to the optimization, as well as an additional penalty on stored inventory. The
optimization problem is then
N−1
X
min c(k)u(k) + 0.1u(k)2 + 0.01x(k)2
u
k=0
s.t. x(k + 1) = x(k) + u(k)
u(k) ∈ [−1, 1]
x(k) ∈ [−10, 10]

Repeat the previous part but using the new optimization problem.

(c) Suppose you wish to solve this problem with a 7-day horizon and a 1-hour time
step. Can you use the explicit solution of either formulation? (Hint: for compar-
ison, there are roughly 1080 atoms in the observable universe.)
Bibliography

A. Alessio and A. Bemporad. A survey on explicit model predictive control. In


L. Magni, D. Raimondo, and F. Allgöwer, editors, Nonlinear Model Predictive
Control - Towards New Challenging Applications, pages 345–369. Springer
Berlin / Heidelberg, 2009.

B. Bank, J. Guddat, D. Klatte, B. Kummer, and K. Tanner. Non-linear parametric


optimization. Birkhäuser Verlag, Basel, Boston, Stuttgart, 1983.

A. Bemporad, M. Morari, V. Dua, and E. N. Pistikopoulos. The explicit linear


quadratic regulator for constrained systems. Automatica, 38(1):3–20, 2002.

F. Borrelli. Constrained Optimal Control of Linear and Hybrid Systems.


Springer-Verlag Berlin Heidelberg, 2003.

F. Borrelli, A. Bemporad, and M. Morari. Predictive Control for Linear and Hy-
brid Systems. Cambridge University Press, 2017.

S. Columbano, K. Fukudu, and C. N. Jones. An output sensitive algorithm for


multiparametric LCPs with sufficient matrices. Polyhedral Computation, 48:
73, 2009.

G. B. Dantzig, J. Folkman, and N. Z. Shapiro. On the continuity of the minimum


set of a continuous function. J. Math. Anal. Appl., 17(3):519–548, 1967.

C. Feller, T. A. Johansen, and S. Olaru. An improved algorithm for combina-


torial multi-parametric quadratic programming. Automatica, 49(5):1370–
1376, 2013.

T. Gal and J. Nedoma. Multiparametric linear programming. Management


Science, 18(7):406–422, 1972.

S. I. Gass and T. L. Saaty. The computational algorithm for the parametric


objective function. Naval Research Logistics Quarterly, 2:39–45, 1955.

W. W. Hager. Lipschitz continuity for constrained processes. SIAM J. Cont.


Opt., 17(3):321–338, 1979.

M. Herceg, M. Kvasnica, C. N. Jones, and M. Morari. Multi-Parametric Toolbox


3.0. In Proc. of the European Control Conference, pages 502–510, Zürich,
Switzerland, July 17–19 2013. http://control.ee.ethz.ch/~mpt.

483
484 Bibliography

C. N. Jones. Approximate receding horizon control. In F. Borrelli, A. Bemporad,


and M. Morari, editors, Predictive Control for Linear and Hybrid Systems,
pages 277–300. Cambridge University Press, 2017.

C. N. Jones, M. Barić, and M. Morari. Multiparametric linear programming with


applications in control. Eur. J. Control, 13:152–170, 2007.

D. Q. Mayne and S. V. Raković. Optimal control of constrained piecewise affine


discrete-time systems using reverse transformation. In Proceedings of the
IEEE 2002 Conference on Decision and Control, volume 2, pages 1546 – 1551
vol.2, Las Vegas, USA, 2002.

D. Q. Mayne and S. V. Raković. Optimal control of constrained piecewise affine


discrete-time systems. Comp. Optim. Appl., 25(1-3):167–191, 2003.

D. Q. Mayne, S. V. Raković, and E. C. Kerrigan. Optimal control and piecewise


parametric programming. In Proceedings of the European Control Confer-
ence 2007, pages 2762–2767, Kos, Greece, July 2–5 2007.

P. O. M. Scokaert, J. B. Rawlings, and E. S. Meadows. Discrete-time stability with


perturbations: Application to model predictive control. Automatica, 33(3):
463–470, 1997.

M. M. Serón, J. A. De Doná, and G. C. Goodwin. Global analytical model predic-


tive control with input constraints. In Proceedings of the 39th IEEE Confer-
ence on Decision and Control, pages 154–159, Sydney, Australia, December
2000.
8
Numerical Optimal Control

8.1 Introduction
Numerical optimal control methods are at the core of every model pre-
dictive control implementation, and algorithmic choices strongly affect
the reliability and performance of the resulting MPC controller. The
aim of this chapter is to explain some of the most widely used algo-
rithms for the numerical solution of optimal control problems. Before
we start, recall that the ultimate aim of the computations in MPC is to
find a numerical approximation of the optimal feedback control u0 (x0 )
for a given current state x0 . This state x0 serves as initial condition for
an optimal control problem, and u0 (x0 ) is obtained as the first control
of the trajectory that results from the numerical solution of the optimal
control problem. Due to a multitude of approximations, the feedback
law usually is not exact. Some of the reasons are the following.

• The system model is only an approximation of the real plant.

• The horizon length is finite instead of infinite.

• The system’s differential equation is discretized.

• The optimization problem is not solved exactly.

While the first two of the above are discussed in Chapters 2 and 3 of
this book, the last two are due to the numerical solution of the opti-
mal control problems arising in model predictive control and are the
focus of this chapter. We argue throughout the chapter that it is not a
good idea to insist that the finite horizon MPC problem shall be solved
exactly. First, it usually is impossible to solve a simulation or opti-
mization problem without any numerical errors, due to finite precision
arithmetic and finite computation time. Second, it might not even be
desirable to solve the problem as exactly as possible, because the neces-
sary computations might lead to large feedback delays or an excessive
use of CPU resources. Third, in view of the other errors that are nec-
essarily introduced in the modeling process and in the MPC problem

485
486 Numerical Optimal Control

formulation, errors due to an inexact numerical solution do not sig-


nificantly change the closed-loop performance, at least as long as they
are smaller than the other error sources. Thus, the optimal choice of a
numerical method for MPC should be based on a trade-off between ac-
curacy and computation time. There are, however, tremendous differ-
ences between different numerical choices, and it turns out that some
methods, compared to others, can have significantly lower computa-
tional cost for achieving the same accuracy. Also, reliability is an issue,
as some methods might more often fail to find an approximate solu-
tion than other methods. Thus, the aim of this chapter is to give an
overview of the necessary steps toward the numerical solution of the
MPC problem, and to discuss the properties of the different choices
that can be made in each step.

8.1.1 Discrete Time Optimal Control Problem

When working in a discrete time setting, the MPC optimization problem


that needs to be solved numerically in each time step, for a given system
state x0 , can be stated as follows. For ease of notation, we introduce
the sequence of future control inputs on the prediction horizon, u :=
(u(0), u(1), . . . , u(N − 1)), as well as the predicted state trajectories
x := (x(0), x(1), . . . , x(N)).
N−1
X
minimize ℓ(x(k), u(k)) + Vf (x(N)) (8.1a)
x, u k=0

subject to x(0) = x0 (8.1b)


x(k + 1) = f (x(k), u(k)), k = 0, 1, . . . , N − 1 (8.1c)
(x(k), u(k)) ∈ Z, k = 0, 1, . . . , N − 1 (8.1d)
x(N) ∈ Xf (8.1e)

We call the above optimization problem PN (x0 ) to indicate its depen-


dence on the parameter x0 , and denote the resulting optimal value
function by VN (x0 ). The value function VN (x0 ) is mostly of theoret-
ical interest, and is in practice computed only for those values of x0
that actually arise in the MPC context. In this chapter, we are mostly
interested in fast and efficient ways to find an optimal solution, which
we denote by (x0 (x0 ), u0 (x0 )). The solution need not be unique for a
given problem PN (x0 ), and in a mathematically correct notation one
could only define the set S 0 (x0 ) of all solutions to PN (x0 ). Usually
one tries to ensure by a proper formulation that the MPC optimization
8.1 Introduction 487

problems have unique solutions, however, so that the set of solutions


is a singleton, S 0 (x0 ) = {(x0 (x0 ), u0 (x0 ))}.
A few remarks are in order regarding the statement of the optimiza-
tion problem (8.1a)-(8.1e). First, as usual in the field of optimization, we
list the optimization variables of problem PN (x0 ) below the word “min-
imize.” Here, they are given by the sequences x and u. The constraints
of the problem appear after the keywords “subject to” and restrict the
search for the optimal solution. Let us discuss each of them briefly:
constraint (8.1b) ensures that the trajectory x = (x(0), . . .) starts at x0 ,
and uniquely determines x(0). Constraints (8.1c) ensure that the state
and control trajectories obey the system dynamics for all time steps
k = 0, . . . , N − 1. If in addition to x(0) one would also fix the controls
u, the whole state trajectory x would be uniquely determined by these
constraints. Constraints (8.1d) shall ensure that the state control pairs
(x(k), u(k)) are contained in the set Z at each time step k. Finally,
the terminal state constraint (8.1e) requires the final state to be in a
given terminal set Xf . The set of all variables (x, u) that satisfy all
constraints (8.1b)-(8.1e) is called the feasible set. Note that the feasible
set is the intersection of all constraint sets defined by the individual
constraints.

8.1.2 Convex Versus Nonconvex Optimization

The most important dividing line in the field of optimization is between


convex and nonconvex optimization problems. If an optimization prob-
lem is convex, every local minimum is also a global one. One can reli-
ably solve most convex optimization problems of interest, finding the
globally optimal solution in polynomial time. On the other hand, if a
problem is not convex, one can usually not find the global minimum.
Even if one has accidentally found the global minimum, one usually
cannot certify that it is the global minimum. Thus, in nonconvex opti-
mization, one has usually to accept that one is only able to find feasible
or locally optimal points. Fortunately, if one has found such a point,
one usually is also able to certify that it is a feasible or locally opti-
mal point. But in the worst case, one might not be able to find even a
feasible point, without knowing if this is due to the problem being in-
feasible, or the optimization algorithm being just unable to find points
in the feasible set. Thus, the difference between convex and nonconvex
has significant implications in practice. To say it in the words of the
famous mathematical optimizer R. Tyrrell Rockafellar, “The great wa-
tershed in optimization is not between linearity and nonlinearity, but
488 Numerical Optimal Control

convexity and nonconvexity.”


When is a given optimization problem a convex optimization prob-
lem? By definition, an optimization problem is convex if its feasible
set is a convex set and if its objective function is a convex function. In
MPC, we usually have freedom in choosing the objective function, and
in most cases one chooses a convex objective function. For example,
the sum of quadratic functions of the form ℓ(x, u) = x ′ Qx + u′ Ru
with positive semidefinite matrices Q and R is a convex function. Usu-
ally, one also chooses the terminal cost Vf to be a convex function, so
that the objective function is a convex function.
Likewise, one usually chooses the terminal set Xf to be a convex
set. For example, one might choose an ellipsoid Xf = {x | x ′ P x ≤ 1}
with a positive definite matrix P , which is a convex set. Very often, one
is lucky and also has convex constraint sets Z, for example box con-
straints on x(k) and u(k). The initial-value constraint (8.1b) restricts
the variable x(0) to be in the point set {x0 }, which is convex. Thus,
most of the constraints in the MPC optimization problem usually can
be chosen to be convex. On the other hand, the constraints (8.1c) re-
flect the system dynamics x(k + 1) = f (x(k), u(k)) for all k, and these
might or might not describe a convex set. Interestingly, it turns out that
they describe a convex set if the system model is linear or affine, i.e., if
f (x(k), u(k)) = Ax(k) + Bu(k) + c with matrices A, B and vector c of
appropriate dimensions. This follows because the solution set of linear
equalities is an affine set, which is convex. Conversely, if the system
model is nonlinear, the solution set of the dynamic constraints (8.1c) is
most likely not a convex set. Thus, we can formulate a modification of
Rockafellar’s statement above: in MPC practice, the great watershed be-
tween convex and nonconvex optimization problems usually coincides
with the division line between linear and nonlinear system models.
One speaks of linear MPC if a linear or affine simulation model is
used, and of nonlinear MPC otherwise. When speaking of linear MPC,
one implicitly assumes that all other constraints and the objective func-
tion are chosen to be convex, but not necessarily linear. In particular, in
linear MPC, the objective function usually is chosen to be convex quad-
ratic. Thus, in the MPC literature, the term linear MPC is used as if
it coincides with “convex linear MPC.” Theoretically possible “noncon-
vex linear MPC” methods, where the system model is linear but where
the cost or constraint sets are not convex, are not of great practical
interest. On the other hand, for nonlinear MPC, i.e., when a nonlin-
ear model is used, convexity usually is lost anyway, and there are no
8.1 Introduction 489

implicit convexity assumptions on the objective and constraints, such


that the term nonlinear MPC nearly always coincides with “nonconvex
nonlinear MPC.”

Example 8.1: Nonlinear MPC


We regard a simple MPC optimization problem of the form (8.1) with
one dimensional state x and control u, system dynamics f (x, u) =
x + u − 2u2 , initial value x0 = 1, and horizon length N = 1, as follows

minimize x(0)2 + u(0)2 + 10x(1)2 (8.2a)


x(0),x(1),u(0)

subject to x(0) = x0 (8.2b)


2
x(1) = x(0) + u(0) − 2u(0) (8.2c)
− 1 ≤ u(0) ≤ 1 (8.2d)

First, we observe that the optimization problem has a three-dimensional


space of optimization variables. To check convexity of the problem,
we first regard the objective, which is a sum of positive quadratic func-
tions, thus a convex function. On the other hand, we need to check
convexity of the feasible set. The initial-value constraint (8.2b) fixes
one of the three variables, thus selects a two-dimensional affine subset
in the three-dimensional space. This subset is described by x(0) = 1
while u(0) and x(1) remain free. Likewise, the control bounds in (8.2d)
cut away all values for u(0) that are less than −1 or more than +1,
thus, there remains only a straight stripe of width 2 in the affine sub-
set, still extending to infinity in the x(1) direction. This straight two-
dimensional stripe still is a convex set. The system equation (8.2c) is a
nonlinear constraint that selects a curve out of the stripe, which is visu-
alized on the left of Figure 8.1. This curve is not a convex set, because
the connecting lines between two points on the curve are not always
contained in the curve. In a formula, the feasible set is given by {(x(0),
x(1), u(0)) | x(0) = 1, u(0) ∈ [−1, 1], x(1) = 1 + u(0) − 2u(0)2 }.
Even though the objective function is convex, the fact that the op-
timization problem has a nonconvex feasible set can lead to different
local minima. This is indeed the case in our example. To see this, let
us evaluate the objective function on all feasible points and plot it as a
function of u(0). This reduced objective function ψ(u) can be obtained
by inserting x(0) = 1 and x(1) = x(0)+u(0)−2u(0)2 into the objective
x(0)2 +u(0)2 +10x(1)2 , which yields ψ(u) = 1+u2 +10(1+u−2u2 )2 =
11 + 20u − 29u2 − 40u3 + 40u4 . This reduced objective is visualized
490 Numerical Optimal Control

102
1

0
x(1) ψ
101
−1 feasible set

−2
100
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
u(0) u(0)

Figure 8.1: Feasible set and reduced objective ψ(u(0)) of the non-
linear MPC Example 8.1.

on the right of Figure 8.1 and it can clearly be seen that two different
locally optimal solutions exist, only one of which is the globally optimal
choice. □

8.1.3 Simultaneous Versus Sequential Optimal Control

The optimal control problem (OCP) (8.1) can be passed to an appro-


priate optimization routine without any modification. In this case, the
optimization variables are given by both, the state trajectory x as well
as the control trajectory u. The pair (x, u) is consistent with the initial
value x0 and the simulation model if and only if the constraints (8.1b)
and (8.1c) are satisfied, which is the case for any feasible solution of
the problem. During the optimization calculations, however, these con-
straints might be violated, and the state trajectory x might not be a valid
simulation corresponding to the controls u. Since the optimization rou-
tine has to simultaneously solve the simulation and the optimization
problem, one calls this approach the simultaneous approach to optimal
control.
On the other hand, one could use the constraints (8.1b)-(8.1c) to find
the unique feasible state trajectory x for any given control trajectory
u. We denote, as before in Chapter 2, the state x(k) that results from
a given initial condition x0 and a given control trajectory u = (u(0),
u(1), . . . , u(N − 1)) by φ(k; x0 , u). Using this expression, that can be
computed by a simple forward simulation routine, we can replace the
equalities (8.1b)-(8.1c) by the trivial equalities x(k) = φ(k; x0 , u) for
8.1 Introduction 491

k = 0, 1, . . . , N. And these constraints can be used to eliminate the com-


plete state trajectory x = (x(0), x(1), . . . , x(N)) from the optimization
problem. The optimization problem in this reduced variable space is
given by

N−1
X
minimize ℓ(φ(k; x0 , u), u(k)) + Vf (φ(N; x0 , u)) (8.3a)
u k=0

subject to (φ(k; x0 , u), u(k)) ∈ Z, k = 0, 1, . . . , N − 1 (8.3b)


φ(N; x0 , u) ∈ Xf (8.3c)

If this reduced optimization problem is solved by an iterative optimiza-


tion routine, in each iteration, one performs a sequence of two steps.
First, for given u, the simulation routine computes the state trajectory
x, and second, the optimization routine updates the control variables
u to iterate toward an optimal solution. Due to this sequential evalu-
ation of simulation and optimization routines, one calls this approach
the sequential approach to optimal control. Though the simultaneous
and the sequential approach solve equivalent optimization problems,
their approach toward finding the solutions is different.
For linear MPC problems, where the system model is linear, the dif-
ference between the two approaches regards mostly the sparsity struc-
ture of the optimization problem, as discussed in Chapter 6 and in
Section 8.8.4. In this case, one usually calls the reduced optimization
problem (8.3) the condensed problem, and the computational process
to generate the data for the condensed problem (8.3) from the data of
the original problem (8.1) is called condensing. Though the condensed
problem has fewer variables, the matrices defining it may have more
nonzero entries than the original problem. Which of the two formula-
tions leads to shorter computation times for a given problem depends
on the number of states, controls and constraints, the specific sparsity
structures, and on the horizon length N. For small N, condensing is
typically preferable, while for large N, it is advisable to apply a sparse
convex solver to the original problem in the full variable space. Despite
the different sparsity structure, and different cost per iteration, many
widely used convex optimization algorithms perform identical iterates
on both problems, because the eliminated constraints are linear and
are exactly respected in each iteration in both the condensed as well as
the original problem formulation.
For nonlinear MPC problems, the sequential and simultaneous ap-
proach can lead to significantly different optimization iterations. Even
492 Numerical Optimal Control

if both problems are addressed with the same optimization algorithm


and are initialized with the same initial guess, i.e., the same u for both,
together with the corresponding simulation result x, the optimization
iterations typically differ after the first iteration, such that the two for-
mulations can need a significantly different number of iterations to
converge; they might even converge to different local solutions or one
formulation might converge while the other does not. As a rule of
thumb, the sequential approach is preferable if the optimization solver
cannot exploit sparsity and the system is stable, while the simultaneous
approach is preferable for unstable nonlinear systems, for problems
with state constraints, and for systems which need implicit simulation
routines.

Example 8.2: Sequential approach


We regard again the simple MPC optimization problem (8.2a), but elim-
inate the states as a function of u = (u(0)) by x(0) = φ(0; x0 , u) = x0
and x(1) = φ(1; x0 , u) = x0 +u(0)−2u(0)2 . The reduced optimization
problem in the sequential approach is then given by
 2
minimize x02 + u(0)2 + 10 x0 + u(0) − 2u(0)2 (8.4a)
u(0)

subject to − 1 ≤ u(0) ≤ 1 (8.4b)

8.1.4 Continuous Time Optimal Control Problem

In most nonlinear MPC applications and many linear MPC applications,


the system dynamics are not given in discrete time but in continuous
time, in form of differential equations

dx
= fc (x, u)
dt

For notational convenience, we usually denote differentiation with re-


spect to time by a dot above the quantity, i.e., we can abbreviate the
above equations by ẋ = fc (x, u). Both the state and control trajecto-
ries are functions of continuous time, and we denote them by x(t) and
u(t). The trajectories need only to be defined on the time horizon of
interest, i.e., for all t ∈ [0, T ], where T is the horizon length. If we do
not assume any discretization, and if we use the shorthand symbols
8.1 Introduction 493

x(·) and u(·) to denote the state and control trajectories, the continu-
ous time optimal control problem (OCP) can be formulated as follows

ZT
minimize ℓc (x(t), u(t)) dt + Vf (x(T )) (8.5a)
x(·), u(·) 0

subject to x(0) = x0 (8.5b)


ẋ(t) = fc (x(t), u(t)), t ∈ [0, T ] (8.5c)
(x(t), u(t)) ∈ Z, t ∈ [0, T ] (8.5d)
x(T ) ∈ Xf (8.5e)

It is important to note that the continuous time optimal control


problem is an infinite-dimensional optimization problem with infinite-
dimensional decision variables and an infinite number of constraints,
because the time index t runs through infinitely many values t ∈ [0, T ].
This is in contrast to discrete time, where the finite number of time in-
dices k ∈ I0:N leads to finitely many decision variables and constraints.
There exists a variety of methods to numerically solve continuous
time OCPs. What all approaches have in common is that at one point,
the infinite-dimensional problem needs to be discretized. One fam-
ily of methods first formulates what is known as the Hamilton-Jacobi-
Bellman (HJB) equation, a partial differential equation for the value
function, which depends on both state space and time, and then dis-
cretizes and solves it. Unfortunately, due to the “curse of dimensional-
ity,” this approach is only practically applicable to systems with small
state dimensions, say less than five, or to the special case of uncon-
strained linear systems with quadratic costs.
A second family of methods, the indirect methods, first derive opti-
mality conditions in continuous time by algebraic manipulations that
use similar expressions as the HJB equation; they typically result in the
formulation of a boundary-value problem (BVP), and only discretize the
resulting continuous time BVP at the very end of the procedure. One
characterizes the indirect methods often as “first optimize, then dis-
cretize.” A third class of methods, the direct methods, first discretizes
the continuous time OCP, to convert it into a finite-dimensional opti-
mization problem. The finite-dimensional optimization problem can
then be solved by tailored algorithms from the field of numerical op-
timization. The direct methods are often characterized as “first dis-
cretize, then optimize.” These methods are most widely used in MPC
applications and are therefore the focus of this chapter.
494 Numerical Optimal Control

To sketch the discretization methods, we look at the continuous


time optimal control problem (8.5). In a direct method, we replace
the continuous index t ∈ [0, T ] by a discrete integer index. For this
aim, we can divide the time horizon T into N intervals, each of length
T
h= N , and evaluate the quantities of interest only for the discrete time
points t = hk with k ∈ I0:N . We use the notation hI0:N = {0, h, 2h, . . . ,
Nh}, such that we can use the expression “t ∈ hI0:N ” to indicate that
t is only considered at these discrete time points. To discretize the
OCP, the objective integral is replaced by a Riemann sum, and the time
derivative by a finite difference approximation: ẋ(t) ≈ x(t+h)−x(t)
h . As
before in discrete time, we denote the sequence of discrete states by
x = (x(0), x(h), x(2h), . . . , x(Nh)) and the sequence of controls by
u = (u(0), u(h), . . . , u(Nh − h)).

X
minimize hℓc (x(t), u(t)) + Vf (x(Nh)) (8.6a)
x, u t∈hI0:N−1

subject to x(0) = x0 (8.6b)


x(t +h)−x(t)
= fc (x(t), u(t)), t ∈ hI0:N−1 (8.6c)
h
(x(t), u(t)) ∈ Z, t ∈ hI0:N−1 (8.6d)
x(Nh) ∈ Xf (8.6e)

It is easily checked that the constraints (8.6b)-(8.6c) uniquely determine


all states x if the control sequence u is given. The above problem is ex-
actly in the form of the discrete time optimization problem (8.1), if one
uses the definitions ℓ(x, u) := hℓc (x, u) and f (x, u) := x + hfc (x, u).
This simple way to go from continuous to discrete time, in particular
the idea to solve a differential equation ẋ = fc (x, u) by the simple
difference method x + = x + hfc (x, u), is originally due to Leonhard
Euler (1707–1783), and is therefore called the Euler integration method.
The Euler method is not the only possible integration method, and in
fact, not the most efficient one. Numerical analysts have investigated
the simulation of differential equations for more than two centuries,
and discovered powerful discretization methods that have much lower
computational cost and higher accuracy than the Euler method and are
therefore more widely used in practice. These are the topic of the next
section.
8.2 Numerical Simulation 495

8.2 Numerical Simulation


The classical task of numerical simulation is the solution of initial-value
problems. An initial-value problem is characterized by an initial state
value x0 at time 0, and a differential equation ẋ = f (t, x) that the
solution x(t) should satisfy on the time interval of interest, i.e., for all
t ∈ [0, T ] with T > 0. In particular, we are interested in computing
an approximation of the final state x(T ). In this section, we allow an
explicit dependence of the right-hand-side function f (t, x) on time. To
be consistent with the literature in the field of numerical simulation—
and deviating from the notation in other chapters of this book—we use
t here as the first input argument of f (t, x). The time dependence
might in particular be due to a fixed control trajectory u(t), and if a
given system is described by the continuous time ODE ẋ = fc (x, u), the
time dependent right-hand-side function is defined by f (t, x) := fc (x,
u(t)). The choice of the control trajectory u(t) is not the focus in
this section, but becomes important later when we treat the solution
of optimal control problems. Instead, in this section, we just review
results from the field of numerical simulation of ordinary differential
equations—which is sometimes also called numerical integration—that
are most relevant to continuous time optimal control computations.
Throughout this section we consider the following initial-value
problem

x(0) = x0 , ẋ(t) = f (t, x(t)) for t ∈ [0, T ] (8.7)

with a given right-hand-side function f : [0, T ] × Rn → Rn . We denote


the exact solution, if it exists, by x(t). Existence of a unique solution of
the initial-value problem is guaranteed by a classical theorem by Émile
Picard (1856–1941) and Ernst Lindelöf (1870–1946), which requires the
function f to be continuous with respect to time t and Lipschitz contin-
uous with respect to the state x. Lipschitz continuity is stronger than
continuity and requires the existence of a constant L > 0 such that the
following inequality

f (t, x) − f (t, y) ≤ L x − y (8.8)

holds for all t ∈ [0, T ] and all x, y ∈ Rn . In many cases of interest,


the function f is not defined on the whole state space, or there might
exist no global Lipschitz constant L for all states x and y. Fortunately,
a local version of the Picard-Lindelöf Theorem exists that only needs
Lipschitz continuity in a neighborhood of the point (0, x0 ) and still
496 Numerical Optimal Control

ensures the existence of a unique solution x(t) for sufficiently small


T . Local Lipschitz continuity is implied by continuous differentiabil-
ity, which is easy to verify and holds for most functions f arising in
practice. In fact, the function f usually is many times differentiable in
both its arguments, and often even infinitely many times—for example,
in the case of polynomials or other analytic functions. The higher dif-
ferentiability of f also leads to higher differentiability of the solution
trajectory x(t) with respect to t, and is at the basis of the higher-order
integration methods that are widely used in practice.
Because all numerical integration methods produce only approxi-
mations to the true solution x(t), we use a different symbol for these
approximations, namely x e (t). The numerical approximation is usu-
ally only exact for the initial value, where we simply set x e (0) := x0 .
For the final state at time T , we aim to have a small error E(T ) :=
x
e (T ) − x(T ) , at low computational cost. All integration methods di-
vide the time horizon of interest into smaller intervals, and proceed by
making a sequence of integration steps, one per interval. For simplic-
ity, assume that the steps are equidistant, and that in total N steps of
size h = T /N are taken. In each step, the integration method makes a
local error, and the combined effect of the accumulated local errors at
time t, i.e., the distance E(t) = x e (t) − x(t) , is called the global error.
After the first integrator step, local and global error coincide because
the integration starts on the exact trajectory, but in subsequent steps,
the global error typically grows while the local errors remain of similar
size.

8.2.1 Explicit Runge-Kutta Methods

Let us first investigate the Euler integrator, that iterates according to


the update rule
x
e (t + h) = x
e (t) + hf (t, x
e (t))

starting with xe (0) = x0 . Which local error do we make in each step? For
local error analysis, we assume that the starting point x e (t) was on an
exact trajectory, i.e., equal to x(t), while the result of the integrator step
x
e (t +h) is different from x(t +h). For the analysis, we assume that the
true trajectory x(t) is twice continuously differentiable with bounded
second derivatives, which implies that its first-order Taylor series satis-
fies x(t + h) = x(t) + hẋ(t) + O(h2 ), where O(h2 ) denotes an arbitrary
function whose size shrinks faster than h2 for h → 0. Since the first
derivative is known exactly, ẋ(t) = f (t, x(t)), and was used in the Euler
8.2 Numerical Simulation 497

integrator, we immediately obtain that x e (t + h) − x(t + h) = O(h2 ).


Because the global error is the accumulated and propagated effect of
the local errors, and because the total number of integrator steps grows
linearly with 1/h, one can show that the global error at the end of the
interval of interest is of size 1/h O(h2 ) = O(h), i.e., of first order. For
this reason one says that the Euler method is a first-order integration
method. The Euler integrator is easy to remember and easy to imple-
ment, but the number of time steps that are needed to obtain even a
moderate accuracy can be reduced significantly if higher-order meth-
ods are used.
Like the Euler integrator, all one-step integration methods create a
discrete time system of the form

x
e (t + h) = xe (t) + Φ(t, x
e (t), h)
R t+h
Here, the map Φ approximates the integral t f (τ, x(τ)) dτ. If Φ
would be equal to this integral, the integration method would be exact,
due to the identity
Z t+h Z t+h
x(t + h) − x(t) = ẋ(τ) dτ = f (τ, x(τ)) dτ
t t

While the Euler integrator approximates the integral by the expression


Φ(t, x, h) = hf (t, x(t)) that has an error of O(h2 ) and needs only
one evaluation of the function f per step, one can find more accurate
approximations by allowing more than one function evaluation per in-
tegration step. This idea leads directly to the Runge-Kutta (RK) integra-
tion methods, that are named after Carl Runge (1856–1927) and Martin
Wilhelm Kutta (1867–1944).
The classical Runge-Kutta method (RK4). One of the most widely
used methods invented by Runge and Kutta performs four function
evaluations, as follows.

k1 = f (t, x)
k2 = f (t + h/2, x + (h/2)k1 )
k3 = f (t + h/2, x + (h/2)k2 )
k4 = f (t + h, x + hk3 )
Φ = (h/6)k1 + (h/3)k2 + (h/3)k3 + (h/6)k4

It is a fourth-order method, and therefore often abbreviated RK4. Since


it is one of the most competitive methods for the accuracies that are
498 Numerical Optimal Control

typically needed in applications, the RK4 integrator is one of the most


widely used integration methods for simulation of ordinary differential
equations. A comparison of the RK4 method with Euler’s first-order
method and a second-order method named after Karl Heun (1859–
1929) is shown in Figure 8.2.

Example 8.3: Integration methods of different order


We regard the simulation of the linear ordinary differential equation
(ODE) " #
0 1
ẋ = Ax with A=
−1 0
over the interval T = 2π , starting at x0 = [1, 0]′ . The analytic solution
of this system is known to be x(t) = exp(At)x0 = [cos(t), − sin(t)]′ ,
such that the final state is given by x(2π ) = [1, 0]′ . To investigate
the performance of different methods, we divide the time horizon into
N equal integration intervals of length h = 2π /N. Note that a Runge-
Kutta method with s stages needs in total M := Ns function evaluations.
We compare the Euler (s = 1), Heun (s = 2), and RK4 method (s = 4).
For each integration method we evaluate the global error at the end
of the integration interval, E(2π ) = x e (2π ) − x(2π ) , and plot it as
a function of the number of function evaluations, M, in Figure 8.2.
We use a doubly logarithmic scale, i.e., plot log(ϵ) versus log(M), to
show the effect of the order. Note that the slope of the higher-order
methods is an integer multiple of the slope of the Euler method. Also
note that the accuracy for each investigated method cannot exceed a
certain base value due to the finite precision arithmetic, and that this
limit is reached for the RK4 integrator at approximately M = 105 . After
this point, increasing the number of integration steps does not further
improve the accuracy. □

The Butcher tableau. A general explicit Runge-Kutta method with s


stages performs the following computations in each integration step

k1 = f (t +c1 h, x )
k2 = f (t +c2 h, x + h (a21 k1 ) )
k3 = f (t +c3 h, x + h (a31 k1 + a32 k2 ) )
.. ..
. .
ks = f (t +cs h, x + h (as1 k1 + ... + as,s−1 ks−1 ) )

Φ = h (b1 k1 + ... + bs−1 ks−1 + bs k s )


8.2 Numerical Simulation 499

explicit Euler
Heun
100 RK4 exact explicit Euler Heun RK4

10−3 1
x1
0
E(2π ) 10−6
−1

10−9
1
x2
10−12 0

−1
0 π /2 π 3π /2 2π
102 104 106
function evaluations t

(a) Accuracy vs. function evaluations. (b) Simulation results for M = 32.

Figure 8.2: Performance of different integration methods.

It is important to note that on the right-hand side of each row, only


those ki values are used that are already computed. This property
holds for every explicit integration method, and makes it possible to
explicitly evaluate the first s equations one after the other to obtain
all values k1 , . . . , ks for the summation in the last line. One usually
summarizes the coefficients of a Runge-Kutta method in what is known
as a Butcher tableau (after John C. Butcher, born 1933) given by

c1
c2 a21
c3 a31 a32
.. .. ..
. . .
cs as1 ··· as,s−1
b1 b2 ··· bs

The Butcher tableau of three popular RK methods is stated below

Euler Heun RK4

0
0 0 1/2 1/2
1 1 1 1/2 0 1/2
1/2 1/2 1 0 0 1
1/6 2/6 2/6 1/6

Note that the bi coefficients on the bottom always add to one. An


interesting fact is that an s-stage explicit Runge-Kutta method can never
500 Numerical Optimal Control

have a higher order than s. And only for orders equal or less than four
exist explicit Runge-Kutta methods for which the order and the number
of stages coincide.

8.2.2 Stiff Equations and Implicit Integrators

Unfortunately, some differential equations cannot reliably be solved by


explicit integration methods; it can occur that even if the underlying
ODE is stable, the integration method is not. Let us regard the scalar
linear ODE
ẋ = λx

with initial condition x0 as a test case. The exact solution is known to


be x(t) = eλt x0 . When this ODE is solved by an explicit Euler method,
it iterates like x + = x + hλx and it is easy to see that the explicit so-
lution is given by x e (kh) = (1 + hλ)k x0 . For positive λ, this leads to
exponential growth, which is not surprising given that the exact ODE
solution grows exponentially. If λ is a large negative number, how-
ever, the exact solution x(t) would decay very fast to zero, while the
Euler integrator is unstable and oscillates with exponentially growing
amplitude if h is larger than 2/(−λ). A similar observation holds for
all explicit integration methods.
The most perturbing fact is that the explicit integration methods
are extremely unstable exactly because of the fact that the system is
extremely stable. Extremely stable ODEs are called stiff equations. For
stiff ODE ẋ = f (t, x), some of the eigenvalues of the Jacobian fx have
extremely large negative real parts, which lead to extremely stable sub-
dynamics. Exactly these extremely stable subdynamics let the explicit
integrators fail; even for relatively short stepsizes h, they overshoot
the true solution and exhibit unstable oscillations. These oscillations
do not just lead to inaccurate solutions, but in fact they quickly ex-
ceed the range of computer representable numbers (10308 for double
precision), such that the explicit integrator just outputs “NaN” (“not a
number”) most of the time.
Fortunately, there exist integration methods that remain stable even
for stiff ODE. Their only drawback is that they are implicit, i.e., they re-
quire the solution of an equation system to compute the next step. The
simplest of these implicit methods is called the implicit Euler method
and it iterates according to

x + = x + hf (t + h, x + )
8.2 Numerical Simulation 501

Note that the desired output value x + appears also on the right side of
the equation. For the scalar linear ODE ẋ = λx, the implicit Euler step
is determined by x + = x + hλx + , which can explicitly be solved to give
x + = x/(1 − hλ). For any negative λ, the denominator is larger than
one, and the numerical approximation x e (kh) = x0 /(1 − hλ)k there-
fore decays exponentially, similar to the exact solution. An integration
method which has the desirable property that it remains stable for the
test ODE ẋ = λx whenever Re(λ) < 0 is called A-stable. While none of
the explicit Runge-Kutta methods is A-stable, the implicit Euler method
is A-stable. But it has a low order. Can we devise A-stable methods that
have a higher order?

8.2.3 Implicit Runge-Kutta and Collocation Methods

Once we accept that we need to solve a nonlinear equation system in or-


der to compute an integration step, we can extend the family of Runge-
Kutta methods by allowing diagonal and upper-triangular entries in the
Butcher tableau. Our hope is to find integration methods that are both
A-stable and have a high order. A general implicit Runge-Kutta method
with s stages solves the following nonlinear system in each integration
step

k1 = f (t + c1 h , x + h ( a11 k1 + a12 k2 +... + a1,s ks ) )


k2 = f (t + c2 h , x + h ( a21 k1 + a22 k2 +... + a2,s ks ) )
.. .. ..
. . .
ks = f (t + cs h , x + h ( as1 k1 + as,2 k2 +... + as,s ks ) )

Φ = h ( b1 k1 + b2 k2 +... + bs ks )

Note that the upper s equations are implicit and form a root-finding
problem with sn nonlinear equations in sn unknowns, where s is the
number of RK stages and n is the state dimension of the differen-
tial equation ẋ = f (t, x). Nonlinear root-finding problems are usually
solved by Newton’s method, which is treated in the next section. For
Newton’s method to work, one has to assume that the Jacobian of the
residual function is invertible. For the RK equations above, this can be
shown to always hold if the time step h is sufficiently small, depending
on the right-hand-side function f . After the values k1 , . . . , ks have been
computed, the last line can be executed and yields the resulting map
Φ(t, x, h). The integrator then uses the map Φ to proceed to the next
integration step exactly as the other one-step methods, according to
502 Numerical Optimal Control

the update equation

x
e (t + h) = x
e (t) + Φ(t, x
e (t), h)

For implicit integrators, contrary to the explicit ones, the map Φ cannot
easily be written down as a series of function evaluations. Evaluation of
Φ(t, x, h) includes the root-finding procedure and typically needs sev-
eral evaluations of the root-finding equations and of their derivatives.
Thus, an s-stage implicit Runge-Kutta method is significantly more ex-
pensive per step compared to an s-stage explicit Runge-Kutta method.
Implicit integrators are usually preferable for stiff ordinary differential
equations, however, due to their better stability properties.
Many different implicit Runge-Kutta methods exist, and each of
them can be defined by its Butcher tableau. For an implicit RK method,
at least one of the diagonal and upper-triangular entries (aij with j ≥ i)
is nonzero. Some methods try to limit the implicit part for easier com-
putations. For example, the diagonally implicit Runge-Kutta methods
have only the diagonal entries nonzero while the upper-triangular part
remains zero.

Collocation methods. One particularly popular subclass of implicit


Runge-Kutta methods is formed by the collocation methods. An s-stage
collocation method first fixes the values ci of the Butcher tableau, and
chooses them so that they are all different and in the unit interval,
i.e., 0 ≤ c1 < c2 < . . . < cs ≤ 1. The resulting time points (t + hci )
are called the collocation points, and their choice uniquely determines
all other entries in the Butcher tableau. The idea of collocation is to
approximate the trajectory on the collocation interval by a polynomial
x
e (τ) for τ ∈ [t, t+h], and to require satisfaction of the ODE ẋ = f (t, x)
only on the collocation points, i.e., impose the conditions ẋ e (t + hci ) =
f (t+hci , x
e (t+hci )) for i = 1, . . . , s. Together with the requirement that
the approximating polynomial x e (τ) should start at the initial value, i.e.,
x
e (t) = x, we have (s + 1) conditions such that the polynomial needs
to have (s + 1) coefficients, i.e., have the degree s, to yield a well-posed
root-finding problem.
The polynomial x e (τ) can be represented in different ways, which
are related via linear basis changes and therefore lead to numerically
equivalent root-finding problems. One popular way is to parameterize
x
e (τ) as the interpolating polynomial through the initial value x and the
state values at the collocation points. This only gives a unique param-
eterization if c1 ≠ 0. To have a more generally applicable derivation
8.2 Numerical Simulation 503

of collocation, we use instead the value x together with the s deriva-


tive values k1 , . . . , ks at the collocation time points toR parameterize
τ
x
e (τ). More precisely, we use the identity x e (τ) = x + t ẋ e (τ1 ; k1 , k2 ,
. . . , ks ) dτ1 , where ẋ
e (·) is the time derivative of x
e (τ), and therefore a
polynomial of degree (s − 1) that can be represented by s coefficients.
Fortunately, due to the fact that all collocation points are different, the
interpolating polynomial through the s vectors k1 , . . . , ks is well defined
and  can easily be represented in a Lagrange basis, with basis functions
Li τ−t h that are one on the i-th collocation point and zero on all oth-
ers.1 Collocation thus approximates ẋ(τ) by the polynomial
     
τ −t τ −t τ −t
ẋ (τ; k1 , k2 , . . . , ks )
e := k1 L1 + k2 L2 + . . . + ks Ls
h h h

and x(τ) by its integral



x
e (τ; x, k1 , k2 , . . . , ks ) := x + ẋ
e (τ1 ; k1 , k2 , . . . , ks ) dτ1
t

To obtain the state at the collocation point (t + ci h), we just need to


evaluate x
e (t + ci h; x, k1 , k2 , . . . , ks ), which is given by the following in-
tegral
Z t+ci h s
X Z ci
x+ ẋ
e (τ1 ; k1 , k2 , . . . , ks ) dτ1 = x + kj h Lj (σ ) dσ
t
j=1 |0 {z }
=:aij

Note that the integrals over the Lagrange basis polynomials depend
only on the relative positions of the collocation time points, and directly
yield the coefficients aij . Likewise, to obtain the coefficients bi , we
evaluate xe (t + h; x, k1 , k2 , . . . , ks ), which is given by
Z t+h s
X Z1
x+ ẋ
e (τ; k1 , k2 , . . . , ks ) dτ = x + ki h Li (σ ) dσ
t
i=1 |0 {z }
=:bi

In Figure 8.3, the difference between the exact solution x(τ) and the
collocation polynomial x e (τ) as well as the difference between their
1 The Lagrange basis polynomials are defined by
Y (σ − cj )
Li (σ ) :=
(ci − cj )
1≤j≤s, j≠i
504 Numerical Optimal Control

1.0
k1
0.8 x
e 1 (τ)
x1 x1 (τ)
0.6 k2

0.4

0.0

k1
ẋ1 −0.5 x
ė 1 (τ)
ẋ1 (τ) k2
−1.0
t t + c1 h t + c2 h t+h
τ

Figure 8.3: Polynomial approximation x e 1 (t) and true trajectory


x1 (t) of the first state and its derivative, computed at
the first integration step of the GL4 collocation method
applied to the stiff ODE from Example 8.4. Note that the
accuracy of the polynomial at the end of the interval is
significantly higher than in the interior. The result of this
first GL4 step can also be seen on the right side of Fig-
ure 8.4.

time derivatives is visualized, for a collocation method with s = 2 col-


location points (GL4) applied to the ODE from Example 8.4. Note that
in this example, ẋ
e (τ; k1 , k2 , . . . , ks ) is a polynomial of order one, i.e., an
affine function, and its integral, x e (τ; x, k1 , k2 , . . . , ks ), is a polynomial
of order two.
The Butcher tableau of three popular collocation methods is

Implicit Midpoint Gauss-Legendre


Euler rule (GL2) of order 4 (GL4)
√ √
1 1 1/2 1/2 1/2− 3/6 1/4 1/4− 3/6
√ √
1 1 1/2+ 3/6 1/4+ 3/6 1/4
1/2 1/2
8.2 Numerical Simulation 505

An interesting remark is that the highest order that an s-stage implicit


Runge-Kutta method can achieve is given by 2s, and that the Gauss-
Legendre collocation methods achieve this order, due to a particularly
smart choice of collocation points (namely as roots of the orthogonal
Legendre polynomials, following the idea of Gaussian quadrature). The
midpoint rule is a Gauss-Legendre method of second order (GL2). The
Gauss-Legendre methods, like many other popular collocation meth-
ods, are A-stable. Some methods, such as the Radau IIA collocation
methods, have even stronger stability properties (they are also L-stable),
and are often preferable for stiff problems. All collocation methods
need to solve a nonlinear system of equations in ns dimensions in each
step, which can become costly for large state dimensions and many
stages.

Example 8.4: Implicit integrators for a stiff ODE system


We consider the following ODE

ẋ = Ax − 500 x (|x|2 − 1)

with A and initial conditions as before in Example 8.3. In contrast to


the previous example, this ODE is nonlinear and stiff, due to the ad-
ditive nonlinear term −500 x (|x|2 − 1). This term is zero only if the
norm of x is one, i.e., if the state lies on the unit circle. If not, the
state is strongly pushed toward the unit circle. This makes the system
a stiff ODE. As we start at [1, 0]′ , the exact solution lies again on the
unit circle, and also ends at [1, 0]′ . For comparison, we solve the initial
value problem with three implicit integration methods, all of colloca-
tion type (implicit Euler, GL2, GL4). To have an approximate measure of
the computational costs of the different methods, we denote by M the
total number of collocation points on the time horizon. The results are
shown in Figure 8.4. On the left-hand side, the different order behav-
ior is observed. On the right-hand side, the trajectories resulting from
a total of M = 10 collocation points are shown for the three different
methods. In Figure 8.3, the first step of the GL4 method is visualized in
detail, showing both the trajectory of the first state as well as its time
derivative, together with their polynomial approximations. □

8.2.4 Differential Algebraic Equations

Some system models do not only contain differential, but also algebraic
equations, and therefore belong to the class of differential algebraic
506 Numerical Optimal Control

implicit Euler
100
GL2
GL4 exact implicit Euler GL2 GL4

10−3 1

x1 0
−6
E(2π ) 10
−1
10−9 1

x2 0
10−12

−1
0 π /2 π 3π /2 2π
101 103 105
collocation points M t

(a) Accuracy vs. collocation points. (b) Simulation result for M = 10 points.

Figure 8.4: Performance of implicit integration methods on a stiff


ODE.

equations (DAEs). The algebraic equations might, for example, reflect


conservation laws in chemical reaction models or kinematic constraints
in robot models. DAE models come in many different forms, some
of which are easier to treat numerically than others. One particularly
favorable class of DAE are the semiexplicit DAE of index one, which can
be written as

ẋ = f (t, x, z) (8.9a)
0 = g(t, x, z) (8.9b)

Here, the differential states x ∈ Rn are accompanied by algebraic states


z ∈ Rnz , and the algebraic states are implicitly determined by the alge-
braic equations (8.9b). Here, the number of algebraic equations is equal
to the number of algebraic states, i.e., g : R × Rn × Rnz → Rnz , such
that for fixed t and x, the algebraic equation (8.9b) forms a nonlinear
system of nz equations for nz unknowns.
The assumption of index one requires the Jacobian matrix of g with
respect to z to be invertible at all points of interest. The fact that ẋ
appears alone on the left side of the differential equation (8.9a) makes
the DAE semiexplicit. An interesting observation is that it is possible to
reduce a semiexplicit DAE of index one to an ODE if one finds an explicit
symbolic expression z∗ (t, x) for the implicit function defined by g(t, x,
z∗ (t, x)) = 0. The resulting ODE that is equivalent to the original DAE
is given by ẋ = f (t, x, z∗ (t, x)). Usually, this reduction from an index-
one DAE to an ordinary differential equation is not possible analytically.
A numerical computation of z∗ (t, x) is always possible in principle, but
8.3 Solving Nonlinear Equation Systems 507

requires the use of an underlying root-finding method. This way it is


possible to solve a DAE with explicit integration methods. For implicit
integration methods, however, one can simply augment the nonlinear
equation system by the algebraic equations g at all evaluation points
of the right-hand-side of the differential function f , and then rely on
the root-finding method of the integrator. For this reason, and because
they are often stiff, DAE are usually addressed with implicit integrators.

8.2.5 Integrator Adaptivity

Many practical integration methods use an adaptive stepsize selection


to attain a good trade-off between numerical accuracy and computa-
tional effort. Instead of performing steps of equal length h, adaptive
methods vary h in each step. Usually, they try to keep an estimate of
the local error constant. The details are beyond our interest here, but
we note that integrator adaptivity can be a crucial feature for the ef-
ficiency of nonlinear MPC implementations, in particular for the long
simulation intervals which appear when one appends a prediction hori-
zon at the end of the control horizon. On the other hand, integrator
adaptivity needs to be treated with care when numerical derivatives of
the simulation result are computed, as discussed in Section 8.4.6.

8.3 Solving Nonlinear Equation Systems


We have seen that an important subtask within numerical simulation—
as well as in numerical optimization—is the solution of nonlinear equa-
tion systems. In this section, we therefore discuss the basic technolo-
gies that make it possible to solve implicit equation systems with thou-
sands of unknowns within a few milliseconds. We start with linear
equations, and then proceed to nonlinear equations and their solution
with Newton-type methods.

8.3.1 Linear Systems

Solving a linear system of equations Az = b with a square invert-


ible matrix A ∈ Rnz ×nz is an easy task in the age of digital comput-
ers. The direct solution of the system requires only two computational
steps: first, a factorization of the matrix A, for example, a lower-upper-
factorization (LU-factorization) that yields a lower-triangular matrix L
and an upper-triangular matrix U such that LU = A. Second, one
508 Numerical Optimal Control

needs to perform a forward and a back substitution, yielding the so-


lution as z = U −1 (L−1 b). The computation of the LU-factorization, or
LU-decomposition, requires (2/3)n3z floating-point operations (FLOPs),
while the forward and back substitution require together n2z operations.
Additional row or column permutations—in a process called pivoting—
usually need to be employed and improve numerical stability, but only
add little extra computational cost. The LU-decomposition algorithm
was introduced by Alan Turing (1912–1954), and can be traced back to
Gaussian elimination, after Carl Friedrich Gauss (1777–1855). Solving
a dense linear system with nz = 3000 variables needs about 18 · 109
FLOPs, which on a current quadcore processor (2.9 GHz Intel Core i5)
need only 600 ms.
The runtime of the LU-decomposition and the substitutions can sig-
nificantly be reduced if the matrix A is sparse, i.e., if it has many more
zero than nonzero entries. Sparsity is particularly simple to exploit
in case of banded matrices, which have their nonzero entries only in
a band around the diagonal. Tailored direct methods also can exploit
other structures, like block sparsity, or symmetry of the matrix A. For
symmetric A, one usually performs a lower-diagonal-lower-transpose-
factorization (LDLT-factorization) of the form LDL′ = A (with lower-
triangular L and diagonal D), which reduces the computational cost by
a factor of two compared to an LU-factorization. For symmetric and
positive definite matrices A, one can even apply a Cholesky decomposi-
tion of the form LL′ = A, with similar costs as the LDLT-factorization.

For huge linear systems that cannot be addressed by direct factor-


ization approaches, there exist a variety of indirect or iterative solvers.
Linear system solving is one of the most widely used numerical tech-
niques in science and engineering, and the field of computational lin-
ear algebra is investigated by a vibrant and active research community.
Contrary to only a century ago, when linear system solving was a te-
dious and error-prone task, today we rarely notice when we solve a
linear equation, e.g., by using the backslash operator in MATLAB in the
expression A\b, because computational linear algebra is such a reliable
and mature technology.

8.3.2 Nonlinear Root-Finding Problems

A more difficult situation occurs when a nonlinear equation system


R(z) = 0 needs to be solved, for example, in each step of an im-
plicit Runge-Kutta method, or in nonlinear optimization. Depending
8.3 Solving Nonlinear Equation Systems 509

on the problem, one can usually not even be sure that a solution z0
with R(z0 ) = 0 exists. And if one has found a solution, one usually
cannot be sure that it is the only one. Despite these theoretical diffi-
culties with nonlinear root-finding problems, they are nearly as widely
formulated and solved in science and engineering as linear equation
systems.
In this section we therefore consider a continuously differentiable
function R : Rnz → Rnz , z , R(z), where our aim is to solve the non-
linear equation
R(z) = 0
Nearly all algorithms to solve this system derive from an algorithm
called Newton’s method or Newton-Raphson method that is accredited
to Isaac Newton (1643–1727) and Joseph Raphson (about 1648–1715),
but which was first described in its current form by Thomas Simpson
(1710–1761). The idea is to start with an initial guess z0 , and to gener-
ate a sequence of iterates (zk )∞
k=0 by linearizing the nonlinear equation
at the current iterate
∂R
R(zk ) + (zk )(z − zk ) = 0
∂z
This equation is a linear system in the variable z, and if the Jacobian
J(zk ) := ∂R
∂z (zk ) is invertible, we can explicitly compute the next iterate
as
zk+1 = zk − J(zk )−1 R(zk )
Here, we use the notation J(zk )−1 R(zk ) as a shorthand for the algo-
rithm that solves the linear system J(zk )∆z = R(zk ). In the actual
computation of a Newton step, the inverse J(zk )−1 is never computed,
but only a LU-decomposition of J(zk ), and a forward and a back sub-
stitution, as described in the previous subsection.
More generally, we can use an invertible approximation Mk of the Ja-
cobian J(zk ), leading to the Newton-type methods. The general Newton-
type method iterates according to

zk+1 = zk − Mk−1 R(zk )

Depending on how closely Mk approximates J(zk ), the local conver-


gence can be fast or slow, or the sequence may even not converge. The
advantages of using an Mk that is different from J(zk ) could be that it
can be chosen to be invertible even if J(zk ) is not, or that computation
of Mk , or of its factorization, can be cheaper. For example, one could
510 Numerical Optimal Control

40

30

R(z)
20

10

1.00 1.25 1.50 1.75 2.00 1.00 1.25 1.50 1.75 2.00
z z

Figure 8.5: Newton-type iterations for solution of R(z) = 0 from Ex-


ample 8.5. Left: exact Newton method. Right: constant
Jacobian approximation.

reuse one matrix and its factorization throughout several Newton-type


iterations.

Example 8.5: Finding a fifth root with Newton-type iterations


We find the zero of R(z) = z5 − 2 for z ∈ R. Here, the derivative is
∂R 4
∂z (z) = 5z , such that the Newton method iterates

zk+1 = zk − (5zk4 )−1 (zk5 − 2)

When starting at z0 = 2, the first step is given by z1 = 2 − (80)−1 (32 −


2) = 13/8, and the following iterates quickly converge to the solution
z∗ with R(z∗ ) = 0, as visualized in Figure 8.5 on the left side.
Alternatively, we could use a Jacobian approximation Mk ≠ J(zk ),
e.g., the constant value Mk = 80 corresponding to the true Jacobian at
z = 2. The resulting iteration would be

zk+1 = zk − (80)−1 (zk5 − 2)

When started at z0 = 2 the first iteration would be the same as for New-
ton’s method, but then the Newton-type method with constant Jacobian
produces a different sequence, as can be seen on the right side of Fig-
ure 8.5. Here, the approximate method also converges; but in general,
8.3 Solving Nonlinear Equation Systems 511

when does a Newton-type method converge, and when it converges,


how quickly? □

8.3.3 Local Convergence of Newton-Type Methods

Next we investigate the conditions on R(z), z0 and on Mk required to


ensure local convergence of Newton-type iterations. In particular we
discuss the speed of convergence. In fact, even if we assume that a
sequence of iterates zk ∈ Rn converges to a solution point z∗ , i.e., if
zk → z∗ , the rate of convergence can be painstakingly slow or light-
ning fast. The speed of convergence can make the difference between
a method being useful or useless for practical computations. Math-
ematically speaking, a sequence (zk ) is said to converge q-linearly if
there exists a positive integer k0 and a positive real number cmax < 1,
and sequence (ck )∞k0 such that for all k ≥ k0 holds that ck ≤ cmax and
that
zk+1 − z∗ ≤ ck zk − z∗ (8.10)
If in addition, ck → 0, the sequence is said to converge q-superlinearly.
If in addition, ck = O(|zk − z∗ |), the sequence is said to converge q-
quadratically.2

Example 8.6: Convergence rates


We discuss and visualize four examples with zk ∈ (0, ∞) and zk → 0,
see Figure 8.6.
1 zk+1 1
• zk = 2k
converges q-linearly: zk = 2

• zk = 0.99k also converges q-linearly: zzk+1


k
= 0.99. This example
converges very slowly. In practice we desire cmax to be smaller
than, say, 12
1 zk+1 1
• zk = k! converges q-superlinearly, as zk = k+1

k
1 zk+1 (22 )2
• zk = k converges q-quadratically, because (zk )2 = k+1 =1<
22 22
1
∞. For k = 6, zk = ≈ 0. This is a typical feature of q-quadratic
264
convergence: often, convergence up to machine precision is ob-
tained in about six iterations. □

2 The historical prefix “q” stands for “quotient,” to distinguish it from a weaker form
of convergence that is called “r-convergence,” where “r” stands for “root.”
512 Numerical Optimal Control

100
zk = 0.99k

10−3

10−6 1
zk zk =
2k
10−9 1
zk =
k!
10−12 1
zk =
22k

10−15
10 20 30 40 50
k

Figure 8.6: Convergence of different sequences as a function of k.

Local convergence of a Newton-type method can be guaranteed by the


following classical result (see, e.g., Bock (1983) or Deuflhard (2011)),
which also specifies the rate of convergence.
Theorem 8.7 (Local contraction for Newton-type methods). Regard a
nonlinear continuously differentiable function R : D → Rnz defined on
an open domain D ⊂ Rnz and a solution point z∗ ∈ D with R(z∗ ) = 0.
We start the Newton-type iteration with the initial guess z0 ∈ D and
iterate according to zk+1 = zk −Mk−1 R(zk ). The sequence (zk ) converges
at least q-linearly to z∗ and obeys the contraction inequality
 
ω
zk+1 −z∗ ≤ κk + zk −z∗ zk −z∗ (8.11)
2
if there exist constants ω ∈ [0, ∞), κmax ∈ [0, 1), and a sequence (κk )∞
k=0
with κk ∈ [0, κmax ], that satisfy for all zk and all z ∈ D the following
two inequalities

Mk−1 (J(zk ) − J(z)) ≤ ω |zk − z| (Lipschitz condition)

Mk−1 (J(zk ) − Mk ) ≤ κk (compatibility condition)


n o
2(1−κmax )
and if the ball B := z ∈ Rnz | |z − z∗ | < ω is completely con-
tained in D and if z0 ∈ B. If in addition κk → 0, the sequence converges
q-superlinearly. If in addition κk = O(|zk − z∗ |) or even κmax = 0, the
sequence converges q-quadratically.
8.3 Solving Nonlinear Equation Systems 513

Corollary 8.8 (Convergence of exact Newton’s method). For an exact


Newton’s method, the convergence rate is q-quadratic, because we have
Mk = J(zk ), i.e., κmax = 0.

8.3.4 Affine Invariance

An iterative method to solve a root-finding problem R(z) = 0 is called


affine invariant if affine basis transformations of the equations or vari-
ables do not change the resulting iterations. This is an important prop-
erty in practice. It is not unreasonable to ask that a good numerical
method should behave the same if it is applied to problems formulated
in different units or coordinate systems.
The exact Newton method is affine invariant, and also some popu-
lar Newton-type optimization methods like the Gauss-Newton method
for nonlinear least squares problems share this property. Their affine
invariance makes them insensitive to the chosen problem scaling, and
this is one reason why they are successful in practice. On the other
hand, a method that is not affine invariant usually needs careful scal-
ing of the model equations and decision variables to work well.

8.3.5 Globalization for Newton-Type Methods

The iterations of a Newton-type method can be regarded the trajec-


tory of a nonlinear discrete time system, and the solution z0 a fixed
point. This system is autonomous if Mk is constant or a function of z,
i.e., if Mk = M(zk ). In this case, the discrete time system is given by
z+ = f (z) with f (z) := z − M(z)−1 R(z). When designing the Newton-
type method, one usually wants the solution z0 to be a stable fixed point
with a large area of attraction. Local convergence to this fixed point
usually can be guaranteed under conditions stated in Theorem 8.7, in
particular if the exact Jacobian is available. On the other hand, the area
of attraction for the full-step Newton-type methods described so far
is unfortunately not very large in practice, and Newton-type methods
usually need extra globalization features to make them globally conver-
gent from arbitrary initial guesses. Some globalization techniques are
based on a merit function that plays the role of a Lyapunov function to
be reduced in each iteration; others are based on a filter as a measure
of merit of a new iterate. To ensure progress from one iteration to
the next, some form of damping is applied that either reduces the un-
modified Newton-type step by doing a line-search along the proposed
direction, or changes the step computation by adding a trust-region
514 Numerical Optimal Control

constraint. For a detailed description of globalization techniques, we


refer to textbooks on optimization such as Nocedal and Wright (2006).

8.4 Computing Derivatives

Whenever a Newton-type method is used for numerical simulation or


optimization, we need to provide derivatives of nonlinear functions
that exist as computer code. Throughout this section, we consider a
differentiable function F (u) with m inputs and p outputs y = F (u),
i.e., a function F : Rm → Rp . The main object of interest is the Jacobian
J(u) ∈ Rm×p of F at the point u, or some of its elements.
Among the many ways to compute the derivatives of F (u), the most
obvious would be to apply the known differentiation rules on paper for
each of its components, and then to write another computer code by
hand that delivers the desired derivatives. This process can become
tedious and error prone, but can be automated by using symbolic com-
puter algebra systems such as Maple or Mathematica. This symbolic
differentiation often works well, but typically suffers from two disad-
vantages. First, it requires the code to exist in the specific symbolic lan-
guage. Second, the resulting derivative expressions can become much
longer than the original function, such that the CPU time needed to
evaluate the Jacobian J(u) by symbolic differentiation can become sig-
nificantly larger than the CPU time to evaluate F (u).
In contrast, we next present three ways to evaluate the Jacobian J(u)
of any computer-represented function F (u) by algorithms that have
bounded costs: numerical differentiation, as well as the algorithmic
differentiation (AD) in forward mode and in reverse mode. All three
ways are based on the evaluation of directional derivatives of the form
J(u)u̇ with a vector u̇ ∈ Rm (forward directional derivatives used in
numerical differentiation and forward AD) or of the form ȳ ′ J(u) with
ȳ ∈ Rp (reverse directional derivatives used in reverse AD). When unit
vectors are used for u̇ or ȳ, the directional derivatives correspond to
columns or rows of J(u), respectively. Evaluation of the full Jacobian
thus needs either m forward derivatives or p reverse derivatives. Note
that in this section, the use of a dot or a bar above a vector as in u̇ and
ȳ just denotes another arbitrary vector with the same dimensions as
the original one.
8.4 Computing Derivatives 515

8.4.1 Numerical Differentiation

Numerical differentiation is based on multiple calls of the function


F (u) at different input values. In its simplest and cheapest form, it
computes a forward difference approximation of J(u)u̇ for given u
and u̇ ∈ Rm by using a small but finite perturbation size t∗ > 0 as
follows
F (u + t∗ u̇) − F (u)
t∗
The optimal size of t∗ for the forward difference approximation de-
pends on the numerical accuracy of the evaluations of F , which we
denote by ϵ > 0, and on the relative size of the second derivatives of F
compared to F , which we denote by L > 0. A detailed derivation leads
to the optimal choice r
ϵ
t∗ ≈
L
While ϵ is typically known and given by the machine precision, i.e.,
ϵ = 10−16 for double-precision floating-point computations, the rela-
tive size of the second derivative L is typically not known, but can be
estimated. Often, L is just assumed to be of size one, resulting in the

choice t∗ = ϵ, i.e., t∗ = 10−8 for double precision. One can show that
the accuracy of the forward derivative approximation is then also given

by ϵ, i.e., one loses half of the valid digits compared to the function
evaluation. To compute the full Jacobian J(u), one needs to evaluate
m forward differences, for the m seed vectors u̇ = (1, 0, 0, . . .)′ , u̇ = (0,
1, 0 . . .)′ , etc. Because the center point can be recovered, one needs in
total (m + 1) evaluations of the function F . Thus, we can summarize
the cost for computation of the full Jacobian J (as well as the function
F ) by the statement

cost(F , J) = (1 + m) cost(F )

There exists a variety of more accurate, but also more expensive, forms
of numerical differentiation, which can be derived from polynomial in-
terpolation of multiple function evaluations of F . The easiest of these
are central differences, which are based on a positive and a negative
perturbation. Using such higher-order formulas with adaptive pertur-
bation size selection, one can obtain high-accuracy derivatives with nu-
merical differentiation, but at significant cost. One interesting way to
actually reduce the cost of the numerical Jacobian calculation arises if
the Jacobian is known to be sparse, and if many of its columns are struc-
turally orthogonal, i.e., have their nonzero entries at different locations.
516 Numerical Optimal Control

To efficiently generate a full Jacobian, one can, for example, use the al-
gorithm by Curtis, Powell, and Reid (1974) that is implemented in the
FORTRAN routine TD12 from the HSL Mathematical Software Library
(formerly Harwell Subroutine Library). For details of sparse Jacobian
evaluations, we refer to the review article by Gebremedhin, Manne, and
Pothen (2005).
In summary, and despite the tricks to improve accuracy or effi-
ciency, one has to conclude that numerical differentiation often re-
sults in quite inaccurate derivatives, and its only—but practically
important—advantage is that it works for any black-box function that
can be evaluated on a given computer. Fortunately, there exists a dif-
ferent technology, called AD, that also has tight bounds on the com-
putational cost of the Jacobian evaluation, but avoids the numerical
inaccuracies of numerical differentiation. It is often even faster than
numerical differentiation, and in the case of reverse derivatives ȳ ′ J, it
can be tremendously faster. It does so, however, by opening the black
box.

8.4.2 Algorithmic Differentiation

We next consider a function F : Rm → Rp that is composed of a se-


quence of N elementary operations, where an elementary operation
acts on only one or two variables. We also introduce a vector x ∈ Rn
with n = m + N that contains all intermediate variables including the
inputs, x1 = u1 , x2 = u2 , . . . xm = um . While the inputs are given be-
fore the function is called, each elementary operation generates a new
intermediate variable, xm+i , for i = 1, . . . , N. Some of these intermedi-
ate variables are used as output y ∈ Rp of the code. This decompo-
sition into elementary operations is automatically performed in each
executable computer code, and best illustrated with an example.

Example 8.9: Function evaluation via elementary operations

We consider the function


" #
u1 u2 u3
F (u1 , u2 , u3 ) =
sin(u1 u2 ) + exp(u1 u2 u3 )

with m = 3 and p = 2. We can decompose this function into N =


5 elementary operations that are preceded by m and followed by p
8.4 Computing Derivatives 517

renaming operations, as follows


x1 = u1
x2 = u2
x3 = u3
x4 = x1 x2
x5 = sin(x4 )
(8.12)
x6 = x4 x3
x7 = exp(x6 )
x8 = x5 + x7
y1 = x6
y2 = x8
Thus, if the m = 3 inputs u1 , u2 , u3 are given, the N = 5 nontrivial
elementary operations compute the intermediate quantities x4 , . . . , x8 ,
and the sixth and eighth of the intermediate quantities are then used
as the output y = F (u) of our function. □
The idea of AD is to use the chain rule and differentiate each of the
elementary operations separately. There exist two modes of AD, the
forward mode and the reverse mode. Both can be derived in a mathe-
matically rigorous way by interpreting the computer function y = F (u)
as the output of an implicit function, as explained next.

8.4.3 Implicit Function Interpretation

Let us regard all equations that recursively define the intermediate


quantities x ∈ Rn for a given u ∈ Rm as one large nonlinear equa-
tion system
G(x, u) = 0 (8.13)
∂G
with G : Rn × Rm → Rn . Here, the partial derivative ∂x ∈ Rn×n is a
lower-triangular invertible matrix and ∂G ∂u ∈ R
n×m
turns out to be an
m-dimensional unit matrix augmented by zeros, which we will denote
by B. The function G defines an implicit function x ∗ : Rm → Rn , u ,
x ∗ (u) that satisfies G(x ∗ (u), u) = 0. The output y = F (u) is given by
the selection of some entries of x ∗ (u) via a selection matrix C ∈ Rp×n ,
i.e., the computer function is represented by the expression F (u) =
∗ ∂G dx ∗
Cx ∗ (u). The derivative dx du of the implicit function satisfies ∂x du +
∂G
∂u = 0 and is therefore given by
   
dx ∗ ∂G −1 ∂G ∂G −1
= − = − B
du ∂x |∂u
{z } ∂x
=:B
518 Numerical Optimal Control

dx ∗
and the Jacobian of F is simply given by J(u) = C du (u). The forward
directional derivative is given by

 
∂G −1
J(u)u̇ = C − B u̇ = C ẋ
∂x
| {z }
=:ẋ

Here, we have introduced the dot quantities ẋ that denote the direc-

tional derivative of x ∗ (u) into the direction u̇, i.e., ẋ = dx
du u̇. An effi-
cient algorithm to compute ẋ corresponds to the solution of a lower-
triangular linear equation system that is given by
 
∂G
− ẋ = B u̇ (8.14)
∂x

Since the matrix ∂G


∂x is lower triangular, the linear system can be solved
by a forward sweep that computes the components of ẋ in the same
order as the elementary operations, i.e., it first computes ẋ1 , then ẋ2 ,
etc. This leads to the forward mode of AD.
The reverse directional derivative, on the other hand, is given by

 
∂G −1
ȳ ′ J(u) = ȳ ′ C − B = x̄ ′ B
∂x
| {z }
=:x̄ ′

where we define the bar quantities x̄ that have a different meaning than
the dot quantities. For computing x̄, we need to also solve a linear
system, but with the transposed system matrix

 ′
∂G
− x̄ = C ′ ȳ (8.15)
∂x

Due to the transpose, this system involves an upper-triangular matrix


and can thus be solved by a reverse sweep, i.e., one first computes x̄n ,
then x̄n−1 , etc. This procedure leads to the reverse mode of AD.

Example 8.10: Implicit function representation

Let us regard Example 8.9 and find the corresponding function G(x, u)
as well as the involved matrices. The function G corresponds to the
8.4 Computing Derivatives 519

first n = 8 rows of (8.12) and is given by


 
u 1 − x1
 
 u2 − x2 
 
 u3 − x3 
 
 x x −x 
 1 2 4 
G(x, u) =  
 sin(x4 ) − x5 
 
 x4 x3 − x6 
 
 
exp(x6 ) − x7 
x5 + x7 − x8

It is obvious that the nonlinear equation G(x, u) = 0 can be solved for


any given u by a simple forward elimination of the variables x1 , x2 ,
. . ., yielding the map x ∗ (u). This fact implies also the lower-triangular
structure of the Jacobian ∂G ∂x which is given by
 
−1
 
0 −1 
 
0 0 −1 
 
∂G  x2 x1 0 −1 

= 
∂x 0
 0 0 cos(x4 ) −1 

0 0 x4 x3 0 −1 
 
 
0 0 0 0 0 exp(x6 ) −1 
0 0 0 0 1 0 1 −1

The derivative of G with respect to u is given by a unit matrix to which


zero rows are appended, and given by
 
1
 
 1 
 
 1
 
∂G 
0 0 0
B := = 
0
∂u  0 0
0 0 0
 
 
0 0 0
0 0 0

The identity y = Cx corresponds to the last p = 2 rows of (8.12), and


the matrix C ∈ Rp×n is therefore given by
" #
0 0 0 0 0 1 0 0
C=
0 0 0 0 0 0 0 1
520 Numerical Optimal Control

The right-hand-side vectors in the equations (8.14) and (8.15) are given
by
   
u̇1 0
   
u̇2  0
   
u̇3  0
   
0 0
  ′  
B u̇ =   and C ȳ =  
0 0
   
0 ȳ1 
   
0 0
   
0 ȳ2

8.4.4 Algorithmic Differentiation in Forward Mode

The forward mode of AD computes ẋ by solving the lower-triangular


linear system (8.14) with a forward sweep. After the trivial definition of
the first m components of ẋ, it goes through all elementary operations
in the same order as in the original function to compute the compo-
nents of ẋ one by one. If an original line of code reads xk = φk (xi , xj ),
the corresponding line to compute ẋk by forward AD is simply given
by

∂φk ∂φk
ẋk = (xi , xj ) ẋi + (xi , xj ) ẋj
∂xi ∂xj

In forward AD, the function evaluation and the derivative evaluation


can be performed simultaneously, if desired, eliminating the need to
store any internal information. The algorithm is best explained by look-
ing again at the example.

Example 8.11: Forward algorithmic differentiation

We differentiate the algorithm from Example 8.9. To highlight the rela-


tion to the original code, we list the original command again on the left
side, and show the algorithm to compute ẋ on the right side. For given
u = [u1 u2 u3 ]′ and u̇ = [u̇1 u̇2 u̇3 ]′ , the two algorithms proceed as
8.4 Computing Derivatives 521

follows

x1 = u1 ẋ1 = u̇1
x2 = u2 ẋ2 = u̇2
x3 = u3 ẋ3 = u̇3
x4 = x1 x2 ẋ4 = x2 ẋ1 + x1 ẋ2
x5 = sin(x4 ) ẋ5 = cos(x4 )ẋ4
x6 = x4 x3 ẋ6 = x3 ẋ4 + x4 ẋ3
x7 = exp(x6 ) ẋ7 = exp(x6 )ẋ6
x8 = x5 + x7 ẋ8 = ẋ5 + ẋ7
y1 = x6 ẏ1 = ẋ6
y2 = x8 ẏ2 = ẋ8

The result of the original algorithm is y = [y1 y2 ]′ and the result of the
forward AD sweep is ẏ = [ẏ1 ẏ2 ]′ . If desired, one could perform both
algorithms in parallel, i.e., evaluate first the left side, then the right side
of each row consecutively. This procedure would allow one to delete
each intermediate variable and the corresponding dot quantity after its
last usage, making the memory demands of the joint evaluation just
twice as big as those of the original function evaluation. □
One can see that the dot-quantity evaluations on the right-hand
side—which we call a forward sweep—are never longer than about twice
the original line of code. This is because each elementary operation de-
pends on at maximum two intermediate variables. More generally, it
can be proven that the computational cost of one forward sweep in
AD is smaller than a small constant times the cost of a plain function
evaluation. This constant depends on the chosen set of elementary
operations, but is usually much less than two, so that we conclude

cost(J u̇) ≤ 2 cost(F )

To obtain the full Jacobian J, we need to perform the forward sweep


several times, each time with the seed vector corresponding to one of
the m unit vectors in Rm . The m forward sweeps all could be per-
formed simultaneously with the evaluation of the function itself, so
that one needs in total one function evaluation plus m forward sweeps,
i.e., we have
cost(F , J) ≤ (1 + 2m) cost(F )
This is a conservative bound, and depending on the AD tool used the
cost of several combined forward sweeps can be significantly reduced,
522 Numerical Optimal Control

and often become much cheaper than a finite difference approxima-


tion. Most important, the result of forward AD is exact up to machine
precision.

8.4.5 Algorithmic Differentiation in Reverse Mode

The reverse mode of AD computes x̄ by solving the upper-triangular


linear system (8.15) with a reverse sweep. It does so by first computing
the right-hand-side C ′ ȳ vector and initializing all bar quantities with
the respective values, i.e., it initially sets x̄ = C ′ ȳ. Then, the reverse
AD algorithm modifies the bar quantities by going through all elemen-
tary operations in reverse order. The value of x̄i is modified for each
elementary operation in which xi is involved. If two quantities xi and
xj are used in the elementary operation xk = φk (xi , xj ), then the cor-
responding two update equations are given by

∂φk
x̄i = x̄i + x̄k (xi , xj ) and
∂xi
∂φk
x̄j = x̄j + x̄k (xi , xj )
∂xj

Again, the algorithm is best illustrated with the example.

Example 8.12: Algorithmic differentiation in reverse mode


We consider again the code from Example 8.9. In contrast to before
in Example 8.11, now we compute the reverse directional derivative
ȳ ′ J(u) for given [u1 u2 u3 ]′ and ȳ = [ȳ1 ȳ2 ]′ . After the forward
evaluation of the function, which is needed to define all intermediate
quantities, we need to solve the linear system (8.15) to obtain x̄. In the
example, this system is explicitly given by
    
1 −x2 x̄1 0
    
 1 −x1  x̄2   0 
    
 1 0 −x4  x̄3   0 
    
 1 − cos(x4 ) −x3  x̄   0 
   4  
   =  
 1 0 −1  x̄5   0 
    
 1 − exp(x ) 0  x̄6  ȳ1 
 6    
 −1    
 1  x̄7   0 
1 x̄8 ȳ2

To solve this equation without forming the matrix explicitly, we process


the elementary operations in reverse order, i.e., one column after the
8.4 Computing Derivatives 523

other, noting that the final result for each x̄i will be a sum of the right-
hand-side vector component C ′ ȳ and a weighted sum of the values x̄j
for those j > i which correspond to elementary operations that have
xi as an input. We therefore initialize all variables by x̄ = C ′ ȳ, which
results for the example in the initialization

x̄1 = 0 x̄5 = 0
x̄2 = 0 x̄6 = ȳ1
x̄3 = 0 x̄7 = 0
x̄4 = 0 x̄8 = ȳ2

In the reverse sweep, the algorithm updates the bar quantities in re-
verse order compared to the original algorithm, processing one column
after the other.

// differentiation of x8 = x5 + x7
x̄5 = x̄5 + x̄8
x̄7 = x̄7 + x̄8
// differentiation of x7 = exp(x6 )
x̄6 = x̄6 + x̄7 exp(x6 )
// differentiation of x6 = x4 x3
x̄4 = x̄4 + x̄6 x3
x̄3 = x̄3 + x̄6 x4
// differentiation of x5 = sin(x4 )
x̄4 = x̄4 + x̄5 cos(x4 )
// differentiation of x4 = x1 x2
x̄1 = x̄1 + x̄4 x2
x̄2 = x̄2 + x̄4 x1

At the very end, the algorithm sets

ū1 = x̄1
ū2 = x̄2
ū3 = x̄3

to read out the desired result ȳ ′ J(x) = [ū1 ū2 ū3 ]. Note that all three
of the components are returned by only one reverse sweep. □
It can be shown that the cost of one reverse sweep of AD is less than
a small constant (which is certainly less than three) times the cost of a
524 Numerical Optimal Control

function evaluation, i.e.,

cost(ȳ ′ J) ≤ 3 cost(F )

To obtain the full Jacobian of F , we need to call the reverse sweep p


times, with the seed vectors corresponding to the unit vectors in Rp ,
i.e., together with one forward evaluation, we have

cost(F , J) ≤ (1 + 3p) cost(F )

Remarkably, reverse AD can compute the full Jacobian at a cost that is


independent of the input dimension m. This is particularly advanta-
geous if p ≪ m, e.g., if we compute the gradient of a scalar function
like the objective in optimization. The reverse mode can be much faster
than what we can obtain by forward finite differences, where we always
need (m + 1) function evaluations. For example, to compute the gradi-
ent of a scalar function f : Rm → R when m = 1, 000, 000 and each call
of the function requires one second of CPU time, the finite difference
approximation of the gradient would take 1, 000, 001 seconds, while
the computation of the same quantity with the backward mode of AD
requires only four seconds (one call of the function plus one backward
sweep). Thus, besides being more accurate, reverse AD can also be
much faster than numerical finite differences. This astonishing fact is
also known as the “cheap gradient result” in the AD community, and
in the field of neural networks it is exploited in the back propagation
algorithm. The only disadvantage of the reverse mode of AD is that
we have to store all intermediate variables and partial derivatives, in
contrast to finite differences or forward AD.

Backward sweep for discrete time optimal control. In numerical op-


timal control we often have to differentiate a function that is the result
of a dynamic system simulation. If the system simulation is in discrete
time, one can directly apply the principles of AD to compute the de-
sired derivatives by the forward or the reverse mode. For evaluating
the gradient of the objective, the reverse mode is most efficient. If
the controls are given by u = [u(0)′ · · · u(N − 1)′ ]′ and the states
x(k) are obtained by a discrete time forward simulation of the form
x(k + 1) = f (x(k), u(k)) for k = 0, . . . , N − 1 started at x(0) = x0 , and
PN−1
if the objective function is given by J(u) := k=0 ℓ(x(k), u(k))+V (xN ),
then the backward sweep to compute ∇u J(u) performs the following
8.4 Computing Derivatives 525

steps

x̄(N)′ = Vx (x(N))
for k = N − 1, N − 2, . . . , 0 (8.16)
′ ′
x̄(k) = ℓx (x(k), u(k)) + x̄(k + 1) fx (x(k), u(k))
ū(k)′ = ℓu (x(k), u(k)) + x̄(k + 1)′ fu (x(k), u(k))
end

The output of this algorithm is the vector ū = [ū(0)′ · · · ū(N − 1)′ ]′


which equals the gradient ∇u J(u). This method to compute the ob-
jective gradient in the sequential approach was well known in the field
of optimal control even before the field of algorithmic differentiation
developed. From a modern perspective, however, it is simply an ap-
plication of reverse AD to the algorithm that computes the objective
function.

8.4.6 Differentiation of Simulation Routines

When a continuous time system is simulated by numerical integration


methods and one wants to compute the derivatives of the state trajec-
tory with respect to initial values or controls, as needed in shooting
methods, there are many different approaches and many possible pit-
falls. While a complete textbook could be written on the differentiation
of just numerical integrators, we present and discuss only three popu-
lar approaches here.
External numerical differentiation (END). Probably the simplest ap-
proach to differentiate an integrator is to regard the integrator call as
a black box, and to compute the desired derivatives by numerical fi-
nite differences. Here one computes one nominal trajectory, and one
or more perturbed trajectories, depending on the desired number of
forward derivatives. This approach, called external numerical differ-
entiation (END), is easy to implement; it is generally not recommended
because it suffers from some disadvantages.
• It is typically inaccurate because integrator accuracies ϵint are well
above machine precision, e.g., ϵint ≈ 10−6 , such that the perturba-
tion size needs to be chosen rather large, in particular for adaptive
integrators.

• It usually is expensive because each call of the integrator for a per-


turbed trajectory creates some overhead, such as error control or
matrix factorizations, which can be avoided in other approaches.
526 Numerical Optimal Control

• It can only compute forward derivatives.

The first disadvantage can be mitigated for explicit integrators with


fixed stepsize, where one is allowed to choose smaller perturbation
sizes, in the order of the square root of the machine precision. For this
special case, END becomes equivalent to the approach described next.
Internal numerical differentiation (IND). The idea behind internal
numerical differentiation (IND) (Bock, 1981) is to regard the numerical
integrator as a differentiable computer code in the spirit of algorithmic
differentiation (AD). Similar to END, it works with perturbed trajecto-
ries. What is different from END is that all perturbed trajectories are
treated in one single forward sweep, and that all adaptive integrator
components are switched off for the perturbed trajectories. Thus, for
an adaptive explicit integrator, the stepsize selection works only on the
nominal trajectory; once the stepsize is chosen, the same size also is
used for all perturbed trajectories.
For implicit integrators, where one performs Newton-type iterations
in each step, the philosophy of IND is to choose the sequence of itera-
tion matrices and numbers of Newton-type iterations for only the nom-
inal trajectory, and to regard the iteration matrices as constant for all
perturbed trajectories. Because all adaptive components are switched
off during the numerical differentiation process, one can regard the
integrator code as a function that evaluates its output with machine
precision. For this reason, the perturbation size can be chosen sig-
nificantly smaller than in END. Thus, IND is both more accurate and
cheaper than END.
Algorithmic differentiation of integrators. Another approach that
is related to IND is to directly apply the principles of AD to the integra-
tion algorithm. In an extreme case, one could just take the integrator
code and process it with an AD tool—this approach can work well for
explicit integrators with fixed stepsize, as we show in Example 8.13,
but otherwise needs to be applied with care to avoid the many possible
pitfalls of a blind application of AD. In particular, for adaptive integra-
tors, one needs to avoid the differentiation of the stepsize selection
procedure. If this simple rule is respected, AD in both forward and re-
verse modes can be easily applied to adaptive explicit integrators, and
is both efficient and yields highly accurate results.
For implicit integrators, one should also regard the number and type
of Newton-type iterations in each step as constant. Otherwise, the AD
tool also tries to differentiate the Jacobian evaluations and factoriza-
8.4 Computing Derivatives 527

tions, which would create unnecessary overhead. When AD is imple-


mented in this way, i.e., if it respects the same guidelines as the IND
approach, its forward mode has similar costs, but yields more accurate
derivatives than IND. Depending on input and output dimensions, the
reverse mode can accelerate computations further.

8.4.7 Algorithmic and Symbolic Differentiation Software

A crucial property of many AD tools is that they are able to pro-


cess generic code from a standard programming language like C, C++,
MATLAB, or FORTRAN, with no or only minor modifications to the source
code. For example, the AD tools ADOL-C and CppAD can process
generic user-supplied C or C++ code. This is in contrast to computer al-
gebra systems such as Maple, Mathematica, or MATLAB’s Symbolic Math
Toolbox, which require the user to define the function to be differenti-
ated using symbolic expressions in a domain-specific language. A fur-
ther advantage of AD over symbolic differentiation is that it is able to
provide tight bounds on the length of the resulting derivative code, as
well as its runtime and memory requirements. On the other hand, some
symbolic tools—such as AMPL or CasADi—make use of AD internally,
so the performance differences between algorithmic and symbolic dif-
ferentiation can become blurry.
An overview of nearly all available AD tools is given at www.
autodiff.org. Most AD tools implement both the forward and re-
verse mode of AD, and allow recursive application of AD to generate
higher-order derivatives. Some AD tools automatically perform graph-
coloring strategies to reduce the cost of Jacobian evaluations, similar
to the sparse numerical differentiation algorithm by Curtis et al. (1974)
mentioned before in the context of numerical differentiation. We refer
to the textbook on algorithmic differentiation by Griewank and Walther
(2008) for an in-depth analysis of the different concepts of AD.

8.4.8 CasADi for Optimization

Many of the computational exercises in this text use the open-source


tool CasADi, which implements AD on user-defined symbolic expres-
sions. CasADi also provides standardized interfaces to a variety of
numerical routines: simulation and optimization, and solution of lin-
ear and nonlinear equations. A key feature of these interfaces is that
every user-defined CasADi function passed to a numerical solver au-
tomatically provides the necessary derivatives to this solver, without
528 Numerical Optimal Control

any additional user input. Often, the result of the numerical solver it-
self can be interpreted as a differentiable CasADi function, such that
derivatives up to any order can be generated without actually differen-
tiating the source code of the solver. Thus, concatenated and recursive
calls to numerical solvers are possible and still result in differentiable
CasADi functions.
CasADi is written in C++, but allows user input to be provided from
either C++, Python, Octave, or MATLAB. When CasADi is used from the
interpreter languages Python, Octave, or MATLAB, the user does not have
any direct contact with C++; but because the internal handling of all
symbolic expressions as well as the numerical computations are per-
formed in a compiled environment, the speed of simulation or op-
timization computations is similar to the performance of compiled
C-code. One particularly powerful optimization solver interfaced to
CasADi is IPOPT, an open-source C++ code developed and described
by Wächter and Biegler (2006). IPOPT is automatically provided in the
standard CasADi installation. For more information on CasADi and
how to install it, we refer the reader to casadi.org. Here, we illustrate
the use of CasADi for optimal control in a simple example.

Example 8.13: Sequential optimal control using CasADi from Octave


In the following example we formulate and solve a simple nonlinear
MPC problem. The problem is formulated and solved by the sequential
approach in discrete time, but the discrete time dynamics are the result
of one step of an integrator applied to a continuous time ordinary dif-
ferential equation (ODE). We go through the example problem and the
corresponding solution using CasADi from Octave, which works with-
out changes from MATLAB. The code is available from the book website as
the file casadi-example-mpc-book-1.m along with a Python version
of the same code, casadi-example-mpc-book-1.py.
As a first step, we define the ODE describing the system, which is
given by a nonlinear oscillator described by the following ODE with
x ∈ R2 and u ∈ R
" # " #
d x1 x2
=
dt x2 −x1 − x13 + u
| {z }
=:fc (x,u)

with the initial condition x(0) = [0, 1]′ . We can encode this in Oc-
tave as follows
8.4 Computing Derivatives 529

% Continuous time dynamics


f_c = @(x, u) [x(2); -x(1) - x(1)^3 + u];

To define the discrete time dynamics x + = f (x, u), we perform one


step of the classical Runge-Kutta method of fourth order. We choose
a stepsize of 0.2 seconds. Given x + = f (x, u), we can state an MPC
optimization problem with zero terminal constraint that we solve, as
follows
N−1
" #
X
′ 10 0
minimize x(k) x(k) + u(k)2 (8.17a)
x, u 0 5
k=0

subject to x(0) = [1, 0]′ (8.17b)


x(k + 1) = f (x(k), u(k)), k = 0, 1, . . . , N − 1 (8.17c)
u(k) ∈ [−1, 1], k = 0, 1, . . . , N − 1 (8.17d)
x(N) = [0, 0]′ (8.17e)

For its numerical solution, we formulate this problem using the se-
quential approach, i.e., we regard only u as optimization variables and
eliminate x by a system simulation. This elimination allows us to gen-
erate a cost function c(u) and a constraint function G(u) such that the
above problem is equivalent to

minimize c(u) (8.18a)


u
subject to u ∈ [−1, 1]N (8.18b)
G(u) = 0 (8.18c)

Here, c : RN → R and G : RN → R2 , with N = 50.


To code this into CasADi/Octave, we begin by declaring a symbolic
variable corresponding to u as follows

% Decision variable
N = 50;
U = casadi.SX.sym(’U’, N);

This symbolic variable can be used to construct expressions for c and


G

% System simulation
xk = [1; 0];
c = 0;
for k=1:N
% RK4 method
530 Numerical Optimal Control

dt = 0.2;
k1 = f_c(xk, U(k));
k2 = f_c(xk+0.5*dt*k1, U(k));
k3 = f_c(xk+0.5*dt*k2, U(k));
k4 = f_c(xk+dt*k3, U(k));
xk = xk + dt/6.0*(k1 + 2*k2 + 2*k3 + k4);
% Add contribution to objective function
c = c + 10*xk(1)^2 + 5*xk(2)^2 + U(k)^2;
end
% Terminal constraint
G = xk - [0; 0];

The last remaining step is to pass the expressions for c and G to an


optimization solver, more specifically, to the nonlinear programming
solver IPOPT. The solver expects an optimization problem with lower
and upper bounds for all variables and constraints of the form
minimize f (x)
x
subject to xlb ≤ x ≤ xub (8.19)
glb ≤ g(x) ≤ gub
To formulate equality constraints in the CasADi syntax for NLPs, one
just sets the upper and lower bounds to equal values. The solver also
expects an initial guess x0 for the optimization variables (the initial
guess x0 for the NLP solver is not to be confused with the initial value
x0 for the state trajectory). The interface to the NLP solver uses the
keywords f and g for the functions f and g, x for the variables x, lbx
for xlb etc. The corresponding CasADi code to pass all data to the NLP
solver, call it, and retrieve the solution looks as follows.
% Create an NLP solver object
nlp = struct(’x’, U, ’f’, c, ’g’, G);
solver = casadi.nlpsol(’solver’, ’ipopt’, nlp);
% Solve the NLP
solution = solver(’x0’, 0, ’lbx’, -1, ’ubx’, 1,
’lbg’, 0, ’ubg’, 0);
U_opt = solution.x;

8.5 Direct Optimal Control Parameterizations


Direct optimal control methods transform a continuous time optimal
control problem of the form (8.5) into a finite-dimensional optimization
8.5 Direct Optimal Control Parameterizations 531

problem. For convenience, we restate the OCP (8.5) in a form that re-
places the constraint sets Z and Xf by equivalent inequality constraints,
as follows
ZT
minimize ℓc (x(t), u(t)) dt + Vf (x(T )) (8.20a)
x(·), u(·) 0

subject to x(0) = x0 (8.20b)


ẋ(t) = fc (x(t), u(t)), t ∈ [0, T ] (8.20c)
h(x(t), u(t)) ≤ 0, t ∈ [0, T ] (8.20d)
hf (x(T )) ≤ 0 (8.20e)

While the above problem has infinitely many variables and constraints,
the idea of direct optimal control methods is to solve instead a related
finite-dimensional problem of the general form

minimize F (w)
w ∈ Rnw
subject to G(x0 , w) = 0 (8.21)

H(w) ≤ 0

This finite-dimensional optimization problem is solved for given initial


value x0 with any of the Newton-type optimization methods described
in the following section, Section 8.6. In this section, we are concerned
only with the transformation of the continuous problem (8.20) into a
finite-dimensional problem of form (8.21).
First, one chooses a finite representation of the continuous func-
tions, which is often called discretization. This encompasses three parts
of the OCP, namely the control trajectory (which is often represented by
a piecewise constant function), the state trajectory (which is often dis-
cretized using a numerical integration rule), and the path constraints
(which are often only imposed on some grid points). Second, one selects
the variables w that are finally passed to the optimization solver. These
can be all of the discretization variables (in the fully simultaneous or
direct transcription approach), but are often only a subset of the param-
eters that represent the control and state trajectories. The remaining
discretization parameters are hidden to the optimization solver, but
are implicitly computed during the optimization computations—such
as the state trajectories in the sequential approach, or the intermediate
quantities in a Runge-Kutta step. Next we present some of the most
widely used direct optimal control parameterizations.
532 Numerical Optimal Control

8.5.1 Direct Single Shooting

Like most direct methods, the single-shooting approach first parame-


terizes the control trajectory with a finite-dimensional vector q ∈ Rnq
and sets u(t) = u e (t; q) for t ∈ [0, T ]. One sometimes calls this step
“control vector parameterization.” One example for such a function
e : [0, T ] × Rnq → Rm is a polynomial of degree p, which requires
u
(p + 1) coefficients for each component of u(t) ∈ Rm . With this
choice, the resulting control parameter q would have the dimension
nq = (p + 1)m. A disadvantage of the polynomials—as of any other
“global” parameterization—is that the inherent problem sparsity due
to the dynamic system structure is inevitably lost. For this reason, and
also because it better corresponds to the discrete time implementation
of MPC, most often one chooses basis functions with local support, for
example, a piecewise constant control parameterization. In this case,
one divides the time horizon [0, T ] into N subintervals [ti , ti+1 ] with
0 = t0 < t1 < . . . < tN = T , and sets

u
e (t; q) := qi for t ∈ [ti , ti+1 )

For each interval, one needs one vector qi ∈ Rm , such that the to-

tal dimension of q = q0 , q1 , . . . , qN−1 is given by nq = Nm. In the
following, we assume this form of piecewise constant control parame-
terization.
Regarding the state discretization, the direct single-shooting
method relies on any of the numerical simulation methods described in
Section 8.2 to find an approximation x e (t; x0 , q) of the state trajectory,
given the initial value x0 at t = 0 and the control trajectory u e (t; q).
Often, adaptive integrators are chosen. In case of piecewise constant
controls, the integration needs to stop and restart briefly at the time
points ti to avoid integrating a nonsmooth right-hand-side function.
Due to state continuity, the state x e (ti ; x0 , q) is both the initial state
of the interval [ti , ti+1 ] as well as the last state of the previous inter-
val [ti−1 , ti ]. The control values used in the numerical integrators on
both sides differ, due to the jump at ti , and are given by qi−1 and qi ,
respectively.
Evaluating the integral in the objective (8.20a) requires an integra-
tion rule. One option is to just augment the ODE system with a quadra-
ture state xquad (t) starting at xquad (0) = 0, and obeying the trivial dif-
ferential equation ẋquad (t) = ℓc (x(t), u(t)) that can be solved with
the same numerical solver as the standard ODE. Another option is to
8.5 Direct Optimal Control Parameterizations 533

evaluate ℓc (x e (t; x0 , q), u


e (t; q)) on some grid and to apply another inte-
gration rule that is external with respect to the integrator. For example,
one can use a refinement of the grid that was used for the control dis-
cretization, where each interval [ti , ti+1 ] is divided into M equally sized
subintervals [τi,j , τi,j+1 ] with τi,j := ti + j/M(ti+1 − ti ) for j = 0, . . . , M
and i = 0, . . . , N − 1, and just apply a Riemann sum on each interval to
yield the objective function

N−1
X M−1
X
F (x0 , q) := ℓc (x
e (τi,j ; x0 , q), u
e (τi,j ; q)) (τi,j+1 −τi,j )
i=0 j=0

+ Vf (x
e (T ; x0 , q))

In the context of the Gauss-Newton method for least squares integrals,


this second option is preferable because it allows one to easily obtain
a Gauss-Newton Hessian approximation from the sensitivities which
are provided by the integrator. Note that the fine grid evaluation as
described here requires an integrator able to output the states at ar-
bitrary locations; collocation methods, for example, have this ability.
If not, one must select points τi,j that coincide with the intermediate
steps or stages of the integrator.
The last discretization choice considers the path constraints (8.20d).
These often are evaluated on the same grid as the control discretization,
or, more generally, on a finer grid, e.g., the time points τi,j defined
above for the objective integral. Then, only finitely many constraints
h(x e (τi,j ; x0 , q), u
e (τi,j ; q)) ≤ 0 are imposed for j = 0, . . . , M and i = 0,
1, . . . , N − 1. Together with the terminal constraint, one defines the
inequality constraint function
 
h(x
e (τ0,0 ; x0 , q), ue (τ0,0 ; q))
 
 h(x
e (τ0,1 ; x0 , q), ue (τ0,1 ; q)) 
 
 .. 
 . 
 
 h(x
e (τ ; x , q), u
e (τ ; q)) 
 1,0 0 1,0 
H(x0 , q) :=  

 h(x
e (τ1,1 ; x0 , q), ue (τ1,1 ; q)) 


 .
..


 
 
h(x
e (τN−1,M−1 ; x0 , q), u e (τN−1,M−1 ; q))
hf (x e (T ; x0 , q))

nhf
If the function h maps to Rnh and hf to R , the function H maps to
(NMnh +nhf )
R . The resulting finite-dimensional optimization problem in
534 Numerical Optimal Control

single shooting is thus given by

minimize F (s0 , q)
s0 , q
subject to s0 − x0 = 0 (8.22)
H(s0 , q) ≤ 0

Of course, the trivial equality constraint s0 − x0 = 0 could easily be


eliminated, and this is often done in single-shooting implementations.
In the real-time optimization context, however, it is beneficial to in-
clude also the parameter x0 as a trivially constrained variable s0 of
the single-shooting optimization problem, as we do here. This simple
trick is called initial-value embedding, and allows one to initialize the
optimization procedure with the past initial value s0 , for which an ap-
proximately optimal solution already exists; it also allows one to easily
obtain a linearized feedback control for new values of x0 , as we dis-
cuss in the next section. Also, for moving horizon estimation (MHE)
problems, one has to keep the (unconstrained) initial value s0 as an
optimization variable in the single-shooting optimization problem for-
mulation.
In summary, the single-shooting method is a fully sequential ap-
proach that treats all intermediate state values computed in the numer-
ical integration routine as hidden variables, and solves the optimization
problem in the space of control parameters q ∈ Rnq and initial values
s0 ∈ Rn only.
There are many different ways to numerically solve the optimization
problem (8.22) in the single-shooting approach using standard meth-
ods from the field of nonlinear programming. At first sight, the opti-
mization problem in the single-shooting method is dense, and usually
problem (8.22) is solved by a dense NLP solver. However, some single-
shooting approaches use a piecewise control parameterization and are
able to exploit the intrinsic sparsity structure of the OCP in the NLP
solution, as discussed in Section 8.8.5.

8.5.2 Direct Multiple Shooting

The direct multiple-shooting method makes exactly the same dis-


cretization choices as the single-shooting method with piecewise con-
trol discretization, but it keeps the states si ≈ x(ti ) at the interval
boundary time points as decision variables in the finite-dimensional
optimization problem. This allows one to completely decouple the nu-
merical integrations on the separate intervals. For simplicity, we regard
8.5 Direct Optimal Control Parameterizations 535

again a piecewise constant control parameterization that uses the con-


stant control value qi ∈ Rm on the interval [ti , ti+1 ]. On the same
interval, we then define the N trajectory pieces x
e i (t; si , qi ) that are the
numerical solutions of the initial-value problems

dx
ei
x
e i (ti ; si , qi ) = si , (t; si , qi ) = fc (x
e i (t; si , qi ), qi ), t ∈ [ti , ti+1 ]
dt

for i = 0, 1, . . . , N − 1. Note that each trajectory piece only depends


on the artificial initial value si ∈ Rn and the local control parameter
qi ∈ Rm .
Using again a possibly refined grid on each interval, with time points
τi,j ∈ [ti , ti+1 ] for j = 0, . . . , M, we can formulate numerical approx-
Rt
imations of the objective integrals tii+1 ℓc (x e i (t; si , qi ), qi ) dt on each
interval by

M−1
X
ℓi (si , qi ) := ℓc (x
e i (τi,j ; si , qi ), qi ) (τi,j+1 −τi,j )
j=0

PN−1
The overall objective is thus given by i=0 ℓi (si , qi ) + Vf (sN ). Note
that the objective terms ℓi (si , qi ) each depend again only on the lo-
cal initial values si and local controls qi , and can thus be evaluated
independently from each other. Likewise, we discretize the path con-
straints, for simplicity on the same refined grid, by defining the local
inequality constraint functions

 
h(xe i (τ0,0 ; si , qi ), qi )
 h(xe i (τ0,1 ; si , qi ), qi ) 
 
Hi (si , qi ) := 
 .. 

 . 
h(x
e i (τ0,M−1 ; si , qi ), qi )

for i = 0, 1, . . . , N − 1. These are again independent functions, with


Hi : Rn ×Rm → R(Mnh ) . Using these definitions, and the concatenations

s := (s0 , s1 , . . . , sN ) and q := q0 , . . . , qN−1 , one can state the finite-
dimensional optimization problem that is formulated and solved in
536 Numerical Optimal Control

the direct multiple-shooting method

N−1
X
minimize ℓi (si , qi ) + Vf (sN ) (8.23a)
s, q i=0

subject to s0 = x0 (8.23b)
si+1 = x
e i (ti+1 ; si , qi ), for i = 0, . . . , N − 1 (8.23c)
Hi (si , qi ) ≤ 0, for i = 0, . . . , N − 1 (8.23d)
hf (sN ) ≤ 0 (8.23e)

By a straightforward definition of problem functions F , G, and H, and


optimization variables w = [s0′ q0′ s1′ q1′ · · · sN−1
′ ′
qN−1 ′ ′
sN ] , the above
problem can be brought into the form (8.21).
Note that, due to the presence of s as optimization variables,
the problem dimension is higher than in the single-shooting method,
namely nw = (N + 1)n + Nm variables compared with only (n + Nm)
in the single-shooting method. On the other hand, the additional Nn
equality constraints (8.23c) eliminate the additional Nn degrees of free-
dom, and the problems (8.23) and (8.22) are fully equivalent if the same
integration routines are used. Also note that the multiple-shooting
NLP (8.23) has exactly the same form as the discrete time optimal con-
trol problem (8.1). From this perspective, the single-shooting prob-
lem (8.22) is thus identical to the sequential formulation, compare (8.3),
and the multiple-shooting problem is identical to the simultaneous for-
mulation, compare (8.1), of the same discrete time OCP.
When comparing the continuous time problem (8.20) with the non-
linear program (NLP) (8.23) in direct multiple shooting, it is interest-
ing to note that the terminal cost and terminal constraint function are
identical, while the cost integrals, the system dynamics, and the path
constraints are all numerically approximated in the multiple-shooting
NLP.
Multiple versus single shooting. The advantages of multiple com-
pared to single shooting are the facts that the evaluation of the in-
tegrator calls can be performed in parallel on the different subinter-
vals, that the state values s can also be used for initialization of the
optimization solver, and that the contraction rate of Newton-type op-
timization iterations is often observed to be faster, in particular for
nonlinear and unstable systems. Its disadvantage for problems with-
out state constraints is that globalization strategies cannot simply rely
on the objective function as merit function, but have to also monitor
8.5 Direct Optimal Control Parameterizations 537

the residuals of the dynamic constraints (8.23c), which can become


cumbersome. Some people also prefer the single-shooting method for
the simple reason, that, as a sequential approach, it shows “feasible,”
or more exactly, “physical” state trajectories in each optimization iter-
ation, i.e., trajectories that satisfy, up to numerical integration errors,
the system’s differential equation.
We argue here, however, that this reason is not valid, because if
one wants to see “physical” trajectories during an optimization run,
one could numerically simulate and plot the system evolution for the
currently best available guess of the control trajectory q in any simul-
taneous method at comparably low additional cost. On the other hand,
in the presence of state constraints, the iterates of both sequential and
simultaneous methods always lead to slightly infeasible state trajec-
tories, while simultaneous methods often converge even faster in this
case. Thus, “feasibility” is not really a reason to prefer one approach
over the other.
A theoretical comparison of sequential and simultaneous (“lifted”)
formulations in the context of Newton-type optimization (Albersmeyer
and Diehl, 2010) shows that both methods can be implemented with
nearly identical computational cost per iteration. Also, it can be
shown—and observed in practice—that simultaneous formulations
lead to faster contraction rates if the nonlinearities of the concate-
nated system dynamics reinforce each other, e.g., if an exponential
x1 = exp(x0 ) is concatenated with an exponential x2 = exp(x1 ), lead-
ing to x2 = exp(exp(x0 )). On the other hand, the sequential approach
would lead to faster contraction if the concatenated nonlinearities miti-
gate each other, e.g., if a logarithm x2 = log(x1 ) follows the exponential
x1 = exp(x0 ) and renders the concatenation x2 = log(exp(x0 )) = x0
the identity (a linear map). In optimal control, one often observes that
the concatenation reinforces the nonlinearities, which renders the si-
multaneous approach favorable.

Exact expressions for linear systems with quadratic costs. In the


special case of linear systems fc (x, u) = Ac x + Bc u with quadratic
costs ℓc (x, u) = x ′ Qc x + u′ Rc u, the exact multiple-shooting functions
x
e i (ti+1 ; si , qi ) and ℓi (si , qi ) also turn out to be linear and quadratic,
and it is possible to compute them explicitly. Specifically

x
e i (ti+1 ; si , qi ) = Asi + Bqi
538 Numerical Optimal Control

with
Z (ti+1 −ti )
A = exp (Ac (ti+1 − ti )) and B = exp (Ac τ) Bc dτ
0

and " #′ " #" #


si Q S si
ℓi (si , qi ) =
qi S′ R qi
with more complicated formulas for Q, R, and S that can be found
in Van Loan (1978) or Pannocchia, Rawlings, Mayne, and Mancuso
(2015). Note that approximations of the above matrices also can be ob-
tained from the differentiation of numerical integration routines that
are applied to the linear ODE system, augmented by the quadratic cost
integral. The first-order derivatives of the final states yield A and B,
and the second-order derivative of the cost gives Q, R, and S. Because
these numerical computations can be done before an actual MPC im-
plementation, they can be performed offline and with high accuracy.

8.5.3 Direct Transcription and Collocation Methods

The idea of simultaneous optimal control can be extended even further


by keeping all ODE discretization variables as optimization variables.
This fully simultaneous approach is taken in the family of direct tran-
scription methods, which directly transcribe all data of the continuous
time OCP (8.20) into an NLP without making use of numerical integra-
tion routines. Instead, they directly formulate the numerical simula-
tion equations as equalities of the optimization problem. One example
of a direct transcription method was already given in the introduction
of this chapter, in (8.6), where an explicit Euler integration rule was
employed. Because the state equations are equality constraints of the
optimization problem, direct transcription methods often use implicit
integration rules; they offer higher orders for the same number of state
discretization variables, and come with better stability properties for
stiff systems. Probably the most popular class of direct transcription
methods are the direct collocation methods.
Direct transcription by collocation. In direct collocation, the time
horizon [0, T ] is first divided into a typically large number N of collo-
cation intervals [ti , ti+1 ], with 0 = t0 < t1 < . . . < tN = T . On each of
these intervals, an implicit Runge-Kutta integration rule of collocation
type is applied to transcribe the ODE ẋ = fc (x, u) to a finite set of non-
linear equations. For this aim, we first introduce the states si ≈ x(ti ) at
8.5 Direct Optimal Control Parameterizations 539

the time points ti , and then regard the implicit Runge-Kutta equations
with M stages on the interval with length hi := (ti+1 − ti ), which create
an implicit relation between si and si+1 . We introduce additional vari-
ables Ki := [k′i,1 · · · k′i,M ]′ ∈ RnM , where ki,j ∈ Rn corresponds to the
state derivative at the collocation time point ti + cj hi for j = 1, . . . , M.
These variables Ki are uniquely defined by the collocation equations if
si and the control value qi ∈ Rm are given. We summarize the colloca-
tion equations as GiRK (si , Ki , qi ) = 0 with
 
ki,1 − fc (si + hi (a11 ki,1 + . . . + a1,M ki,M ), qi )
 k 
 i,2 − fc (si + hi (a21 ki,1 + . . . + a2,M ki,M ), qi ) 
GiRK (si , Ki , qi ) := 
 . 

 .. 
ki,M − fc (si + hi (aM1 ki,1 + . . . + aM,M ki,M ), qi )
(8.24)
The transition to the next state is described by si+1 = FiRK (si , Ki , qi )
with
FiRK (si , Ki , qi ) := si + hi (b1 ki,1 + . . . + bM ki,M )
In contrast to shooting methods, where the controls are often held con-
stant across several integration steps, in direct collocation one usu-
ally allows one new control value qi per collocation interval, as we do
here. Even a separate control parameter for every collocation time point
within the interval is possible. This would introduce the maximum
number of control degrees of freedom that is compatible with direct
collocation methods and could be interpreted as a piecewise polyno-
mial control parameterization of order (M − 1).
Derivative versus state representation. In most direct collocation
implementations, one uses a slightly different formulation, where the
intermediate stage derivative variables Ki = [k′i,1 · · · k′i,M ]′ ∈ RnM are

replaced by the stage state variables Si = [si,1 ′
· · · si,M ]′ ∈ RnM that
are related to si and Ki via the linear map
si,j = si + hi (aj1 ki,1 . . . + aj,M ki,M ) for j = 1, . . . , M (8.25)
If c1 > 0, then the relative time points (0, c1 , . . . , cM ) are all different,
such that the interpolation polynomial through the (M + 1) states (si ,
si,1 , . . . , si,M ) is uniquely defined, which renders the linear map (8.25)
from (si , Ki ) to (si , Si ) invertible. Concretely, the values ki,j can be
obtained as the time derivatives of the interpolation polynomial at the
collocation time points. The inverse map, for j = 1, . . . , M, is given by
1  
ki,j = Dj,1 (si,1 − si ) + . . . + Dj,M (si,M − si ) (8.26)
hi
540 Numerical Optimal Control

Interestingly, the matrix (Djl ) is the inverse of the matrix (amj ) from
PM
the Butcher tableau, such that j=1 amj Djl = δml . Inserting this in-
verse map into GiRK (si , Ki , qi ) from Eq. (8.24) leads to the equivalent
root-finding problem Gi (si , Si , qi ) = 0 with

Gi (si , Si , qi ) :=
 1  
hi D1,1 (si,1 − si ) + . . . + D1,M (si,M − si ) − fc (si,1 , qi )
 1 
 h D2,1 (si,1 − si ) + . . . + D2,M (si,M − si ) − fc (si,2 , qi ) 
 i 
 ..  (8.27)
 . 
 
1 
hi D M,1 (s i,1 − si ) + . . . + DM,M (si,M − si ) − fc (si,M , qi )

Likewise, inserting the inverse map into FiRK (si , Ki , qi ) leads to the lin-
ear expression

e 1 (si,1 − si ) + . . . + b
Fi (si , Si , qi ) := si + b e M (si,M − si )

where the coefficient vector b e ∈ RM is obtained from the RK weight


vector b by the relation b e = D ′ b. In the special case that cM = 1, for
example in Radau IIA collocation methods, the vector b e becomes a unit
vector and the simple relation Fi (si , Si , qi ) = si,M holds. Because the
transition from (si , Ki ) to (si , Si ) just amounts to a basis change, affine
invariant Newton-type methods lead to identical iterates independent
of the chosen parameterization. However, using either the derivative
variables Ki or the state variables Si leads to different sparsity patterns
in the Jacobians and higher-order derivatives of the problem functions.
In particular, the Hessian of the Lagrangian is typically sparser if the
node state variables Si are used. For this reason, the state represen-
tation is more often used than the derivative representation in direct
collocation codes.

Direct
R ti+1 collocation optimization problem. The objective integrals
ti ℓc (x
e (t), qi ) dt on each interval are canonically approximated by
a weighted sum of evaluations of ℓc on the collocation time points, as
follows
M
X
ℓi (si , Si , qi ) := hi bj ℓc (si,j , qi )
j=1
8.5 Direct Optimal Control Parameterizations 541

Similarly, one might choose to impose the path constraints on all col-
location time points, leading to the stage inequality function
 
h(si,1 , qi )
 h(s , q ) 
 i,2 i 
Hi (si , Si , qi ) := 
 .. 

 . 
h(si,M , qi )

The finite-dimensional optimization problem to be solved in direct


collocation has as optimization variables the sequence of external
states s := (s0 , s1 , . . . , sN ), the sequence of the internal states S :=
(S0 , S1 , . . . , SN−1 ) as well as the sequence of local control parameters,

q := q0 , q1 , . . . , qN−1 , and is formulated as follows

N−1
X
minimize ℓi (si , Si , qi ) + Vf (sN ) (8.28a)
s, S, q i=0

subject to s0 = x0 (8.28b)
si+1 = Fi (si , Si , qi ), for i = 0, . . . , N − 1 (8.28c)
0 = Gi (si , Si , qi ), for i = 0, . . . , N − 1 (8.28d)
Hi (si , Si , qi ) ≤ 0, for i = 0, . . . , N − 1 (8.28e)
hf (sN ) ≤ 0 (8.28f)

One sees that the above nonlinear programming problem in direct


collocation is similar to the NLP (8.23) arising in the direct multiple-
shooting method, but is augmented by the intermediate state variables
S and the corresponding algebraic constraints (8.28d). Typically, it is
sparser, but has more variables than the multiple-shooting NLP, not
only because of the presence of S, but also because N is larger since
it equals the total number of collocation intervals, each of which cor-
responds to one integration step in a shooting method. Typically, one
chooses rather small stage orders M, e.g., two or three, and large num-
bers for N, e.g., 100 or 1000. The NLPs arising in the direct collocation
method are large but sparse. If the sparsity is exploited in the opti-
mization solver, direct collocation can be an extremely efficient optimal
control method. For this reason, it is widely used.
Pseudospectral methods. The pseudospectral optimal control
method can be regarded a special case of the direct collocation
method, where only one collocation interval (N = 1) is chosen, but with
a high-order M. By increasing the order M, one can obtain arbitrarily
542 Numerical Optimal Control

high solution accuracies in case of smooth trajectories. The state


trajectory is represented by one global polynomial of order M that
is uniquely determined by the initial value s0 and the M collocation
node values s0,1 , . . . , s0,M . In this approach, the controls are typically
parameterized by one parameter per collocation node, i.e., by M
distinct values q0,1 , . . . , q0,M , such that the control trajectories can be
regarded to be represented by global polynomials of order (M − 1).
One gains a high approximation order, but at the cost that the typical
sparsity of the direct collocation problem is lost.

8.6 Nonlinear Optimization


After the finite-dimensional optimization problem is formulated, it
needs to be solved. From now on, we assume that a nonlinear pro-
gram (NLP) of the form (8.21) is formulated, with variable w ∈ Rnw and
parameter x0 ∈ Rn , which we restate here for convenience.

minimize F (w)
w ∈ Rnw
subject to G(x0 , w) = 0 (8.29)

H(w) ≤ 0

As before, we call the above optimization problem PN (x0 ) to indicate


its dependence on the parameter x0 and on the horizon length N. The
aim of the optimization procedure is to reliably and efficiently find an
approximation of the solution w 0 (x0 ) of PN (x0 ) for a given value of
x0 . Inside the MPC loop, the optimization solver is confronted with
a sequence of related values of the parameter x0 , a fact that can be
exploited in online optimization algorithms to improve speed and reli-
ability compared to standard offline optimization algorithms.
Assumptions and definitions. In this chapter, we make only two as-
sumptions on PN (x0 ): first, that all problem functions are at least twice
continuously differentiable, and second, that the parameter x0 enters
the equalities G linearly, such that the Jacobian matrices Gx and Gw are
independent of x0 . This second assumption is satisfied for all problem
formulations from the previous sections, because the initial value en-
ters only via the initial-value constraint s0 − x0 = 0. If one would en-
counter a problem where the parametric dependence is nonlinear, one
could always use the same trick that we used in the single-shooting
method and introduce a copy of the parameter as an additional opti-
mization variable s0 —which becomes part of w—and constrain it by
8.6 Nonlinear Optimization 543

the additional constraint s0 − x0 = 0. Throughout the section, we often


make use of the linearization HL (·; w̄) of a function H(·) at a point w̄,
i.e., its first-order Taylor series, as follows

HL (w; w̄) := H(w̄) + Hw (w̄) (w − w̄)

Due to the linear parameter dependence of G, its Jacobian does not


depend on x0 , such that we can write

GL (x0 , w; w̄) = G(x0 , w̄) + Gw (w̄) (w − w̄)

We also heavily use the Lagrangian function defined by

L(x0 , w, λ, µ) := F (w) + λ′ G(x0 , w) + µ ′ H(w) (8.30)

whose gradient and Hessian matrix with respect to w are often used.
Again, they do not depend on x0 , and can thus be written as ∇w L(w,
λ, µ) and ∇2w L(w, λ, µ). Note that the dimensions of the multipliers, or
dual variables λ and µ, equal the output dimensions of the functions
G and H, which we denote by nG and nH . We sometimes call w ∈ Rnw
the primal variable. At a feasible point w, we say that an inequality
with index i ∈ {1, . . . , nH } is active if and only if Hi (w) = 0. The
linear independence constraint qualification (LICQ) is satisfied if and
only if the gradients of all active inequalities, ∇w Hi (w) ∈ Rnw , and the
gradients of the equality constraints, ∇w Gj (w) ∈ Rnw for j ∈ {1, . . . ,
nG }, form a linearly independent set of vectors.

8.6.1 Optimality Conditions and Perturbation Analysis

The first-order necessary conditions for optimality of the above opti-


mization problem are known as the Karush-Kuhn-Tucker (KKT) condi-
tions, which are formulated as follows.

Theorem 8.14 (KKT conditions). If w 0 is a local minimizer of the opti-


mization problem PN (x0 ) defined in (8.29) and if LICQ holds at w 0 , then
there exist multiplier vectors λ0 and µ 0 such that

∇w L(w 0 , λ0 , µ 0 ) = 0 (8.31a)
0
G(x0 , w ) = 0 (8.31b)
0 0
0 ≥ H(w ) ⊥ µ ≥ 0 (8.31c)

Here, the last condition, known as the complementarity condition,


states not only that all components of H(w 0 ) are negative and all com-
ponents of µ 0 are positive, but also that the two vectors are orthogonal,
544 Numerical Optimal Control

which implies that the products µi0 Hi (w 0 ) are zero for each i ∈ {1, . . . ,
nH }. Thus, each pair (Hi (w 0 ), µi0 ) ∈ R2 must be an element of a nons-
mooth, L-shaped subset of R2 that comprises only the negative x-axis,
the positive y-axis, and the origin.
Any triple (w 0 , λ0 , µ 0 ) that satisfies the KKT conditions (8.31) and
LICQ is called a KKT point, independent of local optimality.
In general, the existence of multipliers such that the KKT condi-
tions (8.31) hold is just a necessary condition for local optimality of a
point w 0 at which LICQ holds. Only in the special case that the opti-
mization problem is convex, the KKT conditions can be shown to be
both a necessary and a sufficient condition for global optimality. For
the general case, we need to formulate additional conditions on the
second-order derivatives of the problem functions to arrive at suffi-
cient conditions for local optimality. This is only possible after making
a few definitions.
Strictly active constraints and null space basis. At a KKT point (w,
λ, µ), an active constraint with index i ∈ {1, . . . , nH } is called weakly
active if and only if µi = 0 and strictly active if µi > 0. Note that for
weakly active constraints, the pair (Hi (w), µi ) is located at the origin,
i.e., at the nonsmooth point of the L-shaped set. For KKT points without
weakly active constraints, i.e., when the inequalities are either strictly
active or inactive, we say that the strict complementarity condition is
satisfied.
Based on the division into weakly and strictly active constraints, one
can construct the linear space Z of directions in which the strictly active
constraints and the equality constraints remain constant up to first or-
der. This space Z plays an important role in the second-order sufficient
conditions for optimality that we state below, and can be defined as the
null space of the matrix that is formed by putting the transposed gra-
dient vectors of all equality constraints and all strictly active inequality
constraints on top of each other. To define this properly at a KKT point
(w, λ, µ), we reorder the inequality constraints such that
 + 
H (w)
 0 
H(w) =  H (w) 
H − (w)

In this reordered view on the function H(w), the strictly active inequal-
ity constraints H + (w) come first, then the weakly active constraints
H 0 (w), and finally the inactive constraints H − (w). Note that the out-
put dimensions of the three functions add to nH . The set Z ⊂ Rnw is
8.6 Nonlinear Optimization 545

now defined as null space of the matrix


" #
Gw (w)
A := + ∈ RnA ×nw
Hw (w)

One can regard an orthogonal basis matrix Z ∈ Rnw ×(nw −nA ) of Z that
satisfies AZ = 0 and Z ′ Z = I and whose columns span Z. This al-
lows us to compactly formulate the following sufficient conditions for
optimality.
Theorem 8.15 (Strong second-order sufficient conditions for optimal-
ity). If (w 0 , λ0 , µ 0 ) is a KKT point and if the Hessian of its Lagrangian
is positive definite on the corresponding space Z, i.e., if

Z ′ ∇2w L(w 0 , λ0 , µ 0 )Z > 0 (8.32)

then the point w 0 is a local minimizer of problem PN (x0 ).


We call a KKT point that satisfies the conditions of Theorem 8.15 a
strongly regular KKT point. We should mention that there exists also a
weaker form of second-order sufficient conditions. We prefer to work
with the stronger variant because it does not only imply optimality
but also existence of neighboring solutions w 0 (x0 ) as a function of
the parameter x0 . Moreover, the solution map w 0 (x0 ) is directionally
differentiable, and the directional derivative can be obtained by the
solution of a quadratic program, as stated in the following theorem that
summarizes standard results from parametric optimization (Robinson,
1980; Guddat, Vasquez, and Jongen, 1990) and is proven in the specific
form below in Diehl (2001).
Theorem 8.16 (Tangential predictor by quadratic program). If (w̄, λ̄,
µ̄) is a strongly regular KKT point for problem PN (x̄0 ) (i.e., it satisfies
the conditions of Theorem 8.15) then there is a neighborhood N ⊂ Rn
around x̄0 such that for each x0 ∈ N the problem PN (x0 ) has a lo-
cal minimizer and corresponding strongly regular KKT point (w 0 (x0 ),
λ0 (x0 ), µ 0 (x0 )). Moreover, the map from x0 ∈ N to (w 0 (x0 ), λ0 (x0 ),
µ 0 (x0 )) is directionally differentiable at x̄0 , and the directional deriva-
tive can be obtained by the solution of the following quadratic pro-
gram
1
minimize FL (w; w̄) + (w − w̄)′ ∇2w L(w̄, λ̄, µ̄)(w − w̄)
w∈R nw 2
(8.33)
subject to GL (x0 , w; w̄) = 0
HL (w; w̄) ≤ 0
546 Numerical Optimal Control

More specifically, the solution (w QP (x0 , λQP (x0 ), µ QP (x0 )) of the above
QP satisfies
 QP 
w (x0 ) − w 0 (x0 )
 QP 
 λ (x0 ) − λ0 (x0 )  = O(|x0 − x̄0 |2 )
µ QP (x0 ) − µ 0 (x0 )

8.6.2 Nonlinear Optimization with Equalities

When we solve an optimization problem without inequalities, the KKT


conditions simplify to

∇w L(w 0 , λ0 ) = 0
G(x0 , w 0 ) = 0

This is a smooth root-finding problem that can be summarized as R(x0 ,


z) = 0 with z = [w ′ λ′ ]′ . Interestingly, if one regards the Lagrangian L
as a function of x0 and z, we have R(x0 , z) = ∇z L(x0 , z). The classical
Newton-Lagrange method addresses the above root-finding problem by
a Newton iteration of the form

zk+1 = zk + ∆zk with Rz (zk )∆zk = −R(x0 , zk ) (8.35)

To simplify notation and avoid that the iteration index k interferes with
the indices of the optimization variables, we usually use the following
notation for the Newton step

z+ = z̄ + ∆z with Rz (z̄)∆z = −R(x0 , z̄) (8.36)

Here, the old iterate and linearization point is called z̄ and the new it-
erate z+ . The square Jacobian matrix Rz (z) that needs to be factorized
in each iteration to compute ∆z has a particular structure and is given
by " #
∇2w L(w, λ) Gw (w)′
Rz (z) =
Gw (w) 0
This matrix is called the KKT matrix and plays an important role in
all constrained optimization algorithms. The KKT matrix is invertible
at a point z if the LICQ condition holds, i.e., Gw (w) has rank nG , and
if the Hessian of the Lagrangian is positive definite on the null space
of Gw (w), i.e., if Z ′ ∇2w L(w, λ, µ)Z > 0, for Z being a null space basis.
The matrix Z ′ ∇2w L(w, λ, µ)Z is also called the reduced Hessian. Note
that the KKT matrix is invertible at a strongly regular point, as well
8.6 Nonlinear Optimization 547

as in a neighborhood of it, such that Newton’s method is locally well


defined. The KKT matrix is the second derivative of the Lagrangian L
with respect to the primal-dual variables z, and is therefore symmetric.
For this reason, it has only real eigenvalues, but it is typically indefinite.
At strongly regular KKT points, it has nw positive and nG negative
eigenvalues.
Quadratic program interpretation and tangential predictors. A
particularly simple optimization problem arises if the objective func-
tion is linear quadratic, F (w) = b′ w + (1/2)w ′ Bw, and the constraint
linear, G(w) = a + Aw. In this case, we speak of a quadratic program
(QP), and the KKT conditions of the QP directly form a linear system in
the variables z = [w ′ λ′ ]′ , namely
" #" # " #
B A′ w b
=−
A 0 λ a
Due to the equivalence of the KKT conditions of the QP with a linear
system one can show that the new point z+ = z̄ + ∆z in the Newton
iteration for the nonlinear problem (8.34) also can be obtained as the
solution of a QP
1
minimize FL (w; w̄) + (w − w̄)′ Bex (z̄)(w − w̄)
w∈R nw 2 (8.37)
subject to GL (x0 , w; w̄) = 0

with Bex (z̄) := ∇2w L(w̄, λ̄, µ̄). If the primal-dual solution of the above
QP is denoted by w QP and λQP , one can easily show that setting
w + := w QP and λ+ := λQP yields the same step as the Newton iteration.
The interpretation of the Newton step as a QP is not particularly rele-
vant for equality constrained problems, but becomes a powerful tool in
the context of inequality constrained optimization. It directly leads to
the family of sequential quadratic programming (SQP) methods, which
are treated in Section 8.7.1. One interesting observation is that the
QP (8.37) is identical to the QP (8.33) from Theorem 8.16, and thus its
solution cannot only be used as a Newton step for a fixed value of x0 ,
but it can also deliver a tangential predictor for changing values of x0 .
This property is used extensively in continuation methods for nonlin-
ear MPC, such as the real-time iteration presented in Section 8.9.2.

8.6.3 Hessian Approximations

Even though the reduced exact Hessian is guaranteed to be positive def-


inite at regular points, it can become indefinite at nonoptimal points.
548 Numerical Optimal Control

In that case the Newton’s method would fail because the KKT matrix
would become singular in one iteration. Also, the evaluation of the ex-
act Hessian can be costly. For this reason, Newton-type optimization
methods approximate the exact Hessian matrix Bex (z̄) by an approxi-
mation B̄ that is typically positive definite or at least positive semidef-
inite, and solve the QP

1
minimize FL (w; w̄) + (w − w̄)′ B̄(w − w̄)
w∈R nw 2 (8.38)
subject to GL (x0 , w; w̄) = 0

in each iteration. These methods can be generalized to the case of


inequality constrained optimization problems and then fall into the
class of sequential quadratic programming (SQP) methods.
The local convergence rate of Newton-type optimization methods
can be analyzed directly with the tools from Section 8.3.3. Since the
difference between the exact KKT matrix J(zk ) and the Newton-type
iteration matrix Mk is due only to the difference in the Hessian approx-
imation, Theorem 8.7 states that convergence can occur only if the dif-
ference Bex (zk )− B̄k is sufficiently small, and that the linear contraction
factor κmax directly depends on this difference and becomes zero if the
exact Hessian is used. Thus, the convergence rate for an exact Hessian
SQP method is quadratic, and superlinear convergence occurs if the dif-
ference between exact and approximate Hessian shrinks to zero in the
relevant directions. Note that the algorithms described in this and the
following sections only approximate the Hessian matrix, but evaluate
the exact constraint Jacobian Gw (w̄) in each iteration.
The constrained Gauss-Newton method. One particularly useful
Hessian approximation is possible if the objective function F (w) is a
sum of squared residuals, i.e., if

F (w) = (1/2) |M(w)|2

for a differentiable function M : Rnw → RnM . In this case, the exact


Hessian Bex (z̄) is given by
n
XM nG
X
M (w̄)′ Mw (w̄) + Mj (w̄)∇2 Mj (w̄) + λ̄i ∇2 Gi (w̄)
| w {z }
j=1 i=1
=:BGN (w̄)

By taking only the first part of this expression, one obtains the Gauss-
Newton Hessian approximation BGN (w̄), which is by definition always
8.6 Nonlinear Optimization 549

a positive semidefinite matrix. In the case that Mw (w̄) ∈ RnM ×nw has
rank nw , i.e., if nM ≥ nw and the nw columns are linearly indepen-
dent, the Gauss-Newton Hessian BGN (w̄) is even positive definite. Note
that BGN (w̄) does not depend on the multipliers λ, but the error with
respect to the exact Hessian does. This error would be zero if both the
residuals Mj (w̄) and the multipliers λi are zero. Because both can be
shown to be small at a strongly regular solution with small objective
function (1/2) |M(w)|2 , the Gauss-Newton Hessian BGN (w̄) is a good
approximation for problems with small residuals |M(w)|.
When the Gauss-Newton Hessian BGN (w̄) is used within a con-
strained optimization algorithm, as we do here, the resulting algo-
rithm is often called the constrained or generalized Gauss-Newton
method (Bock, 1983). Newton-type optimization algorithms with
Gauss-Newton Hessian converge only linearly, but their contraction rate
can be surprisingly fast in practice, in particular for problems with
small residuals. The QP subproblem that is solved in each iteration of
the constrained Gauss-Newton method can be shown to be equivalent
to
minimize (1/2) |ML (w; w̄)|2
w ∈ Rnw (8.39)
subject to GL (x0 , w; w̄) = 0

A particularly simple instance of the constrained Gauss-Newton


method arises if the objective function is itself already a positive defi-
nite quadratic function, i.e., if F (w) = (1/2)(w − wref )′ B(w − wref ). In
1
this case, one could define M(w) := B 2 (w − wref ) to see that the QP
subproblem has the same objective as the NLP. Generalizing this ap-
proach to nonquadratic, but convex, objectives and convex constraint
sets, leads to the class of sequential convex programming methods as
discussed and analyzed in Tran-Dinh, Savorgnan, and Diehl (2012).

Hessian update methods. Another way to obtain a cheap and posi-


tive definite Hessian approximation B̄ for Newton-type optimization is
provided by Hessian update methods. In order to describe them, we
recall the iteration index k to the primal-dual variables zk = [wk′ λ′k ]′
and the Hessian matrix Bk at the k-th iteration, such that the QP to be
solved in each iteration is described by

1
minimize FL (w; wk ) + (w − wk )′ Bk (w − wk )
w∈R nw 2 (8.40)
subject to GL (x0 , w; wk ) = 0
550 Numerical Optimal Control

QP QP
In a full step method, the primal-dual solution wk and λk of the above
QP QP
QP is used as next iterate, i.e., wk+1 := wk and λk+1 := λk . A Hessian
update formula uses the previous Hessian approximation Bk and the
Lagrange gradient evaluations at wk and wk+1 to compute the next
Hessian approximation Bk+1 . Inspired from a directional derivative of
the function ∇w L(·, λk+1 ) in the direction sk := (wk+1 − wk ), which,
up-to-first order, should be equal to the finite difference approximation
yk := ∇w L(wk+1 , λk+1 ) − ∇w L(wk , λk+1 ), all Hessian update formulas
require the secant condition

Bk+1 sk = yk

One particularly popular way of the many ways to obtain a matrix


Bk+1 that satisfies the secant condition is given by the Broyden-Fletcher-
Goldfarb-Shanno (BFGS) formula, which sets

Bk sk sk′ Bk yk yk′
Bk+1 := Bk − +
sk′ Bk sk yk′ sk

One often starts the update procedure with a scaled unit matrix, i.e.,
sets B0 := αI with some α > 0. It can be shown that for a positive defi-
nite Bk and for yk′ sk > 0, the matrix Bk+1 resulting from the BFGS for-
mula is also positive definite. In a practical implementation, to ensure
positive definiteness of Bk+1 , the unmodified update formula is only
applied if yk′ sk is sufficiently large, say if the inequality yk′ sk ≥ βsk′ Bk sk
is satisfied with some β ∈ (0, 1), e.g., β = 0.2. If it is not satisfied, the
update can either be skipped, i.e., one sets Bk+1 := Bk , or the vector yk
is first modified and then the BFGS update is performed with this mod-
ified vector. An important observation is that the gradient difference
yk can be computed with knowledge of the first-order derivatives of F
and G at wk and wk+1 , which are needed to define the linearizations FL
and GL in the QP (8.40) at the current and next iteration point. Thus, a
Hessian update formula does not create any additional costs in terms
of derivative computations compared to a fixed Hessian method (like,
for example, steepest descent); but it typically improves the conver-
gence speed significantly. One can show that Hessian update methods
lead to superlinear convergence under mild conditions.

8.7 Newton-Type Optimization with Inequalities


The necessary optimality conditions for an equality constrained opti-
mization problem form a smooth system of nonlinear equations in the
8.7 Newton-Type Optimization with Inequalities 551

primal-dual variables, and can therefore directly be addressed by New-


ton’s method or its variants. In contrast to this, the KKT conditions for
inequality constrained problems contain the complementarity condi-
tions (8.31c), which define an inherently nonsmooth set in the primal-
dual variable space, such that Newton-type methods can be applied
only after some important modifications. In this section, we present
two widely used classes of methods, namely sequential quadratic pro-
gramming (SQP) and nonlinear interior point (IP) methods.

8.7.1 Sequential Quadratic Programming

Sequential quadratic programming (SQP) methods solve in each itera-


tion an inequality constrained quadratic program (QP) that is obtained
by linearizing all problem functions

1
minimize FL (w; wk ) + (w − wk )′ Bk (w − wk )
w∈R nw 2
(8.41)
subject to GL (x0 , w; wk ) = 0
HL (w; wk ) ≤ 0

The above QP is a quadratic approximation of the nonlinear problem


QP
PN (x0 ), and is denoted by PN (x0 ; wk , Bk ) to express its dependence
on the linearization point wk and the choice of Hessian approximation
QP QP
Bk . In the full-step SQP method, the primal-dual solution zk = (wk ,
QP QP QP
λk , µk ) of the QP PN (x0 ; wk , Bk ) is directly taken as the next iter-
QP
ate, zk+1 = (wk+1 , λk+1 , µk+1 ), i.e., one sets zk+1 := zk . Note that the
multipliers (λk+1 , µk+1 ) only have an influence on the next QP via the
Hessian approximation Bk+1 , and can be completely discarded in case a
multiplier-free Hessian approximation such as a Gauss-Newton Hessian
is used.
The solution of an inequality constrained QP is a nontrivial task, but
for convex QP problems there exist efficient and reliable algorithms that
are just treated here as a black box. To render the QP subproblem con-
vex, one often chooses positive semidefinite Hessian approximations
Bk .
Active set detection and local convergence. A crucial property of
SQP methods is that the set of active inequalities (the active set, in
short) is discovered inside the QP solver, and that the active set can
change significantly from one SQP iteration to the next. However, one
can show that the QP solution discovers the correct active set when
552 Numerical Optimal Control

the linearization point wk is close to a strongly regular solution of the


NLP (8.29) at which strict complementarity holds. Thus, in the vicinity
of the solution, the active set remains stable, and, therefore, the SQP
iterates become identical to the iterates of a Newton-type method for
equality constrained optimization applied to a problem where all active
constraints are treated as equalities, and where all other inequalities are
discarded. Therefore, the local convergence results for general Newton-
type methods can be applied; and the SQP method shows quadratic
convergence in case of an exact Hessian, superlinear convergence in
case of Hessian updates, and linear convergence in case of a Gauss-
Newton Hessian.

Generalized tangential predictors in SQP methods. An appealing


property of SQP methods for problems that depend on a parameter x0
is that they deliver a generalized tangential predictor, even at points
where the active set changes, i.e., where strict complementarity does
QP
not hold. More precisely, it is easily seen that the QP PN (x0 ; w̄, B̄)
formulated in an SQP method, with exact Hessian B̄ = ∇2 L(z̄) at a
strongly regular solution z̄ = (w̄, λ̄, µ̄) of problem PN (x̄0 ), delivers the
tangential predictor of Theorem 8.16 for neighboring problems PN (x0 )
with x0 ≠ x̄0 (Diehl, 2001). A disadvantage of SQP methods is that they
require in each iteration the solution of an inequality constrained QP,
which is more expensive than solution of a linear system.

8.7.2 Nonlinear Interior Point Methods

Nonlinear interior point (IP) methods remove the nonsmoothness of


the KKT conditions by formulating an approximate, but smooth root-
finding problem. This smooth problem corresponds to the necessary
optimality conditions of an equality constrained optimization problem
that is an approximation of the original problem. In a first and trivial
step, the nonlinear inequalities H(w) ≤ 0 are reformulated into equal-
ity constraints H(w)+s = 0 by introduction of a slack variable s ∈ RnH
that is required to be positive, such that the equivalent new problem
has bounds of the form s ≥ 0 as its only inequality constraints. In the
second and crucial step, these bounds are replaced by a barrier term
PnH
of the form −τ i=1 log si with τ > 0 that is added to the objective.
This leads to a different and purely equality constrained optimization
8.7 Newton-Type Optimization with Inequalities 553

problem given by
nH
X
minimize F (w) − τ log si
w, s i=1
(8.42)
subject to G(x0 , w) = 0
H(w) + s = 0

For τ → 0, the barrier term −τ log si becomes zero for any strictly posi-
tive si > 0 while it always grows to infinity for si → 0, i.e., on the bound-
ary of the feasible set. Thus, for τ → 0, the barrier function would be a
perfect indicator function of the true feasible set and one can show that
the solution of the modified problem (8.42) tends to the solution of the
original problem (8.29) for τ → 0. For any positive τ > 0, the necessary
optimality conditions of problem (8.42) are a smooth set of equations,
and can, if we denote the multipliers for the equalities H(w) + s = 0 by
µ ∈ RnH and keep the original definition of the Lagrangian from (8.30),
be equivalently formulated as

∇w L(w, λ, µ) = 0 (8.43a)
G(x0 , w) = 0 (8.43b)
H(w) + s = 0 (8.43c)
µi s i = τ for i = 1, . . . , nH (8.43d)

Note that for τ > 0, the last condition (8.43d) is a smooth version of
the complementarity condition 0 ≤ s ⊥ µ ≥ 0 that would correspond
to the KKT conditions of the original problem after introduction of the
slack variable s.
A nonlinear IP method proceeds as follows: it first sets τ to a rather
large value, and solves the corresponding root-finding problem (8.43)
with a Newton-type method for equality constrained optimization. Dur-
ing these iterations, the implicit constraints si > 0 and µi > 0 are
strictly enforced by shortening the steps, if necessary, to avoid being
attracted by spurious solutions of µi si = τ. Then, it slowly reduces the
barrier parameter τ; for each new value of τ, the Newton-type iterations
are initialized with the solution of the previous problem.
Of course, with finitely many Newton-type iterations, the root-
finding problems for decreasing values of τ can only be solved ap-
proximately. In practice, one often performs only one Newton-type
iteration per problem, i.e., one iterates while one changes the problem.
Here, we have sketched the primal-dual IP method as it is for example
554 Numerical Optimal Control

implemented in the NLP solver IPOPT (Wächter and Biegler, 2006); but
there exist many other variants of nonlinear interior point methods. IP
methods also exist in variants that are tailored to linear or quadratic
programs and IP methods also can be applied to other convex optimiza-
tion problems such as second-order cone programs or semidefinite pro-
grams (SDP). For these convex IP algorithms, one can establish polyno-
mial runtime bounds, which unfortunately cannot be established for
the more general case of nonlinear IP methods described here.

Nonlinear IP methods with fixed barrier parameter. Some variants


of nonlinear IP methods popular in the field of nonlinear MPC use a
fixed positive barrier parameter τ throughout all iterations, and there-
fore solve a modified MPC problem. The advantage of this approach is
that a simple and straightforward Newton-type framework for equality
constrained optimization can be used out of the box. The disadvantage
is that for a large value of τ, the modified MPC problem is a conservative
approximation of the original MPC problem; for a small value of τ, the
nonlinearity due to the condition (8.43d) is severe and slows down the
convergence of the Newton-type procedure. Interestingly, these non-
linear IP variants are sometimes based on different barrier functions
than the logarithmic barrier described above; they use slack formula-
tions that make violation of the implicit constraint si ≥ 0 impossible by
setting, for example, si = (ti )2 with new slacks ti . This last variant is
successfully used for nonlinear MPC by Ohtsuka (2004), and modifies
the original problem to a related problem of the form

nH
X
minimize F (w) − τ ti
w, t i=1
(8.44)
subject to G(x0 , w) = 0
Hi (w) + (ti )2 = 0, i = 1, . . . , nH

which is then solved by a tailored Newton-type method for equality


constrained optimization.

8.7.3 Comparison of SQP and Nonlinear IP Methods

While SQP methods need to solve a QP in each iteration, nonlinear IP


methods only solve a linear system of similar size in each iteration,
which is cheaper. Some SQP methods even solve the QP by an interior
point method, and then perform about 10-30 inner iterations—each of
8.8 Structure in Discrete Time Optimal Control 555

which is as expensive as the linear system solution in a nonlinear IP


method.
On the other hand, the cost per iteration for both SQP and nonlin-
ear IP methods also comprises the evaluation of the problem functions
and their derivatives. The number of high-level iterations required to
reach a desired level of accuracy is often smaller for SQP methods than
for nonlinear IP methods. Also, SQP methods are better at warmstart-
ing, which is particularly important in the context of nonlinear MPC.
Roughly speaking, for an NLP with cheap function and derivative eval-
uations, as in direct collocation, and if no good initial guess is provided,
a nonlinear IP method is preferable. An SQP method would be favorable
in case of expensive function evaluations, as in direct single or multiple
shooting, and when good initial guesses can be provided, for example,
if a sequence of neighboring problems is solved.

8.8 Structure in Discrete Time Optimal Control

When a Newton-type optimization method is applied to an optimal con-


trol problem, the dynamic system constraints lead to a specific sparsity
structure in the KKT matrix. And the quadratic program (QP) in the
Newton-type iteration corresponds to a linear quadratic (LQ) optimal
control problem with time-varying matrices. To discuss this structure
in detail, consider an unconstrained discrete time OCP as it arises in
the direct multiple-shooting method

N−1
X
minimize ℓi (xi , ui ) + Vf (xN )
w i=0
(8.45)
¯ 0 − x0 = 0
subject to x̄
fi (xi , ui ) − xi+1 = 0 for i = 0, . . . , N − 1

Here, the vector w ∈ R(N+1)n+Nm of optimization variables is given by


w = [x0′ u′0 · · · xN−1

u′N−1 xN ′ ′
] . The fixed vector x̄ ¯ 0 is marked by
two bars to distinguish it from the optimization variable x0 , as well
as from a specific value x̄0 of x0 that is used as linearization point
in a Newton-type algorithm. We introduce also a partitioned vector of
Lagrange multipliers, λ = [λ′0 λ′1 . . . λ′N ]′ , with λ ∈ R(N+1)n , such that
556 Numerical Optimal Control

the Lagrangian of the problem is given by


N−1
X
¯ 0 , w, λ) = λ′0 (x̄
L(x̄ ¯ 0 − x0 ) + ℓi (xi , ui )+
i=0

λ′i+1 (fi (xi , ui ) − xi+1 ) + Vf (xN )

As before, we can combine w and λ to a vector z ∈ R2(N+1)n+Nm of all


primal-dual variables. Interestingly, the exact Hessian matrix Bex (z) =
∇2w L(z) is block diagonal (Bock and Plitt, 1984), because the Lagrangian
function L is a sum of independent terms that each depend only on a
small subset of the variables—a property called partial separability.
The exact Hessian is easily computed to be a matrix with the structure
 
Q0 S0′
S 
 0 R0 
 
 .. 

Bex (z̄) =  . 
 (8.46)
 QN−1 SN−1 ′ 
 
 
 SN−1 RN−1 
PN
where the blocks with index i, only depend on the primal variables with
index i and the dual variables with index (i + 1). More specifically, for
i = 0, . . . , N − 1 the blocks are readily shown to be given by
" #
Qi Si′
= ∇2(xi ,ui ) [ℓi (xi , ui ) + λ′i+1 fi (xi , ui )]
Si Ri

8.8.1 Simultaneous Approach

Most simultaneous Newton-type methods for optimal control pre-


serve the block diagonal structure of the exact Hessian Bex (z̄) and
also of the Hessian approximation B̄. Thus, the linear quadratic
optimization problem (8.38) that is solved in one iteration of a
Newton-type optimization method for a given linearization point w̄ =
[x̄0′ ū′0 · · · x̄N−1

ū′N−1 x̄N
′ ′
] and a given Hessian approximation B̄ is
identical to the following time-varying LQ optimal control problem
N−1
X
minimize ℓQP,i (xi , ui ; w̄, B̄) + VQP,f (xN ; w̄, B̄)
w i=0
¯ 0 − x0 = 0
subject to x̄
fL,i (xi , ui ; x̄i , ūi ) − xi+1 = 0 for i = 0, . . . , N − 1
(8.47)
8.8 Structure in Discrete Time Optimal Control 557

Here, the quadratic objective contributions ℓQP,i (xi , ui ; w̄, B̄) are given
by
" # " #′ " #" #
′ xi − x̄i 1 xi − x̄i Q̄i S̄i′ xi − x̄i
ℓi (x̄i , ūi )+∇(s,q) ℓi (x̄i , ūi ) +
ui − ūi 2 ui − ūi S̄i R̄i ui − ūi

the terminal cost VQP,f (xN ; w̄, B̄) is given by

Vf (x̄N ) + ∇Vf (x̄N )′ [xN − x̄N ] + (1/2)[xN − x̄N ]′ P̄N [xN − x̄N ]

and the linearized constraint functions fL,i (xi , ui ; x̄i , ūi ) are simply
given by

∂fi ∂fi
fi (x̄i , ūi ) + (x̄i , ūi )[xi − x̄i ] + (x̄i , ūi )[ui − ūi ]
|∂s {z } ∂q
| {z }
=:Āi =:B̄i

To create a banded structure, it is advantageous to order the primal-


dual variable vector as z = [λ′0 x0′ u′0 · · · λ′N−1 xN−1

u′N−1 λ′N xN
′ ′
];
then the solution of the above LQ optimal control problem at iterate
z̄ corresponds to the solution of a block-banded linear system M̄KKT ·
¯ 0 , z̄), which we can write equivalently as
(z − z̄) = −∇z L(x̄

M̄KKT · z = −r̄KKT (8.48)

where the residual vector is given by r̄KKT := ∇z L(x̄¯ 0 , z̄) − M̄KKT z̄.
The matrix M̄KKT is an approximation of the block-banded KKT matrix
∇2z L(z̄) and given by
 
0 −I
 
−I Q̄0 S̄0′ Ā′0 
 
 S̄ 0 R̄0 B̄ ′ 
 0 
 Ā0 B̄0 0 −I 
 
 
 .. 
M̄KKT = 
 −I . 
 (8.49)
 ′ ′
Q̄N−1 S̄N−1 ĀN−1 
 
 ′ 
 S̄N−1 R̄N−1 B̄N−1 
 
 Ā N−1 B̄N−1 0 −I 
 
 −I 
 
P̄N

Ignoring the specific block structure, this is a banded symmetric ma-


trix with bandwidth (2n + m) and total size N(2n+m) + 2n, and the
558 Numerical Optimal Control

linear system can thus in principle be solved using a banded LDLT-


factorization routine at a cost that is linear in the horizon length N
and cubic in (2n + m). There exists a variety of even more efficient
solvers for this form of KKT systems with smaller runtime and smaller
memory footprint. Many of these solvers exploit the specific block-
banded structure of the LQ optimal control problem. Some of these
solvers are based on the backward Riccati recursion, as introduced in
Section 1.3.3 and Section 6.1.1, and described in Section 8.8.3 for the
time-varying case.

8.8.2 Linear Quadratic Problems (LQP)

Consider a time-varying LQ optimal control problem of the form

N−1
" #′ " # " #′ " #" #
X q̄i xi 1 xi Q̄i S̄i′ xi 1 ′

minimize + + p̄N xN + xN P̄N xN
x, u r̄i ui 2 ui S̄i R̄i ui 2
i=0
¯ 0 − x0 = 0
subject to x̄
b̄i + Āi xi + B̄i ui − xi+1 = 0 for i = 0, . . . , N − 1
(8.50)
Here, we use the bar above fixed quantities such as Āi , Q̄i to distin-
guish them from the optimization variables xi , ui , and the quantities
that are computed during the solution of the optimization problem.
This distinction makes it possible to directly interpret problem (8.50)
as the LQ approximation (8.47) of a nonlinear problem (8.45) at a given
linearization point z̄ = [λ̄′0 x̄0′ ū′0 · · · λ̄′N−1 x̄N−1

ū′N−1 λ̄′N x̄N
′ ′
] within
a Newton-type optimization method. We call the above problem the lin-
ear quadratic problem (LQP), and present different solution approaches
for the LQP in the following three subsections.

8.8.3 LQP Solution by Riccati Recursion

One band-structure-exploiting solution method for the above linear


quadratic optimization problem is called the Riccati recursion. It can
easily be derived by dynamic programming arguments. It is given by
three recursions—one expensive matrix and two cheaper vector recur-
sions.
First, and most important, we perform a backward matrix recursion
which is started at PN := P̄N , and goes backward through the indices
8.8 Structure in Discrete Time Optimal Control 559

i = N − 1, . . . , 0 to compute PN−1 , . . . , P0 with the following formula

Pi := Q̄i + Ā′i Pi+1 Āi


− (S̄i′ + Ā′i Pi+1 B̄i )(R̄i + B̄i′ Pi+1 B̄i )−1 (S̄i + B̄i′ Pi+1 Āi ) (8.51)

The only condition for the above matrix recursion formula to be well
defined is that the matrix (R̄i + B̄i′ Pi+1 B̄i ) is positive definite, which
turns out to be equivalent to the optimization problem being well posed
(otherwise, problem (8.50) would be unbounded from below). Note that
the Riccati matrix recursion propagates symmetric matrices Pi , whose
symmetry can and should be exploited for efficient computations.
The second recursion is a vector recursion that also goes backward
in time and is based on the matrices P0 , . . . , PN resulting from the first
recursion, and can be performed concurrently. It starts with pN := p̄N
and then runs through the indices i = N − 1, . . . , 0 to compute

pi := q̄i + Ā′i (Pi+1 b̄i + pi+1 )


− (S̄i′ + Ā′i Pi+1 B̄i )(R̄i + B̄i′ Pi+1 B̄i )−1 (r̄i + B̄i′ (Pi+1 b̄i + pi+1 )) (8.52)

Interestingly, the result of the first and the second recursion together
yield the optimal cost-to-go functions Vi0 for the states xi that are given
by
1
Vi0 (xi ) = ci + pi′ xi + xi′ Pi xi
2
where the constants ci are not of interest here. Also, one directly ob-
tains the optimal feedback control laws u0i that are given by

u0i (xi ) = ki + Ki xi

with

Ki := −(R̄i + B̄i′ Pi+1 B̄i )−1 (S̄i + B̄i′ Pi+1 Āi ) and (8.53a)
ki := −(R̄i + B̄i′ Pi+1 B̄i )−1 (r̄i + B̄i′ (Pi+1 b̄i + pi+1 )) (8.53b)

Based on these data, the optimal solution to the optimal control prob-
lem is obtained by a forward vector recursion that is nothing other
than a forward simulation of the linear dynamics using the optimal
feedback control law. Thus, the third recursion starts with x0 := x̄¯0
and goes through i = 0, . . . , N − 1 computing

ui := ki + Ki xi (8.54a)
xi+1 := b̄i + Āi xi + B̄i ui (8.54b)
560 Numerical Optimal Control

For completeness, one would simultaneously also compute the La-


grange multipliers λi , which are for i = 0, . . . , N given by the gradient
of the optimal cost-to-go function at the solution

λi := pi + Pi xi (8.54c)

The result of the three recursions of the Riccati algorithm is a vector


z = [λ′0 x0′ u′0 · · · λ′N−1 xN−1 ′
u′N−1 λ′N xN
′ ′
] that solves the linear
system M̄KKT · z = −r̄KKT with a right-hand side that is given by r̄KKT =
¯ ′0 q̄0′ r̄0′ b̄0′ · · · q̄N−1
[x̄ ′ ′
r̄N−1 ′
b̄N−1 ′ ′
p̄N ].
The matrix recursion (8.51) can be interpreted as a factorization of
the KKT matrix M̄KKT , and in an efficient implementation it needs about
N(7/3n3 + 4n2 m + 2nm2 + 1/3m3 ) FLOPs, which is about one-third
the cost of a plain banded LDLT-factorization.
On the other hand, the two vector recursions (8.52) and (8.54a)-
(8.54c) can be interpreted as a linear system solve with the already fac-
torized matrix M̄KKT . In an efficient implementation, this linear system
solve needs about N(8n2 + 8nm + 2n2 ) FLOPs.
If care is taken to reduce the number of memory movements and
to optimize the linear algebra operations for full CPU usage, one can
obtain significant speedups in the range of one order of magnitude
compared to a standard implementation of the Riccati recursion—even
for small- and medium-scale dynamic systems (Frison, 2015). With only
minor modifications, the Riccati recursion can be used inside an interior
point method for inequality constrained optimal control problems.

8.8.4 LQP Solution by Condensing

A different way to exploit the block-sparse structure of the LQ op-


timal control problem (8.50) is to first eliminate the state trajectory
x = [x0′ x1′ · · · xN ′ ′ ¯ 0 and the con-
] as a function of the initial value x̄
′ ′ ′ ′
trol u = [u0 u1 · · · uN−1 ] . After subdivision of the variables into
states and controls, the equality constraints of the QP (8.50) can be ex-
pressed in the following form, where we omit the bar above the system
matrices and vectors for better readability
       
I 0 I 0
       
−A0 I   b0  0 B0 
       
 −A I   b1  0 ¯  B 
 1 x =  + x̄ 0 + 1 u
 . .   .  .  . 
 .. ..   .  .  . . 
   .  .  
−AN−1 I bN−1 0 BN−1
| {z } | {z } | {z } | {z }
=:A =:b =:I =:B
8.8 Structure in Discrete Time Optimal Control 561

It can easily be shown that the inverse of A is given by


 
I
 
 A0 I 
 
−1  A 1 A 0 A 1 I 
A = 
 . . .. .. 
 .. .. . . 
 
(AN−1 · · · A0 ) (AN−1 · · · A1 ) (AN−1 · · · A2 ) I

and state elimination results in the affine map

¯ 0 + A−1 B u
x = A−1 b + A−1 I x̄

Using this explicit expression to eliminate all states in the objective


results in a condensed, unconstrained quadratic optimization problem
of the form
" #′ " # " #′ " #" #
q ¯0
x̄ ¯0
1 x̄ Q S′ ¯0

minimize c + + (8.55)
u r u 2 u S R u

that is equivalent to the original optimal control problem (8.50). Con-


densing algorithms process the vectors and matrices of the sparse
problem (8.50) to yield the data of the condensed QP (8.55)—in par-
ticular the Hessian R—and come in different variants. One classical
condensing algorithm has a cost of about (1/3)N 3 nm2 FLOPS; a second
variant, that can be derived by applying reverse AD to the quadratic cost
function, has a different complexity and costs about N 2 (2n2 m + nm2 )
FLOPs. See Frison (2015) for a detailed overview of these and other
condensing approaches.
After condensing, the condensed QP still needs to be solved, and
the solution of the above unconstrained QP (8.50) is given by u0 =
−R−1 (r + Sx̄¯ 0 ). Because the Hessian R is a dense symmetric and usu-
ally positive definite matrix of size (Nm), it can be factorized using a
Cholesky decomposition, which costs about (1/3)N 3 m3 FLOPs. Inter-
estingly, the Cholesky factorization also could be computed simultane-
ously with the second condensing procedure mentioned above, which
results in an additional cost of only about Nm3 FLOPs (Frison, 2015),
resulting in a condensing based Cholesky factorization of quadratic
complexity in N, as discovered by Axehill and Morari (2012). The con-
densing approach can easily be extended to the case of additional con-
straints, and results in a condensed QP with Nm variables and some
additional equality and inequality constraints that can be addressed by
a dense QP solver.
562 Numerical Optimal Control

Is condensing a sequential approach? Condensing is similar in


spirit to a sequential approach that is applied to the LQ subproblem. To
distinguish the different algorithmic ingredients, we reserve the term
“sequential” for the nonlinear OCP only, while we speak of “condens-
ing” when we refer to an LQ optimal control problem. This distinction is
useful because all four combinations of sequential or simultaneous ap-
proaches with either the Riccati recursion or the condensing algorithm
are possible, and lead to different algorithms. For example, when the
simultaneous approach is combined with the condensing algorithm, it
leads to different Newton-type iterates than the plain sequential ap-
proach, even though the linear algebra operations in the quadratic sub-
problems are similar.

Comparing Riccati recursion and condensing. The Riccati recur-


sion, or, more generally, the banded-LDLT-factorization approaches,
have a runtime that is linear in the horizon length N; they are there-
fore always preferable to condensing for long horizons. They can easily
be combined with interior point methods and result in highly compet-
itive QP solution algorithms. On the other hand, condensing-based QP
solutions become more competitive than the Riccati approach for short
to moderate horizon lengths N—in particular if the state dimension n
is larger than the control dimension m, and if an efficient dense ac-
tive set QP solver is used for the condensed QPs. Interestingly, one
can combine the advantages of condensing and band structured linear
algebra to yield a partial condensing method (Axehill, 2015), which is
even more efficient than the plain Riccati approach on long horizons.

8.8.5 Sequential Approaches and Sparsity Exploitation

So far, we have only presented the solution of the unconstrained OCP


by Newton-type methods in the simultaneous approach, to highlight
the specific sparsity structure that is inherent in the resulting LQ prob-
lem. Many Newton-type algorithms also exist which are based on the
sequential approach, however, where the Newton-type iterations are
performed in the space of control sequences u = [u′0 · · · u′N−1 ]′ only.
We recall that one eliminates the state trajectory by a nonlinear forward
simulation in the sequential approach to maintain physically feasible
trajectories. The plain sequential approach does not exploit sparsity
and is not applicable to strongly unstable systems. Interestingly, some
sequential approaches exist that do exploit the sparsity structure of
the OCP and some—notably differential dynamic programming—even
8.8 Structure in Discrete Time Optimal Control 563

incorporate feedback into the forward simulation to better deal with


unstable dynamic systems.
Plain dense sequential approach. We start by describing how the
plain sequential approach—the direct single-shooting method intro-
duced in Section 8.5.1—solves the unconstrained OCP (8.45) with a
Newton-type method. Here, all states are directly eliminated as a func-
tion of the controls by a forward simulation that starts at x0 := x̄ ¯ 0 and
recursively defines xi+1 := fi (xi , ui ) for i = 0, . . . , N − 1. The result is
P
that the objective function F (x̄ ¯ 0 , u) := N−1
i=0 ℓi (xi , ui ) + Vf (xN ) di-
rectly depends on all optimization variables u = [u′0 · · · u′N−1 ]′ . The
task of optimization now is to find a root of the nonlinear equation
¯ 0 , u) = 0. At some iterate ū, after choosing a Hessian
system ∇u F (x̄
approximation B̄ ≈ ∇2u F (x̄¯ 0 , ū), one has to solve linear systems of the
form
B̄(u − ū) = −∇u F (x̄ ¯ 0 , ū) (8.56)

It is important to note that the exact Hessian ∇2u F (x̄ ¯ 0 , ū) is a dense
matrix of size Nm (where m is the control dimension), and that one
usually also chooses a dense Hessian approximation B̄ that is ideally
positive definite.
A Cholesky decomposition of a symmetric positive definite linear
system of size Nm has a computational cost of (1/3)(Nm)3 FLOPs, i.e.,
the iteration cost of the plain sequential approach grows cubically with
the horizon length N. In addition to the cost of the linear system solve,
one has to consider the cost of computing the gradient ∇u F (x̄ ¯ 0 , ū).
This is ideally done by a backward sweep equivalent to the reverse mode
of algorithmic differentiation (AD) as stated in (8.16), at a cost that
grows linearly in N. The cost of forming the Hessian approximation
depends on the chosen approximation, but is typically quadratic in N.
For example, an exact Hessian could be computed by performing Nm
¯ 0 , u).
forward derivatives of the gradient function ∇u F (x̄
The plain dense sequential approach results in a medium-sized op-
timization problem without much sparsity structure but with expen-
sive function and derivative evaluations, and can thus be addressed
by a standard nonlinear programming method that does not exploit
sparsity, but converges with a limited number of function evaluations.
Typically, an SQP method in combination with a dense active set QP
solver is used.
Sparsity-exploiting sequential approaches. Interestingly, one can
form and solve the same linear system as in (8.56) by using the sparse
564 Numerical Optimal Control

linear algebra techniques described in the previous section for the si-
multaneous approach. To implement this, it would be easiest to start
with an algorithm for the simultaneous approach that computes the
full iterate in the vector z that contains as subsequences the controls
u = [u′0 · · · u′N−1 ]′ , the states x = [x0′ · · · xN ]′ , and the multipliers
λ = [λ′0 · · · λ′N ]′ . After the linear system solve, one would simply
overwrite the states x by the result of a nonlinear forward simulation
for the given controls u.
The sparse sequential approach is particularly easy to implement
if a Gauss-Newton Hessian approximation is used (Sideris and Bobrow,
2005). To compute the exact Hessian blocks, one performs a second
reverse sweep identical to (8.16) to overwrite the values of the multipli-
ers λ. As in the simultaneous approach, the cost for each Newton-type
iteration would be linear in N with this approach, while one can show
that the resulting iterates would be identical to those of the dense se-
quential approach for both the exact and the Gauss-Newton Hessian
approximations.

8.8.6 Differential Dynamic Programming

The sequential approaches presented so far first compute the com-


plete control trajectory u in each iteration, and then simulate the non-
linear system open loop with this trajectory u to obtain the states x
for the next linearization point. In contrast, differential dynamic pro-
gramming (DDP) (Mayne, 1966; Jacobson and Mayne, 1970) uses the
time-varying affine feedback law u0i (xi ) = ki + Ki xi from the Riccati
recursion to simulate the nonlinear system forward in time. Like other
sequential approaches, the DDP algorithm starts with an initial guess
for the control trajectory—or the assumption of some feedback law—
and the corresponding state trajectory. But then in each DDP iteration,
¯ 0 , one recursively defines for i = 0, 1, . . . , N − 1
starting at x0 := x̄

ui := ki + Ki xi (8.57a)
xi+1 := fi (xi , ui ) (8.57b)

with Ki and ki from (8.53a) and (8.53b), to define the next control and
state trajectory. Interestingly, DDP only performs the backward recur-
sions (8.51) and (8.52) from the Riccati algorithm. The forward simula-
tion of the linear system (8.54b) is replaced by the forward simulation
of the nonlinear system (8.57b). Note that both the states and the con-
trols in DDP are different from the standard sequential approach.
8.8 Structure in Discrete Time Optimal Control 565

DDP with Gauss-Newton Hessian. Depending on the type of Hessian


approximation, different variants of DDP can be derived. Conceptu-
ally the easiest is DDP with a Gauss-Newton Hessian approximation,
because it has no need of the multipliers λi . In case of a quadratic
objective with positive semidefinite cost matrices, these matrices coin-
cide with the Gauss-Newton Hessian blocks, and the method becomes
particularly simple; one needs only to compute the system lineariza-
tion matrices Āi , B̄i for i = 0, . . . , N − 1 at the trajectory (x, u) from the
previous iteration to obtain all data for the LQ optimal control prob-
lem, and then perform the backward recursions (8.51) and (8.52) to
define Ki and ki in (8.53a) and (8.53b). This DDP variant is sometimes
called iterative linear quadratic regulator (LQR) (Li and Todorov, 2004)
and is popular in the field of robotics. Like any method based on the
Gauss-Newton Hessian, the iterative LQR algorithm has the advantage
that the Hessian approximation is always positive semidefinite, but the
disadvantage that its convergence rate is only linear.
DDP with exact Hessian. In contrast to the iterative LQR algorithm,
the DDP algorithm from Mayne (1966) uses an exact Hessian approxi-
mation and thus offers a quadratic rate of convergence. Like all exact
Hessian methods, it can encounter indefiniteness of the Hessian, which
can be addressed by algorithmic modifications that are beyond our in-
terest here. To compute the exact Hessian blocks
" #
Q̄i S̄i′
:= ∇2(xi ,ui ) [ℓi (x̄i , ūi ) + λ̄′i+1 fi (x̄i , ūi )]
S̄i R̄i

the DDP algorithm needs not only the controls ūi , but also the states
x̄i and the Lagrange multipliers λ̄i+1 , which are not part of the mem-
ory of the algorithm. While the states x̄i are readily obtained by the
nonlinear forward simulation (8.57b), the Lagrange multipliers λ̄i+1 are
obtained simultaneously with the combined backward recursions (8.51)
and (8.52). They are chosen as the gradient of the quadratic cost-to-go
1
function Vi0 (xi ) = pi′ xi+ 2 xi′ Pi xi at the corresponding state values, i.e.,
as
λ̄i := pi + Pi x̄i (8.58)

for i = N − 1, . . . , 0. The last Hessian block (which is needed first in the


backward recursion) is independent of the multipliers and just given
by the second derivative of the terminal cost and defined by P̄N :=
∇2 Vf (x̄N ). Because p̄N := ∇Vf (x̄N ) − P̄N x̄N , the last multiplier is given
by λ̄N := ∇Vf (x̄N ). Starting with these values for P̄N , p̄N , and λ̄N , the
566 Numerical Optimal Control

backward Riccati recursions (8.51) and (8.52) can be started and the
Lagrange multipliers be computed simultaneously using (8.58).
The DDP algorithm in its original form is only applicable to uncon-
strained problems, but can easily be adapted to deal with control con-
straints. In order to deal with state constraints, a variety of heuristics
can be employed that include, for example, barrier methods; a similar
idea was presented in the more general context of constrained OCPs un-
der the name feasibility perturbed sequential quadratic programming
by Tenny, Wright, and Rawlings (2004).

8.8.7 Additional Constraints in Optimal Control

Most Newton-type methods for optimal control can be generalized to


problems with additional equality or inequality constraints. In nonlin-
ear MPC, these additional constraints could be terminal equality con-
straints of the form r (xN ) = 0, as in the case of a zero terminal con-
straint; or terminal inequality constraints of the form r (xN ) ≤ 0, as
in the case of a terminal region. They could also be path constraints
of the form ri (xi , ui ) = 0 or ri (xi , ui ) ≤ 0 for i = 0, . . . , N − 1. The
Lagrangian function then comprises additional contributions, but the
block diagonal structure of the exact Hessian in (8.46) and the general
sparsity of the problem is preserved.

Simultaneous approaches. If the multipliers for the extra constraints


are denoted by µi for i = 0, . . . , N, the Lagrangian in the simultaneous
approaches is given by

¯ 0 , w, λ, µ) = λ′0 (x̄
L(x̄ ¯ 0 − x 0 ) + µN

rN (xN ) + Vf (xN )
N−1
X
+ ℓi (xi , ui ) + λ′i+1 (fi (xi , ui ) − xi+1 ) + µi′ ri (xi , ui )
i=0

We can summarize all primal-dual variables in a vector z := [w ′ λ′ µ ′ ]′


¯ 0 , z). In the purely equality-constrained
and write the Lagrangian as L(x̄
case, Newton-type optimization algorithms again just try to find a root
of the nonlinear equation system ∇z L(z) = 0 by solving at a given
iterate z̄ the linear system M̄(z − z̄) = −∇z L(z̄) where M̄ is an ap-
proximation of the exact KKT matrix ∇2z L(z̄). In the presence of in-
equalities, one can resort to SQP or nonlinear IP methods. In all cases,
the Lagrangian remains partially separable and the KKT matrix has a
similar sparsity structure as for the unconstrained OCP. Therefore, the
8.9 Online Optimization Algorithms 567

linear algebra operations again can be performed by band-structure-


exploiting algorithms that have a linear complexity in the horizon
length N, if desired, or by condensing based approaches.
One major difference with unconstrained optimal control is that the
overall feasibility of the optimization problem and the satisfaction of
the linear independence constraint qualification (LICQ) condition is no
longer guaranteed a priori, and thus, care needs to be taken in for-
mulating well-posed constrained OCPs. For example, one immediately
runs into LICQ violation problems if one adds a zero terminal con-
straint xN = 0 to a problem with a large state dimension n, but a small
control dimension m, and such a short time horizon N that the total
number of control degrees of freedom, Nm, is smaller than n. In these
unfortunate circumstances, the total number of equality constraints,
(N + 1)n + n, would exceed the total number of optimization variables,
(N + 1)n + Nm, making satisfaction of LICQ impossible.

Sequential approaches. Like the simultaneous approaches, most se-


quential approaches to optimal control—with the exception of DDP—
can easily be generalized to the case of extra equality constraints, with
some adaptations to the linear algebra computations in each iteration.
For the treatment of inequality constraints on states and controls, one
can again resort to SQP or nonlinear IP-based solution approaches. In
the presence of state constraints, however, the iterates violate in gen-
eral these state constraints; thus the iterates are infeasible points of
the optimization problem, and the main appeal of the sequential ap-
proach is lost. On the other hand, the disadvantages of the sequential
approach, i.e., the smaller region of convergence and slower contraction
rate, especially for nonlinear and unstable systems, remain or become
even more pronounced. For this reason, state constrained optimal con-
trol problems are most often addressed with simultaneous approaches.

8.9 Online Optimization Algorithms


Optimization algorithms for model predictive control need to solve not
only one OCP, but a sequence of problems PN (x0 ) for a sequence of
different values of x0 , and the time to work on each problem is limited
by the sampling time ∆t. Many different ideas can be used alone or in
combination to ensure that the numerical approximation errors do not
become too large and that the computation times remain bounded. In
this section, we first discuss some general algorithmic considerations,
568 Numerical Optimal Control

then present the important class of continuation methods and discuss


in some detail the real-time iteration.

8.9.1 General Algorithmic Considerations

We next discuss one by one some general algorithmic ideas to adapt


standard numerical optimal control methods to the context of online
optimization for MPC.

Coarse discretization of control and state trajectories. The CPU


time per Newton-type iteration strongly depends on the number of opti-
mization variables in the nonlinear program (NLP), which itself depends
on the horizon length N, the number of free control parameters, and
on the state discretization method. To keep the size of the NLP small,
one would classically choose a relatively small horizon length N, and
employ a suitable terminal cost and constraint set that ensures recur-
sive feasibility and nominal stability in case of exact NLP solutions. The
total number of control parameters would then be Nm, and the state
discretization would be equally accurate on all N control intervals.
In the presence of modeling errors and unavoidably inexact NLP
solutions, however, one could also accept additional discretization er-
rors by choosing a coarser control or state discretization, in particular
in the end of the MPC horizon. Often, practitioners use move blocking
where only the first M ≪ N control moves in the MPC horizon have the
feedback sampling time ∆t. The remaining (N − M) control moves are
combined into blocks of size two or larger, such that the overall num-
ber of control parameters is less than Nm. In particular if a plain dense
single-shooting algorithm is employed, move blocking can significantly
reduce the CPU cost per iteration. Likewise, one could argue that the
state evolution need only be simulated accurately on the immediate fu-
ture, while a coarser state discretization could be used toward the end
of the horizon.
From the viewpoint of dynamic programming, one could argue that
only the first control interval of duration ∆t needs to be simulated ac-
curately using the exact discrete time model x1 = f (x0 , u0 ), while the
remaining (N − 1) intervals of the MPC horizon only serve the purpose
of providing an approximation of the gradient of the cost-to-go func-
tion, i.e., of the gradient of VN−1 (f (x0 , u0 )). Since the discrete time
dynamics usually originate from the approximation of a continuous
time system, one could even decide to use a different state and control
parameterization on the remaining time horizon. For example, after
8.9 Online Optimization Algorithms 569

the first interval of length ∆t, one could use one single long collocation
interval of length (N − 1)∆t with one global polynomial approximation
of states and controls, as in pseudospectral collocation, in the hope of
obtaining a cheaper approximation of VN−1 (f (x0 , u0 )).

Code generation and fixed matrix formats. Since MPC optimization


problems differ only in the value x0 , many problem functions, and
even some complete matrices in the Newton-type iterations, remain
identical across different optimization problems and iterations. This
allows for the code generation of optimization solvers that are tailored
to the specific system model and MPC formulation. While the user in-
terface can be in a convenient high-level language, the automatically
generated code is typically in a standardized lower-level programming
language such as plain C, which is supported by many embedded com-
puting systems. The generated code has fixed matrix and vector dimen-
sions, needs no online memory allocations, and contains no or very few
switches. As an alternative to code generation, one could also just fix
the matrix and vector dimensions in the most computationally inten-
sive algorithmic components, and use a fixed specific matrix storage
format that is optimized for the given computing hardware.

Delay compensations by prediction. Often, at a sampling instant t0 ,


one has a current state estimate x0 , but knows in advance that the MPC
optimization calculations take some time, e.g., a full sampling time ∆t.
In the meantime, i.e., on the time interval [t0 , t0 + ∆t], one usually has
to apply some previously computed control action u0 . As all this is
known before the optimization calculations start, one could first pre-
dict the expected state x1 := f (x0 , u0 ) at the time (t0 + ∆t) when the
MPC computations are finished, and directly let the optimization algo-
rithm address the problem PN (x1 ). Though this prediction approach
cannot eliminate the feedback delay of one sampling time ∆t in case
of unforeseen disturbances, it can alleviate its effect in the case that
model predictions and reality are close to each other.

Division into preparation and feedback phases. An additional idea


is to divide the computations during each sampling interval into a long
preparation phase, and a much shorter feedback phase that could,
for example, consist of only a matrix vector multiplication in case of
linear state feedback. We assume that the computations in the feed-
back phase take a computational time ∆tfb with ∆tfb ≪ ∆t, while the
preparation time takes the remaining duration of one sampling inter-
val. Thus, during the time interval [t0 , t0 +∆t−∆tfb ] one would perform
570 Numerical Optimal Control

a preparation phase that presolves as much as possible the optimiza-


tion problem that one expects at time (t0 + ∆t), corresponding to a
predicted state x̄1 .
At time (t0 + ∆t − ∆tfb ), when the preparation phase is finished, one
uses the most current state estimate to predict the state at time (t0 +∆t)
more accurately than before. Denote this new prediction x1 . During
the short time interval [t0 + ∆t − ∆tfb , t0 + ∆t], one performs the com-
putations of the feedback phase to obtain an approximate feedback u1
that is based on x1 . In case of linear state feedback, one would, for ex-
ample, precompute a vector v and a matrix K in the preparation phase
that are solely based on x̄1 , and then evaluate u1 := v + K(x1 − x̄1 ) in
the feedback phase. Alternatively, more complex computations—such
as the solution of a condensed QP—can be performed in the feedback
phase. The introduction of the feedback phase reduces the delay to
unforeseen disturbances from ∆t to ∆tfb . One has to accept, however,
that the feedback is not the exact MPC feedback, but only an approxima-
tion. Some online algorithms, such as the real-time iteration discussed
in Section 8.9.2, achieve the division into preparation and feedback
phase by reordering the computational steps of a standard optimiza-
tion algorithm, without creating any additional overhead per iteration.
Tangential predictors. A particularly powerful way to obtain a cheap
approximation of the exact MPC feedback is based on the tangential
predictors from Theorem 8.16. In case of strict complementarity at the
solution w̄ of an expected problem PN (x̄1 ), one can show that for suffi-
ciently small distance |x1 − x̄1 |, the solution of the parametric QP (8.33)
corresponds to a linear map, i.e., w QP (x1 ) = w̄ +A(x1 − x̄1 ). The matrix
A can be precomputed based on knowledge of the exact KKT matrix at
the solution w̄, but before the state x1 is known.
Generalized tangential predictors are based on the (approximate)
solution of the full QP (8.33), which is more expensive than a matrix
vector multiplication, but is also applicable to the cases where strict
complementarity does not hold or where the active set changes. The
aim of all tangential predictors is to achieve a second-order approxi-
mation that satisfies w QP (x1 ) − w ∗ (x1 ) = O(|x1 − x̄1 |2 ), which is
only possible if the exact KKT matrix is known. If the exact KKT matrix
is not used in the underlying optimization algorithm, e.g., in case of a
Gauss-Newton Hessian approximation, one can alternatively compute
QP
an approximate generalized tangential predictor w e (x1 ) ≈ w QP (x1 ),
which only approximates the exact tangential predictor, but can be ob-
tained without creating additional overhead compared to a standard
8.9 Online Optimization Algorithms 571

optimization iteration.
Warmstarting and shift. Another easy way to transfer solution infor-
mation from one MPC problem to the next is to use an existing solution
approximation as initial guess for the next MPC optimization problem,
in a procedure called warmstarting. In its simplest variant, one can
just use the existing solution guess without any modification. In the
shift initialization, one first shifts the current solution guess to account
for the advancement of time. The shift initialization can most easily be
performed if an equidistant grid is used for control and state discretiza-
tion, and is particularly advantageous for systems with time-varying
dynamics or objectives, e.g., if a sequence of future disturbances is
known, or one is tracking a time-varying trajectory.
Iterating while the problem changes. Extending the idea of warm-
starting, some MPC algorithms do not separate between one opti-
mization problem and the next, but always iterate while the problem
changes. They only perform one iteration per sampling time, and they
never try to iterate the optimization procedure to convergence for any
fixed problem. Instead, they continue to iterate while the optimization
problem changes. When implemented with care, this approach ensures
that the algorithm always works with the most current information, and
never loses precious time by working on outdated information.

8.9.2 Continuation Methods and Real-Time Iterations

Several of the ideas mentioned above are related to the idea of contin-
uation methods, which we now discuss in more algorithmic detail. For
this aim, we first regard a parameter-dependent root-finding problem
of the form
R(x, z) = 0
with variable z ∈ Rnz , parameter x ∈ Rn , and a smooth function
R : Rn × Rnz → Rnz . This root-finding problem could originate from an
equality constrained MPC optimization problem with fixed barrier as
it arises in a nonlinear IP method. The parameter dependence on x is
due to the initial state value, which varies from one MPC optimization
problem to the next. In case of infinite computational resources, one
could just employ one of the Newton-type methods from Section 8.3.2
to converge to an accurate approximation of the exact solution z∗ (x)
that satisfies R(x, z∗ (x)) = 0. In practice, however, we only have lim-
ited computing power and finite time, and need to be satisfied with an
approximation of z∗ (x).
572 Numerical Optimal Control

Fortunately, it is a realistic assumption that we have an approximate


solution of a related problem available, for the previous value of x. To
clarify notation, we introduce a problem index k, such that the aim of
the continuation method is to solve root-finding problems R(xk , z) = 0
for a sequence (xk )k∈I . For algorithmic simplicity, we assume that the
parameter x enters the function R linearly. This assumption means
that the Jacobian of R with respect to z does not depend on x but only
on z, and can thus be written as Rz (z). As a consequence, also the
linearization of R depends only on the linearization point z̄, i.e., it can
be written as RL (x, z; z̄) := R(x, z̄) + Rz (z̄)(z − z̄).
A simple full-step Newton iteration for a fixed parameter x would
iterate z+ = z̄ − Rz (z̄)−1 R(x, z̄). If we have a sequence of values xk ,
we could decide to perform only one Newton iteration for each value
xk and then proceed to the next one. Given a solution guess zk for the
parameter value xk , a continuation method would then generate the
next solution guess by the iteration formula

zk+1 := zk − Rz (zk )−1 R(xk+1 , zk )

Another viewpoint on this iteration is that zk+1 solves the linear equa-
tion system RL (xk+1 , zk+1 ; zk ) = 0. Interestingly, assuming only regu-
larity of Rz , one can show that if zk equals the exact solution z∗ (xk )
for the previous parameter xk , the next iterate zk+1 is a first-order ap-
proximation, or tangential predictor, for the exact solution z∗ (xk+1 ).
More generally, one can show that
 " # 2

zk − z (x k )
zk+1 − z∗ (xk+1 ) = O   (8.59)
xk+1 − xk

From this equation it follows that one can remain in the area of con-
vergence of the Newton method if one starts close enough to an ex-
act solution, zk ≈ z∗ (xk ), and if the parameter changes (xk+1 − xk )
are small enough. Interestingly, it also implies quadratic convergence
toward the solution in case the parameter values of xk remain con-
stant. Roughly speaking, the continuation method delivers tangential
predictors in case the parameters xk change a lot, and nearly quadratic
convergence in case they change little.
The continuation method idea can be extended to Newton-type it-
erations of the form

zk+1 := zk − Mk−1 R(xk+1 , zk )


8.9 Online Optimization Algorithms 573

with approximations Mk ≈ Rz (zk ). In this case, only approximate tan-


gential predictors are obtained.
Real-time iterations. To generalize the continuation idea to a se-
quence of inequality constrained optimization problems PN (xk ) of the
general form (8.29) with primal-dual solutions z∗ (xk ), one performs
SQP type iterations of the form (8.41), but use in each iteration a new
parameter value xk+1 . This idea directly leads to the real-time itera-
tion (Diehl, Bock, Schlöder, Findeisen, Nagy, and Allgöwer, 2002) that
determines the approximate solution zk+1 = (wk+1 , λk+1 , µk+1 ) of prob-
lem PN (xk+1 ) from the primal-dual solution of the following QP
1
minimize FL (w; wk ) + (w − wk )′ Bk (w − wk )
w∈R nw 2
(8.60)
subject to GL (xk+1 , w; wk ) = 0
HL (w; wk ) ≤ 0
QP
which we denote by PN (xk+1 ; wk , Bk ). If one uses the exact Hessian,
Bk = ∇2w L(zk ), Theorem 8.16 ensures that the QP solution is a gener-
alized tangential predictor of the exact solution if zk was equal to an
exact and strongly regular solution z∗ (xk ). Conversely, if the values of
xk would remain constant, the exact Hessian SQP method would have
quadratic convergence.
More generally, the exact Hessian real-time iteration satisfies the
quadratic approximation formula (8.59), despite the fact that active set
changes lead to nondifferentiability in the solution map z∗ (·). Loosely
speaking, the SQP based real-time iteration is able to easily “jump”
across this nondifferentiability, and its prediction and convergence
properties are not directly affected by active set changes. If the Hessian
is not the exact one, the real-time iteration method delivers only ap-
proximate tangential predictors, and shows linear instead of quadratic
convergence. In practice, one often uses the Gauss-Newton Hessian in
conjunction with a simultaneous approach to optimal control, but also
sequential approaches were suggested in a similar framework (Li and
Biegler, 1989). One can generalize the SQP based real-time iteration
idea further by allowing the subproblems to be more general convex
optimization problems, and by approximating also the constraint Jaco-
bians, as proposed and investigated by Tran-Dinh et al. (2012).
Shift initialization and shrinking horizon problems. If the paramet-
ric optimization problems originate from an MPC optimal control prob-
lem with time-varying dynamics or objectives, it can be beneficial to
574 Numerical Optimal Control

employ a shift strategy that shifts every approximate solution by one


time step backward in time before the next QP problem is solved. For
notational correctness, we need to denote the MPC problem by PN (k,
xk ) in this case, to reflect the direct dependence on the time index k.
While most of the variable shift is canonical, the addition of an extra
control, state, and multiplier at the end of the prediction horizon is
not trivial, and different variants exist. Some are based on an auxiliary
control law and a forward simulation, but also a plain repetition of the
second-to-last interval, which needs no additional computations, is a
possibility.
The guiding idea of the shift initialization is that a shifted optimal
solution should ideally correspond to an optimal solution of the new
MPC problem, if the new initial value xk+1 originates from the nominal
system dynamics xk+1 = f (xk , uk ). But while recursive feasibility can
be obtained easily by a shift, recursive optimality can usually not be
obtained for receding horizon problems. Thus, a shift strategy perturbs
the contraction of the real-time iterations and needs to be applied with
care. In the special case of time-invariant MPC problems PN (xk ) with a
short horizon and tight terminal constraint or cost, a shift strategy is
not beneficial.
On the other hand, in the case of finite-time (batch) processes that
are addressed by MPC on shrinking horizons, recursive optimality can
easily be achieved by shrinking a previously optimal solution. More
concretely, if the initial horizon length was N, and at time k one would
have the solution to the problem PN−k (k, xk ) on the remaining time
horizon, the optimal solution to the problem PN−k−1 (k + 1, xk+1 ) is
easily obtained by dropping the first component of the controls, states,
and multipliers. Thus, the shrinking operation is canonical and should
be used if real-time iterations—or other continuation methods—are ap-
plied to shrinking horizon MPC problems.

8.10 Discrete Actuators


Optimal control problems with discrete actuators fall into the class of
mixed-integer optimal control problems, which are NP-hard and known
to be difficult to solve. If one is lucky and the system model and con-
straints are linear and the cost is linear or convex quadratic, the dis-
crete time optimal control problem turns out to be a mixed-integer
linear program (MILP) or mixed-integer quadratic program (MIQP). For
both classes there exist robust and reliable solvers that can be used
8.10 Discrete Actuators 575

as a black-box for small to moderate problem dimensions. Another


lucky case arises if the sequence of switches happens to be known in
advance in a continuous time system, in which case switching-time op-
timization can be used to transform the problem into a standard non-
linear program (NLP). On the other hand, if we have a nonlinear system
model with unknown switching sequence, we have to confront a sig-
nificantly more difficult problem after discretization, namely a mixed-
integer nonlinear program (MINLP). To address this MINLP one has ba-
sically three options:

• One can use piecewise system linearizations and mixed logical


dynamics (MLD) to approximate the MINLP by a MILP or MIQP.

• One can try to solve the MINLP to global optimality using tech-
niques from the field of global optimization.

• One can use a heuristic to find an approximate solution of the


MINLP.

While the first two options can lead to viable solutions for relevant ap-
plications, they often lead to excessively large runtimes, so the MPC
practitioner may need to resort to the last option. Fortunately, the
optimal control structure of the problem allows us to use a powerful
heuristic that exploits the fact that the state of a (continuous time)
system is most strongly influenced by the time average of its controls
rather than their pointwise values, as illustrated in Figure 8.7. This
heuristic is based on a careful MINLP formulation, which is very similar
to a standard nonlinear MPC problem, but with special structure. First,
divide the input vector u = (uc , ub ) ∈ Rmc +mb into continuous inputs,
uc , and binary integer inputs, ub , such that the system is described by
x + = f (x, uc , ub ). Second, and without loss of generality, we restrict
ourselves to binary integers ub ∈ {0, 1}mb inside a convex polyhedron
P ⊂ [0, 1]mb , and assume that ub enters the system linearly.3 The poly-
hedral constraint ub ∈ P allows us to exclude some combinations, e.g.,
3 If necessary, this binary representation can be achieved by a technique called outer
convexification, which is applicable to any system x + = fe (x, uc , uI ) where the integer
vector uI has dimension mI and can take finitely many (nI ) values uI ∈ {uI,1 , . . . ,
P mb
uI,nI }. We set mb := nI and f (x, uc , ub ) := i=1 ub,i fe (x, uc , uI,i ) and P := {ub ∈ [0,
m
P m b
1] b | j=1 ub,i = 1}. Due to exponential growth of nI in the number of original
integer decisions mI , this technique should be applied with care, e.g., only partially for
separate subsystems, or avoided altogether if the original system is already linear in
the integer controls.
576 Numerical Optimal Control

if two machines cannot be operated simultaneously. The polyhedron


P can and should be chosen such that its vertices equal the admissible
binary values in each time step.
We might have additional combinatorial constraints that couple dif-
ferent time steps with each other. Typical examples are limits on the
total number of switches, or dwell-time constraints, which bound the
duration that a component of ub can be active without interruption.
We introduce the binary control trajectory ub := (ub (0), ub (1), . . . ,
ub (N − 1)) ∈ [0, 1]mb ×N and denote the set of combinatorially feasible
trajectories by B ⊂ {0, 1}mb ×N ∩ P N . The MINLP arising in MPC with
discrete actuators can then be formulated as follows

N−1
X
minimize ℓ(x(k), uc (k), ub (k)) + Vf (x(N))
x, uc , ub k=0

subject to x(0) = x0
x(k + 1) = f (x(k), uc (k), ub (k)), k = 0, . . . , N − 1
h(x(k), uc (k), ub (k)) ≤ 0, k = 0, . . . , N − 1
hf (x(N)) ≤ 0
ub (k) ∈ P , k = 0, . . . , N − 1
ub ∈ B
(8.61)
Without the last constraint, ub ∈ B, the above problem would be a
standard NLP with optimal control structure. Likewise, a standard NLP
arises if the binary controls ub are fixed. These two observations di-
rectly lead to the following three-step algorithm that is a heuristic to
find a good feasible solution of the MINLP (8.61).

1. Solve the relaxed NLP (8.61) without combinatorial constraints,


ub ∈ B, leading to a relaxed solution guess (x∗ , uc∗ , ub∗ ), possi-
bly with ub∗ ∉ B, with objective value VN∗ .

2. Find a binary trajectory ub∗∗ ∈ B that approximates ub∗ , e.g. by


minimizing the distance between ub∗ and ub∗∗ in a suitable norm.

3. Fix the binary controls to ub∗∗ and solve the restricted NLP (8.61)
in the variables (x, uc ) only, with solution (x∗∗∗ , uc∗∗∗ ) and ob-
jective value VN∗∗∗ .

The result of the algorithm is the triple (x∗∗∗ , uc∗∗∗ , ub∗∗ ) which is a
8.10 Discrete Actuators 577

feasible, but typically not an optimal point of the MINLP (8.61).4 Note
that this feasible MINLP solution has an objective value VN∗∗∗ that is
larger than the unknown exact MINLP solution VN0 which in turn is larger
than the relaxed NLP objective VN∗ from Step 1 (if the global NLP solution
was found): VN∗ ≤ VN0 ≤ VN∗∗∗ . Thus, the objective values from Steps 1
and 3 help us to bound the optimality loss incurred by using the above
three-step heuristic.
The choice of the approximation in Step 2 affects both solution qual-
ity and computational complexity. One popular choice, that is taken in
the combinatorial integral approximation (CIA) algorithm (Sager, Jung,
and Kirches, 2011) is to minimize the distance in a specially scaled
maximum norm that compares integrals, and is given by

n−1
X
∥ub ∥CIA := max ub,j (k)
j≤mb , n≤N
k=0

Thus, in Step 2 of the CIA algorithm, one has to find ub∗∗ =


arg minub ∈B ∥ub − ub∗ ∥CIA . This problem turns out to be a MILP (see
Exercise 8.11) with a special structure that can be exploited in tai-
lored algorithms, some of which are available in the open source tool
pycombina (Bürger, Zeile, Hahn, Altmann-Dieses, Sager, and Diehl,
2020).
For the special case of continuous time problems with trivial com-
binatorial constraints, B = {0, 1}mb ×N ∩ P N , one can show under mild
conditions that the difference between the objectives VN∗ and VN∗∗∗ in
the three-step CIA algorithm shrinks linearly with the discretization
step size h = T /N if the length of the continuous time horizon T is
fixed while N grows (Sager, Bock, and Diehl, 2012). A more general
approximation result can be established in the presence of minimum
dwell-time constraints (Zeile, Robuschi, and Sager, 2020).

Example 8.17: MPC with discrete actuator


We regard a simple problem of the form (8.61) for a nonlinear and un-
stable system with one state x ∈ R and one binary control ub ∈ R. The
continuous time system is described by ẋ = x 3 − ub and transformed
to a discrete time system x + = f (x, ub ) by using one RK4 step with
step length h = 0.05. The aim is to track a reference xref = 0.7 starting
from the initial value x0 = 0.9 on a horizon of length N = 30, resulting
4 An important feature in practice is the relaxation of inequality constraints, e.g., by
using L1-penalties, in order to ensure feasible optimization problems in Steps 1 and 3.
578 Numerical Optimal Control

1.0
xref
0.9 x∗
x∗∗∗
x
0.8

0.7

0.6

1.00

0.75
ub
0.50

0.25 u∗b
u∗∗
b
0.00
0.00 0.25 0.50 0.75 1.00 1.25
k

Figure 8.7: Relaxed and binary feasible solution for Example 8.17.

in the following MINLP


N
X
minimize (x(k) − xref )2
x, ub k=0

subject to x(0) = x0
(8.62)
x(k + 1) = f (x(k), ub (k)), k = 0, . . . , N − 1
ub (k) ∈ [0, 1], k = 0, . . . , N − 1
ub ∈ B

The combinatorial constraint set B imposes a minimum dwell-time con-


straint on the uptime that requires that ub remains active for at least
two consecutive time steps, i.e., we have B = {ub ∈ {0, 1}N | ub (k) ≥
ub (k − 1) − ub (k − 2), k = 0, . . . , N − 1}. The required initial val-
ues ub (−1) and ub (−2) are both set to zero. We solve the problem
using the described three-step procedure and the combinatorial inte-
gral approximation in Step 2. The relaxed solution (x∗ , ub∗ ) after Step
1 as well as the solution (x∗∗∗ , ub∗∗ ) after Step 3 are shown in Fig-
ure 8.7. Note that due to the absence of continuous controls, Step 3
just amounts to a system simulation. The objective values are given by
8.11 Notes 579

VN∗ = 0.166 and VN∗∗∗ = 0.1771. The true optimal cost, which can for
this simple example be found in a few seconds by an intelligent inves-
tigation of all 230 ≈ 109 possibilities via branch-and-bound, is given by
VN0 = 0.176. □

8.11 Notes
The description of numerical optimal control methods in this chapter
is far from complete, and we have left out many details as well as many
methods that are important in practice. We mention some related lit-
erature and software links that could complement this chapter.
General numerical optimal control methods are described in the
textbooks by Bryson and Ho (1975); Betts (2001); Gerdts (2011); and
in particular by Biegler (2010). The latter reference focuses on di-
rect methods and also provides an in-depth treatment of nonlinear
programming. The overview articles by Binder, Blank, Bock, Bulirsch,
Dahmen, Diehl, Kronseder, Marquardt, Schlöder, and Stryk (2001);
and Diehl, Ferreau, and Haverbeke (2009); as well a forthcoming text-
book on numerical optimal control (Gros and Diehl, 2020) has a similar
focus on online optimization for MPC as the current chapter.
General textbooks on numerical optimization are Bertsekas (1999);
Nocedal and Wright (2006). Convex optimization is covered by Ben-
Tal and Nemirovski (2001); Nesterov (2004); Boyd and Vandenberghe
(2004). The last book is particularly accessible for an engineering audi-
ence, and its PDF is freely available on the home page of its first author.
Newton’s method for nonlinear equations and its many variants are
described and analyzed in a textbook by Deuflhard (2011). An up-to-
date overview of optimization tools can be found at plato.asu.edu/
guide.html, many optimization solvers are available as source code
at www.coin-or.org, and many optimization solvers can be accessed
online via neos-server.org.
While the direct single-shooting method often is implemented by
coupling an efficient numerical integration solver with a general non-
linear program (NLP) solver such as SNOPT (Gill, Murray, and Saun-
ders, 2005), the direct multiple-shooting and direct collocation meth-
ods need to be implemented by using NLP solvers that fully exploit the
sparsity structure, such as IPOPT5 (Wächter and Biegler, 2006) There
exist many custom implementations of the direct multiple-shooting
5 This code is available to the public under a permissive open-source license.
580 Numerical Optimal Control

method with their own structure-exploiting NLP solvers, such as, for
example, HQP5 (Franke, 1998); MUSCOD-II (Leineweber, Bauer, Schäfer,
Bock, and Schlöder, 2003); ACADO5 (Houska, Ferreau, and Diehl, 2011);
and FORCES-NLP (Zanelli, Domahidi, Jerez, and Morari, 2017).
Structure-exploiting QP solvers that can be used standalone for lin-
ear MPC or as subproblem solvers within SQP methods are, for example,
the dense code qpOASES5 (Ferreau, Kirches, Potschka, Bock, and Diehl,
2014), which is usually combined with condensing, or the sparse codes
FORCES (Domahidi, 2013); qpDUNES5 (Frasch, Sager, and Diehl, 2015);
and HPMPC5 (Frison, 2015). The latter is based on a CPU specific ma-
trix storage format that by itself leads to speedups in the range of one
order of magnitude, and which was made available to the public in the
BLASFEO5 library at github.com/giaf/blasfeo.
In Section 8.2 on numerical simulation methods, we have exclu-
sively treated Runge-Kutta methods because they play an important
role within a large variety of numerical optimal control algorithms, such
as shooting, collocation, or pseudospectral methods. Another popular
and important family of integration methods, however, are the linear
multistep methods; in particular, the implicit backward differentiation
formula (BDF) methods are widely used for simulation and optimization
of large stiff differential algebraic equations (DAEs). For an in-depth
treatment of general numerical simulation methods for ordinary dif-
ferential equations (ODEs) and DAEs, we recommend the textbooks by
Hairer, Nørsett, and Wanner (1993, 1996); as well as Brenan, Campbell,
and Petzold (1996); Ascher and Petzold (1998).
For derivative generation of numerical simulation methods, we refer
to the research articles Bauer, Bock, Körkel, and Schlöder (2000); Pet-
zold, Li, Cao, and Serban (2006); Kristensen, Jørgensen, Thomsen, and
Jørgensen (2004); Quirynen, Gros, Houska, and Diehl (2017a); Quirynen,
Houska, and Diehl (2017b); and the Ph.D. theses by Albersmeyer (2010);
Quirynen (2017). A collection of numerical ODE and DAE solvers with
efficient derivative computations are implemented in the SUNDIALS5
suite (Hindmarsh, Brown, Grant, Lee, Serban, Shumaker, and Wood-
ward, 2005).
Regarding Section 8.4 on derivatives, we refer to a textbook on al-
gorithmic differentiation (AD) by Griewank and Walther (2008), and
an overview of AD tools at www.autodiff.org. The AD framework
CasADi5 can in its latest form be found at casadi.org, and is de-
scribed in the article Andersson, Akesson, and Diehl (2012); and the
Ph.D. theses by Andersson (2013); Gillis (2015).
8.12 Exercises 581

8.12 Exercises
Some of the exercises in this chapter were developed for courses on
numerical optimal control at the University of Freiburg, Germany. The
authors gratefully acknowledge Joel Andersson, Joris Gillis, Sébastien
Gros, Dimitris Kouzoupis, Jesus Lago Garcia, Rien Quirynen, Andrea
Zanelli, and Mario Zanon for contributions to the formulation of these
exercises; as well as Michael Risbeck, Nishith Patel, Douglas Allan, and
Travis Arnold for testing and writing solution scripts.

Exercise 8.1: Newton’s method for root finding


In this exercise, we experiment with a full-step Newton method and explore the depen-
dence of the iterates on the problem formulation and the initial guess.

(a) Write a computer program that performs Newton iterations in Rn that takes as
inputs a function F (z), its Jacobian J(z), and a starting point z[0] ∈ Rn . It
shall output the first 20 full-step Newton iterations. Test your program with
R(z) = z32 − 2 starting first at z[0] = 1 and then at different positive initial
guesses. How many iterations do you typically need in order to obtain a solution
that is exact up to machine precision?

(b) An equivalent problem to z32 − 2 = 0 can be obtained by lifting it to a higher


dimensional space (Albersmeyer and Diehl, 2010), as follows
 
z2 − z12
 
z − z2 
 3 2
 
 
R(z) = z4 − z32 
 
 
z5 − z42 
 
2 − z52

Use your algorithm to implement Newton’s method for this lifted problem and
start it at z[0] = [1 1 1 1 1]′ (note that we use square brackets in the index to
denote the Newton iteration). Compare the convergence of the iterates for the
lifted problem with those of the equivalent unlifted problem from the previous
task, initialized at one.

(c) Consider now the root-finding problem R(z) = 0 with R : R → R, R(z) :=


tanh(z) − 21 . Convergence of Newton’s method is sensitive to the chosen initial
value z0 . Plot R(z) and observe the nonlinearity. Implement Newton’s method
with full steps for it, and test if it converges or not for different initial values
z[0] .

(d) Regard the problem of finding a solution to the nonlinear equation system
2x = ey/4 and 16x 4 + 81y 4 = 4 in the two variables x, y ∈ R. Solve it with
your implementation of Newton’s method using different initial guesses. Does
it always converge, and, if it converges, does it always converge to the same
solution?
582 Numerical Optimal Control

Exercise 8.2: Newton-type methods for a boundary-value problem


Regard the scalar discrete time system
1  
x(k + 1) = 11x(k) + x(k)2 + u , k = 0, . . . , N − 1
10
with boundary conditions
x(0) = x0 x(N) = 0
We fix the initial condition to x0 = 0.1 and the horizon length to N = 30. The aim is to
find the control value u ∈ R—which is kept constant over the whole horizon—in order
to steer the system to the origin at the final time, i.e., to satisfy the constraint x(N) = 0.
This is a two-point boundary-value problem (BVP). In this exercise, we formulate this
BVP as a root-finding problem in two different ways: first, with the sequential approach,
i.e., with only the single control as decision variable; and second, with the simultaneous
approach, i.e., with all 31 states plus the control as decision variables.
(a) Formulate and solve the problem with the sequential approach, and solve it with
an exact Newton’s method initialized at u = 0. Plot the state trajectories in each
iteration. Also plot the residual values x(N) and the variable u as a function of
the Newton iteration index.

(b) Now formulate and solve the problem with the simultaneous approach, and solve
it with an exact Newton’s method initialized at u = 0 and the corresponding
state sequence that is obtained by forward simulation started at x0 . Plot the
state trajectories in each iteration.
Plot again the residual values x(N) and the variable u as a function of the Newton
iteration index, and compare with the results that you have obtained with the
sequential approach. Do you observe differences in the convergence speed?

(c) One feature of the simultaneous approach is that its states can be initialized with
any trajectory, even an infeasible one. Initialize the simultaneous approach with
the all-zero trajectory, and again observe the trajectories and the convergence
speed.

(d) Now solve both formulations with a Newton-type method that uses a constant
Jacobian. For both approaches, the constant Jacobian corresponds to the exact
Jacobian at the solution of the same problem for x0 = 0, where all states and the
control are zero. Start with implementing the sequential approach, and initialize
the iterates at u = 0. Again, plot the residual values x(N) and the variable u as
a function of iteration index.

(e) Now implement the simultaneous approach with a fixed Jacobian approxima-
tion. Again, the Jacobian approximation corresponds to the exact Jacobian at
the solution of the neighboring problem with x0 = 0, i.e., the all zero trajectory.
Start the Newton-type iterations with all states and the control set to zero, and
plot the residual values x(N) and the variable u as a function of iteration index.
Discuss the differences of convergence speed with the sequential approach and
with the exact Newton methods from before.

(f) The performance of the sequential approach can be improved if one introduces
the initial state x(0) as a second decision variable. This allows more freedom
for the initialization, and one can automatically profit from tangential solution
8.12 Exercises 583

predictors. Adapt your exact Newton method, initialize the problem in the all-
zero solution and again observe the results.

(g) If u∗ is the exact solution that is found at the end of the iterations, plot the loga-
rithm of u − u∗ versus the iteration number for all six numerical experiments
(a)–(f), and compare.

(h) The linear system that needs to be solved in each iteration of the simultaneous
approach is large and sparse. We can use condensing in order to reduce the linear
system to size one. Implement a condensing-based linear system solver that only
uses multiplications and additions, and one division. Compare the iterations
with the full-space linear algebra approach, and discuss the differences in the
iterations, if any.

Exercise 8.3: Convex functions


Determine and explain whether the following functions are convex or not on their
respective domains.
(a) f (x) = c ′ x + x ′ A′ Ax on Rn

(b) f (x) = −c ′ x − x ′ A′ Ax on Rn

(c) f (x) = log(c ′ x) + exp(b′ x) on {x ∈ Rn | c ′ x > 0}

(d) f (x) = − log(c ′ x) − exp(b′ x) on {x ∈ Rn | c ′ x > 0}

(e) f (x) = 1/(x1 x2 ) on R2++

(f) f (x) = x1 /x2 on R2++

Exercise 8.4: Convex sets


Determine and explain whether the following sets are convex or not.
(a) A ball, i.e., a set of the form
Ω = {x | |x − xc | ≤ r }

(b) A sublevel set of a convex function f : Rn → R for a constant c ∈ R


Ω = {x ∈ Rn | f (x) ≤ c}

(c) A superlevel set of a convex function f : Rn → R for a constant c ∈ R


Ω = {x ∈ Rn | f (x) ≥ c}

(d) The set


Ω = {x ∈ Rn | x ′ B ′ Bx ≤ b′ x}

(e) The set


Ω = {x ∈ Rn | x ′ B ′ Bx ≥ b′ x}

(f) A cone, i.e., a set of the form


Ω = {(x, α) ∈ Rn × R | |x| ≤ α}
584 Numerical Optimal Control

(g) A wedge, i.e., a set of the form

{x ∈ Rn | a′1 x ≤ b1 , a′2 x ≤ b2 }

(h) A polyhedron
{x ∈ Rn | Ax ≤ b}

(i) The set of points closer to one set than another

Ω = {x ∈ Rn | dist(x, S) ≤ dist(x, T )}

where dist(x, S) := inf{|x − z|2 | z ∈ S}.

Exercise 8.5: Finite differences: theory of optimal perturbation size


Assume we have a twice continuously differentiable function f : R → R and we want
to evaluate its derivative f ′ (x0 ) at x0 with finite differences. Further assume that for
all x ∈ [x0 − δ, x0 + δ] holds that

f (x) ≤ fmax f ′′ (x) ≤ fmax


′′
f ′′′ (x) ≤ fmax
′′′

We assume δ > t for any perturbation size t in the following finite difference approx-
imations. Due to finite machine precision ϵmach that leads to truncation errors, the
computed function fe (x) = f (x)(1 + ϵ(x)) is perturbed by noise ϵ(x) that satisfies the
bound
|ϵ(x)| ≤ ϵmach

(a) Compute a bound on the error of the forward difference approximation

′ fe (x0 + t) − fe (x0 )
fe fd,t (x0 ) :=
t
′′ , ϵ
namely, a function ψ(t; fmax , fmax mach ) that satisfies


fe fd,t (x0 ) − f ′ (x0 ) ≤ ψ(t; fmax , fmax
′′
, ϵmach )

(b) Which value t ∗ minimizes this bound and which value ψ∗ has the bound at t ∗ ?

(c) Perform a similar error analysis for the central difference quotient

′ fe (x0 + t) − fe (x0 − t)
fe cd,t (x0 ) :=
2t
that is, compute a bound

fe fd,t (x0 ) − f ′ (x0 ) ≤ ψcd (t; fmax , fmax
′′ ′′′
, fmax , ϵmach )


(d) For central differences, what is the optimal perturbation size tcd and what is the

size ψcd of the resulting bound on the error?
8.12 Exercises 585

1.0

0.9

0.8

0.7
z (m)

0.6

0.5

0.4
hanging
chain

−2.0 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0


y (m)

Figure 8.8: A hanging chain at rest. See Exercise 8.6(b).

Exercise 8.6: Finding the equilibrium point of a hanging chain using CasADi
Consider an elastic chain attached to two supports and hanging in-between. Let us
discretize it with N mass points connected by N − 1 springs. Each mass i has position
(yi , zi ), i = 1, . . . , N.
Our task is to minimize the total potential energy, which is made up by potential
energy in each string and potential energy of each mass according to

J(y1 , z1 , . . . , yn , zn ) =
N−1   N
1 X X
Di (yi − yi+1 )2 + (zi − zi+1 )2 + g0 mi z i (8.63)
2
i=1 i=1
| {z } | {z }
spring potential energy gravitational potential energy

subject to constraints modeling the ground.


(a) CasADi is an open-source software tool for solving optimization problems in
general and optimal control problems (OCPs) in particular. In its most typical
usage, it is used to formulate and solve constrained optimization problems of
the form
minimize f (x)
x
subject to x≤x≤x (8.64)
g ≤ g(x) ≤ g
where x ∈ Rnx is the decision variable, f : Rnx → R is the objective function,
and g : Rnx → Rng is the constraint function. For equality constraints, the
586 Numerical Optimal Control

upper and lower bounds are equal.


If you have not already done so, go to casadi.org and locate the installation
instructions. On most platforms, installing CasADi amounts to downloading a
binary installation and placing it somewhere in your path. Version 3.3 of CasADi
on Octave/MATLAB was used in this edition, so make sure that you are not using a
version older than this and keep an eye on the text website for incompatibilities
with future versions of CasADi. Locate the CasADi user guide and, with an Octave
or MATLAB interpreter in front of you, read Chapters 1 through 4. These chapters
give you an overview of the scope and syntax of CasADi.

(b) We assume that f is a convex quadratic function and g is a linear function. In


this case we refer to (8.64) as a convex quadratic program (QP). To solve a QP
with CasADi, we construct symbolic expressions for x, f , and g, and use this
to construct a solver object that can be called one or more times with different
values for x, x, g, and g. An initial guess for x can also be provided, but this is
less important for convex QPs, where the solution is unique.
Figure 8.8 shows the solution of the unconstrained problem using the open-
source QP solver qpOASES with N = 40, mi = 40/N kg, Di = 70N N/m, and
g0 = 9.81 m/s2 . The first and last mass points are fixed to (−2, 1) and (2, 1),
respectively. Go through the code for the figure and make sure you understand
the steps.

(c) Now introduce ground constraints: zi ≥ 0.5 and zi ≥ 0.5 + 0.1 yi , for i = 2, · · · ,
N − 2. Resolve the QP and compare with the unconstrained solution.

(d) We now want to formulate and solve a nonlinear program (NLP). Since an NLP is a
generalization of a QP, we can solve the above problem with an NLP solver. This
can be done by simply changing casadi.qpsol in the script to casadi.nlpsol
and the solver plugin ’qpoases’ with ’ipopt’, corresponding to the open-
source NLP solver IPOPT. Are the solutions of the NLP and QP solver the same?

(e) Now, replace the linear equalities by nonlinear ones that are given by zi ≥ 0.5 +
0.1 yi2 for i = 2, · · · , N − 2. Modify the expressions from before to formulate
and solve the NLP, and visualize the solution. Is the NLP convex?

(f) Now, by modifications of the expressions from before, formulate and solve an
NLP where the inequality constraints are replaced by zi ≥ 0.8 + 0.05 yi − 0.1 yi2
for i = 2, · · · , N − 2. Is this NLP convex?

Exercise 8.7: Direct single shooting versus direct multiple shooting

Consider the following OCP, corresponding to driving a Van der Pol oscillator to the
origin, on a time horizon with length T = 10
8.12 Exercises 587

1.0 x1 (t)
x2 (t)
0.8 u(t)

0.6

0.4

0.2

0.0

−0.2

−0.4

0 2 4 6 8 10
t

Figure 8.9: Direct single shooting solution for (8.65) without path
constraints.

ZT
minimize (x1 (t)2 + x2 (t)2 + u(t)2 ) dt
x(·), u(·) 0

subject to ẋ1 (t) = (1 − x2 (t)2 ) x1 (t) − x2 (t) + u(t)


ẋ2 (t) = x1 (t)
(8.65)
−1 ≤u(t) ≤ 1, t ∈ [0, T ]
x1 (0) = 0, x1 (T ) = 0
x2 (0) = 1, x2 (T ) = 0
−0.25 ≤x1 (t), t ∈ [0, T ]
We will solve this problem using direct single shooting and direct multiple shooting
using IPOPT as the NLP solver.

(a) Figure 8.9 shows the solution to the above problem using a direct single shooting
approach, without enforcing the constraint −0.25 ≤ x1 (t). Go through the code
for the figure step by step. The code begins with a modeling step, where sym-
bolic expressions for the continuous-time model are constructed. Thereafter,
the problem is transformed into discrete time by formulating an object that
integrates the system forward in time using a single step of the RK4 method.
This function also calculates the contribution to the objective function for the
same interval using the same integrator method. In the next part of the code, a
588 Numerical Optimal Control

symbolic representation of the NLP is constructed, starting with empty lists of


variables and constraints. This symbolic representation of the NLP is used to
define an NLP solver object using IPOPT as the underlying solver. Finally, the
solver object is evaluated to obtain the optimal solution.

(b) Modify the code so that the path constraint on x1 (t) is being respected. You
only need to enforce this constraint at the end of each control interval. This
should result in additional components to the NLP constraint function G(w),
which will now have upper and lower bounds similar to the decision variable w.
Resolve the modified problem and compare the solution.

(c) Modify the code to implement the direct multiple-shooting method instead of
direct single shooting. This means introducing decision variables corresponding
to not only the control trajectory, but also the state trajectory. The added deci-
sion variables will be matched with an equal number of new equality constraints,
enforcing that the NLP solution corresponds to a continuous state trajectory.
The initial and terminal conditions on the state can be formulated as upper and
lower bounds on the corresponding elements of w. Use x(t) = 0 as the initial
guess for the state trajectory.

(d) Compare the IPOPT output for both transcriptions. How did the change from
direct single shooting to direct multiple shooting influence

• The number of iterations?


• The number of nonzeros in the Jacobian of the constraints?
• The number of nonzeros in the Hessian of the Lagrangian?

(e) Generalize the RK4 method so that it takes M = 4 steps instead of just one. This
corresponds to a higher-accuracy integration of the model dynamics. Approxi-
mately how much smaller discretization error can we expect from this change?

(f) Replace the RK4 integrator with the variable-order, variable-step size code
CVODES from the SUNDIALS suite, available as the ’cvodes’ plugin for
casadi.integrator. Use 10−8 for the relative and absolute tolerances. Consult
CasADi’s user guide for syntax. What are the advantages and disadvantages of
using this integrator over the fixed-step RK4 method used until now?

Exercise 8.8: Direct collocation


Collocation, in its most basic sense, refers to a way of solving initial-value problems
by approximating the state trajectory with piecewise polynomials. For each step of the
integrator, corresponding to an interval of time, we choose the coefficients of these
polynomials to ensure that the ODE becomes exactly satisfied at a given set of time
points. In the following, we choose the Gauss-Legendre collocation integrator of sixth
order, which has d = 3 collocation points. Together with the point 0 at the start of the
interval [0, 1], we have four time points to define the collocation polynomial
p p
τ0 = 0 τ1 = 1/2 − 15/10 τ2 = 1/2 τ3 = 1/2 + 15/10
Using these time points, we define the corresponding Lagrange polynomials
d
Y τ − τr
Lj (τ) =
τj − τr
r =0, r ̸=j
8.12 Exercises 589

Introducing a uniform time grid tk = k h, k = 0, . . . , N, with the corresponding state


values xk := x(tk ), we can approximate the state trajectory inside each interval [tk ,
tk+1 ] as a linear combination of these basis functions
d
X  
t − tk
x
e k (t) = Lr xk,r
r =0
h

By differentiation, we get an approximation of the time derivative at each collocation


point for j = 1, . . . , 3
d d
1 X 1 X

e k (tk,j ) = L̇r (τj ) xk,r := Cr ,j xk,r
h r =0 h r =0

We also can get an expression for the state at the end of the interval
d
X d
X
x
e k+1,0 = Lr (1) xk,r := Dr xk,r
r =0 r =0

Finally, we also can integrate our approximation over the interval, giving a formula for
quadratures
Zt d Z1
X d
X
k+1
x
e k (t) dt = h Lr (τ) dτ xk,r := h br xk,r
tk r =0 0 r =1

(a) Figure 8.10 shows an open-loop simulation for the ODE in (8.65) using Gauss-
Legendre collocation of order 2, 4, and 6. A constant control u(t) = 0.5 was
applied and the initial conditions were given by x(0) = [0, 1]′ . The figure on
the left shows the first state x1 (t) for the three methods as well as a high-
accuracy solution obtained from CVODES, which uses a backward differentia-
tion formula (BDF) method. In the figure on the right we see the discretization
error, as compared with CVODES. Go through the code for the figure and make
sure you understand it. Using this script as a template, replace the integrator
in the direct multiple-shooting method from Exercise 8.7 with this collocation
integrator. Make sure that you obtain the same solution. The structure of the
NLP should remain unchanged—you are still implementing the direct multiple-
shooting approach, only with a different integrator method.

(b) In the NLP transcription step, replace the embedded function call with additional
degrees of freedom corresponding to the state at all the collocation points. En-
force the collocation equations at the NLP level instead of the integrator level.
Enforce upper and lower bounds on the state at all collocation points. Compare
the solution time and number of nonzeros in the Jacobian and Hessian matrices
with the direct multiple-shooting method.

Exercise 8.9: Gauss-Newton SQP iterations for optimal control


Consider a nonlinear pendulum defined by
" # " #
v(t) 0
ẋ(t) = f (x(t), u(t)) = + u(t)
−C sin(p(t)/C) 1

with state x = [p, v]′ and C := 180/π /10, to solve an OCP using a direct multiple-
shooting method and a self-written sequential quadratic programming (SQP) solver
with a Gauss-Newton Hessian.
590 Numerical Optimal Control

|x1 (t) − x1∗ (t)| with x1∗ (t) from CVODES


3
GL2
GL4 10−1
GL6
2
BDF

10−3
1
x1
10−5
0

−1 10−7
GL2
GL4
GL6
−2 10−9
0.0 2.5 5.0 7.5 10.0 2.5 5.0 7.5 10.0
t t

Figure 8.10: Open-loop simulation for (8.65) using collocation.

(a) Starting with the pendulum at x̄0 = [10 0]′ , we aim to minimize the required
controls to bring the pendulum to xN = [0 0]′ in a time horizon T = 10 s.
Regarding bounds on p, v, and u, namely pmax = 10, vmax = 10, and umax = 3,
the required controls can be obtained as the solution of the following OCP

N−1
1 X 2
minimize uk 2
x0 ,u0 ,x1 ,..., 2
uN−1 ,xN k=0

subject to x̄0 − x0 = 0
Φ(xk , uk ) − xk+1 = 0, k = 0, . . . , N − 1
xN = 0
− xmax ≤ xk ≤ xmax , k = 0, . . . , N − 1
− umax ≤ uk ≤ umax , k = 0, . . . , N − 1

Formulate the discrete dynamics xk+1 = Φ(xk , uk ) using a RK4 integrator with
a time step ∆t = 0.2 s. Encapsulate the code in a single CasADi function of the
form of a CasADi function object as in Exercise 8.7. Simulate the system forward
in time and plot the result.

(b) Using w = (x0 , u0 , . . . , uN−1 , xN ) as the NLP decision variable, we can formulate
the equality constraint function G(w), the least squares function M(w), and the
8.12 Exercises 591

bounds vector wmax so that the above OCP can be written

1
minimize |M(w)|22
w 2
subject to G(w) = 0
− wmax ≤ w ≤ wmax

The SQP method with Gauss-Newton Hessian solves a linearized version of this
problem in each iteration. More specifically, if the current iterate is w̄, the next
iterate is given by w̄ + ∆w, where ∆w is the solution of the following QP

1
minimize ∆w ′ JM (w̄)′ JM (w̄)∆w + M(w̄)′ JM (w̄)∆w
∆w 2
(8.66)
subject to G(w̄) + JG (w̄)∆w = 0
− wmax − w̄ ≤ ∆w ≤ wmax − w̄

In order to implement the Gauss-Newton method, we need the Jacobians


∂G ∂M
JG (w) = ∂w (w) and JM (w) = ∂w (w), both of which can be efficiently obtained
using CasADi’s jacobian command. In this case the Gauss-Newton Hessian
H = JM (w̄)′ JM (w̄) can readily be obtained by pen and paper. Define what Hx
and Hu need to be in the Hessian

 
Hx
  " #
 Hu  h i
 
H= ..  Hx = Hu =
 . 
 
Hx

(c) Figure 8.11 shows the control trajectory after 0, 1, 2, and 6 iterations of the
Gauss-Newton method applied to a direct multiple-shooting transcription of
(8.65). Go through the code for the figure step by step. You should recog-
nize much of the code from the solution to Exercise 8.7. The code represents a
simplified, yet efficient way of using CasADi to solve OCPs.
Modify the code to solve the pendulum problem. Note that the sparsity patterns
of the linear and quadratic terms of the QP are printed out at the beginning of
the execution. JG (w) is a block sparse matrix with blocks being either identity
∂Φ ∂Φ
matrices I or partial derivatives Ak = ∂x (xk , uk ) and Bk = ∂u (xk , uk ).
Initialize the Gauss-Newton procedure at w = 0, and stop the iterations when
wk+1 − wk gets smaller than 10−4 . Plot the iterates as well as the vector G
during the iterations. How many iterations do you need?
592 Numerical Optimal Control

1.0 0 iterations
1 iteration
2 iterations
0.8 6 iterations

0.6
u(t)

0.4

0.2

0.0

0 2 4 6 8 10
t

Figure 8.11: Gauss-Newton iterations for a direct multiple-shooting


transcription of (8.65); u(t) after 0, 1, 2, and 6 Gauss-
Newton iterations.

Exercise 8.10: Real-time iterations and nonlinear MPC


We return to the OCP from Exercise 8.9
N−1
1 X 2
minimize uk 2
x0 ,u0 ,x1 ,..., 2
uN−1 ,xN k=0

subject to x̄0 − x0 = 0
Φ(xk , uk ) − xk+1 = 0, k = 0, . . . , N − 1
xN = 0
−xmax ≤ xk ≤ xmax , k = 0, . . . , N − 1
−umax ≤ uk ≤ umax , k = 0, . . . , N − 1

In this problem, we regard x̄0 as a parameter and modify the simultaneous Gauss-
Newton algorithm from Exercise 8.9. In particular, we modify this algorithm to per-
form real-time iterations for different values of x̄0 , so that we can use the algorithm
to perform closed-loop nonlinear MPC simulations for stabilization of the nonlinear
pendulum.
(a) Modify the function sqpstep from the solution of Exercise 8.9 so that it accepts
the parameter x̄0 . You would need to update the upper and lower bounds on w
accordingly. Test it and make sure that it works.
8.12 Exercises 593

(b) In order to visualize the generalized tangential predictor, call the sqpstep
method with different values for x̄0 while resetting the variable vector w̄ to its
initial value (zero) between each call. Use a linear interpolation for x̄0 with 100
points between zero and the value (10, 0)′ , i.e., set x̄0 = λ[10 0]′ for λ ∈ [0, 1].
Plot the first control u0 as a function of λ and keep your plot.

(c) To compute the exact solution manifold with relatively high accuracy, perform
now the same procedure for the same 100 increasing values of λ, but this time
perform for each value of λ multiple Gauss-Newton iterations, i.e., replace each
call to sqpstep with, e.g., 10 calls without changing x̄0 . Plot the obtained values
for u0 and compare with the tangential predictor from the previous task by
plotting them in the same plot.

(d) In order to see how the real-time iterations work in a more realistic setting, let
the values of λ jump faster from 0 to 1, e.g., by doing only 10 steps, and plot the
result again into the same plot.

(e) Modify the previous algorithm as follows: after each change of λ by 0.1, keep it
constant for nine iterations, before you do the next jump. This results in about
100 consecutive real-time iterations. Interpret what you see.

[1]
(f) Now we do the first closed-loop simulation: set the value of x̄0 to [10 0]′ and ini-
tialize w [0] at zero, and perform the first real-time iteration by calling sqpstep.
[1]
This iteration yields the new solution guess w [1] and corresponding control u0 .
Use this control at the “real plant,” i.e., generate the next value of x̄0 , which we
[2] [2] [1] [1]
denote x̄0 , by calling the one-step simulation function, x̄0 := Φ(x̄0 , u0 ).
[2]
Close the loop by calling sqpstep using w [1] and x̄0 , etc., and perform 100
iterations. For better observation, plot after each real-time iteration the control
and state variables on the whole prediction horizon. (It is interesting to note
that the state trajectory is not necessarily feasible).
Also observe what happens with the states x̄0 during the scenario, and plot
them in another plot against the time index. Do they converge, and if yes, to
what value?

(g) Now we make the control problem more difficult by treating the pendulum in an
upright position, which is unstable. This is simply done by changing the sign in
front of the sine in the differential equation, i.e., our model is now
" # " #
v(t) 0
f (x(t), u(t)) = + u(t) (8.67)
C sin(p(t)/C) 1

[1]
Start your real-time iterations again at w [0] = 0 and set x̄0 to [10 0]′ , and
perform the same closed-loop simulation as before. Explain what happens.

Exercise 8.11: CIA norm and MILP


One of the heuristics discussed in Section 8.10 for approximating the solution of mixed-
integer nonlinear optimal control problems is the combinatorial integral approximation
594 Numerical Optimal Control

(CIA) (Sager et al., 2011). The CIA step solves the following optimization problem
k
X
min max ub,j (i) − u∗
b,j (i)
ub j ∈ I1:nb i=0
k ∈ I0:N−1


in which ub is the discrete control sequence that approximates ub , the real-valued
solution of a nonlinear program in the heuristic. Additional constraints can be included
in this optimization such as rate-of-change constraints, dwell-time constraints, etc.
Consider the standard form of a mixed-integer linear program (MILP)

min c ′ x + d′ y
x,y

subject to

Ax + Ey ≤ b
y ∈ Bs

with real x ∈ Rq and b ∈ Rr , and binary y ∈ Bs . State the CIA step in the standard
form of an MILP, i.e., give the MILP variables x, y, c, d, A, E, b, q, r , s for solving the CIA
step.
Bibliography

J. Albersmeyer. Adjoint-based algorithms and numerical methods for sensitivity


generation and optimization of large scale dynamic systems. PhD thesis,
University of Heidelberg, 2010.

J. Albersmeyer and M. Diehl. The lifted Newton method and its application in
optimization. SIAM J. Optim., 20(3):1655–1684, 2010.

J. Andersson. A General-Purpose Software Framework for Dynamic Optimiza-


tion. PhD thesis, KU Leuven, October 2013.

J. Andersson, J. Akesson, and M. Diehl. CasADi – a symbolic package for auto-


matic differentiation and optimal control. In Recent Advances in Algorith-
mic Differentiation, volume 87 of Lecture Notes in Computational Science
and Engineering, pages 297–307. Springer, 2012.

U. M. Ascher and L. R. Petzold. Computer Methods for Ordinary Differential


Equations and Differential–Algebraic Equations. SIAM, Philadelphia, 1998.

D. Axehill. Controlling the level of sparsity in MPC. Sys. Cont. Let., 76:1–7,
2015.

D. Axehill and M. Morari. An alternative use of the Riccati recursion for efficient
optimization. Sys. Cont. Let., 61(1):37–40, 2012.

I. Bauer, H. G. Bock, S. Körkel, and J. P. Schlöder. Numerical methods for


optimum experimental design in DAE systems. J. Comput. Appl. Math., 120
(1-2):1–15, 2000.

A. Ben-Tal and A. Nemirovski. Lectures on Modern Convex Optimization: Anal-


ysis, Algorithms, and Engineering Applications. MPS-SIAM Series on Opti-
mization. MPS-SIAM, Philadelphia, 2001.

D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, sec-


ond edition, 1999.

J. T. Betts. Practical Methods for Optimal Control Using Nonlinear Program-


ming. SIAM, Philadelphia, 2001.

L. T. Biegler. Nonlinear Programming. MOS-SIAM Series on Optimization. SIAM,


2010.

595
596 Bibliography

T. Binder, L. Blank, H. G. Bock, R. Bulirsch, W. Dahmen, M. Diehl, T. Kronseder,


W. Marquardt, J. P. Schlöder, and O. V. Stryk. Introduction to model based
optimization of chemical processes on moving horizons. Online Optimiza-
tion of Large Scale Systems: State of the Art, Springer, pages 295–340, 2001.

H. G. Bock. Numerical treatment of inverse problems in chemical reaction ki-


netics. In K. H. Ebert, P. Deuflhard, and W. Jäger, editors, Modelling of Chem-
ical Reaction Systems, volume 18 of Springer Series in Chemical Physics,
pages 102–125. Springer, Heidelberg, 1981.

H. G. Bock. Recent advances in parameter identification techniques for ODE. In


Numerical Treatment of Inverse Problems in Differential and Integral Equa-
tions, pages 95–121. Birkhäuser, 1983.

H. G. Bock and K. J. Plitt. A multiple shooting algorithm for direct solution of


optimal control problems. In Proceedings of the IFAC World Congress, pages
242–247. Pergamon Press, 1984.

S. P. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University


Press, 2004.

K. E. Brenan, S. L. Campbell, and L. R. Petzold. Numerical solution of initial-value


problems in differential-algebraic equations. Classics in Applied Mathemat-
ics 14. SIAM, Philadelphia, 1996.

A. E. Bryson and Y. Ho. Applied Optimal Control. Hemisphere Publishing, New


York, 1975.

A. Bürger, C. Zeile, M. Hahn, A. Altmann-Dieses, S. Sager, and M. Diehl. py-


combina: An open-source tool for solving combinatorial approximation
problems arising in mixed-integer optimal control. In Proceedings of the
IFAC World Congress, 2020.

A. R. Curtis, M. J. D. Powell, and J. K. Reid. On the estimation of sparse Jacobian


matrices. J. Inst. Math. Appl., 13:117–119, 1974.

P. Deuflhard. Newton methods for nonlinear problems: affine invariance and


adaptive algorithms, volume 35. Springer, 2011.

M. Diehl. Real-Time Optimization for Large Scale Nonlinear Processes. PhD


thesis, Universität Heidelberg, 2001.

M. Diehl, H. G. Bock, J. P. Schlöder, R. Findeisen, Z. Nagy, and F. Allgöwer.


Real-time optimization and nonlinear model predictive control of processes
governed by differential-algebraic equations. J. Proc. Cont., 12(4):577–585,
2002.
Bibliography 597

M. Diehl, H. J. Ferreau, and N. Haverbeke. Efficient numerical methods for


nonlinear MPC and moving horizon estimation. In L. Magni, M. D. Raimondo,
and F. Allgöwer, editors, Nonlinear model predictive control, volume 384 of
Lecture Notes in Control and Information Sciences, pages 391–417. Springer,
2009.

A. Domahidi. Methods and Tools for Embedded Optimization and Control. PhD
thesis, ETH Zürich, 2013.

H. J. Ferreau, C. Kirches, A. Potschka, H. G. Bock, and M. Diehl. qpOASES: a


parametric active-set algorithm for quadratic programming. Mathematical
Programming Computation, 6(4):327–363, 2014.

R. Franke. Integrierte dynamische Modellierung und Optimierung von Syste-


men mit saisonaler Wärmespeicherung. PhD thesis, Technische Universität
Ilmenau, Germany, 1998.

J. V. Frasch, S. Sager, and M. Diehl. A parallel quadratic programming method


for dynamic optimization problems. Mathematical Programming Computa-
tions, 7(3):289–329, 2015.

G. Frison. Algorithms and Methods for High-Performance Model Predictive Con-


trol. PhD thesis, Technical University of Denmark (DTU), 2015.

A. H. Gebremedhin, F. Manne, and A. Pothen. What color is your Jacobian?


Graph coloring for computing derivatives. SIAM Review, 47:629–705, 2005.

M. Gerdts. Optimal Control of ODEs and DAEs. Berlin, Boston: De Gruyter,


2011.

P. Gill, W. Murray, and M. Saunders. SNOPT: An SQP algorithm for large-scale


constrained optimization. SIAM Review, 47(1):99–131, 2005.

J. Gillis. Practical methods for approximate robust periodic optimal control of


nonlinear mechanical systems. PhD thesis, KU Leuven, 2015.

A. Griewank and A. Walther. Evaluating Derivatives. SIAM, 2 edition, 2008.

S. Gros and M. Diehl. Numerical Optimal Control. 2020. (In preparation).

J. Guddat, F. G. Vasquez, and H. T. Jongen. Parametric Optimization: Singular-


ities, Pathfollowing and Jumps. Teubner, Stuttgart, 1990.

E. Hairer, S. P. Nørsett, and G. Wanner. Solving Ordinary Differential Equa-


tions I. Springer Series in Computational Mathematics. Springer, Berlin, 2nd
edition, 1993.
598 Bibliography

E. Hairer, S. P. Nørsett, and G. Wanner. Solving Ordinary Differential Equations


II – Stiff and Differential-Algebraic Problems. Springer Series in Computa-
tional Mathematics. Springer, Berlin, 2nd edition, 1996.

A. C. Hindmarsh, P. N. Brown, K. E. Grant, S. L. Lee, R. Serban, D. E. Shumaker,


and C. S. Woodward. SUNDIALS: Suite of nonlinear and differential/algebraic
equation solvers. ACM Trans. Math. Soft., 31:363–396, 2005.

B. Houska, H. J. Ferreau, and M. Diehl. ACADO toolkit – an open source frame-


work for automatic control and dynamic optimization. Optimal Cont. Appl.
Meth., 32(3):298–312, 2011.

D. H. Jacobson and D. Q. Mayne. Differential dynamic programming, volume 24


of Modern Analytic and Computational Methods in Science and Mathematics.
American Elsevier Pub. Co., 1970.

M. R. Kristensen, J. B. Jørgensen, P. G. Thomsen, and S. B. Jørgensen. An ESDIRK


method with sensitivity analysis capabilities. Comput. Chem. Eng., 28:2695–
2707, 2004.

D. B. Leineweber, I. Bauer, A. A. S. Schäfer, H. G. Bock, and J. P. Schlöder.


An efficient multiple shooting based reduced SQP strategy for large-scale
dynamic process optimization (Parts I and II). Comput. Chem. Eng., 27:157–
174, 2003.

W. Li and E. Todorov. Iterative linear quadratic regulator design for nonlin-


ear biological movement systems. In Proceedings of the 1st International
Conference on Informatics in Control, Automation and Robotics, 2004.

W. C. Li and L. T. Biegler. Multistep, Newton-type control strategies for con-


strained nonlinear processes. Chem. Eng. Res. Des., 67:562–577, 1989.

D. Q. Mayne. A second-order gradient method for determining optimal trajec-


tories of non-linear discrete-time systems. Int. J. Control, 3(1):85–96, 1966.

Y. Nesterov. Introductory lectures on convex optimization: a basic course, vol-


ume 87 of Applied Optimization. Kluwer Academic Publishers, 2004.

J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York,


second edition, 2006.

T. Ohtsuka. A continuation/GMRES method for fast computation of nonlinear


receding horizon control. Automatica, 40(4):563–574, 2004.

G. Pannocchia, J. B. Rawlings, D. Q. Mayne, and G. Mancuso. Whither discrete


time model predictive control? IEEE Trans. Auto. Cont., 60(1):246–252, Jan-
uary 2015.
Bibliography 599

L. R. Petzold, S. Li, Y. Cao, and R. Serban. Sensitivity analysis of differential-


algebraic equations and partial differential equations. Comput. Chem. Eng.,
30:1553–1559, 2006.

R. Quirynen. Numerical Simulation Methods for Embedded Optimization. PhD


thesis, KU Leuven and University of Freiburg, 2017.

R. Quirynen, S. Gros, B. Houska, and M. Diehl. Lifted collocation integrators


for direct optimal control in ACADO toolkit. Math. Prog. Comp., pages 1–45,
2017a.

R. Quirynen, B. Houska, and M. Diehl. Efficient symmetric Hessian propagation


for direct optimal control. J. Proc. Cont., 50:19–28, 2017b.

S. M. Robinson. Strongly regular generalized equations. Mathematics of Oper-


ations Research, Vol. 5, No. 1 (Feb., 1980), pp. 43-62, 5:43–62, 1980.

S. Sager, M. Jung, and C. Kirches. Combinatorial integral approximation. Math.


Method. Oper. Res., 73(3):363, 2011.

S. Sager, H. G. Bock, and M. Diehl. The integer approximation error in mixed-


integer optimal control. Math. Prog., 133:1–{23, 2012.

A. Sideris and J. Bobrow. An efficient sequential linear quadratic algorithm for


solving unconstrained nonlinear optimal control problems. IEEE Transac-
tions on Automatic Control, 50(12):2043–2047, 2005.

M. J. Tenny, S. J. Wright, and J. B. Rawlings. Nonlinear model predictive control


via feasibility-perturbed sequential quadratic programming. Comp. Optim.
Appl., 28:87–121, 2004.

Q. Tran-Dinh, C. Savorgnan, and M. Diehl. Adjoint-based predictor-corrector


sequential convex programming for parametric nonlinear optimization.
SIAM J. Optim., 22(4):1258–1284, 2012.

C. F. Van Loan. Computing integrals involving the matrix exponential. IEEE


Trans. Automat. Control, 23(3):395–404, 1978.

A. Wächter and L. T. Biegler. On the implementation of an interior-point filter


line-search algorithm for large-scale nonlinear programming. Math. Prog.,
106(1):25–57, 2006.

A. Zanelli, A. Domahidi, J. Jerez, and M. Morari. FORCES NLP: An efficient im-


plementation of interior-point methods for multistage nonlinear nonconvex
programs. Int. J. Control, 2017.

C. Zeile, N. Robuschi, and S. Sager. Mixed-integer optimal control under mini-


mum dwell time constraints. Math. Prog., pages 1–42, 2020.
Author Index

Afonso, P. A. F. N. A., 306 Bartle, R. G., 704


Agarwal, M., 303 Başar, T., 382, 426
Aguilera, R. P., 171 Bates, C. N., 148, 150, 151, 170, 211,
Aguirre, L. A., 311 258, 314–316, 696
Akesson, J., 580 Battistelli, G., 319
Alamo, T., 142, 145–147, 169, 259, Bauer, I., 580
260, 359 Bayes, T., 674
Albersmeyer, J., 537, 580, 581 Bellman, R. E., 14, 729
Alessandri, A., 319 Bemporad, A., 171, 476, 477
Alessio, A., 477 Ben-Tal, A., 579
Allan, D. A., 148, 150, 151, 169, 170, Bequette, B. W., 303
211, 258, 273, 275–277, 279, Bernstein, D. S., 311
283, 287, 297, 298, 314–316, Berntorp, K., 302
320, 324, 696, 722 Bertram, J. E., 701
Allgöwer, F., 168–170, 256, 258, 260, Bertsekas, D. P., 14, 25, 258, 292,
261, 347, 358, 573 358, 380, 417, 426, 579, 729,
Allwright, J. C., 259 750, 753, 755
Altmann-Dieses, A., 577 Betts, J. T., 579
Alvarado, I., 169, 359 Bicchi, A., 171
Amrit, R., 156, 157, 170, 184 Biegler, L. T., 170, 243, 257, 528, 554,
Anderson, B. D. O., 319 573, 579
Anderson, T. W., 662 Binder, T., 579
Andersson, J., 580 Blanchini, F., 109, 259, 339
Angeli, D., 156, 170, 260 Blank, L., 579
Apostol, T. M., 638 Bobrow, J., 564
Artstein, Z., 359 Bock, H. G., 259, 512, 526, 549, 556,
Ascher, U. M., 580 573, 577, 579, 580
Åström, K. J., 318 Bordons, C., 1
Athans, M., 427 Borrelli, F., 476, 477
Aubin, J. P., 258, 358 Boyd, S. P., 579, 624, 767–769
Axehill, D., 561, 562 Braatz, R., 259
Bravo, J. M., 259
Baglietto, M., 319 Brenan, K. E., 580
Balakrishnan, V., 258 Broustail, J. P., 167
Bank, B., 477 Brown, P. N., 580
Baotic, M., 171 Bruyninckx, H., 305
Bard, Y., 682 Bryson, A. E., 366, 579
Barić, M., 477 Bulirsch, R., 579

600
Author Index 601

Bürger, A., 577 de Souza, C. E., 326


Desoer, C. A., 291, 677
Cai, C., 169, 313, 722 Deuflhard, P., 512, 579
Calafiore, G. C., 253, 254 Deyst, J., 290, 319
Callier, F. M., 291, 677 Di Cairano, S., 171
Camacho, E. F., 1, 145–147, 169, 258, Diehl, M., 157, 170, 184, 259, 320,
259, 359 537, 545, 549, 552, 573, 577,
Campbell, S. L., 580 579–581
Campi, M. C., 254 Dieudonne, J., 638
Cannon, M., 256, 259–261 Domahidi, A., 580
Cao, Y., 580 Doucet, A., 305
Casavola, A., 260 Dreyfus, S. E., 14
Castro, J. A. A. M., 306 Dua, V., 477
Chatterjee, D., 248, 249, 260, 261 Dunbar, W. B., 427
Chaves, M., 303 Durand, H., 170
Chen, C. C., 168 Durrant-Whyte, H. F., 305
Chen, H., 168, 170, 258
Cheng, Q., 260, 261 Eaton, J. W., 104
Chisci, L., 234, 250, 256, 259, 260 Ellis, M., 170
Chmielewski, D., 168, 169
Christofides, P. D., 170 Fagiano, L., 169
Christophersen, F. J., 171 Falugi, P., 142, 169, 249, 259, 261,
Chryssochoos, I., 259, 260 359
Clarke, D. W., 167 Feller, C., 476
Clarke, F., 761, 762 Ferramosca, A., 359
Coddington, E. A., 649 Ferreau, H. J., 579, 580
Columbano, S., 476 Filippova, T. F., 358
Copp, D. A., 320, 359 Findeisen, R., 259, 347, 358, 573
Cui, H., 427 Fletcher, R., 66
Curtis, A. R., 516, 527 Folkman, J., 477
Cutler, C. R., 167 Foss, B. A., 311, 358
Francis, B. A., 50
Dabbene, F., 253, 254, 256, 260, 261 Franke, R., 580
Dahmen, W., 579 Franz̀e, G., 260
Damborg, M. J., 168 Frasch, J. V., 580
Dantzig, G. B., 477 Frison, G., 560, 561, 580
David, H. A., 686 Fukudu, K., 476
Davison, E. J., 50
De Doná, J. A., 1, 476, 477 Gal, T., 477
de Freitas, N., 305 Garcı́a, C. E., 167
De Keyser, R. M. C., 167 Gass, S. I., 477
de la Peña, D. M., 142, 169, 259, 260 Gebremedhin, A. H., 516
De Nicolao, G., 168, 169, 257, 258 Gerdts, M., 579
De Schutter, J., 305 Gevers, M. R., 326
602 Author Index

Gilbert, E. G., 168, 172, 173, 224, 225, Hu, W., 320
259, 280, 339 Huang, R., 170
Gill, P., 579
Gillette, R. D., 167 Imsland, L., 311, 358
Gillis, J., 580
Glover, J. D., 358 Jacobsen, E. W., 427
Golub, G. H., 368, 629, 630 Jacobson, D. H., 564
Goodwin, G. C., 1, 70, 171, 261, 326, Jadbabaie, A., 169
476, 477 Jazwinski, A. H., 38, 290, 318, 319
Goulart, P. J., 259, 260 Jerez, J., 580
Grant, K. E., 580 Ji, L., 278, 319, 320, 323, 696
Gray, M., 303 Jia, D., 427
Griewank, A., 527, 580 Jiang, Z.-P., 314, 321, 693, 699, 702,
Grimm, G., 169, 258, 722 703, 713, 717–719
Gros, S., 170, 579, 580 Johansen, T. A., 476
Grover, P., 302 Johnson, C. R., 23, 64, 273, 629, 681
Grüne, L., 169, 170 Jones, C. N., 142, 169, 476, 477
Guddat, J., 477, 545 Jongen, H. T., 545
Gudi, R., 303 Jørgensen, J. B., 580
Günther, S., 303 Jørgensen, S. B., 580
Julier, S. J., 304, 305, 311
Hager, W. W., 479 Jung, M., 577, 594
Hahn, J., 311
Hahn, M., 577 Kailath, T., 318, 319
Hairer, E., 580 Kalman, R. E., 21, 26, 701
Hale, J., 649 Kameswaran, S., 243
Harinath, E., 170 Kandepu, R., 311
Hartman, P., 649 Keerthi, S. S., 168, 172, 280
Hauser, J., 169 Kellett, C. M., 208, 693
Hautus, M. L. J., 24, 42 Kerrigan, E. C., 231, 259, 260, 339,
Haverbeke, N., 579 477
Heemels, W. P. M. H., 170, 171, 260, Khalil, H. K., 693, 695, 726
313, 314, 701, 726 Khurzhanski, A. B., 258, 358
Henson, M. A., 104 Kirches, C., 577, 580, 594
Herceg, M., 476 Klatte, D., 477
Hespanha, J. P., 320, 359 Kleinman, D. L., 167
Hicks, G. A., 243 Knüfer, S., 320
Hillar, C. J., 681 Kobayshi, K., 171
Hindmarsh, A. C., 580 Kolås, S., 311
Hiraishi, K., 171 Kolmanovsky, I., 224, 225, 259, 339
Ho, Y., 366, 579 Kolmogorov, A. N., 318
Hokayem, P., 260 Körkel, S., 580
Horn, R. A., 23, 64, 273, 629 Kostina, E., 259
Houska, B., 580 Kothare, M. V., 258
Author Index 603

Kouramas, K. I., 231, 259, 339 Mayne, D. Q., 92, 142, 147, 167–170,
Kouvaritakis, B., 256, 258–261 231, 249, 250, 256–261, 300,
Krichman, M., 719 302, 319, 323, 339, 347, 358,
Kristensen, M. R., 580 359, 371, 462, 477, 538, 564,
Krogh, B. H., 427 565
Kronseder, T., 579 McShane, E. J., 649
Kummer, B., 477 Meadows, E. S., 104, 257, 319, 479
Kurzhanski, A. B., 358 Mehta, P. G., 302
Kvasnica, M., 476 Mesarović, M., 427
Kwakernaak, H., 50, 319 Messina, M. J., 169, 258, 722
Kwon, W. H., 1, 167 Meyn, S. P., 302
Miani, S., 109, 259
Langson, W., 250, 256, 259, 260 Michalska, H., 168, 260, 319, 358
Larsson, T., 427 Middlebrooks, S. A., 303
LaSalle, J. P., 693 Mohtadi, C., 167
Lazar, M., 170, 171, 260, 313, 314, Moitié, R., 358
701, 726 Moore, J. B., 319
Ledyaev, Y. S., 761, 762 Morari, M., 167, 171, 258, 476, 477,
Lee, E. B., 167 561, 580
Lee, J. H., 258, 300, 319 Morgenstern, O., 426
Lee, S. L., 580 Morshedi, A. M., 167
Lefebvre, T., 305 Mosca, E., 260
Leineweber, D. B., 580 Motee, N., 427
Levinson, N., 649 Müller, M. A., 169, 170, 320
Li, S., 580 Murray, R. M., 427
Li, W., 565 Murray, W., 579
Li, W. C., 573 Muske, K. R., 168, 172, 319
Limon, D., 142, 145–147, 169, 258,
259, 359 Nagy, Z., 259, 573
Lorenzen, M., 256, 260, 261 Narasimhan, S., 310, 311
Løvaas, C., 261 Nash, J., 426
Lunze, J., 427 Nedic, A., 729
Lygeros, J., 248, 249, 260, 261 Nedoma, J., 477
Negenborn, R. R., 428
Maciejowski, J. M., 1, 259, 260 Nemirovski, A., 579
Macko, D., 427 Nesterov, Y., 579
Maestre, J. M., 428 Nevistić, V., 169
Magni, L., 168, 169, 257, 258 Nijmeijer, H., 170, 313
Mancuso, G., 92, 538 Nocedal, J., 66, 514, 579, 624, 768
Manne, F., 516 Nørgaard, M., 305
Manousiouthakis, V., 168, 169 Nørsett, S. P., 580
Markus, L., 167
Marquardt, W., 579 Odelson, B. J., 51
Marquis, P., 167 Ohtsuka, T., 554
604 Author Index

Olaru, S., 476 Rao, C. V., 70, 162, 167, 169, 257,
Olsder, G. J., 382, 426 300, 319, 323
Ozdaglar, A. E., 729 Rault, A., 167
Ravn, O., 305
Pancanti, S., 171 Rawlings, J. B., 51–53, 70, 92, 104,
Pannek, J., 169 129, 142, 147, 148, 150, 151,
Pannocchia, G., 52, 53, 92, 147, 170, 156, 157, 162, 167–172, 184,
258, 347, 352, 421, 538 211, 257, 258, 273, 275–279,
Panos, C., 259 283, 287, 297, 298, 300, 302,
Papon, J., 167 303, 314–316, 319, 320, 323,
Papoulis, A., 657, 658 347, 352, 371, 416, 418, 421,
Parisini, T., 168, 169 423, 427, 428, 479, 538, 566,
Pearson, A. E., 167 696, 707, 709, 722
Pereira, M., 142, 169 Ray, W. H., 243
Peressini, A. L., 769 Reble, M., 170, 258
Peterka, V., 167 Reid, J. K., 516, 527
Petzold, L. R., 580 Reif, K., 303
Picasso, B., 171 Rengaswamy, R., 310, 311
Pistikopoulos, E. N., 477 Rhodes, I. B., 258, 358
Plitt, K. J., 556 Rice, M. J., 258
Polak, E., 624, 757, 759, 760 Richalet, J., 167
Pothen, A., 516 Rippin, D. W. T., 303
Potschka, A., 580 Risbeck, M. J., 129, 142, 148, 150,
Poulsen, N. K., 305 151, 169–171, 211, 258, 279,
Powell, M. J. D., 516, 527 314–316, 696, 707, 709
Praly, L., 357 Robertson, D. G., 319
Prasad, V., 303 Robinson, S. M., 545
Prett, D. M., 167 Robuschi, N., 577
Price, C., 290, 319 Rockafellar, R. T., 212, 439, 641, 646,
Primbs, J. A., 169 739, 740, 746, 748–750, 757,
Propoi, A. I., 167 759
Romanenko, A., 306
Qiu, L., 50 Roset, B. J. P., 170, 313
Qu, C. C., 311 Rossiter, J. A., 1, 234, 250, 256, 258–
Quevedo, D. E., 171 260
Quincampoix, M., 358 Russo, L. P., 303
Quirynen, R., 580
Saaty, T. L., 477
Raimondo, D. M., 259 Safonov, M., 427
Rajamani, M. R., 51 Sager, S., 577, 580, 594
Raković, S. V., 231, 259–261, 339, Salas, F., 145–147
347, 358, 359, 462, 477 Sandell Jr., N. R., 427
Ralph, D., 259 Santos, L. O., 257, 306
Ramaker, B. L., 167 Saunders, M., 579
Author Index 605

Savorgnan, C., 549, 573 Teixeira, B. O. S., 311


Sayyar-Rodsari, B., 427 Tempo, R., 253, 254, 256, 260, 261
Scattolini, R., 168, 169, 257, 258, 428 Tenny, M. J., 300, 566
Schäfer, A. A. S., 580 Testud, J. L., 167
Schei, T. S., 311 Thomas, Y. A., 167
Schley, M., 303 Thomsen, P. G., 580
Schlöder, J. P., 573, 580 Todorov, E., 565
Schur, I., 629 Tôrres, L. A. B., 311
Schweppe, F. C., 358 Tran-Dinh, Q., 549, 573
Scokaert, P. O. M., 147, 167–170, 257, Tsitsiklis, J. N., 380, 426
258, 371, 479 Tuffs, P. S., 167
Selby, S. M., 73 Tuna, S. E., 169, 258, 722
Sepulchre, R., 257
Serban, R., 580 Uhl, Jr., J. J., 769
Serón, M. M., 1, 259, 261, 476, 477 Uhlmann, J. K., 304, 305, 311
Shah, S., 303 Unbehauen, R., 303
Shapiro, N. Z., 477
Shaw, L., 168 Vachhani, P., 310, 311
Shein, W. W., 171 Valyi, I., 258, 358
Sherbert, D. R., 704 Van Cauwenberghe, A. R., 167
Shumaker, D. E., 580 van der Merwe, R., 305
Sideris, A., 564 Van Loan, C. F., 538
Šiljak, D. D., 427 van Wyk, E. J., 259
Sin, K. S., 70 Vandenberghe, L., 579, 624, 767–769
Sivan, R., 50, 319 Varaiya, P., 427
Skogestad, S., 427 Vasquez, F. G., 545
Smith, H. W., 50 Veliov, V. M., 358
Sontag, E. D., 23, 41, 275, 276, 303, Venkat, A. N., 421, 427, 428
321, 705–707, 713, 717, 719 Vidyasagar, M., 370
Stengel, R. F., 303, 319 von Neumann, J., 426
Stern, R. J., 761, 762
Stewart, B. T., 416, 418, 421, 423, 427 Wächter, A., 528, 554, 579
Strang, G., 23, 42, 625, 626 Walther, A., 527, 580
Stryk, O. V., 579 Wan, E., 305
Sullivan, F. E., 769 Wang, L., 1
Sznaier, M., 168 Wang, Y., 275, 276, 314, 321, 693,
699, 702, 703, 713, 717–719
Takahara, Y., 427 Wanner, G., 580
Tan, K. T., 168, 173 Wets, R. J.-B., 212, 646, 739, 740, 746,
Tanner, K., 477 748–750, 757, 759
Teel, A. R., 169, 204–209, 257, 258, Wiener, N., 318
275–277, 313, 314, 357, 693, Wilson, D. I., 303
701, 709, 710, 712, 714, 722, Wolenski, P. R., 761, 762
726 Wonham, W. M., 50
606 Author Index

Woodward, C. S., 580


Wright, S. J., 66, 147, 170, 258, 366,
416, 418, 421, 423, 427, 428,
514, 566, 579, 624, 768
Wynn, A., 320

Xie, L., 320

Yang, T., 302


Yaz, E., 303
Ydstie, B. E., 167
You, K., 320
Yu, J., 169
Yu, S., 170, 258
Yu, Z., 258

Zanelli, A., 580


Zanon, M., 170
Zappa, G., 234, 250, 256, 259, 260
Zeile, C., 577
Zeilinger, M. N., 142, 169
Zoppoli, R., 168, 169
Author Index 607
Citation Index

Aguilera and Quevedo (2013), 171, Bard (1974), 682, 691


186 Bartle and Sherbert (2000), 704, 727
Albersmeyer (2010), 580, 595 Başar and Olsder (1999), 382, 426,
Albersmeyer and Diehl (2010), 537, 442
581, 595 Bauer et al. (2000), 580, 595
Alessandri et al. (2008), 319, 327 Bayes (1763), 674, 691
Alessio and Bemporad (2009), 477, Bellman (1957), 14, 86, 729, 770
483 Bellman and Dreyfus (1962), 14, 86
Allan (2020), 279, 287, 320, 324, 327 Bemporad and Morari (1999), 171,
Allan and Rawlings (2018), 169, 186, 186
722, 727 Bemporad et al. (2002), 477, 483
Allan and Rawlings (2019), 277, 279, Ben-Tal and Nemirovski (2001), 579,
283, 327 595
Allan and Rawlings (2020), 273, 279, Berntorp and Grover (2018), 302,
287, 297, 298, 327 327
Allan et al. (2017), 148, 150, 151, Bertsekas (1987), 14, 25, 86, 292, 327
170, 186, 211, 258, 264, 314– Bertsekas (1999), 417, 442, 579, 595,
316, 327, 696, 727 750, 753, 755, 770
Allan et al. (2020), 275–277, 327 Bertsekas and Rhodes (1971), 358,
Amrit et al. (2011), 170, 186 361
Anderson (2003), 661, 691 Bertsekas and Rhodes (1971a), 258,
Anderson and Moore (1981), 319, 264
327 Bertsekas and Rhodes (1971b), 258,
Andersson (2013), 580, 595 264
Andersson et al. (2012), 580, 595 Bertsekas and Tsitsiklis (1997), 380,
Angeli et al. (2008), 260, 264 426, 442
Angeli et al. (2012), 156, 170, 186 Bertsekas et al. (2001), 729, 770
Apostol (1974), 638, 691 Betts (2001), 579, 595
Artstein and Raković (2008), 359, Biegler (2010), 579, 595
361 Binder et al. (2001), 579, 595
Ascher and Petzold (1998), 580, 595 Blanchini (1999), 259, 264, 339, 361
Åström (1970), 318, 327 Blanchini and Miani (2008), 109, 186,
Aubin (1991), 258, 264, 358, 361 259, 264
Axehill (2015), 562, 595 Bock (1981), 526, 596
Axehill and Morari (2012), 561, 595 Bock (1983), 512, 549, 596
Bock and Plitt (1984), 556, 596
Bank et al. (1983), 477, 483 Borrelli (2003), 477, 483
Baotic et al. (2006), 171, 186 Borrelli et al. (2017), 476, 483

608
Citation Index 609

Boyd and Vandenberghe (2004), 579, de Souza et al. (1986), 326, 328
596, 624, 691, 767–770 Deuflhard (2011), 512, 579, 596
Brenan et al. (1996), 580, 596 Deyst and Price (1968), 290, 319, 328
Bryson and Ho (1975), 366, 442, 579, Di Cairano et al. (2014), 171, 187
596 Diehl (2001), 545, 552, 596
Bürger et al. (2020), 577, 596 Diehl et al. (2002), 573, 596
Diehl et al. (2006), 259, 264
Cai and Teel (2008), 169, 186, 313, Diehl et al. (2009), 579, 596
327, 722, 727 Diehl et al. (2011), 157, 170, 184, 187
Calafiore and Campi (2006), 254, 264 Dieudonne (1960), 638, 691
Callier and Desoer (1991), 291, 327, Domahidi (2013), 580, 597
677, 691 Dunbar (2007), 427, 442
Camacho and Bordons (2004), 1, 86 Dunbar and Murray (2006), 427, 442
Chatterjee and Lygeros (2015), 248,
249, 260, 261, 264 Ellis et al. (2014), 170, 187
Chatterjee et al. (2011), 260, 264
Chaves and Sontag (2002), 303, 327
Fagiano and Teel (2012), 169, 187
Chen and Allgöwer (1998), 168, 186,
Falugi and Mayne (2011), 261, 265
258, 264
Falugi and Mayne (2013), 359, 361
Chen and Shaw (1982), 168, 186
Falugi and Mayne (2013a), 169, 187
Chisci et al. (2001), 234, 250, 256,
Falugi and Mayne (2013b), 142, 169,
259, 260, 264
187
Chmielewski and Manousiouthakis
Feller et al. (2013), 476, 483
(1996), 168, 169, 186
Ferramosca et al. (2009), 359, 361
Clarke et al. (1987), 167, 186
Clarke et al. (1998), 761, 762, 770 Ferreau et al. (2014), 580, 597
Coddington and Levinson (1955), Findeisen et al. (2003), 358, 361
649, 691 Fletcher (1987), 66, 86
Columbano et al. (2009), 476, 483 Francis and Wonham (1976), 50, 86
Copp and Hespanha (2014), 359, 361 Franke (1998), 580, 597
Copp and Hespanha (2017), 320, 328 Frasch et al. (2015), 580, 597
Cui and Jacobsen (2002), 427, 442 Frison (2015), 560, 561, 580, 597
Curtis et al. (1974), 516, 527, 596
Cutler and Ramaker (1980), 167, 187 Gal and Nedoma (1972), 477, 483
Garcı́a and Morshedi (1986), 167,
Dantzig et al. (1967), 477, 483 187
David (1981), 686, 691 Garcı́a et al. (1989), 167, 187
Davison and Smith (1971), 50, 86 Gass and Saaty (1955), 477, 483
Davison and Smith (1974), 50, 86 Gebremedhin et al. (2005), 516, 597
De Keyser and Van Cauwenberghe Gelb (1974), 303, 328
(1985), 167, 187 Gerdts (2011), 579, 597
De Nicolao et al. (1996), 168, 187, Gilbert and Tan (1991), 168, 173, 187
257, 264 Gill et al. (2005), 579, 597
De Nicolao et al. (1998), 169, 187 Gillis (2015), 580, 597
610 Citation Index

Glover and Schweppe (1971), 358, Jiang and Wang (2002), 693, 699,
361 702, 703, 713, 718, 727
Golub and Van Loan (1996), 368, 442, Johnson (1970), 681, 691
629, 630, 691 Johnson and Hillar (2002), 681, 691
Goodwin and Sin (1984), 70, 86 Jones (2017), 477, 483
Goodwin et al. (2005), 1, 86 Jones et al. (2007), 477, 484
Goulart et al. (2006), 259, 260, 265 Julier and Uhlmann (1997), 311, 328
Goulart et al. (2008), 259, 265 Julier and Uhlmann (2002), 305, 328
Griewank and Walther (2008), 527, Julier and Uhlmann (2004a), 303–
580, 597 305, 328
Grimm et al. (2005), 169, 188, 722, Julier and Uhlmann (2004b), 304,
727 329
Grimm et al. (2007), 258, 265 Julier et al. (2000), 305, 329
Gros and Diehl (2020), 579, 597
Grüne and Pannek (2017), 169, 188 Kailath (1974), 318, 319, 329
Guddat et al. (1990), 545, 597 Kalman (1960a), 26, 86
Gudi et al. (1994), 303, 328 Kalman (1960b), 21, 86
Kalman and Bertram (1960), 701,
Hager (1979), 479, 483 727
Hairer et al. (1993), 580, 597 Kameswaran and Biegler (2006), 243,
Hairer et al. (1996), 580, 597 265
Hale (1980), 649, 691 Kandepu et al. (2008), 311, 329
Hartman (1964), 649, 691 Keerthi and Gilbert (1985), 280, 329
Hautus (1972), 24, 42, 86 Keerthi and Gilbert (1987), 172, 188
Herceg et al. (2013), 476, 483 Keerthi and Gilbert (1988), 168, 188
Hicks and Ray (1971), 243, 265 Kellett and Teel (2004), 208, 265
Hindmarsh et al. (2005), 580, 598 Kellett and Teel (2004a), 693, 727
Horn and Johnson (1985), 23, 64, 86, Kellett and Teel (2004b), 693, 727
273, 328, 629, 691 Khalil (2002), 693, 695, 726, 727
Houska et al. (2011), 580, 598 Khurzhanski and Valyi (1997), 258,
Hu (2017), 320, 328 265, 358, 361
Hu et al. (2015), 320, 328 Kleinman (1970), 167, 188
Huang et al. (2011), 170, 188 Knüfer and Müller (2018), 320, 329
Kobayshi et al. (2014), 171, 188
Imsland et al. (2003), 358, 361 Kolås et al. (2009), 311, 329
Kolmanovsky and Gilbert (1995),
Jacobson and Mayne (1970), 564, 598 259, 265
Jadbabaie et al. (2001), 169, 188 Kolmanovsky and Gilbert (1998),
Jazwinski (1970), 38, 86, 290, 318, 224, 225, 265, 339, 361
319, 328 Kolmogorov (1941), 318, 329
Ji et al. (2016), 320, 328 Kothare et al. (1996), 258, 265
Jia and Krogh (2002), 427, 442 Kouramas et al. (2005), 259, 265
Jiang and Wang (2001), 314, 321, Kouvaritakis and Cannon (2016),
328, 693, 717–719, 727 256, 266
Citation Index 611

Kouvaritakis et al. (2010), 260, 261, Mayne (1966), 564, 565, 598
266 Mayne (1995), 258, 266
Krichman et al. (2001), 719, 727 Mayne (1997), 258, 267
Kristensen et al. (2004), 580, 598 Mayne (2000), 169, 189
Kurzhanski and Filippova (1993), Mayne (2013), 147, 169, 189
358, 361 Mayne (2016), 260, 261, 267
Kwakernaak and Sivan (1972), 50, 87, Mayne and Falugi (2016), 169, 189
319, 329 Mayne and Falugi (2019), 249, 267
Kwon (2005), 1, 87 Mayne and Langson (2001), 250, 256,
Kwon and Pearson (1977), 167, 188 259, 260, 267
Mayne and Michalska (1990), 168,
Langson et al. (2004), 259, 260, 266 189
Larsson and Skogestad (2000), 427, Mayne and Raković (2002), 477, 484
442 Mayne and Raković (2003), 462, 477,
LaSalle (1986), 693, 727 484
Lazar and Heemels (2009), 170, 188 Mayne et al. (2000), 167, 169, 189,
Lazar et al. (2008), 260, 266 257, 267
Lazar et al. (2009), 701, 726, 727 Mayne et al. (2005), 259, 267
Lazar et al. (2013), 314, 329 Mayne et al. (2006), 358, 362
Lee and Markus (1967), 167, 188 Mayne et al. (2007), 477, 484
Lee and Yu (1997), 258, 266 Mayne et al. (2009), 347, 358, 362
Lefebvre et al. (2002), 305, 329 Mayne et al. (2011), 259, 267
Leineweber et al. (2003), 580, 598 McShane (1944), 649, 691
Li and Biegler (1989), 573, 598 Meadows et al. (1993), 319, 329
Li and Todorov (2004), 565, 598 Meadows et al. (1995), 104, 189
Limon et al. (2002), 258, 266 Mesarović et al. (1970), 427, 442
Limon et al. (2006), 145–147, 188 Michalska and Mayne (1993), 168,
Limon et al. (2008), 169, 188, 259, 189, 260, 267
266, 359, 362 Michalska and Mayne (1995), 319,
Limon et al. (2010), 169, 189 329, 358, 362
Limon et al. (2012), 142, 169, 189 Middlebrooks and Rawlings (2006),
Lorenzen et al. (2016), 256, 260, 261, 303, 329
266 Moitié et al. (2002), 358, 362
Løvaas et al. (2008), 261, 266 Motee and Sayyar-Rodsari (2003),
Lunze (1992), 427, 442 427, 442
Müller (2017), 320, 330
Maciejowski (2002), 1, 87 Müller and Allgöwer (2014), 169, 189
Maestre and Negenborn (2014), 428, Müller and Grüne (2015), 170, 189
442 Müller et al. (2015), 170, 190
Magni and Sepulchre (1997), 257, Muske and Rawlings (1993), 172, 190
266 Muske et al. (1993), 319, 330
Magni et al. (2003), 258, 266
Marquis and Broustail (1988), 167, Nagy and Braatz (2004), 259, 267
189 Nash (1951), 426, 443
612 Citation Index

Nesterov (2004), 579, 598 Rao and Rawlings (1999), 70, 87, 162,
Nocedal and Wright (2006), 66, 87, 190
514, 579, 598, 624, 691, 768, Rao et al. (2001), 300, 330
770 Rao et al. (2003), 319, 323, 330
Nørgaard et al. (2000), 305, 330 Rawlings and Amrit (2009), 170, 190
Rawlings and Ji (2012), 278, 319,
Odelson et al. (2006), 51, 87 323, 330, 696, 728
Ohtsuka (2004), 554, 598 Rawlings and Mayne (2009), 300,
302, 330
Rawlings and Muske (1993), 168, 190
Pannocchia and Rawlings (2003), 52,
Rawlings and Risbeck (2015), 129,
53, 87, 347, 352, 362
191, 279, 330, 709, 728
Pannocchia et al. (2011), 147, 170,
Rawlings and Risbeck (2017), 142,
190, 258, 267
169–171, 191, 707, 728
Pannocchia et al. (2015), 92, 190,
Rawlings and Stewart (2008), 427,
538, 598
443
Papoulis (1984), 657, 658, 692
Reif and Unbehauen (1999), 303, 330
Parisini and Zoppoli (1995), 168,
Reif et al. (1999), 303, 330
169, 190
Reif et al. (2000), 303, 331
Peressini et al. (1988), 769, 770
Richalet et al. (1978a), 167, 191
Peterka (1984), 167, 190
Richalet et al. (1978b), 167, 191
Petzold et al. (2006), 580, 598
Robertson and Lee (2002), 319, 331
Picasso et al. (2003), 171, 190
Robinson (1980), 545, 599
Polak (1997), 624, 692, 757, 759,
Rockafellar (1970), 439, 443, 641,
760, 770
692
Prasad et al. (2002), 303, 330
Rockafellar and Wets (1998), 212,
Prett and Gillette (1980), 167, 190
268, 646, 692, 739, 740, 746,
Primbs and Nevistić (2000), 169, 190
748–750, 757, 759, 770
Propoi (1963), 167, 190
Romanenko and Castro (2004), 306,
331
Qiu and Davison (1993), 50, 87 Romanenko et al. (2004), 306, 331
Qu and Hahn (2009), 311, 330 Roset et al. (2008), 170, 191, 313, 331
Quevedo et al. (2004), 171, 190 Rossiter (2004), 1, 87
Quirynen (2017), 580, 599 Rossiter et al. (1998), 258, 268
Quirynen et al. (2017a), 580, 599
Quirynen et al. (2017b), 580, 599 Sager et al. (2011), 577, 594, 599
Sager et al. (2012), 577, 599
Raković (2012), 259, 267 Sandell Jr. et al. (1978), 427, 443
Raković et al. (2003), 259, 267 Santos and Biegler (1999), 257, 268
Raković et al. (2005), 339, 362 Scattolini (2009), 428, 443
Raković et al. (2005a), 231, 259, 267 Schur (1909), 629, 692
Raković et al. (2005b), 259, 268 Scokaert and Mayne (1998), 258, 268
Raković et al. (2012), 259, 268 Scokaert and Rawlings (1998), 168,
Rao (2000), 300, 319, 330 169, 191
Citation Index 613

Scokaert et al. (1997), 257, 268, 479, Wächter and Biegler (2006), 528, 554,
484 579, 599
Scokaert et al. (1999), 147, 170, 191, Wang (2009), 1, 87
371, 443 Wiener (1949), 318, 331
Selby (1973), 73, 87 Wilson et al. (1998), 303, 332
Serón et al. (2000), 476, 477, 484 Wright (1997), 366, 444
Sideris and Bobrow (2005), 564, 599
Šiljak (1991), 427, 443 Yang et al. (2013), 302, 332
Sontag (1998), 23, 41, 87, 321, 331 Ydstie (1984), 167, 191
Sontag (1998a), 706, 707, 713, 728 Yu et al. (2011), 258, 268
Sontag (1998b), 705, 728 Yu et al. (2014), 170, 191
Sontag and Wang (1995), 717, 728
Sontag and Wang (1997), 275, 276, Zanelli et al. (2017), 580, 599
321, 331, 719, 728 Zanon et al. (2013), 170, 191
Stengel (1994), 303, 319, 331 Zeile et al. (2020), 577, 599
Stewart et al. (2010), 421, 443
Stewart et al. (2011), 416, 418, 421,
423, 443
Strang (1980), 23, 42, 87, 625, 626,
692
Sznaier and Damborg (1987), 168,
191

Teel (2004), 204–209, 257, 268, 709,


710, 712, 714, 728
Teel and Praly (1994), 357, 362
Teixeira et al. (2008), 311, 331
Tempo et al. (2013), 253, 254, 268
Tenny and Rawlings (2002), 300, 331
Tenny et al. (2004), 566, 599
Thomas (1975), 167, 191
Tran-Dinh et al. (2012), 549, 573, 599

Vachhani et al. (2006), 310, 311, 331


van der Merwe et al. (2000), 305, 331
Van Loan (1978), 538, 599
Venkat (2006), 428, 443
Venkat et al. (2006a), 428, 443
Venkat et al. (2006b), 428, 443
Venkat et al. (2007), 427, 443
Vidyasagar (1993), 370, 444
von Neumann and Morgenstern
(1944), 426, 444
Subject Index

A-stable integration methods, 501 Back propagation algorithm, 524


Accumulation point, see Sequence Bar quantities, 518
Active Bayes’s theorem, 672, 674
constraint, 743 Bellman-Gronwall lemma, 651
set, 552, 743 BFGS, 550
AD, 514, 516, 561, 580 Bolzano-Weierstrass theorem, 111,
forward mode, 520 632
reverse mode, 522 Boundary-value problem, see BVP
Adaptive stepsize, 507 Bounded
Adjoint operator, 676 locally, 209, 693, 696, 710, 711
Bounded estimate error, 313
Admissible, 97
Broyden-Fletcher-Goldfarb-Shanno,
control, 464, 465, 469
see BFGS
control sequence, 108, 210, 732
Butcher tableau, 498
disconnected region, 161
BVP, 493, 582
disturbance sequence, 198, 204,
215, 227
C-set, 338
policy, 197, 198
Caratheodory conditions, 651
set, 137, 166, 168
CasADi, vii, xii, xiii, 527, 580, 585,
Affine, 446, 447, 488
586, 591
function, 447
Cayley-Hamilton theorem, 22, 64
hull, 450, 632
Central limit theorem, 657
invariance, 513
Centralized control, 363, 376
piecewise, 104, 448, 450, 452, 458,
Certainty equivalence, 194
462, 468, 763
Chain rule, 61, 638
set, 632
Cholesky factorization, 508, 561
Algebraic states, 506 CIA, 577, 594
Algorithmic (or automatic) differen- CLF, 131, 134, 714–716
tiation, see AD constrained, 716
Arrival cost, 33, 70, 79, 80 global, 714
full information, 296 Closed-loop control, 92
AS, 112, 423, 698 Code generation, 569
Asymptotically stable, see AS Collocation, 502
Attraction direct, 540
domain of, 700 methods, 502
global, 697 points, 502
region of, 113, 700 Combinatorial integral approxima-
Attractivity, 710 tion, see CIA

614
Subject Index 615

Combining MHE and MPC, 312 Control Lyapunov function, see CLF
stability, 314 Control vector parameterization,
Complementarity condition, 543 532
strict, 544 Controllability, 23
Concave function, 647 canonical form, 68
Condensing, 491, 560 duality with observability, 291
Cone matrix, 23
convex, 644 weak, 116
normal, 439, 737, 743, 746, 748 Controllable, 23
polar, 453, 455, 644, 740 Converse theorem
tangent, 737, 743, 746, 748 asymptotic stability, 705
Constrained Gauss-Newton method, exponential stability, 374, 725
549 Convex, 646
Constraint qualification, 479, 750, cone, 644
751 function, 488, 583, 646
Constraints, 6 hull, 641
active, 543, 743 optimization problem, 487, 741
coupled input, 405 optimality condition, 453
hard, 7, 94 set, 338, 583, 641
input, 6, 94 Cooperative control, 363, 386
integrality, 8 algorithm, 421
output, 6 distributed nonlinear, 419
polyhedral, 743 Correlation, 668
probabilistic, 254 Cost function, 11, 95, 369
soft, 7, 132
state, 6, 94 DAE, 505
terminal, 96, 144–147, 212 semiexplicit DAE of index one, 506
tightened, 202, 223, 230, 242, 346, Damping, 514
357 DARE, 25, 69, 136
trust region, 514 DDP, 564
uncoupled input, 402 exact Hessian, 565
Continuation methods, 571 Gauss-Newton Hessian, 565
Continuity, 633 Decentralized control, 363, 377
lower semicontinuous, 634 Decreasing, see Sequence
uniform, 634 Derivatives, 636
upper semicontinuous, 634 Detectability, 50, 120, 275, 319, 321,
Control law, 90, 200, 445 322, 719
continuity, 104 duality with stabilizability, 291
discontinuity, 104 exponential, 285
explicit, 446 Detectable, 26, 68, 72, 73, 325
implicit, 100, 210, 446 Determinant, 27, 628, 659, 666
offline, 89, 236 Deterministic problem, 91
online, 89 Difference equation, 5
time-invariant, 100 linear, 5
616 Subject Index

nonlinear, 93, 237 disturbance models, 408


uncertain systems, 203, 211 nonlinear, 414, 422
Difference inclusion, 203, 711 state estimation, 399
asymptotic stability, 150 target problem, 410
discontinuous systems, 206 zero offset, 411
uncertain systems, 203 Disturbances, 49
Differential algebraic equation, see additive, 193, 224, 228
DAE bounded, 336
Differential dynamic programming, integrating, 50
see DDP measurement, 269
Differential equation, 91 process, 269
Differential equations, 648 random, 198
Differential states, 506 stability, 712
Differentiation Dot quantities, 518
algorithmic, 516 DP, 14, 107, 195, 364, 367, 469, 729
numerical, 515 backward, 14, 18
symbolic, 514 forward, 14, 33, 296
Direct collocation, 540, 588 robust control, 214
Direct methods, 493 Dual dynamic system, 677
Direct multiple shooting, 534, 586, Duality
589 of linear estimation and regula-
Direct single shooting, 532, 586 tion, 290
Direct transcription methods, 538 strong, 184, 769
Directional derivatives, 639 weak, 184, 769
forward, 518 Dynamic programming, see DP
reverse, 518
Discrete actuators, 8, 160 Economic MPC, 153
Discrete algebraic Riccati equation, asymptotic average performance,
see DARE 155
Discretization, 531 asymptotic stability, 156
Dissipativity, see Economic MPC comparison with tracking MPC,
Distance 158
Hausdorff, set to set, 224, 339 dissipativity, 156
point to set, 207, 208, 224 strict dissipativity, 157, 160
Distributed EKF, 302–304
gradient algorithm, 417 END, 525
nonconvex optimization, 417 Epigraph, 647
nonlinear cooperative control, Equilibrium point, 694
419 Estimation, 26, 269, 349
stability, 422 convergence, 43
optimization, 426 distributed, 399
state estimation, 399 duality with regulation, 290
target problem, 410 full information, see FIE
Distributed MPC, 363 least squares, 33
Subject Index 617

linear optimal, 29 Gauss-Legendre methods, see GL


moving horizon, see MHE Gauss-Newton Hessian, 548, 589
stability, 288 Gaussian distribution, see Normal
Euler integration method, 494, 497 density
Expectation, 655 Gaussian elimination, 508
Explicit MPC, 445 Generalized Gauss-Newton method,
Exponential stability, see Stability 549
Extended Kalman filter, see EKF Generalized predictive control, see
External numerical differentiation, GPC
see END Generalized tangential predictors,
552, 570
Farkas’s lemma, 453 GES, 698
Feasibility GL, 505
recursive, 112, 132, 356 Global error, 496
Feasible set, 487 Global solutions, 741
Feedback control, 49, 195, 340 Globalization techniques, 514
Feedback MPC, 200 Globally asymptotically stable, see
Feedback particle filtering, 302 GAS
Feedforward control, 341 Globally exponentially stable, see
FIE, 269 GES
Final-state observability, see FSO GPC, 167
Finite horizon, 21, 89 Gramian
Floating point operation, see FLOP observability, 684
FLOP, 367, 508, 560, 561 reachability, 683
Forward mode, see AD
Fritz-John necessary conditions, 753 Hamilton-Jacobi-Bellman equation,
FSO, 294 see HJB
Full information estimation, see FIE Hausdorff metric, see Distance Haus-
Fundamental theorem of linear alge- dorff
bra, 23, 42, 625 Hautus lemma
existence, 23, 625 controllability, 24
uniqueness, 42, 625 detectability, 72, 437, 441
observability, 42
Game stabilizability, 68
M-player game, 412 Hessian approximations, 547
constrained two-player, 400 BFGS, 550
cooperative, 386 Gauss-Newton, 548
noncooperative, 378 secant condition, 550
theory, 426 update methods, 549
two-player nonconvex, 418 HJB, 493
unconstrained two-player, 374 Hurwitz matrix, 220, 706
GAS, 112, 408, 433, 698 Hyperplane, 472, 642, 643
Gauss divergence theorem, 61 support, 644
Gauss-Jacobi iteration, 380 Hyperstate, 194, 333, 334
618 Subject Index

i-IOSS, 275, 285, 321, 323, 722 extended, 306–311


Implicit integrators, 500 unscented, 304–311
Increasing, see Sequence KKT, 543, 755
Incrementally, uniformly matrix, 546
input/output-to-state-stable, strongly regular, 545
see i-UIOSS
IND, 526 L-stable integration methods, 505
Independent, see Random variable Lagrange basis polynomials, 503
Indirect methods, 493 Lagrange multipliers, 66, 67, 365,
Infinite horizon, 21, 89 369, 430
Initial-value embedding, 534 Laplace transform, 3
Initial-value problem, 495 LAR, 179, 475–476
Innovation, 194, 305, 334 LDLT-factorization, 508
Input-to-state-stability, see ISS plain banded, 560
Input/output-to-state-stability, see Least squares estimation, see Estima-
IOSS tion
Integral control, see Offset-free con- Leibniz formula, 61
trol Level set, 16, 137, 648
Interior point methods, see IP LICQ, 543, 755
Internal model principle, 49 Limit, see Sequence
Internal numerical differentiation, Line search, 417, 514
see IND Linear
Invariance MPC, 131–139, 488
control, 110 quadratic MPC, 11, 99, 461–470
positive, 110, 339, 694, 712 space, 624
robust control, 217 subspace, 624
robust positive, 212, 217, 313, system, 27, 131–139
339, 350 Linear absolute regulator, see LAR
sequential control, 125 Linear independence constraint
sequential positive, 125, 707 qualification, see LICQ
IOSS, 121, 322, 323, 721 Linear multistep methods, 580
IP, 552, 580 Linear optimal state estimation, see
IPOPT, 528, 554, 580 KF
ISS, 718 Linear program, see LP
i-UIOSS, 312, 325 Linear quadratic Gaussian, see LQG
Linear quadratic problems, see LQP
K functions, 112, 275, 285, 694 Linear quadratic regulator, see LQR
upper bounding, 709 Lipschitz continuous, 374, 406, 461,
K∞ functions, 112, 694 495, 637, 761, 766
KL functions, 112, 275, 285, 694 Local error, 496
Kalman filter, see KF Local solutions, 489, 741
Karush-Khun-Tucker conditions, see Look-up table, 90
KKT LP, 448, 451
KF, 26, 33, 43, 51, 78, 79, 334 parametric, 470
Subject Index 619

LQG, 194, 335 Minimum theorem, 760


LQP, 429, 430, 558 Minkowski set subtraction, see Set al-
condensing, 560 gebra
Riccati recursion, 558 MINLP, 575
LQR, 11, 24, 364, 429, 430, 565, 736 MIQP, 575
constrained, 461–470 Mixed continuous/discrete actua-
convergence, 24 tors, 162
DP solution for constrained, 469 Mixed-integer optimization, 161
infinite horizon, 21 Models, 1
unconstrained, 132 continuous time, 492
LU-factorization, 508 deterministic, 2, 9
Luenberger observer, 338 discrete time, 5, 486
Lyapunov equation, 137, 706 distributed, 4
Lyapunov function, 113, 701 disturbance, 49, 408
control, see CLF input-output, 3
global, 208 linear dynamic, 2
IOSS, 721 stochastic, 9
ISS, 314, 718 time-invariant, 2, 10
local, 239 time-varying, 2
OSS, 720 Monotonicity, 118, 435
Lyapunov stability, 370, 432 Monte Carlo optimization, 223
uniform, 371 Move blocking, 568
Lyapunov stability constraint, 404 Moving horizon estimation, see MHE
Lyapunov stability theorem, 113, MPCTools, vii, xii
700 Multipliers, 543
KL version, 703 Multistage optimization, 12

M-player game Nash equilibrium, 382–386


constrained, 412 Newton-Lagrange method, 546
MATLAB, 22, 64, 65, 68, 508, 528 Newton-Raphson method, 509
Mean value theorem, 638 Newton-type methods, 507, 510
Merit function, 514 local convergence, 511
MHE, 39, 292 Newton-type optimization with in-
as conditional density, 40 equalities, 550
as least squares, 40 NLP, 534, 542
combining with MPC, 312 Noise, 10
comparison with EKF and UKF, Gaussian, 287
306 measurement, 10, 26, 269
convergence, 296 process, 26, 269
existence, 293 Nominal stability, see Stability
nonzero prior weighting, 296 Nonconvex
zero prior weighting, 293 optimization problem, 487
MILP, 575, 594 Nonconvex optimization problem,
Min-max optimal control, 214 745
620 Subject Index

Nonconvexity, 166, 415 Optimality conditions, 543, 737


Noncooperative control, 363, 378 convex program, 453
Nonlinear KKT, 543
MPC, 139–144, 488 linear inequalities, 744
Nonlinear interior point methods, nonconvex problems, 752
see IP normal cone, 742
Nonlinear optimization, 542 parametric LP, 472
Nonlinear program, see NLP tangent cone, 743
Nonlinear root-finding problems, Ordinary differential equation, see
508 ODE
Norm, 631, 690, 696, 717 OSS, 321, 323, 719
Normal cone, see Cone Outer-bounding tube, see Tube
Normal density, 27, 656 Output MPC, 312–318, 333
conditional, 28, 674, 675 stability, 314, 345
degenerate, 661 Output-to-state-stability, see OSS
Fourier transform of, 658
linear transformation, 28, 75 Parameter, 97, 446
multivariate, 659 Parametric optimization, 97
singular, 661 Parametric programming, 97, 446
Normal distribution, see Normal den- computation, 476
sity continuity of V 0 (·) and u0 (·), 460
Nullspace, 53, 624 linear, 470, 472, 473
Numerical differentiation, 515 piecewise quadratic, 463
forward difference, 515 quadratic, 451, 456, 458
Numerical integration, 495 Partial condensing, 562
Numerical optimal control, 485 Partial separability, 556
Particle filtering, 302
Observability, 41, 293, 722 feedback, 302
canonical form, 72 Partitioned matrix inversion theo-
duality with controllability, 291 rem, 16, 65, 628
Gramian, 684 Peano’s existence theorem, 651
matrix, 42 Picard-Lindelöf theorem, 495
Observable, 41, 293 PID control, 49, 84
OCP, 490, 585, 586, 589, 592, 731 Pivoting, 508
continuous time, 492 Plantwide control, 363, 409, 418
discrete time, 486, 555 optimal, 376, 421
Octave, 22, 64, 65, 68, 528 subsystems, 364, 374, 414
ODE, 495–507, 528 Polyhedral, 446, 447, 450, 743, 761
Offset-free control, 48–59 Polytope, 203, 450, 461, 462, 464–
Offset-free MPC, 347 466, 468, 761, 765, 766
One-step integration methods, 497 Pontryagin set subtraction, see Set al-
Online optimization algorithms, 567 gebra
Open-loop control, 195 Positive definite, 121, 629, 695
Optimal control problem, see OCP Positive semidefinite, 121, 629
Subject Index 621

Principle of optimality, 734 Region of attraction, see Attraction


Probability Regularization, 206–209
conditional density, 27, 672 Regulation, 89, 350
density, 27, 654 combining with MHE, 312
distribution, 27, 654 duality with estimation, 290
marginal density, 27, 659 Relative gain array, see RGA
moments, 655 Reverse mode, see AD
multivariate density, 27, 659 RGA, 385
noninvertible transformations, RGAS, 207, 272, 710
666 convolution maximization form,
Projection, 97, 111, 447, 731, 756, 273
763, 765, 767 RGES, 285
Proportional-integral-derivative, see RHC, 108, 109, 135, 163, 217
PID control Riccati equation, 20, 68, 69, 71, 136,
Pseudo-inverse, 625 291, 369
Pseudospectral method, 541 Riccati recursion, 558
Python, 528 RK, 498
classical (RK4), 497
Q-convergence explicit, 496
q-linearly, 511 implicit, 501
q-quadratically, 511 Robust min-max MPC, 220
q-superlinearly, 511 Robust MPC, 193, 200
Q-function, 279, 281–283 min-max, 220
QP, 100, 364, 437, 449, 451, 547 tube-based, 223
parametric, 451 Robustly asymptotically stable, see
parametric piecewise, 463 RAS
Quadratic Robustly globally asymptotically sta-
piecewise, 104, 450, 452, 458, 463, ble, see RGAS
464, 468, 761 Robustly globally exponentially sta-
Quadratic program, see QP ble, see RGES
Quadrature state, 532 Robustness
inherent, 204
Radau IIA collocation methods, 505, nominal, 204, 709
540 of nominal MPC, 209
Random variable, 654 Runge-Kutta method, see RK
independent, 27
Range, 624 Scenario optimization, 254
RAS, 229, 313 Schur decomposition, 629
Reachability Gramian, 683 real, 401, 630
Real-time iterations, 573 Semicontinuity
Receding horizon control, see RHC inner, 757
Recursive feasibility, see Feasibility outer, 757
Recursive least squares, 38, 75 Sequence, 632
Reduced Hessian, 547 accumulation point, 632, 759
622 Subject Index

convergence, 632 asymptotic, 112, 423, 698


limit, 632, 679, 680, 759 constrained, 699
monotone, 632 exponential, 120, 698
nondecreasing, 44 global, 112, 698
nonincreasing, 25 global asymptotic, 112, 126, 408,
subsequence, 632 433, 698
Sequential optimal control, 491, 562 global asymptotic (KL version),
plain dense, 563 699
sparsity-exploiting, 563 global attractive, 112
Sequential quadratic programming, global exponential, 120, 698
see SQP inherent, 91
Set local, 112, 698
affine, 632 nominal, 91
algebra, 224 robust asymptotic, see RAS
boundary, 631 robust exponential, 235
bounded, 631 robust global asymptotic, see
closed, 631 RGAS
compact, 631 time-varying systems, 125
complement, 631 with disturbances, 712
interior, 631 Stabilizability, 68, 120
level, 16, 137 duality with detectability, 291
open, 631 Stabilizable, 26, 46, 68, 73, 136, 140,
quasiregular, 751 714
regular, 749 Stage cost, 18, 153
relative interior, 632 economic, 153
sublevel, 137 State estimation, see Estimation
Set-valued function, 99, 472, 755– Statistical independence, 668
757 Steady-state target, 48, 352
Setpoint distributed, 410
nonzero, 46, 349 Stiff equations, 500
Shift initialization, 571 Stochastic MPC, 193, 200, 246
Short horizon syndrome, 311 stabilizing conditions, 248
Sigma points, 305 tightened constraints, 253
Simultaneous optimal control, 490 tube-based, 250
Singular-value decomposition, see Storage function, 156
SVD Strong duality, see Duality
Space Subgradient, 640
linear, 624 convex function, 762
vector, 624 Sublevel set, 137, 648
Sparsity, 491 Suboptimal MPC, 147, 369
SQP, 551, 589 asymptotic stability, 151
feasibility perturbed, 566 distributed, 369
local convergence, 552 exponential stability, 372
Stability, 112 Subspace
Subject Index 623

linear, 624 unconstrained, 374


Supply rate, 156 uncoupled input constraints, 402
Support function, 648
SVD, 627 UKF, 304–306
System Uncertainty, 193
composite, 343, 345 parametric, 194
deterministic, 9, 196, 333 Uncontrollable, 22
discontinuous, 206 Unit ball, 631
linear, 2, 133, 224, 228, 338 Unscented Kalman filter, see UKF
noisy, 269
nominal, 238 Value function, 13, 92, 204, 240
nonlinear, 2, 93, 123, 139, 236 continuity, 104, 208, 759
periodic, 133 discontinuity, 104
time-invariant, 3, 5, 10, 93, 338 Lipschitz continuity, 760
time-varying, 2, 123, 141, 347, 437 Variable
uncertain, 193, 195, 196, 333, 334, controlled, 47
338 disturbance, 49
dual, 543
Tangent cone, see Cone input, 2
Taylor series, 3, 64 output, 2
Taylor’s theorem, 143 primal, 543
Terminal constraint, see Constraints random, 27, 654
Terminal region, 93 independent, 27, 654
Time to go, 108, 109, 196, 217, 469 state, 2
Trace, 73, 681 Vertex, 471
Tracking, 46
periodic target, 142 Warm start, 148, 183, 221, 370, 391,
Transfer function, 4, 6, 179, 383 404, 413, 433, 555
Trust region, 514 shift initialization, 571
Tube, 202, 335 Weak controllability, see Controlla-
bounding, 226 bility
outer-bounding, 224 Weak duality, see Duality
Tube-based robust MPC, 223 Weierstrass theorem, 97, 98, 294,
feedback controller, 228 372, 636
improved, 234
linear systems, 228 Z-transform, 5
model predictive controller, 238
nominal controller, 228
nominal trajectory, 238
nonlinear systems, 236
tightened constraints, 230, 242
Two-player game
constrained, 400
coupled input constraints, 405
Note: Appendices A, B, and C can be found at
www.chemengr.ucsb.edu/~jbraw/mpc
A
Mathematical Background

Version: date: April 25, 2022


Copyright © 2022 by Nob Hill Publishing, LLC

A.1 Introduction
In this appendix we give a brief review of some concepts that we need.
It is assumed that the reader has had at least a first course on lin-
ear systems and has some familiarity with linear algebra and analy-
sis. The appendices of Polak (1997); Nocedal and Wright (2006); Boyd
and Vandenberghe (2004) provide useful summaries of the results we
require. The material presented in Sections A.2–A.14 follows closely
Polak (1997) and earlier lecture notes of Professor Polak.

A.2 Vector Spaces


The Euclidean space Rn is an example of a vector space that satisfies a
set of axioms the most significant being: if x and z are two elements
of a vector space V , then αx + βz is also an element of V for all α,
β ∈ R. This definition presumes addition of two elements of V and
multiplication of any element of V by a scalar are defined. Similarly,
S ⊂ V is a linear subspace1 of V if any two elements of x and z of S
satisfy αx + βz ∈ S for all α, β ∈ R. Thus, in R3 , the origin, a line or a
plane passing through the origin, the whole set R3 , and even the empty
set are all subspaces.

A.3 Range and Nullspace of Matrices


Suppose A ∈ Rm×n . Then R(A), the range of A, is the set {Ax | x ∈
Rn }; R(A) is a subspace of Rm and its dimension, i.e., the number
of linearly independent vectors that span R(A), is the rank of A. For
1 All of the subspaces used in this text are linear subspaces, so we often omit the
adjective linear.

624
A.4 Linear Equations — Existence and Uniqueness 625

h i
example, if A is the column vector 11 , then R(A) is the subspace
h i
spanned by the vector 11 and the rank of A is 1. The nullspace N (A)
is the set of vectors in Rn that are mapped to zero by A so that N (A) =
{x | Ax = 0}. The nullspace N (A) is a subspace of Rn . hForithe
1
example above, N (A) is the subspace spanned by the vector −1 . It
′ n
is an important fact that R(A ) ⊕ N (A) = R or, equivalently, that
N (A) = (R(A′ ))⊥ where A′ ∈ Rn×m is the transpose of A and S ⊥
denotes the orthogonal complement of any subspace S; a consequence
is that the sum of the dimensions R(A) and N (A) is n. If A is square
and invertible, then n = m and the dimension of R(A) is n so that
the dimension of N (A) is 0, i.e., the nullspace contains only the zero
vector, N (A) = {0}.

A.4 Linear Equations — Existence and Uniqueness


Let A ∈ Rm×n be a real-valued matrix with m rows and n columns. We
are often interested in solving linear equations of the type

Ax = b

in which b ∈ Rm is given, and x ∈ Rn is the unknown. The fundamen-


tal theorem of linear algebra gives a complete characterization of the
existence and uniqueness of solutions to Ax = b (Strang, 1980, pp.87–
88). Every matrix A decomposes the spaces Rn and Rm into the four
fundamental subspaces depicted in Figure A.1. A solution to Ax = b
exists for every b if and only if the rows of A are linearly independent. A
solution to Ax = b is unique if and only if the columns of A are linearly
independent.

A.5 Pseudo-Inverse
The solution of Ax = y when A is invertible is x = A−1 y where A−1 is
the inverse of A. Often an approximate inverse of y = Ax is required
when A is not invertible. This is yielded by the pseudo-inverse A† of A;
if A ∈ Rm×n , then A† ∈ Rn×m . The properties of the pseudo-inverse
are illustrated in Figure A.2 for the case when A ∈ R2×2 where both
R(A) and N (A) have dimension 1. Suppose we require a solution to
the equation Ax = y. Since every x ∈ R2 is mapped into R(A), we
see that a solution may only be obtained if y ∈ R(A). Suppose this is
not the case, as in Figure A.2. Then the closest point, in the Euclidean
sense, to y in R(A) is the point y ∗ which is the orthogonal projection
626 Mathematical Background

Ax = b
n
R Rm

R(A′ ) R(A)
r r

A′ A

0 0

N (A) N (A′ )
n−r m−r

Figure A.1: The four fundamental subspaces of matrix A (after


(Strang, 1980, p.88)). The dimension of the range of
A and A′ is r , the rank of matrix A. The nullspace of A
and range of A′ are orthogonal as are the nullspace of A′
and range of A. Solutions to Ax = b exist for all b if and
only if m = r (rows independent). A solution to Ax = b
is unique if and only if n = r (columns independent).

of y onto R(A), i.e., y − y ∗ is orthogonal to R(A). Since y ∗ ∈ R(A),


there exists a point in R2 that A maps into y ∗ . Now A maps any point
of the form x + h where h ∈ N (A) into A(x + h) = Ax + Ah = Ax
so that there must exist a point x ∗ ∈ (N (A))⊥ = R(A′ ) such that
Ax ∗ = y ∗ , as shown in Figure A.2. All points of the form x = x ∗ + h
where h ∈ N (A) are also mapped into y ∗ ; x ∗ is the point of least
norm that satisfies Ax ∗ = y ∗ where y ∗ is that point in R(A) closest,
in the Euclidean sense, to y.
The pseudo-inverse A† of a matrix A ∈ Rm×n is a matrix in Rn×m
that maps every y ∈ Rm to that point x ∈ R(A′ ) of least Euclidean
norm that minimizes y − Ax 2 . The operation of A† is illustrated in
A.5 Pseudo-Inverse 627

x A y

A
y∗
x∗

R(A′ )

N (A) R(A)

Figure A.2: Matrix A maps into R(A).

Figure A.3. Hence AA† projects any point y ∈ Rm orthogonally onto


R(A), i.e., AA† y = y ∗ , and A† A projects any x ∈ Rn orthogonally
onto R(A′ ), i.e., A† Ax = x ∗ .

x A† y

A† y∗
x∗

R(A′ )

N (A) R(A)

Figure A.3: Pseudo-inverse of A maps into R(A′ ).

If A ∈ Rm×n where m < n has maximal rank m, then AA′ ∈ Rm×m


is invertible and A† = A′ (AA′ )−1 ; in this case, R(A) = Rm and every
y ∈ Rm lies in R(A). Similarly, if n < m and A has maximal rank
n, then A′ A ∈ Rn×n is invertible and A† = (A′ A)−1 A′ ; in this case,
R(A′ ) = Rn and every x ∈ Rn lies in R(A′ ). More generally, if A ∈
Rm×n has rank r , then A has the singular-value decomposition A =
U ΣV ′ where U ∈ Rm×r and V ∈ Rr ×n are orthogonal matrices, i.e.,
U ′ U = Ir and V ′ V = Ir , and Σ = diag(σ1 , σ2 , . . . , σr ) ∈ Rr ×r where
σ1 > σ2 > · · · > σr > 0. The pseudo-inverse of A is then

A† = V Σ−1 U ′
628 Mathematical Background

A.6 Partitioned Matrix Inversion Theorem

Let matrix Z be partitioned into


" #
B C
Z=
D E

and assume Z −1 , B −1 and E −1 exist. Performing row elimination gives


" #
−1 B −1 + B −1 C(E − DB −1 C)−1 DB −1 −B −1 C(E − DB −1 C)−1
Z =
−(E − DB −1 C)−1 DB −1 (E − DB −1 C)−1

Note that this result is still valid if E is singular. Performing column


elimination gives
" #
−1 (B − CE −1 D)−1 −(B − CE −1 D)−1 CE −1
Z =
−E D(B − CE −1 D)−1
−1
E + E D(B − CE −1 D)−1 CE −1
−1 −1

Note that this result is still valid if B is singular. A host of other useful
control-related inversion formulas follow from these results. Equating
the (1,1) or (2,2) entries of Z −1 gives the identity

(A + BCD)−1 = A−1 − A−1 B(DA−1 B + C −1 )−1 DA−1

A useful special case of this result is

(I + X −1 )−1 = I − (I + X)−1

Equating the (1,2) or (2,1) entries of Z −1 gives the identity

(A + BCD)−1 BC = A−1 B(DA−1 B + C −1 )−1

Determinants. We require some results on determinants of parti-


tioned matrices when using normal distributions in the discussion of
probability. If E is nonsingular

det(A) = det(E) det(B − CE −1 D)

If B is nonsingular

det(A) = det(B) det(E − DB −1 C)


A.7 Quadratic Forms 629

A.7 Quadratic Forms


Positive definite and positive semidefinite matrices show up often in
LQ problems. Here are some basic facts about them. In the following
Q is real and symmetric and R is real.
The matrix Q is positive definite (Q > 0), if

x ′ Qx > 0, ∀ nonzero x ∈ Rn

The matrix Q is positive semidefinite (Q ≥ 0), if

x ′ Qx ≥ 0, ∀x ∈ Rn

You should be able to prove the following facts.

1. Q > 0 if and only if λ > 0, λ ∈ eig(Q).

2. Q ≥ 0 if and only if λ ≥ 0, λ ∈ eig(Q).

3. Q ≥ 0 ⇒ R ′ QR ≥ 0 ∀R.

4. Q > 0 and R nonsingular ⇒ R ′ QR > 0.

5. Q > 0 and R full column rank ⇒ R ′ QR > 0.

6. Q1 > 0, Q2 ≥ 0 ⇒ Q = Q1 + Q2 > 0.

7. Q > 0 ⇒ z∗ Qz > 0 ∀ nonzero z ∈ Cn .

8. Given Q ≥ 0, x ′ Qx = 0 if and only if Qx = 0.

You may want to use the Schur decomposition (Schur, 1909) of a matrix
in establishing some of these eigenvalue results. Golub and Van Loan
(1996, p.313) provide the following theorem

Theorem A.1 (Schur decomposition). If A ∈ Cn×n then there exists a


unitary Q ∈ Cn×n such that

Q∗ AQ = T

in which T is upper triangular.

Note that because T is upper triangular, its diagonal elements are


the eigenvalues of A. Even if A is a real matrix, T can be complex be-
cause the eigenvalues of a real matrix may come in complex conjugate
pairs. Recall a matrix Q is unitary if Q∗ Q = I. You should also be able
to prove the following facts (Horn and Johnson, 1985).
630 Mathematical Background

1. If A ∈ Cn×n and BA = I for some B ∈ Cn×n , then

(a) A is nonsingular
(b) B is unique
(c) AB = I

2. The matrix Q is unitary if and only if

(a) Q is nonsingular and Q∗ = Q−1


(b) QQ∗ = I
(c) Q∗ is unitary
(d) The rows of Q form an orthonormal set
(e) The columns of Q form an orthonormal set

3. If A is real and symmetric, then T is real and diagonal and Q can be


chosen real and orthogonal. It does not matter if the eigenvalues
of A are repeated.

For real, but not necessarily symmetric, A you can restrict yourself
to real matrices, by using the real Schur decomposition (Golub and
Van Loan, 1996, p.341), but the price you pay is that you can achieve
only block upper triangular T , rather than strictly upper triangular T .

Theorem A.2 (Real Schur decomposition). If A ∈ Rn×n then there exists


an orthogonal Q ∈ Rn×n such that
 
R11 R12 ··· R1m
 0 R22 ··· R2m 
 
Q′ AQ = 
 .. .. .. .. 

 . . . . 
0 0 ··· Rmm

in which each Rii is either a real scalar or a 2×2 real matrix having com-
plex conjugate eigenvalues; the eigenvalues of Rii are the eigenvalues
of A.

If the eigenvalues of Rii are disjoint (i.e., the eigenvalues are not re-
peated), then R can be taken block diagonal instead of block triangular
(Golub and Van Loan, 1996, p.366).
A.8 Norms in Rn 631

A.8 Norms in Rn
A norm in Rn is a function |·| : Rn → R≥0 such that

(a) |x| = 0 if and only if x = 0;

(b) |λx| = |λ| |x| , for all λ ∈ R, x ∈ Rn ;

(c) x + y ≤ |x| + y , for all x, y ∈ Rn .

Let B := {x | |x| ≤ 1} denote the closed ball of radius 1 centered at


the origin. For any x ∈ Rn and ρ > 0, we denote by x ⊕ ρB or B(x, ρ)
the closed ball {z | |z − x| ≤ ρ} of radius ρ centered at x. Similarly
{x | |x| < 1} denotes the open ball of radius 1 centered at the origin
and {z | |z − x| < ρ} the open ball of radius ρ centered at x; closed
and open sets are defined below.

A.9 Sets in Rn
The complement of S ⊂ Rn in Rn , is the set S c := {x ∈ Rn | x ̸∈ S}. A
set X ⊂ Rn is said to be open, if for every x ∈ X, there exists a ρ > 0
such that B(x, ρ) ⊆ X. A set X ⊂ Rn is said to be closed if X c , its
complement in Rn , is open.
A set X ⊂ Rn is said to be bounded if there exists an M < ∞ such that
|x| ≤ M for all x ∈ X. A set X ⊂ Rn is said to be compact if X is closed
and bounded. An element x ∈ S ⊆ Rn is an interior point of the set S if
there exists a ρ > 0 such that z ∈ S, for all |z − x| < ρ. The interior of a
set S ⊂ Rn , int(S), is the set of all interior points of S; int(S) is an open
set, the largest 2 open subset of S. For example, if S = [a, b] ⊂ R, then
int(S) = (a, b); as another example, int(B(x, ρ)) = {z | |z − x| < ρ}.
The closure of a set S ⊂ Rn , denoted S̄, is the smallest 3 closed set
containing S. For example, if S = (a, b] ⊂ R, then S̄ = [a, b]. The
boundary of S ⊂ Rn , is the set δS := S̄ \ int(S) = {s ∈ S̄ | s ∉ int(S)}.
For example, if S = (a, b] ⊂ R, then int(S) = (a, b), S̄ = [a, b], ∂S = {a,
b}.
An affine set S ⊂ Rn is a set that can be expressed in the form
S = {x} ⊕ V := {x + v | v ∈ V } for some x ∈ Rn and some subspace
V of Rn . An example is a line in Rn not passing through the origin.
The affine hull of a set S ⊂ Rn , denoted aff(S), is the smallest4 affine
2 Largest in the sense that every open subset of S is a subset of int(S).
3 Smallest in the sense that S̄ is a subset of any closed set containing S.
4 In the sense that aff(S) is a subset of any other affine set containing S.
632 Mathematical Background

set that contains S. That is equivalent to the intersection of all affine


sets containing S.
Some sets S, such as a line in Rn , n ≥ 2, do not have an interior,
but do have an interior relative to the smallest affine set in which S
lies, which is aff(S) defined above. The relative interior of S is the
set {x ∈ S | ∃ρ > 0 such that int(B(x, h ρ))
i ∩ aff(S) h⊂ iS}. Thus the
line segment, S := {x ⊂ R | x = λ 0 + (1 − λ) 01 , λ ∈ [0, 1]}
2 1

does not have an interior, but does have an interior relative to the line
containing it, aff(S). h iinterior of S is the open line segment
h i The relative
{x ∈ R2 | x = λ 10 + (1 − λ) 01 , λ ∈ (0, 1)}.

A.10 Sequences
Let the set of nonnegative integers be denoted by I≥0 . A sequence is a
function from I≥0 into Rn . We denote a sequence by its values, (xi )i∈I≥0 .
A subsequence of (xi )i∈I≥0 is a sequence of the form (xi )i∈K , where K
is an infinite subset of I≥0 .
A sequence (xi )i∈I≥0 in Rn is said to converge to a point x̂ if
limi→∞ |xi − x̂| = 0, i.e., if, for all δ > 0, there exists an integer k such
that |xi − x̂| ≤ δ for all i ≥ k; we write xi → x̂ as i → ∞ to denote the
fact that the sequence (xi ) converges to x̂. The point x̂ is called a limit
of the sequence (xi ). A point x ∗ is said to be an accumulation point
of a sequence (xi )i∈I≥0 in Rn , if there exists an infinite subset K ⊂ I≥0
K
such that xi → x ∗ as i → ∞, i ∈ K in which case we say xi → x ∗ .5
Let (xi ) be a bounded infinite sequence in R and let the S be the set
of all accumulation points of (xi ). Then S is compact and lim sup xi is
the largest and lim inf xi the smallest accumulation point of (xi ):

lim sup xi := max{x | x ∈ S}, and


i→∞

lim inf xi := min{x | x ∈ S}


i→∞

Theorem A.3 (Bolzano-Weierstrass). Suppose X ⊂ Rn is compact and


(xi )i∈I≥0 takes its values in X. Then (xi )i∈I≥0 must have at least one
accumulation point.
From Exercise A.7, it follows that the accumulation point postulated
by Theorem A.3 lies in X. In proving asymptotic stability we need the
following property of monotone sequences.
5 Be aware of inconsistent usage of the term limit point. Some authors use limit point

as synonymous with limit. Others use limit point as synonymous with accumulation
point. For this reason we avoid the term limit point.
A.11 Continuity 633

Proposition A.4 (Convergence of monotone sequences). Suppose that


(xi )i∈I≥0 is a sequence in R such that x0 ≥ x1 ≥ x2 ≥ . . ., i.e., suppose
the sequence is monotone nonincreasing. If (xi ) has an accumulation
point x ∗ , then xi → x ∗ as i → ∞, i.e., x ∗ is a limit.

Proof. For the sake of contradiction, suppose that (xi )i∈I≥0 does not
converge to x ∗ . Then, for some ρ > 0, there exists a subsequence
(xi )i∈K such that xi ̸∈ B(x ∗ , ρ) for all i ∈ K, i.e., |xi − x ∗ | > ρ for all
i ∈ K. Since x ∗ is an accumulation point, there exists a subsequence
K∗
(xi )i∈K ∗ such that xi → x ∗ . Hence there is an i1 ∈ K ∗ such that
|xi − x ∗ | ≤ ρ/2, for all i ≥ i1 , i ∈ K ∗ . Let i2 ∈ K be such that i2 > i1 .
Then we must have that xi2 ≤ xi1 and xi2 − x ∗ > ρ, which leads
to the conclusion that xi2 < x ∗ − ρ. Now let i3 ∈ K ∗ be such that
i3 > i2 . Then we must have that xi3 ≤ xi2 and hence that xi3 < x ∗ − ρ
which implies that xi3 − x ∗ > ρ. But this contradicts the fact that
xi3 − x ∗ ≤ ρ/2, and hence we conclude that xi → x ∗ as i → ∞. ■

It follows from Proposition A.4 that if (xi )i∈I≥0 is a monotone de-


creasing sequence in R bounded below by b, then the sequence (xi )i∈I≥0
converges to some x ∗ ∈ R where x ∗ ≥ b.

A.11 Continuity
We now summarize some essential properties of continuous functions.

1. A function f : Rn → Rm is said to be continuous at a point x ∈ Rn ,


if for every δ > 0 there exists a ρ > 0 such that

f (x ′ ) − f (x) < δ ∀x ′ ∈ int(B(x, ρ))

A function f : Rn → Rm is said to be continuous if it is continuous


at all x ∈ Rn .

2. Let X be a closed subset of Rn . A function f : X → Rm is said to


be continuous at a point x in X if for every δ > 0 there exists a
ρ > 0 such that

f (x ′ ) − f (x) < δ ∀x ′ ∈ int(B(x, ρ)) ∩ X

A function f : Rn → Rm is said to be continuous on X if it is


continuous at all x in X.
634 Mathematical Background

3. A function f : Rn → Rm is said to be upper semicontinuous at a


point x ∈ Rn , if for every δ > 0 there exists a ρ > 0 such that

f (x ′ ) − f (x) < δ ∀x ′ ∈ int(B(x, ρ))

A function f : Rn → Rm is said to be upper semicontinuous if it


is upper semicontinuous at all x ∈ Rn .

4. A function f : Rn → Rm is said to be lower semicontinuous at a


point x ∈ Rn , if for every δ > 0 there exists a ρ > 0 such that

f (x ′ ) − f (x) > −δ ∀x ′ ∈ int(B(x, ρ))

A function f : Rn → Rm is said to be lower semicontinuous if it


is lower semicontinuous at all x ∈ Rn .

5. A function f : Rn → Rm is said to be uniformly continuous on a


subset X ⊂ Rn if for any δ > 0 there exists a ρ > 0 such that for
any x ′ , x ′′ ∈ X satisfying |x ′ − x ′′ | < ρ,

f (x ′ ) − f (x ′′ ) < δ

Proposition A.5 (Uniform continuity). Suppose that f : Rn → Rm is


continuous and that X ⊂ Rn is compact. Then f is uniformly continuous
on X.

Proof. For the sake of contradiction, suppose that f is not uniformly


 
continuous on X. Then, for some δ > 0, there exist sequences xi′ ,
 
xi′′ in X such that

xi′ − xi′′ < (1/i), for all i ∈ I≥0

but
f (xi′ ) − f (xi′′ ) > δ, for all i ∈ I≥0 (A.1)
 
Since X is compact, there must exist a subsequence xi′ such that
i∈K
K K
xi′ →x ∈ X as i → ∞. Furthermore, because of (A.1), xi′′ → x ∗ also

K
holds. Hence, since f (·) is continuous, we must have f (xi′ ) → f (x ∗ )
K
and f (xi′′ ) → f (x ∗ ). Therefore, there exists a i0 ∈ K such that for all
i ∈ K, i ≥ i0

f (xi′ ) − f (xi′′ ) ≤ f (xi′ ) − f (x ∗ ) + f (x ∗ ) − f (xi′′ ) < δ/2

contradicting (A.1). This completes our proof. ■


A.11 Continuity 635

Proposition A.6 (Compactness of continuous functions of compact


sets). Suppose that X ⊂ Rn is compact and that f : Rn → Rm is contin-
uous. Then the set
f (X) := {f (x) | x ∈ X}
is compact.

Proof.
(a) First we show that f (X) is closed. Thus, let (f (xi ) | i ∈ I≥0 ), with
xi ∈ X, be any sequence in f (X) such that f (xi ) → y as i → ∞. Since
(xi ) is in a compact set X, there exists a subsequence (xi )i∈K such
K K
that xi → x ∗ ∈ X as i → ∞. Since f (·) is continuous, f (xi ) → f (x ∗ )
as i → ∞. But y is the limit of (f (xi ))i∈I≥0 and hence it is the limit of
any subsequence of (f (xi )) . We conclude that y = f (x ∗ ) and hence
that y ∈ f (X), i.e., f (X) is closed.

(b) Next, we prove that f (X) is bounded. Suppose f (X) is not


bounded. Then there exists a sequence (xi ) such that f (xi ) ≥ i for
all i ∈ I≥0 . Now, since (xi ) is in a compact set, there exists a subse-
K K
quence (xi )i∈K such that xi → x ∗ with x ∗ ∈ X, and f (xi ) → f (x ∗ ) by
continuity of f (·). Hence there exists an i0 such that for any j > i > i0 ,
j, i ∈ K

f (xj ) − f (xi ) ≤ f (xj ) − f (x ∗ ) + f (xi ) − f (x ∗ ) < 1/2 (A.2)

Let i ≥ i0 be given. By hypothesis there exists a j ∈ K, j ≥ i such that


f (xj ) ≥ j ≥ f (xi ) + 1. Hence

f (xj ) − f (xi ) ≥ f (xj ) − f (xi ) ≥1

which contradicts (A.2). Thus f (X) must be bounded, which completes


the proof. ■

Let Y ⊂ R. Then inf(Y ), the infimum of Y , is defined to be the


greatest lower bound6 of Y . If inf(Y ) ∈ Y , then min(Y ) := min{y |
y ∈ Y }, the minimum of the set Y , exists and is equal to inf(Y ). The
infimum of a set Y always exists if Y is not empty and is bounded from

below, in which case there always exist sequences yi ∈ Y such that
yi ↘ β := inf(Y ) as i → ∞. Note that β := inf(Y ) does not necessarily
lie in the set Y .
6 The value α ∈ R is the greatest lower bound of Y if y ≥ α for all y ∈ Y , and β > α
implies that β is not a lower bound for Y .
636 Mathematical Background

Proposition A.7 (Weierstrass). Suppose that f : Rn → R is continuous


and that X ⊂ Rn is compact. Then there exists an x̂ ∈ X such that

f (x̂) = inf f (x)


x∈X

i.e., minx∈X f (x) is well defined.

Proof. Since X is compact, f (X) is bounded. Hence inf x∈X f (x) = α


is finite. Let (xi ) be an infinite sequence in X such that f (xi ) ↘ α
as i → ∞. Since X is compact, there exists a converging subsequence
K K
(xi )i∈K such that xi → x̂ ∈ X. By continuity, f (xi ) → f (x̂) as i → ∞.
Because (f (xi )) is a monotone nonincreasing sequence that has an
accumulation point f (x̂), it follows from Proposition A.4 that f (xi ) →
f (x̂) as i → ∞. Since the limit of the sequence (f (xi )) is unique, we
conclude that f (x̂) = α. ■

A.12 Derivatives

We first define some notation. If f : Rn → R, then (∂/∂x)f (x) is a row


vector defined by

(∂/∂x)f (x) := [(∂/∂x1 )f (x), . . . , (∂/∂xn )f (x)]

provided the partial derivatives (∂/∂xi )f (x), i = 1, 2, . . . , n exist. Sim-


ilarly, if f : Rn → Rm , (∂/∂x)f (x) is defined to be the matrix
 
(∂/∂x1 )f1 (x) (∂/∂x2 )f1 (x) ... (∂/∂xn )f1 (x)
 (∂/∂x )f (x) (∂/∂x2 )f2 (x) ... (∂/∂xn )f2 (x) 
 1 2 
(∂/∂x)f (x) := 
 .. .. .. .. 

 . . . . 
(∂/∂x1 )fm (x) (∂/∂x2 )fm (x) ... (∂/∂xn )fm (x)

where xi and fi denote, respectively, the ith component of the vectors


x and f . We sometimes use fx (x) in place of (∂/∂x)f (x). If f : Rn →
R, then its gradient ∇f (x) is a column vector defined by
 
(∂/∂x1 )f (x)
 (∂/∂x )f (x) 
 2 
∇f (x) := 
 .. 

 . 
(∂/∂xn )f (x)
A.12 Derivatives 637

and its Hessian is ∇2 f (x) = (∂ 2 /∂x 2 )f (x) = fxx (x) defined by


 2 2 2 2

(∂ /∂x1 )f (x) (∂ /∂x1 ∂x2 )f (x) ... (∂ /∂x1 ∂xn )f (x)
 (∂ 2 /∂x ∂x )f (x) (∂x22 )f (x) ... (∂ 2 /∂x2 ∂xn )f (x)
 2 1 
∇ f (x) := 
2
 .. .. .. .. 

 . . . . 
(∂ 2 /∂xn ∂x1 )f (x) (∂ 2 /∂xn ∂x2 )f (x) ... 2 2
(∂ /∂xn )f (x)

We note that ∇f (x) = [(∂/∂x)f (x)]′ = fx′ (x).


We now define what we mean by the derivative of f (·). Let f : Rn →
R be a continuous function with domain Rn . We say that f (·) is
m

differentiable at x̂ if there exists a matrix Df (x̂) ∈ Rm×n (the Jacobian)


such that
f (x̂ + h) − f (x̂) − Df (x̂)h
lim =0
h→0 |h|
in which case Df (·) is called the derivative of f (·) at x̂. When f (·) is
differentiable at all x ∈ Rn , we say that f is differentiable.
We note that the affine function h , f (x̂) + Df (x̂)h is a first order
approximation of f (x̂ + h). The Jacobian can be expressed in terms of
the partial derivatives of f (·).

Proposition A.8 (Derivative and partial derivative). Suppose that the


function f : Rn → Rm is differentiable at x̂. Then its derivative Df (x̂)
satisfies
Df (x̂) = fx (x̂) := ∂f (x̂)/∂x

Proof. From the definition of Df (x̂) we deduce that for each i ∈ {1, 2,
. . . , m}
fi (x̂ + h) − fi (x̂) − Dfi (x̂)h
lim =0
h→0 |h|
where fi is the ith element of f and (Df )i the ith row of Df . Set
h = tej , where ej is the j-th unit vector in Rn so that |h| = t. Then
(Df )i (x̂)h = t(Df )i (x̂)ej = (Df )ij (x̂), the ijth element of the matrix
Df (x̂). It then follows that

f i (x̂ + tej ) − f (x̂) − t(Df )ij (x̂)


lim =0
t↘0 t
which shows that (Df )ij (x̂) = ∂fi (x̂)/∂xj . ■

A function f : Rn → Rm is locally Lipschitz continuous at x̂ if there


exist L ∈ [0, ∞), ρ̂ > 0 such that

f (x) − f (x ′ ) ≤ L x − x ′ , for all x, x ′ ∈ B(x̂, ρ̂)


638 Mathematical Background

The function f is globally Lipschitz continuous if the inequality holds


for all x, x ′ ∈ Rn . The constant L is called the Lipschitz constant of
f . It should be noted that the existence of partial derivatives of f (·)
does not ensure the existence of the derivative Df (·) of f (·); see e.g.
Apostol (1974, p.103). Thus consider the function

f (x, y) = x + y if x = 0 or y = 0

f (x, y) = 1 otherwise

In this case

∂f (0, 0) f (t, 0) − f (0, 0)


= lim =1
∂x t→0 t
∂f (0, 0) f (0, t) − f (0, 0)
= lim =1
∂y t→0 t

but the function is not even continuous at (0, 0). In view of this, the
following result is relevant.

Proposition A.9 (Continuous partial derivatives). Consider a function


f : Rn → Rm such that the partial derivatives ∂f i (x)/dx j exist in a
neighborhood of x̂, for i = 1, 2, . . . , n, j = 1, 2, . . . , m. If these partial
derivatives are continuous at x̂, then the derivative Df (x̂) exists and is
equal to fx (x̂).

The following chain rule holds.

Proposition A.10 (Chain rule). Suppose that f : Rn → Rm is defined by


f (x) = h(g(x)) with both h : Rl → Rm and g : Rn → Rl differentiable.
Then
∂f (x̂) ∂h(g(x̂)) ∂g(x̂)
=
∂x ∂y ∂x

The following result Dieudonne (1960), replaces, inter alia, the mean
value theorem for functions f : Rn → Rm when m > 1.

Proposition A.11 (Mean value theorem for vector functions).


(a) Suppose that f : Rn → Rm has continuous partial derivatives at each
point x of Rn . Then for any x, y ∈ Rn ,
Z1
f (y) = f (x) + fx (x + s(y − x))(y − x)ds
0
A.12 Derivatives 639

(b) Suppose that f : Rn → Rm has continuous partial derivatives of


order two at each point x of Rn . Then for any x, y ∈ Rn ,
Z1
f (y) = f (x)+fx (x)(y−x)+ (1−s)(y−x)′ fxx (x+s(y−x))(y−x)ds
0

Proof.
(a) Consider the function g(s) = f (x + s(y − x)) where f : Rn → Rm .
Then g(1) = f (y), g(0) = f (x) and
Z1
g(1) − g(0) = g ′ (s)ds
0
Z1
= Df (x + s(y − x))(y − x)ds
0

which completes the proof for p = 1.

(b) Consider the function g(s) = f (x + s(y − x)) where f : Rn → R.


Then
d
[g ′ (s)(1 − s) + g(s)] = g ′′ (s)(1 − s)
ds
Integrating from 0 to 1 yields
Z1

g(1) − g(0) − g (0) = (1 − s)g ′′ (s)ds
0

But g ′′ (s) = (y − x)′ fxx (x + s(y − x))(y − x) so that the last equation
yields
Z1
f (y)−f (x) = fx (x)(y−x)+ (1−s)(y−x)′ fxx (x+s(y−x))(y−x)ds
0

when g(s) is replaced by f (x + s(y − x)).


Finally, we define directional derivatives which may exist even when


a function fails to have a derivative. Let f : Rn → Rm . We define the
directional derivative of f at a point x̂ ∈ Rn in the direction h ∈ Rn (h ̸=
0) by
f (x̂ + th) − f (x̂)
df (x̂; h) := lim
t↘0 t
if this limit exists (note that t > 0 is required). The directional deriva-
tive is positively homogeneous, i.e., df (x; λh) = λdf (x; h) for all
λ > 0.
640 Mathematical Background

Not all the functions we discuss are differentiable everywhere. Ex-


amples include the max function ψ(·) defined by ψ(x) := max i {f i (x) |
i ∈ I} where each function f i : Rn → R is continuously differentiable
everywhere. The function ψ(·) is not differentiable at those x for which
the active set I 0 (x) := {i ∈ I | f i (x) = ψ(x)} has more than one
element. The directional derivative d(x; h) exists for all x, h in Rn ,
however, and is given by

dψ(x; h) = max {dfi (x; h) | i ∈ I 0 (x)} = max {⟨∇fi (x), h⟩ | i ∈ I 0 (x)}


i i

When, as in this example, the directional derivative exists for all x, h in


Rn we can define a generalization, called the subgradient, of the con-
ventional gradient. Suppose that f : Rn → R has a directional derivative
for all x, h in Rn . The f (·) has a subgradient ∂f (·) defined by

∂ψ(x) := {g ∈ Rn | df (x; h) ≥ ⟨g, h⟩ ∀h ∈ Rn }

The subgradient at a point x is, unlike the ordinary gradient, a set.


For our max example (f (x) = ψ(x) = max i {fi (x) | i ∈ I}) we have
dψ(x; h) = max i {⟨∇f i (x), h⟩ | i ∈ I 0 (x)}. In this case, it can be
shown that
∂ψ(x) = co{∇f i (x) | i ∈ I 0 (x)}

If the directional derivative h , df (x; h) is convex, then the subgradi-


ent ∂f (x) is nonempty and the directional derivative df (x; h) may be
expressed as
df (x; h) = max {⟨g, h⟩ | g ∈ ∂f (x)}
g

Figure A.4 illustrates this for the case when ψ(x) := max{f1 (x), f2 (x)}
and I 0 (x) = {1, 2}.

∇f1 (x)
∇f2 (x)
∂ψ(x)

f1 (x) = ψ(x) f2 (x) = ψ(x)

Figure A.4: Subgradient.


A.13 Convex Sets and Functions 641

A.13 Convex Sets and Functions


Convexity is an enormous subject. We collect here only a few essential
results that we will need in our study of optimization; for further details
see Rockafellar (1970). We begin with convex sets.

A.13.1 Convex Sets

Definition A.12 (Convex set). A set S ∈ Rn is said to be convex if, for


any x ′ , x ′′ ∈ S and λ ∈ [0, 1], (λx ′ + (1 − λ)x ′′ ) ∈ S.
Let S be a subset of Rn . We say that co(S) is the convex hull of S if
it is the smallest7 convex set containing S.
Theorem A.13 (Caratheodory). Let S be a subset of Rn . If x̄ ∈ co(S),
then it may be expressed as a convex combination of no more than n + 1
points in S, i.e., there exist m ≤ n + 1 distinct points, {xi }m
i=1 , in S such
Pm Pm
that x̄ = i=1 µ i xi , µ i > 0, i=1 µ i = 1.
Proof. Consider the set
kx
X kx
X
Cs := {x | x = µ i xi , xi ∈ S, µ i ≥ 0, µ i = 1, kx ∈ I≥0 }
i=1 i=1

First, it is clear that S ⊂ Cs . Next, since for any x ′ , x ′′ ∈ Cs , λx ′ +


(1 − λx ′′ ) ∈ Cs , for λ ∈ [0, 1], it follows that Cs is convex. Hence
we must have that co(S) ⊂ Cs . Because Cs consists of all the convex
combinations of points in S, however, we must also have that Cs ⊂
co(S). Hence Cs = co(S). Now suppose that

X
x̄ = µ̄ i xi
i=1
Pk̄
with µ̄ i ≥ 0, i = 1, 2, . . . , k̄, i=1 µ̄ i = 1. Then the following system of
equations is satisfied

" # " #
X
i xi x̄
µ̄ = (A.3)
1 1
i=1
i
with µ̄ ≥ 0. Suppose that k̄ > n + 1. Then there exist coefficients
αj , j = 1, 2, . . . , k̄, not all zero, such that

" #
X xi
i
α =0 (A.4)
1
i=1
7 Smallest in the sense that any other convex set containing S also contains co(S).
642 Mathematical Background

Adding (A.4) multiplied by θ to (A.3) we get



" # " #
X xi x̄
i i
(µ̄ + θα ) =
1 1
i=1

Suppose, without loss of generality, that at least one αi < 0. Then there
exists a θ̄ > 0 such that µ̄ j + θ̄αj = 0 for some j while µ̄ i + θ̄αi ≥ 0
for all other i. Thus we have succeeded in expressing x̄ as a convex
combination of k̄ − 1 vectors in S. Clearly, these reductions can go on
as long as x̄ is expressed in terms of more than (n + 1) vectors in S.
This completes the proof. ■

Let S1 , S2 be any two sets in Rn . We say that the hyperplane

H = {x ∈ Rn | ⟨x, v⟩ = α}

separates S1 and S2 if

⟨x, v⟩ ≥ α for all x ∈ S1


y, v ≤ α for all y ∈ S2

The separation is said to be strong if there exists an ε > 0 such that

⟨x, v⟩ ≥ α + ε for all x ∈ S1


y, v ≤ α − ε for all y ∈ S2

v
H

S1

S2

Figure A.5: Separating hyperplane.

Theorem A.14 (Separation of convex sets). Let S1 , S2 be two convex


sets in Rn such that S1 ∩ S2 = ∅. Then there exists a hyperplane which
separates S1 and S2 . Furthermore, if S1 and S2 are closed and either S1
or S2 is compact, then the separation can be made strict.
A.13 Convex Sets and Functions 643

Theorem A.15 (Separation of convex set from zero). Suppose that S ⊂


Rn is closed and convex and 0 ̸∈ S. Let

x̂ = arg min{|x|2 | x ∈ S}

Then
H = {x | ⟨x̂, x⟩ = |x̂|2 }
separates S from 0, i.e., ⟨x̂, x⟩ ≥ |x̂|2 for all x ∈ S.

Proof. Let x ∈ S be arbitrary. Then, since S is convex, [x̂+λ(x−x̂)] ∈ S


for all λ ∈ [0, 1]. By definition of x̂, we must have

0 < |x̂|2 ≤ |x̂ + λ(x − x̂)|2


= |x̂|2 + 2λ⟨x̂, x − x̂⟩ + λ2 |x − x̂|2

Hence, for all λ ∈ (0, 1],

0 ≤ 2 ⟨x̂, x − x̂⟩ + λ |x − x̂|2

Letting λ → 0 we get the desired result. ■

Theorem A.15 can be used to prove the following special case of


Theorem A.14:

Corollary A.16 (Existence of separating hyperplane). Let S1 , S2 be two


compact convex sets in Rn such that S1 ∩ S2 = ∅. Then there exists a
hyperplane which separates S1 and S2 .

Proof. Let C = S1 − S2 := {x1 − x2 | x1 ∈ S1 , x2 ∈ S2 }. Then C is convex


and compact and 0 ̸∈ C. Let x̂ = (x̂1 − x̂2 ) = arg min{|x|2 | x ∈ C},
where x̂1 ∈ S1 and x̂2 ∈ S2 . Then, by Theorem A.15

⟨x − x̂, x̂⟩ ≥ 0, for all x ∈ C (A.5)

Let x = x1 − x̂2 , with x1 ∈ S1 . Then (A.5) leads to

⟨x1 − x̂2 , x̂⟩ ≥ |x̂|2 (A.6)

for all x1 ∈ S1 . Similarly, letting x = x̂1 − x2 , in (A.5) yields

⟨x̂1 − x2 , x̂⟩ ≥ |x̂|2 (A.7)

for all x2 ∈ S2 . The inequality in (A.7) implies that

⟨x̂1 − x̂2 + x̂2 − x2 , x̂⟩ ≥ |x̂|2


644 Mathematical Background

Since x̂1 − x̂2 = x̂, we obtain

⟨x2 − x̂2 , x̂⟩ ≤ 0 (A.8)

for all x2 ∈ S2 . The desired result follows from (A.6) and (A.8), the
separating hyperplane H being {x ∈ Rn | ⟨x̂, x − x̂2 ⟩ = 0}. ■

Definition A.17 (Support hyperplane). Suppose S ⊂ Rn is convex. We


say that H = {x | ⟨x − x̄, v⟩ = 0} is a support hyperplane to S through
x̄ with inward (outward) normal v if x̄ ∈ S and

⟨x − x̄, v⟩ ≥ 0 (≤ 0) for all x ∈ S

Theorem A.18 (Convex set and halfspaces). A closed convex set is equal
to the intersection of the halfspaces which contain it.

Proof. Let C be a closed convex set and A the intersection of halfspaces


containing C. Then clearly C ⊂ A. Now suppose x̄ ̸∈ C. Then there
exists a support hyperplane H which separates strictly x̄ and C so that
x̄ does not belong to one halfspace containing C. It follows that x̄ ̸∈ A.
Hence C c ⊂ Ac which leads to the conclusion that A ⊂ C. ■

An important example of a convex set is a convex cone.

Definition A.19 (Convex cone). A subset C of Rn , C ≠ ∅, is called a


cone if x ∈ C implies λx ∈ C for all λ ≥ 0. A cone C is pointed if
C ∩ −C = {0}. A convex cone is a cone that is convex.

An example of a cone is a halfspaces with a boundary that is a hy-


perplane passing through the origin; an example of a pointed cone is
the positive orthant. A polyhedron C defined by C := {x | ⟨ai , x⟩ ≤ 0,
i ∈ I} is a convex cone that is pointed

Definition A.20 (Polar cone). Given a cone C ⊂ Rn , the cone C ∗ defined


by
C ∗ := {h | ⟨h, x⟩ ≤ 0 ∀x ∈ C}

is called the polar cone of C.

An illustration of this definition when C is a polyhedron containing


the origin is given in Figure A.6. In this figure, H is the hyperplane with
normal h passing through the origin.
A.13 Convex Sets and Functions 645

C∗
h

H 0

Figure A.6: Polar cone.

Definition A.21 (Cone generator). A cone K is said to be generated by


a set {ai | i ∈ I} where I is an index set if
 
X 
K= µi ai | µi ≥ 0, i ∈ I
 
i∈I

in which case we write K = cone{ai | i ∈ I}.

We make use of the following result:

Proposition A.22 (Cone and polar cone generator).


(a) Suppose C is a convex cone containing the origin and defined by

C := {x ∈ Rn | ⟨ai , x⟩ ≤ 0, i ∈ I}

Then
C ∗ = cone{ai | i ∈ I}

(b) If C is a closed convex cone, then (C ∗ )∗ = C.

(c) If C1 ⊂ C2 , then C2∗ ⊂ C1∗ .

Proof.
(a) Let the convex set K be defined by

K := cone{ai | i ∈ I}

We wish to prove C ∗ = K. To prove K ⊂ C ∗ , suppose h is an arbitrary


P
point in K := cone{ai | i ∈ I}. Then h = i∈I µi ai where µi ≥ 0 for all
i ∈ I. Let x be an arbitrary point in C so that ⟨ai , x⟩ ≤ 0 for all i ∈ I.
Hence X X
⟨h, x⟩ = ⟨ µi ai , x⟩ = µi ⟨ai , x⟩ ≤ 0
i∈I i∈I
646 Mathematical Background

so that h ∈ C ∗ . This proves that K ⊂ C ∗ . To prove that C ∗ ⊂ K,


assume that h ∈ C ∗ but that, contrary to what we wish to prove, h ̸∈ K.
P e where either µj > 0 for at least one j ∈ I,
Hence h = i∈I µi ai + h
e , which is orthogonal to ai , i ∈ I, is not zero, or both. If µj < 0,
or h
let x ∈ C be such that ⟨ai , x⟩ = 0 for all i ∈ I, i ≠ j and ⟨aj , x⟩ < 0;
e ≠ 0, let x ∈ C be such that ⟨h
if h e , x⟩ > 0 (both conditions can be
satisfied). Then

e , x⟩ = µj ⟨aj , x⟩ + ⟨h
⟨h, x⟩ = ⟨µj aj , x⟩ + ⟨h e , x⟩ > 0

e ≠ 0 or both.
since either both µj and ⟨aj , x⟩ are strictly negative or h

This contradicts the fact that x ∈ C and h ∈ C (so that ⟨h, x⟩ ≤ 0).
Hence h ∈ K so that C ∗ ⊂ K. It follows that C ∗ = cone{ai | i ∈ I}.

(b) That (C ∗ )∗ = C when C is a closed convex cone is given in Rock-


afellar and Wets (1998), Corollary 6.21.

(c) This result follows directly from the definition of a polar cone.

A.13.2 Convex Functions

Next we turn to convex functions. For an example see Figure A.7.

f (x)

f (y)

x y

Figure A.7: A convex function.

A function f : Rn → R is said to be convex if for any x ′ , x ′′ ∈ Rn


and λ ∈ [0, 1],

f (λx ′ + (1 − λ)x ′′ ) ≤ λf (x ′ ) + (1 − λ)f (x ′′ )


A.13 Convex Sets and Functions 647

A function f : Rn → R is said to be concave if −f is convex.


The epigraph of a function f : Rn → R is defined by

epi(f ) := {(x, y) ∈ Rn × R | y ≥ f (x)}

Theorem A.23 (Convexity implies continuity). Suppose f : Rn → R is


convex. Then f is continuous in the interior of it domain.

The following property is illustrated in Figure A.7.

Theorem A.24 (Differentiability and convexity). Suppose f : Rn → R is


differentiable. Then f is convex if and only if

f (y) − f (x) ≥ ∇f (x), y − x for all x, y ∈ Rn (A.9)

Proof. ⇒ Suppose f is convex. Then for any x, y ∈ Rn , and λ ∈ [0, 1]

f (x + λ(y − x)) ≤ (1 − λ)f (x) + λf (y) (A.10)

Rearranging (A.10) we get

f (x + λ(y − x)) − f (x)


≤ f (y) − f (x) for all λ ∈ [0, 1]
λ
Taking the limit as λ → 0 we get (A.9).
⇐ Suppose (A.9) holds. Let x and y be arbitrary points in Rn and
let λ be an arbitrary point in [0, 1]. Let z = λx + (1 − λ)y. Then

f (x) ≥ f (z) + f ′ (z)(x − z), and


f (y) ≥ f (z) + f ′ (z)(y − z)

Multiplying the first equation by λ and the second by (1 − λ), adding


the resultant equations, and using the fact that z = λx +(1−λ)y yields

λf (x) + (1 − λ)f (y) ≥ f (z) = f (λx + (1 − λ)y)

Since x and y in Rn and λ in [0, 1] are all arbitrary, the convexity of


f (·) is established. ■

Theorem A.25 (Second derivative and convexity). Suppose that f :


Rn → R is twice continuously differentiable. Then f is convex if and only
if the Hessian (second derivative) matrix ∂ 2 f (x)/∂x 2 is positive semidef-
inite for all x ∈ Rn , i.e., y, ∂ 2 f (x)/∂x 2 y ≥ 0 for all x, y ∈ Rn .
648 Mathematical Background

Proof. ⇒ Suppose f is convex. Then for any x, y ∈ Rn , because of


Theorem A.24 and Proposition A.11

0 ≤ f (y) − f (x) − ∇f (x), y − x


Z1 * +
∂ 2 f (x + s(y − x))
= (1 − s) y − x, (y − x) ds (A.11)
0 ∂x 2

2
Hence, dividing by y − x and letting y → x, we obtain that
∂ 2 f (x)/∂x 2 is positive semidefinite.
⇐ Suppose that ∂ 2 f (x)/∂x 2 is positive semidefinite for all x ∈ R.
Then it follows directly from the equality in (A.11) and Theorem A.24
that f is convex. ■

Definition A.26 (Level set). Suppose f : Rn → R. A level set of f is a


set of the form {x | f (x) = α}, α ∈ R.

Definition A.27 (Sublevel set). Suppose f : Rn → R. A sublevel set X


of f is a set of the form X = {x | f (x) ≤ α}, α ∈ R. We also write the
sublevel set as X = levα f .

Definition A.28 (Support function). Suppose Q ⊂ Rn . The support


function σQ : Rn → Re = R ∪ {+∞} is defined by:

σQ (p) = sup{ p, x | x ∈ Q}
x

σQ (p) measures how far Q extends in direction p.

Proposition A.29 (Set membership and support function). Suppose Q ⊂


Rn is a closed and convex set. Then x ∈ Q if and only if σQ (p) ≥ p, x
for all p ∈ Rn

Proposition A.30 (Lipschitz continuity of support function). Suppose


Q ⊂ Rn is bounded. Then σQ is bounded and Lipschitz continuous
σQ (p) − σQ (q) ≤ K p − q for all p, q ∈ Rn , where K := sup{|x| |
x ∈ Q} < ∞.

A.14 Differential Equations


Although difference equation models are employed extensively in this
book, the systems being controlled are most often described by differ-
ential equations. Thus, if the system being controlled is described by
A.14 Differential Equations 649

the differential equation ẋ = fc (x, u), as is often the case, and if it


is decided to control the system using piecewise constant control with
period ∆, then, at sampling instants k∆ where k ∈ I, the system is
described by the difference equation

x + = f (x, u)

then f (·) may be derived from fc (·) as follows


Z∆
f (x, u) = x + fc (φc (s; x, u), u)ds
0

where φc (s; x, u) is the solution of ẋ = fc (x, u) at time s if its initial


state at time 0 is x and the control has a constant value u in the interval
[0, ∆]. Thus x in the difference equation is the state at time k, say, u
is the control in the interval [0, ∆], and x + is the state at time k + 1.
Because the discrete time system is most often obtained by a contin-
uous time system, we must be concerned with conditions which guaran-
tee the existence and uniqueness of solutions of the differential equa-
tion describing the continuous time system. For excellent expositions
of the theory of ordinary differential equations see the books by Hale
(1980), McShane (1944), Hartman (1964), and Coddington and Levinson
(1955).
Consider, first, the unforced system described by

(d/dt)x(t) = f (x(t), t) or ẋ = f (x, t) (A.12)

with initial condition


x(t0 ) = x0 (A.13)
Suppose f : D → Rn , where D is an open set in Rn × R, is continuous.
A function x : T → Rn , where T is an interval in R, is said to be a
(conventional) solution of (A.12) with initial condition (A.13) (or passing
through (x0 , t0 )) if:

(a) x is continuously differentiable and x satisfies (A.12) on T ,

(b) x(t0 ) = x0 ,

and (x(t), t) ∈ D for all t in T . It is easily shown, when f is continuous,


that x satisfies (A.12) and (A.13) if and only if:

Zt
x(t) = x0 + f (x(s), s)ds (A.14)
t0
650 Mathematical Background

Peano’s existence theorem states that if f is continuous on D, then,


for all (x0 , t0 ) ∈ D there exists at least one solution of (A.12)) passing
through (x0 , t0 ). The solution is not necessarily unique - a counter

example being ẋ = x for x ≥ 0. To proceed we need to be able to
deal with systems for which f (·) is not necessarily continuous for the
following reason. If the system is described by ẋ = f (x, u, t) where
f : Rn × Rm → Rn is continuous, and the control u : R → Rm is
continuous, then, for given u(·), the function f u : Rn ×R → Rn defined
by:
f u (x, t) := f (x, u(t), t)
is continuous in t. We often encounter controls that are not continuous,
however, in which case f u (·) is also not continuous. We need a richer
class of controls. A suitable class is the class of measurable functions
which, for the purpose of this book, we may take to be a class rich
enough to include all controls, such as those that are merely piecewise
continuous, that we may encounter. If the control u(·) is measurable
and f (·) is continuous, then f u (·), defined above, is continuous in x
but measurable in t, so we are forced to study such functions. Suppose,
as above, D is an open set in Rn × R. The function f : D → Rn is said
to satisfy the Caratheodory conditions in D if:

(a) f is measurable in t for each fixed x,

(b) f is continuous in x for each fixed t,

(c) for each compact set F in D there exists a measurable function


t , mF (t) such that

f (x, t) ≤ mF (t)
for all (x, t) ∈ F . We now make use of the fact that if t , h(t) is mea-
∆ Rt
surable, its integral t , H(t) = t0 h(s)ds is absolutely continuous and,
therefore, has a derivative almost everywhere. Where H(·) is differen-
tiable, its derivative is equal to h(·). Consequently, if f (·) satisfies the
Caratheodory conditions, then the solution of (A.14), i.e., a function
φ(·) satisfying (A.14) everywhere does not satisfy (A.12) everywhere
but only almost everywhere, at the points where φ(·) is differentiable.
In view of this, we may speak either of a solution of (A.14) or of a
solution of (A.12) provided we interpret the latter as an absolutely con-
tinuous function which satisfies (A.12)) almost everywhere. The ap-
propriate generalization of Peano’s existence theorem is the following
result due to Caratheodory:
A.14 Differential Equations 651

Theorem A.31 (Existence of solution to differential equations). If D is


an open set in Rn × R and f (·) satisfies the Caratheodory conditions on
D, then, for any (x0 , t0 ) in D, there exists a solution of (A.14) or (A.12)
passing through (x0 , t0 ).

Two other classical theorems on ordinary differential equations that


are relevant are:

Theorem A.32 (Maximal interval of existence). If D is an open set in


Rn × R, f (·) satisfies the Caratheodory conditions on D, and φ(·) is a
solution of (A.10) on some interval, then there is a continuation φ′ (·) of
φ(·) to a maximal interval (ta , tb ) of existence. The solution φ′ (·), the
continuation of φ(·), tends to the boundary of D as t ↘ ta and t ↗ tb .

Theorem A.33 (Continuity of solution to differential equation). Sup-


pose D is an open set in Rn × R, f satisfies the Caratheodory condition
and, for each compact set U in D, there exists an integrable function
t , ku (t) such that

f (x, t) − f (y, t) ≤ ku (t) x − y

for all (x, t), (y, t) in U . Then, for any (x0 , t0 ) in U there exists a unique
solution φ(·; x0 , t0 ) passing through (x0 , t0 ). The function (t, x0 , t0 ) ,
φ(t; x0 , t0 ) : R × Rn × R → Rn is continuous in its domain E which is
open.

Note that D is often Rn × R, in which case Theorem A.32 states that


a solution x(·) of (A.14) escapes, i.e., |x(t)| → ∞ as t ↘ ta or t ↗ tb
if ta and tb are finite; ta and tb are the escape times. An example of
a differential equation with finite escape time is ẋ = x 2 which has, if
x0 > 0, t0 = 0, a solution x(t) = x0 [1 − (t − t0 )x0 ]−1 and the maximal
interval of existence is (ta , tb ) = (−∞, t0 + 1/x0 ).
These results, apart from absence of a control u which is trivially
corrected, do not go far enough. We require solutions on an interval [t0 ,
tf ] given a priori. Further assumptions are needed for this. A useful
tool in developing the required results is the Bellman-Gronwall lemma:

Theorem A.34 (Bellman-Gronwall). Suppose that c ∈ (0, ∞) and that


α : [0, 1] → R+ is a bounded, integrable function, and that the integrable
function y : [0, 1] → R satisfies the inequality
Zt
y(t) ≤ c + α(s)y(s)ds (A.15)
0
652 Mathematical Background

for all t ∈ [0, 1]. Then Rt


y(t) ≤ ce 0 α(s)ds (A.16)
for all t ∈ [0, 1].

Note that, if the inequality in (A.15) were replaced by an equality,


(A.15) could be integrated to yield (A.16).

Proof. Let the function Y : [0, 1] → R be defined by


Zt
Y (t) = α(s)y(s)ds (A.17)
0

so that Ẏ (t) = α(t)y(t) almost everywhere on [0, 1]. It follows from


(A.15) and (A.17) that:

y(t) ≤ c + Y (t) ∀t ∈ [0, 1]

Hence
Rt Rt
α(s)ds α(s)ds
(d/dt)[e− 0 Y (t)] = e− 0 (Ẏ (t) − α(t)Y (t))
Rt
α(s)ds
= (e− 0 )α(t)(y(t) − Y (t))
Rt
α(s)ds
≤ c(e− 0 )α(t) (A.18)

almost everywhere on [0, 1]. Integrating both sides of (A.18) from 0 to


t yields R R
t t
α(s)ds α(s)ds
e− 0 Y (t) ≤ c[1 − e− 0 ]
for all t ∈ [0, 1]. Hence
Rt
Y (t) ≤ c[e 0 α(s)ds − 1]

and Rt
y(t) ≤ ce 0 α(s)ds
for all t ∈ [0, 1]. ■

The interval [0, 1] may, of course, be replaced by [t0 , tf ] for arbi-


trary t0 , tf ∈ (−∞, ∞). Consider now the forced system described by

ẋ(t) = f (x(t), u(t), t) a.e (A.19)

with initial condition


x(0) = 0
A.14 Differential Equations 653

The period of interest is now T := [0, 1] and “a.e.” denotes “almost


everywhere on T .” Admissible controls u(·) are measurable and satisfy
the control constraint
u(t) ∈ Ω a.e.
where Ω ⊂ Rm is compact. For convenience, we denote the set of
admissible controls by

U := {u : T → Rm | u(·) is measurable, u(t) ∈ Ω a.e.}

Clearly U is a subset of L∞ . For simplicity we assume, in the sequel,


that f is continuous; this is not restrictive. For each u in U, x in IR n ,
the function t , f u (x, t) := f (x, u(t), t) is measurable so that f u sat-
isfies the Caratheodory conditions and our previous results, Theorems
A.31–A.33, apply. Our concern now is to show that, with additional
assumptions, for each u in U, a solution to (A.12) or (A.13) exists on T ,
rather than on some maximal interval that may be a subset of T , and
that this solution is unique and bounded.
Theorem A.35 (Existence of solutions to forced systems). Suppose:
(a) f is continuous and

(b) there exists a positive constant c such that

f (x ′ , u, t) − f (x, u, t) ≤ c x ′ − x

for all (x, u, t) ∈ Rn × Ω × T . Then, for each u in U, there exists a


unique, absolutely continuous solution x u : T → Rn of (A.19) on the
interval T passing through (x0 , 0). Moreover, there exists a constant K
such that
|x u (t)| ≤ K
for all t ∈ T , all u ∈ U.
Proof. A direct consequence of (b) is the existence of a constant which,
without loss of generality, we take to be c, satisfying
(c) f (x, u, t) ≤ c(1 + |x|) for all (x, u, t) ∈ Rn × Ω × T .
Assumptions (a) and (b) and their corollary (c), a growth condition on
f (·), ensure that f u (·) satisfies the Caratheodory conditions stated
earlier. Hence, our previous results apply, and there exists an interval
[0, tb ] on which a unique solution x u (·) exists; moreover |x u (t)| → ∞
as t ↗ tb . Since x u (·) satisfies
Zt
u
x (t) = x0 + f (x u (s), u(s), s)ds
0
654 Mathematical Background

it follows from the growth condition that


Zt
u
|x (t)| ≤ |x0 | + f (x u (s), u(s), s) ds
0
Zt
≤ |x0 | + c (1 + |x u (s)|)ds
0
Zt
≤ (|x0 | + c) + c |x u (s)| ds
0

Applying the Bellman-Gronwall lemma yields

|x u (t)| ≤ (c + |x0 |)ect

for all t ∈ [0, tb ), u ∈ U. If follows that the escape time tb cannot be


finite, so that, for all u in U, there exists a unique absolutely continuous
solution x u (·) on T passing through (x0 , (0)). Moreover, for all u in
U, all t ∈ T
|x u (t)| ≤ K
where K := (c + |x0 |)ec . ■

A.15 Random Variables and the Probability Density


Let ξ be a random variable taking values in the field of real numbers
and the function Fξ (x) denote the probability distribution function
of the random variable so that

Fξ (x) = Pr(ξ ≤ x)

i.e., Fξ (x) is the probability that the random variable ξ takes on a value
less than or equal to x. Fξ is obviously a nonnegative, nondecreasing
function and has the following properties due to the axioms of proba-
bility
Fξ (x1 ) ≤ Fξ (x2 ) if x1 < x2

lim Fξ (x) = 0
x→−∞

lim Fξ (x) = 1
x→∞

We next define the probability density function, denoted pξ (x),


such that Zx
Fξ (x) = pξ (s)ds, −∞ < x < ∞ (A.20)
−∞
A.15 Random Variables and the Probability Density 655

We can allow discontinuous Fξ if we are willing to accept generalized


functions (delta functions and the like) for pξ . Also, we can define the
density function for discrete as well as continuous random variables if
we allow delta functions. Alternatively, we can replace the integral in
(A.20) with a sum over a discrete density function. The random variable
may be a coin toss or a dice game, which takes on values from a discrete
set contrasted to a temperature or concentration measurement, which
takes on a values from a continuous set. The density function has the
following properties
pξ (x) ≥ 0

Z∞
pξ (x)dx = 1
−∞

and the interpretation in terms of probability


Z x2
Pr(x1 ≤ ξ ≤ x2 ) = pξ (x)dx
x1

The mean or expectation of a random variable ξ is defined as


Z∞
E(ξ) = xpξ (x)dx
−∞

The moments of a random variable are defined by


Z∞
E(ξ n ) = x n pξ (x)dx
−∞

and it is clear that the mean is the zeroth moment. Moments of ξ about
the mean are defined by
Z∞
E((ξ − E(ξ))n ) = (x − E(ξ))n pξ (x)dx
−∞

and the variance is defined as the second moment about the mean

var(ξ) = E((ξ − E(ξ))2 ) = E(ξ 2 ) − E 2 (ξ)

The standard deviation is the square root of the variance

σ (ξ) = (var(ξ))1/2
656 Mathematical Background

Normal distribution. The normal or Gaussian distribution is ubiqui-


tous in applications. It is characterized by its mean, m and variance,
σ 2 , and is given by
!
1 1 (x − m)2
pξ (x) = √ exp − (A.21)
2π σ 2 2 σ2

We proceed to check that the mean of this distribution is indeed m and


the variance is σ 2 as claimed and that the density is normalized so that
its integral is one. We require the definite integral formulas
Z∞
2 √
e−x dx = π
−∞
Z∞ √
π
x 1/2 e−x dx = Γ (3/2) =
0 2
The first formula may also be familiar from the error function in trans-
port phenomena
Zx
2 2
erf(x) = √ e−u du
π 0
erf(∞) = 1

We calculate the integral of the normal density as follows


Z∞ Z∞ !
1 1 (x − m)2
pξ (x)dx = √ exp − dx
−∞ 2π σ 2 −∞ 2 σ2

Define the change of variable


 
1 x−m
u= √
2 σ
which gives
Z∞ Z∞  
1
pξ (x)dx = √ exp −u2 du = 1
−∞ π −∞

and (A.21) does have unit area. Computing the mean gives
Z∞ !
1 1 (x − m)2
E(ξ) = √ x exp − dx
2π σ 2 −∞ 2 σ2

using the same change of variables as before yields


Z∞ p
1 2
E(ξ) = √ ( 2uσ + m)e−u du
π −∞
A.15 Random Variables and the Probability Density 657

The first term in the integral is zero because u is an odd function, and
the second term produces

E(ξ) = m

as claimed. Finally the definition of the variance of ξ gives


Z∞ !
1 2 1 (x − m)2
var(ξ) = √ (x − m) exp − dx
2π σ 2 −∞ 2 σ2
Changing the variable of integration as before gives
Z∞
2 2
var(ξ) = √ σ 2 u2 e−u du
π −∞

and because the integrand is an even function,


Z∞
4 2
var(ξ) = √ σ 2 u2 e−u du
π 0

Now changing the variable of integration again using s = u2 gives


Z∞
2
var(ξ) = √ s 1/2 e−s ds
πσ2 0
The second integral formula then gives

var(ξ) = σ 2

Shorthand notation for the random variable ξ having a normal distri-


bution with mean m and variance σ 2 is

ξ ∼ N(m, σ 2 )

Figure A.8 shows the normal distribution with a mean of one and
variances of 1/2, 1 and 2. Notice that a large variance implies that
the random variable is likely to take on large values. As the variance
shrinks to zero, the probability density becomes a delta function and
the random variable approaches a deterministic value.
Central limit theorem.
The central limit theorem states that if a set of n random
variables xi , i = 1, 2, . . . , n are independent, then under gen-
eral conditions the density py of their sum

y = x1 + x2 + · · · + xn

properly normalized, tends to a normal density as n → ∞.


(Papoulis, 1984, p. 194).
658 Mathematical Background

0.9

0.8

0.7 σ = 1/2

0.6

0.5
pξ (x)
0.4

0.3
σ =1
0.2

0.1 σ =2
m
0
−5 −4 −3 −2 −1 0 1 2 3 4 5 6 7
x
!
1 1 (x − m)2
Figure A.8: Normal distribution, pξ (x) = √ exp − .
2π σ 2 2 σ2
Mean is one and standard deviations are 1/2, 1 and 2.

Notice that we require only mild restrictions on how the xi themselves


are distributed for the sum y to tend to a normal. See Papoulis (1984,
p. 198) for one set of sufficient conditions and a proof of this theorem.
Fourier transform of the density function. It is often convenient to
handle the algebra of density functions, particularly normal densities,
by using the Fourier transform of the density function rather than the
density itself. The transform, which we denote as ϕξ (u), is often called
the characteristic function or generating function in the statistics liter-
ature. From the definition of the Fourier transform
Z∞
ϕξ (u) = eiux pξ (x)dx
−∞

The transform has a one-to-one correspondence with the density func-


tion, which can be seen from the inverse transform formula
Z
1 ∞ −iux
pξ (x) = e ϕξ (u)du
2π −∞
A.16 Multivariate Density Functions 659

Example A.36: Fourier transform of the normal density.


Show the Fourier
 transform of
 the normal density is
ϕξ (u) = exp ium − 12 u2 σ 2 . □

A.16 Multivariate Density Functions


In applications we normally do not have a single random variable but
a collection of random variables. We group these variables together
in a vector and let random variable ξ now take on values in Rn . The
probability density function is still a nonnegative scalar function

pξ (x) : Rn → R+

which is sometimes called the joint density function. As in the scalar


case, the probability that the n-dimensional random variable ξ takes
on values between a and b is given by
Z bn Z b1
Pr(a ≤ ξ ≤ b) = ··· pξ (x)dx1 · · · dxn
an a1

Marginal density functions. We are often interested in only some


subset of the random variables in a problem. Consider two vectors
of random variables, ξ ∈ Rn and η ∈ Rm . We can consider the joint
distribution of both of these random variables pξ,η (x, y) or we may
only be interested in the ξ variables, in which case we can integrate out
the m η variables to obtain the marginal density of ξ
Z ∞ Z
pξ (x) = ··· pξ,η (x, y)dy1 · · · dym
−∞

Analogously to produce the marginal density of η we use


Z ∞ Z
pη (y) = ··· pξ,η (x, y)dx1 · · · dxn
−∞

Multivariate normal density. We define the multivariate normal den-


sity of the random variable ξ ∈ Rn as
 
1 1 ′ −1
pξ (x) = exp − (x − m) P (x − m) (A.22)
(2π )n/2 (det P )1/2 2
in which m ∈ Rn is the mean and P ∈ Rn×n is the covariance matrix.
The notation det P denotes determinant of P . As noted before, P is a
660 Mathematical Background

  
p(x) = exp −1/2 3.5x12 + 2(2.5)x1 x2 + 4.0x22

0.5

0.25

2
1x
−2 0 2
−1 x1 0 −1
1 −2
2

Figure A.9: Multivariate normal in two dimensions.

real, symmetric matrix. The multivariate normal density is well-defined


only for P > 0. The singular, or degenerate, case P ≥ 0 is discussed
subsequently. Shorthand notation for the random variable ξ having a
normal distribution with mean m and covariance P is

ξ ∼ N(m, P )

The matrix P is a real, symmetric matrix. Figure A.9 displays a mul-


tivariate normal for
" # " #
−1 3.5 2.5 0
P = m=
2.5 4.0 0

As displayed in Figure A.9, lines of constant probability in the multi-


variate normal are lines of constant

(x − m)′ P −1 (x − m)

To understand the geometry of lines of constant probability (ellipses in


two dimensions, ellipsoids or hyperellipsoids in three or more dimen-
sions) we examine the eigenvalues and eigenvectors of the P −1 matrix.
A.16 Multivariate Density Functions 661

x ′ Ax = b
Avi = λi vi

x2
r
b
v
λ2 2
r
b
v
λ1 1

q x1
e 22
bA
q
e 11
bA

Figure A.10: The geometry of quadratic form x ′ Ax = b.

Consider the quadratic function x ′ Ax depicted in Figure A.10. Each


eigenvector of A points along one of the axes of the ellipse x ′ Ax = b.
The eigenvalues show us how stretched the ellipse is in each eigenvec-
tor direction. If we want to put simple bounds on the ellipse, then we
draw a box around it as shown in Figure A.10. Notice the box contains
much more area than the corresponding ellipse and we have lost the
correlation between the elements of x. This loss of information means
we can put different tangent ellipses of quite different shapes inside
the same box. The size of the bounding box is given by
q
length of ith side = bAe ii

in which
e ii = (i, i) element of A−1
A

See Exercise A.45 for a derivation of the size of the bounding box. Fig-
ure A.10 displays these results: the eigenvectors are aligned with the
ellipse axes and the eigenvalues scale the lengths. The lengths of the
sides of the box that are tangent to the ellipse are proportional to the
square root of the diagonal elements of A−1 .
Singular or degenerate normal distributions. It is often convenient
to extend the definition of the normal distribution to admit positive
semidefinite covariance matrices. The distribution with a semidefinite
covariance is known as a singular or degnerate normal distribution (An-
662 Mathematical Background

derson, 2003, p. 30). Figure A.11 shows a nearly singular normal dis-
tribution.
To see how the singular normal arises, let the scalar random variable
ξ be distributed normally with zero mean and positive definite covari-
ance, ξ ∼ N(0, Px ), and consider the simple linear transformation
" #
1
η = Aξ A=
1

in which we have created two identical copies of ξ for the two compo-
nents η1 and η2 of η. Now consider the density of η. If we try to use
the standard formulas for transformation of a normal, we would have
" #
′ Px Px
η ∼ N(0, Py ) Py = APx A =
Px Px

and Py is singular since its rows are linearly dependent. Therefore one
of the eigenvalues of Py is zero and Py is positive semidefinite and not
positive definite. Obviously we cannot use (A.22) for the density in this
case because the inverse of Py does not exist. To handle these cases, we
first provide an interpretation that remains valid when the covariance
matrix is singular and semidefinite.

Definition A.37 (Density of a singular normal). A singular joint normal


density of random variables (ξ1 , ξ2 ), ξ1 ∈ Rn1 , ξ2 ∈ Rn2 , is denoted
" # " # " #
ξ1 m1 Λ1 0
∼N ,
ξ2 m2 0 0

with Λ1 > 0. The density is defined by


 
1 1 2
pξ (x1 , x2 ) = n1 1 exp − |x 1 − m )|
1 Λ−1 δ(x2 −m2 )
(2π ) 2 (det Λ1 ) 2 2 1

(A.23)

In this limit, the “random” variable ξ2 becomes deterministic and


equal to its mean m2 . For the case n1 = 0, we have the completely
degenerate case in which pξ2 (x2 ) = δ(x2 − m2 ), which describes the
completely deterministic case ξ2 = m2 and there is no random com-
ponent ξ1 . This expanded definition enables us to generalize the im-
portant result that the linear transformation of a normal is normal,
so that it holds for any linear transformation, including rank deficient
transformations such as the A matrix given above in which the rows
A.16 Multivariate Density Functions 663

are not independent (see Exercise 1.40). Starting with the definition of
a singular normal, we can obtain the density for ξ ∼ N(mx , Px ) for any
positive semidefinite Px ≥ 0. The result is

1  1 
pξ (x) = r 1 exp − |(x − mx )|2Q1 δ(Q2′ (x − mx ))
(2π ) (det Λ1 )
2 2 2
(A.24)
r ×r n×n
in which matrices Λ ∈ R and orthonormal Q ∈ R are obtained
from the eigenvalue decomposition of Px
" #" #
h i Λ 0 Q1′
′ 1
Px = QΛQ = Q1 Q2
0 0 Q2′

and Λ1 > 0 ∈ Rr ×r , Q1 ∈ Rn×r , Q2 ∈ Rn×(n−r ) . This density is nonzero


for x satisfying Q2′ (x − mx ) = 0. If we let N(Q2′ ) denote the r di-
mensional nullspace of Q2′ , we have that the density is nonzero for
x ∈ N(Q2′ ) ⊕ {mx } in which ⊕ denotes set addition.

Example A.38: Marginal normal density


Given that ξ and η are jointly, normally distributed with mean
" #
mx
m=
my

and covariance matrix


" #
Px Pxy
P=
Pyx Py

show that the marginal density of ξ is normal with the following pa-
rameters
ξ ∼ N(mx , Px ) (A.25)

Solution
As a first approach to establish (A.25), we directly integrate the y vari-
ables. Let x̄ = x − mx and ȳ = y − my , and nx and ny be the dimen-
sion of the ξ and η variables, respectively, and n = nx + ny . Then the
definition of the marginal density gives
 " #′ " #−1 " #
Z∞
1 1 x̄ P x P xy x̄ 
pξ (x) = exp − dȳ
(2π )n/2 (det P )1/2 −∞ 2 ȳ Pyx Py ȳ
664 Mathematical Background

e and partition P
Let the inverse of P be denoted as P e as follows
" #−1 " #
Px Pxy ex P
P e xy
= (A.26)
Pyx Py e yx P
P ey

Substituting (A.26) into the definition of the marginal density and ex-
panding the quadratic form in the exponential yields

(2π )n/2 (det P )1/2 pξ (x) =


 Z∞  
exp −(1/2)x̄ ′ P e x x̄ e yx x̄ + ȳ ′ P
exp −(1/2)(2ȳ ′ P e y ȳ) dȳ
−∞

We complete the square on the term in the integral by noting that


−1 −1 ′ −1
e P
(ȳ+P e y (ȳ+P
e yx x̄)′ P e P e yx x̄) = ȳ ′ P e P
e yx x̄+x̄ ′ P
e y ȳ+2ȳ ′ P e e
y y yx y P yx x̄

Substituting this relation into the previous equation gives


 
′ −1
(2π )n/2 (det P )1/2 pξ (x) = exp −(1/2)x̄ ′ (P
ex − P
e P e P
yx y
e yx )x̄
Z∞  
exp −(1/2)(ȳ + a)′ P e y (ȳ + a) dȳ
−∞

−1
in which a = e P
P e yx x̄.
Using (A.22) to evaluate the integral gives
y
 
1 ′ e e

e
−1
e
pξ (x) =  1/2 exp −(1/2)x̄ (P x − P yx P y P yx )x̄
ey)
(2π )nx /2 det(P ) det(P

From the matrix inversion formula we conclude


′ −1
ex − P
P e P e e −1
xy y P yx = Px

and

e =
−1 det Px
det(P ) = det(Px ) det(Py − Pyx Px−1 Pxy ) = det Px det P y
ey
det P

Substituting these results into the previous equation gives


1  
pξ (x) = exp −(1/2)x̄ ′ Px−1 x̄
(2π )nx /2 (det Px )1/2
Therefore
ξ ∼ N(mx , Px )

A.16 Multivariate Density Functions 665

  
p(x) = exp −1/2 27.2x12 + 2(−43.7)x1 x2 + 73.8x22

1
0.75
0.5
0.25
0

−2 −2
−1
x2 −1
0 0 x1
1 1
2 2

Figure A.11: A nearly singular normal density in two dimensions.

Functions of random variables. In stochastic dynamical systems we


need to know how the density of a random variable is related to the
density of a function of that random variable. Let f : Rn → Rn be
a mapping of the random variable ξ into the random variable η and
assume that the inverse mapping also exits

η = f (ξ), ξ = f −1 (η)

Given the density of ξ, pξ (x), we wish to compute the density of η,


pη (y), induced by the function f . Let S denote an arbitrary region of
the field of the random variable ξ and define the set S ′ as the transform
of this set under the function f

S ′ = {y | y = f (x), x ∈ S}

Then we seek a function pη (y) such that


Z Z
pξ (x)dx = pη (y)dy (A.27)
S S′
666 Mathematical Background

for every admissible set S. Using the rules of calculus for transforming
a variable of integration we can write
Z Z  
∂f −1 (y)
pξ (x)dx = pξ (f −1 (y)) det dy (A.28)
S S′ ∂y

in which det(∂f −1 (y)/∂y) is the absolute value of the determinant


of the Jacobian matrix of the transformation from η to ξ. Subtracting
(A.28) from (A.27) gives
Z  
pη (y) − pξ (f −1 (y)) det(∂f −1 (y)/∂y) dy = 0 (A.29)
S′

Because (A.29) must be true for any set S ′ , we conclude (a proof by


contradiction is immediate)8

pη (y) = pξ (f −1 (y)) det(∂f −1 (y)/∂y) (A.30)

Example A.39: Nonlinear transformation


Show that
 !2 
1 1 y 1/3 − m
pη (y) = √ exp − 
3 2π σ y 2/3 2 σ

is the density function of the random variable η under the transforma-


tion
η = ξ3

for ξ ∼ N(m, σ 2 ). Notice that the density pη is singular at y = 0. □

Noninvertible transformations. Given n random variables ξ = (ξ1 ,


ξ2 , . . . , ξn ) with joint density pξ and k random variables η = (η1 , η2 ,
. . . , ηk ) defined by the transformation η = f (ξ)

η1 = f1 (ξ) η2 = f2 (ξ) ··· ηk = fk (ξ)

We wish to find pη in terms of pξ . Consider the region generated in Rn


by the vector inequality
f (x) ≤ c
8 Some care should be exercised if one has generalized functions in mind for the
conditional density.
A.16 Multivariate Density Functions 667

x2

X(c) c

c x1

Figure A.12: The region X(c) for y = max(x1 , x2 ) ≤ c.

Call this region X(c), which is by definition


X(c) = {x | f (x) ≤ c}
Note X is not necessarily simply connected. The probability distribu-
tion (not density) for η then satisfies
Z
Pη (y) = pξ (x)dx (A.31)
X(y)

If the density pη is of interest, it can be obtained by differentiating Pη .

Example A.40: Maximum of two random variables


Given two independent random variables, ξ1 , ξ2 and the new random
variable defined by the noninvertible, nonlinear transformation
η = max(ξ1 , ξ2 )
Show that η’s density is given by
Zy Zy
pη (y) = pξ1 (y) pξ2 (x)dx + pξ2 (y) pξ1 (x)dx
−∞ −∞

Solution
The region X(c) generated by the inequality y = max(x1 , x2 ) ≤ c is
sketched in Figure A.12. Applying (A.31) then gives
Zy Zy
Pη (y) = pξ (x1 , x2 )dx1 dx2
−∞ −∞
= Pξ (y, y)
= Pξ1 (y)Pξ2 (y)
668 Mathematical Background

which has a clear physical interpretation. It says the probability that


the maximum of two independent random variables is less than some
value is equal to the probability that both random variables are less
than that value. To obtain the density, we differentiate

pη (y) = pξ1 (y)Pξ2 (y) + Pξ1 (y)pξ2 (y)


Zy Zy
= pξ1 (y) pξ2 (x)dx + pξ2 (y) pξ1 (x)dx
−∞ −∞

A.16.1 Statistical Independence and Correlation

We say two random variables ξ, η are statistically independent or sim-


ply independent if

pξ,η (x, y) = pξ (x)pη (y), all x, y

The covariance of two random variables ξ, η is defined as

cov(ξ, η) = E ((ξ − E(ξ)) (η − E(η)))

The covariance of the vector-valued random variable ξ with compo-


nents ξi , i = 1, . . . n can be written as

Pij = cov(ξi , ξj )
 
var(ξ1 ) cov(ξ1 , ξ2 ) ··· cov(ξ1 , ξn )
 cov(ξ , ξ ) var(ξ2 ) ··· cov(ξ2 , ξn ) 
 2 1 
P =  .. .. .. .. 

 . . . . 
cov(ξn , ξ1 ) cov(ξn , ξ2 ) ··· var(ξn )

We say two random variables, ξ and η, are uncorrelated if

cov(ξ, η) = 0

Example A.41: Independent implies uncorrelated


Prove that if ξ and η are statistically independent, then they are uncor-
related.
A.16 Multivariate Density Functions 669

Solution
The definition of covariance gives

cov(ξ, η) = E((ξ − E(ξ))(η − E(η)))


= E(ξη − ξE(η) − ηE(ξ) + E(ξ)E(η))
= E(ξη) − E(ξ)E(η)

Taking the expectation of the product ξη and using the fact that ξ and
η are independent gives

ZZ
E(ξη) = xypξ,η (x, y)dxdy
−∞
∞ZZ
= xypξ (x)pη (y)dxdy
−∞
Z∞ Z∞
= xpξ (x)dx ypη (y)dy
−∞ −∞
= E(ξ)E(η)

Substituting this fact into the covariance equation gives

cov(ξ, η) = 0

Example A.42: Does uncorrelated imply independent?


Let ξ and η be jointly distributed random variables with probability
density function
( 1
2 2
pξ,η (x, y) = 4 [1 + xy(x − y )], |x| < 1, y <1
0, otherwise

(a) Compute the marginals pξ (x) and pη (y). Are ξ and η indepen-
dent?

(b) Compute cov(ξ, η). Are ξ and η uncorrelated?

(c) What is the relationship between independent and uncorrelated?


Are your results on this example consistent with this relationship?
Why or why not?
670 Mathematical Background

1  
pξ,η (x, y) = 4 1 + xy(x 2 − y 2 )

0.5

0.25

1
0
−1 0 y
x 0
1 −1

Figure A.13: A joint density function for the two uncorrelated ran-
dom variables in Example A.42.

Solution
The joint density is shown in Figure A.13.
(a) Direct integration of the joint density produces

pξ (x) = (1/2), |x| < 1 E(ξ) = 0


pη (y) = (1/2), y <1 E(η) = 0

and we see that both marginals are zero mean, uniform densities.
Obviously ξ and η are not independent because the joint density
is not the product of the marginals.

(b) Performing the double integral for the expectation of the product
term gives
1
ZZ
E(ξη) = xy + (xy)2 (x 2 − y 2 )dxdy
−1

=0
A.16 Multivariate Density Functions 671

and the covariance of ξ and η is therefore

cov(ξ, η) = E(ξη) − E(ξ)E(η)


=0

and ξ and η are uncorrelated.

(c) We know independent implies uncorrelated. This example does


not contradict that relationship. This example shows uncorre-
lated does not imply independent, in general, but see the next
example for normals.

Example A.43: Independent and uncorrelated are equivalent for nor-


mals
If two random variables are jointly normally distributed,
" # " # " #!
ξ mx Px Pxy
∼N ,
η my Pyx Py

Prove ξ and η are statistically independent if and only if ξ and η are


uncorrelated, or, equivalently, P is block diagonal.

Solution
We have already shown that independent implies uncorrelated for any
density, so we now show that, for normals, uncorrelated implies inde-
pendent. Given cov(ξ, η) = 0, we have


Pxy = Pyx =0 det P = det Px det Py

so the density can be written


 " #′ " #−1 " #
1 x̄ P x 0 x̄ 
exp − 2
ȳ 0 Py ȳ
pξ,η (x, y) =  1/2 (A.32)
(2π )(nx +ny )/2 det Px det Py

For any joint normal, we know the marginals are simply

ξ ∼ N(mx , Px ) η ∼ N(my , Py )
672 Mathematical Background

so we have
1  
′ −1
pξ (x) = exp −(1/2) x̄ P x x̄
(2π )nx /2 (det Px )1/2
1  
′ −1
pη (y) = exp −(1/2) ȳ Py ȳ
(2π )ny /2 (det Py )1/2

Forming the product and combining terms gives


" #′ " # " #!
x̄ Px−1 0 x̄
exp − 12
ȳ 0 Py−1 ȳ
pξ (x)pη (y) =  1/2
(2π )(nx +ny )/2 det Px det Py

Comparing this equation to (A.32), and using the inverse of a block-


diagonal matrix, we have shown that ξ and η are statistically indepen-
dent. □

A.17 Conditional Probability and Bayes’s Theorem


Let ξ and η be jointly distributed random variables with density pξ,η (x,
y). We seek the density function of ξ given a specific realization y of
η has been observed. We define the conditional density function as

pξ,η (x, y)
pξ|η (x|y) =
pη (y)

Consider a roll of a single die in which η takes on values E or O to


denote whether the outcome is even or odd and ξ is the integer value
of the die. The twelve values of the joint density function are simply
computed

pξ,η (1, E) = 0 pξ,η (1, O) = 1/6


pξ,η (2, E) = 1/6 pξ,η (2, O) = 0
pξ,η (3, E) = 0 pξ,η (3, O) = 1/6
(A.33)
pξ,η (4, E) = 1/6 pξ,η (4, O) = 0
pξ,η (5, E) = 0 pξ,η (5, O) = 1/6
pξ,η (6, E) = 1/6 pξ,η (6, O) = 0

The marginal densities are then easily computed; we have for ξ


E
X
pξ (x) = pξ,η (x, y)
y=O
A.17 Conditional Probability and Bayes’s Theorem 673

which gives by summing across rows of (A.33)

pξ (x) = 1/6, x = 1, 2, . . . 6

Similarly, we have for η

6
X
pη (y) = pξ,η (x, y)
x=1

which gives by summing down the columns of (A.33)

pη (y) = 1/2, y = E, O

These are both in accordance of our intuition on the rolling of the die:
uniform probability for each value 1 to 6 and equal probability for an
even or an odd outcome. Now the conditional density is a different
concept. The conditional density pξ |η(x, y) tells us the density of x
given that η = y has been observed. So consider the value of this
function
pξ|η (1|O)

which tells us the probability that the die has a 1 given that we know
that it is odd. We expect that the additional information on the die
being odd causes us to revise our probability that it is 1 from 1/6 to
1/3. Applying the defining formula for conditional density indeed gives

1/6
pξ|η (1|O) = pξ,η (1, O)/pη (O) = = 1/3
1/2

Consider the reverse question, the probability that we have an odd


given that we observe a 1. The definition of conditional density gives

1/6
pη,ξ (O|1) = pη,ξ (O, 1)/pξ (1) = =1
1/6

i.e., we are sure the die is odd if it is 1. Notice that the arguments to
the conditional density do not commute as they do in the joint density.
This fact leads to a famous result. Consider the definition of condi-
tional density, which can be expressed as

pξ,η (x, y) = pξ|η (x|y)pη (y)

or
pη,ξ (y, x) = pη|ξ (y|x)pξ (x)
674 Mathematical Background

Because pξ,η (x, y) = pη,ξ (y, x), we can equate the right-hand sides
and deduce
pη|ξ (y|x)pξ (x)
pξ|η (x|y) =
pη (y)
which is known as Bayes’s theorem (Bayes, 1763). Notice that this re-
sult comes in handy whenever we wish to switch the variable that is
known in the conditional density, which we will see is a key step in
state estimation problems.

Example A.44: Conditional normal density


Show that if ξ and η are jointly normally distributed as
" # " # " #!
ξ mx Px Pxy
∼N ,
η my Pyx Py

then the conditional density of ξ given η is also normal

(ξ|η) ∼ N(m, P )

in which the mean is

m = mx + Pxy Py−1 (y − my ) (A.34)

and the covariance is

P = Px − Pxy Py−1 Pyx (A.35)

Solution
The definition of conditional density gives

pξ,η (x, y)
pξ|η (x|y) =
pη (y)

Because (ξ, η) is jointly normal, we know from Example A.38

1  
pη (y) = exp −(1/2)(y − my )′ Py−1 (y − my )
(2π )nη /2 (det P y)
1/2

and therefore

(det Py )1/2
pξ|η (x|y) = " #!1/2 exp(−1/2a) (A.36)
nξ /2 Px Pxy
(2π ) det
Pyx Py
A.17 Conditional Probability and Bayes’s Theorem 675

in which the argument of the exponent is


" #′ " #−1 " #
x − mx Px Pxy x − mx
a= −(y−my )′ Py−1 (y−my )
y − my Pyx Py y − my

If we use P = Px − Pxy Py−1 Pyx as defined in (A.35) then we can use the
partitioned matrix inversion formula to express the matrix inverse in
the previous equation as
" #−1 " #
Px Pxy P −1 −P −1 Pxy Py−1
=
Pyx Py −Py−1 Pyx P −1 Py−1 + Py−1 Pyx P −1 Pxy Py−1

Substituting this expression and multiplying out terms yields

a = (x − mx )′ P −1 (x − mx ) − 2(y − my )′ (Py−1 Pyx P −1 )(x − mx )


+ (y − my )′ (Py−1 Pyx P −1 Pxy Py−1 )(y − my )

which is the expansion of the following quadratic term


h i′ h i
a = (x − mx ) − Pxy Py−1 (y − my ) P −1 (x − mx ) − Pxy Py−1 (y − my )


in which we use the fact that Pxy = Pyx . Substituting (A.34) into this
expression yields
a = (x − m)′ P −1 (x − m) (A.37)

Finally noting that for the partitioned matrix


" #
Px Pxy
det = det Py det P (A.38)
Pyx Py

and substitution of equations (A.38) and (A.37) into (A.36) yields


 
1 1 ′ −1
pξ|η (x|y) = exp − (x − m) P (x − m)
(2π )nξ /2 (det P )1/2 2

which is the desired result. □

Example A.45: More normal conditional densities


Let the joint conditional of random variables a and b given c be a nor-
mal distribution with
" # " #!
ma Pa Pab
p(a, b|c) ∼ N , (A.39)
mb Pba Pb
676 Mathematical Background

Then the conditional density of a given b and c is also normal

p(a|b, c) ∼ N(m, P )

in which the mean is

m = ma + Pab Pb−1 (b − mb )

and the covariance is


P = Pa − Pab Pb−1 Pba

Solution
From the definition of joint density we have

p(a, b, c)
p(a|b, c) =
p(b, c)

Multiplying the top and bottom of the fraction by p(c) yields

p(a, b, c) p(c)
p(a|b, c) =
p(c) p(b, c)

or
p(a, b|c)
p(a|b, c) =
p(b|c)

Substituting the distribution given in (A.39) and using the result in Ex-
ample A.38 to evaluate p(b|c) yields
" # " #!
ma Pa Pab
N ,
mb Pba Pb
p(a|b, c) =
N(mb , Pb )

And now applying the methods of Example A.44 this ratio of normal
distributions reduces to the desired expression. □

Adjoint operator. Given a linear operator G : U → V and inner prod-


ucts for the spaces U and V, the adjoint of G, denoted by G ∗ is the
linear operator G ∗ : V → U such that

⟨u, G ∗ v⟩ = ⟨Gu, v⟩, ∀u ∈ U, v ∈ V (A.40)


A.17 Conditional Probability and Bayes’s Theorem 677

Dual dynamic system (Callier and Desoer, 1991). The dynamic sys-
tem

x(k + 1) = Ax(k) + Bu(k), k = 0, . . . , N − 1


y(k) = Cx(k) + Du(k)

maps an initial condition and input sequence (x(0), u(0), . . . , u(N −1))
into a final condition and an output sequence (x(N), y(0), . . . , y(N −
1)). Call this linear operator G
   
x(N) x(0)
 y(0)   u(0) 
   
   
 ..  = G .. 
 .   . 
y(N − 1) u(N − 1)
The dual dynamic system represents the adjoint operator G ∗
   
x(0) x(N)
 y(1)   u(1) 
   
 .  = G∗  .. 
 .   
 .   . 
y(N) u(N)
We define the usual inner product, ⟨a, b⟩ = a′ b, and substitute into
(A.40) to obtain

x(0)′ x(0) + u(0)′ y(1) + · · · + u(N − 1)′ y(N) −


| {z }
⟨u,G∗ v⟩

x(N)′ x(N) + y(0)′ u(1) + · · · + y(N − 1)′ u(N) = 0


| {z }
⟨Gu,v⟩

If we express the y(k) in terms of x(0) and u(k) and collect terms we
obtain
h i
0 = x(0)′ x(0) − C ′ u(1) − A′ C ′ u(2) − · · · − A′N x(N)
h i
+ u(0)′ y(1) − D ′ u(1) − B ′ C ′ u(2) − · · · − B ′ A′(N−2) C ′ u(N) − B ′ A′(N−1) x(N)
+ ···
 
+ u(N − 2)′ y(N − 1) − D ′ u(N − 1) − B ′ C ′ u(N) − B ′ A′ x(N)
 
+ u(N − 1)′ y(N) − D ′ u(N) − B ′ x(N)
Since this equation must hold for all (x(0), u(0), . . . , u(N − 1)), each
term in brackets must vanish. From the u(N − 1) term we conclude

y(N) = B ′ x(N) + D ′ u(N)


678 Mathematical Background

Using this result, the u(N − 2) term gives



B ′ x(N − 1) − A′ x(N) + C ′ u(N) = 0

From which we find the state recursion for the dual system

x(N − 1) = A′ x(N) + C ′ u(N)

Passing through each term then yields the dual state space description
of the adjoint operator G ∗

x(k − 1) = A′ x(k) + C ′ u(k), k = N, . . . , 1


′ ′
y(k) = B x(k) + D u(k)

So the primal and dual dynamic systems change matrices in the follow-
ing way
(A, B, C, D) -→ (A′ , C ′ , B ′ , D ′ )
Notice this result produces the duality variables listed in Table A.1 if
we first note that we have also renamed the regulator’s input matrix B
to G in the estimation problem. We also note that time runs in the op-
posite directions in the dynamic system and the dual dynamic system,
which corresponds to the fact that the Riccati equation iterations run
in opposite directions in the regulation and estimation problems.

A.18 Exercises

Exercise A.1: Norms in Rn


Show that the following three functions are all norms in Rn
 1/2
n
X
|x|2 :=  (x i )2 
i=1

|x|∞ := max{ x 1 , x 2 , . . . , x n }
n
X
|x|1 := xj
i=1

where x j denotes the jth component of the vector x.

Exercise A.2: Equivalent norms


Show that there are finite constants Kij , i, j = 1, 2, ∞ such that
|x|i ≤ Kij |x|j , for all i, j ∈ {1, 2, ∞}.

This result shows that the norms are equivalent and may be used interchangeably for
establishing that sequences are convergent, sets are open or closed, etc.
A.18 Exercises 679

Regulator Estimator
A A′
B C′
C G′
k l=N −k
Π(k) P − (l) Regulator Estimator
Π(k − 1) P − (l + 1) R > 0, Q > 0 R > 0, Q > 0
Π P− (A, B) stabilizable (A, C) detectable
Q Q (A, C) detectable (A, G) stabilizable
R R
Q(N) Q(0)

K −Le
A + BK (A − L e C)′
x ε

Table A.1: Duality variables and stability conditions for linear quad-
ratic regulation and linear estimation.

Exercise A.3: Open and closed balls


Let x ∈ Rn and ρ > 0 be given. Show that {z | |z − x| < ρ} is open and that B(x, ρ) is
closed.

Exercise A.4: Condition for closed set


Show that X ⊂ Rn is closed if and only if int(B(x, ρ)) ∩ X ̸= ∅ for all ρ > 0 implies
x ∈ X.

Exercise A.5: Convergence


Suppose that xi → x̂ as i → ∞; show that for every ρ > 0 there exists an ip ∈ I≥0 such
that xi ∈ B(x̂, ρ) for all i ≥ ip .

Exercise A.6: Limit is unique



Suppose that x̂, x̂ ′ are limits of a sequence xi i∈I≥0 . Show that x̂ = x̂ ′ .

Exercise A.7: Open and closed sets


n
 that a set X ⊂ R is open if and only if, for any x̂ ∈ X and any sequence
(a) Show
xi ⊂ Rn such that xi → x̂ as i → ∞, there exists a q ∈ I≥0 such that xi ∈ X
for all i ≥ q.

(b) Show that a set X ⊂ Rn is closed if and only if for all xi ⊂ X, if xi → x̂ as
i → ∞, then x̂ ∈ X, i.e., a set X is closed if and only if it contains the limit of
every convergent sequences lying in X.
680 Mathematical Background

Exercise A.8: Decreasing and bounded below


Prove the observation at the end of Section A.10 that a monotone decreasing sequence
that is bounded below converges.

Exercise A.9: Continuous function



Show that f : Rn → Rm is continuous at x̂ implies f (xi ) → f (x̂) for any sequence xi
satisfying xi → x̂ as i → ∞.

Exercise A.10: Alternative proof of existence of minimum of continuous


function on compact set
Prove Proposition A.7 by making use of the fact that f (X) is compact.

Exercise A.11: Differentiable implies Lipschitz


Suppose that f : Rn → Rm has a continuous derivative fx (·) in a neighborhood of x̂.
Show that f is locally Lipschitz continuous at x̂.

Exercise A.12: Continuous, Lipschitz continuous, and differentiable


Provide examples of functions meeting the following conditions.
1. Continuous but not Lipschitz continuous.
2. Lipschitz continuous but not differentiable.

Exercise A.13: Differentiating quadratic functions and time-varying matrix


inverses
(a) Show that ∇f (x) = Qx if f (x) = (1/2)x ′ Qx and Q is symmetric.

(b) Show that (d/dt)A−1 (t) = −A−1 (t)Ȧ(t)A−1 (t) if A : R → Rn×n , A(t) is invert-
ible for all t ∈ R, and Ȧ(t) := (d/dt)A(t).

Exercise A.14: Directional derivative


Suppose that f : Rn → Rm has a derivative fx (x̂) at x̂. Show that for any h, the
directional derivative df (x̂; h) exists and is given by
df (x̂; h) = fx (x̂)h = (∂f (x)/∂x)h.

Exercise A.15: Convex combination


Suppose S ⊂ Rn is convex. Let {xi }ki=1 be points in S and let {µ i }ki=1 be scalars such
Pk
that µ i ≥ 0 for i = 1, 2, . . . , k andi
i=1 µ = 1. Show that
 
Xk
 µ i xi  ∈ S.
i=1

Exercise A.16: Convex epigraph


Show that f : Rn → R is convex if and only if its epigraph is convex.
A.18 Exercises 681

Exercise A.17: Bounded second derivative and minimum


Suppose that f : Rn → R is twice continuously differentiable and that for some ∞ >
2 2
M ≥ m > 0, M y ≥ ⟨y, ∂ 2 f /∂x 2 (x)y⟩ ≥ m y for all x, y ∈ Rn . Show that the
sublevel sets of f are convex and compact and that f (·) attains its infimum.

Exercise A.18: Sum and max of convex functions are convex


Suppose that fi : Rn → R, i = 1, 2, . . . , m are convex. Show that
ψ1 (x) := max {fi (x) | i ∈ {1, 2, . . . , m}},
i
m
X
ψ2 (x) := fi (x)
i=1

are both convex.

Exercise A.19: Einige kleine Mathprobleme


(a) Prove that if λ is an eigenvalue and v is an eigenvector of A (Av = λv), then λ
is also an eigenvalue of T in which T is upper triangular and given by the Schur
decomposition of A
Q∗ AQ = T
What is the corresponding eigenvector?

(b) Prove statement 1 on positive definite matrices (from Section A.7). Where is this
fact needed?

(c) Prove statement 6 on positive definite matrices. Where is this fact needed?

(d) Prove statement 5 on positive definite matrices.

(e) Prove statement 8 on positive semidefinite matrices.

(f) Derive the two expressions for the partitioned A−1 .

Exercise A.20: Positive definite but not symmetric matrices


Consider redefining the notation A > 0 for A ∈ Rn×n to mean x ′ Ax > 0 for all x ∈
Rn ≠ 0. In other words, the restriction that A is symmetric in the usual definition of
positive definiteness is removed. Consider also B := (A + A′ )/2. Show the following
hold for all A. (a) A > 0 if and only if B is positive definite. (b) tr(A) = tr(B). (Johnson,
1970; Johnson and Hillar, 2002)

Exercise A.21: Trace of a matrix function


Derive the following formula for differentiating the trace of a function of a square
matrix
d tr(f (A)) df (x)
= g(A′ ) g(x) =
dA dx
in which g is the usual scalar derivative of the scalar function f . This result proves
useful in evaluating the change in the expectation of the stage cost in stochastic control
problems.
682 Mathematical Background

Exercise A.22: Some matrix differentiation


Derive the following formulas (Bard, 1974). A, B ∈ Rn×n , a, x ∈ Rn .
(a)
∂x ′ Ax
= Ax + A′ x
∂x

(b)
∂Axa′ Bx
= (a′ Bx)A + Axa′ B
∂x ′

(c)
∂a′ Ab
= ab′
∂A

Exercise A.23: Partitioned matrix inversion formula


In deriving the partitioned matrix inversion formula we assumed A is partitioned into
" #
B C
A=
D E

and that A−1 , B −1 and E −1 exist. In the final formula, the term

(E − DB −1 C)−1

appears, but we did not assume this matrix is invertible. Did we leave out an assump-
tion or can the existence of this matrix inverse be proven given the other assumptions?
If we left out an assumption, provide an example in which this matrix is not invertible.
If it follows from the other assumptions, prove this inverse exists.

Exercise A.24: Partitioned positive definite matrices


Consider the partitioned positive definite, symmetric matrix
" #
H11 H12
H=
H21 H22

Prove that the following matrices are also positive definite


1. H11
2. H22
3. H in which " #
H11 −H12
H=
−H21 H22
−1 −1
4. H11 − H12 H22 H21 and H22 − H21 H11 H12

Exercise A.25: Properties of the matrix exponential


Prove that the following properties of the matrix exponential, which are useful for
dealing with continuous time linear systems. The matrix A is a real-valued n × n
matrix, and t is real.
(a)  
rank eAt = n ∀t
A.18 Exercises 683

(b) !
Zt
rank eAτ dτ =n ∀t > 0
0

Exercise A.26: Controllability in continuous time


A linear, time-invariant, continuous time system
dx
= Ax + Bu
dt
x(0) = x0 (A.41)
is controllable if there exists an input u(t), 0 ≤ t ≤ t1 , t1 > 0 that takes the system
from any x0 at time zero to any x1 at some finite time t1 .
(a) Prove that the system in (A.41) is controllable if and only if
rank (C) = n
in which C is, remarkably, the same controllability matrix that was defined for
discrete time systems 1.16
h i
C = B AB · · · An−1 B

(b) Describe a calculational procedure for finding this required input.

Exercise A.27: Reachability Gramian in continuous time


Consider the symmetric, n × n matrix W defined by
Zt

W (t) = e(t−τ)A BB ′ e(t−τ)A dτ
0
The matrix W is known as the reachability Gramian of the linear, time-invariant system.
The reachability Gramian proves useful in analyzing controllability and reachability.
Prove the following important properties of the reachability Gramian.
(a) The reachability Gramian satisfies the following matrix differential equation
dW
= BB ′ + AW + W A′
dt
W (0) = 0
which provides one useful way to calculate its values.

(b) The reachability Gramian W (t) is full rank for all t > 0 if and only if the system
is controllable.

Exercise A.28: Differences in continuous time and discrete time systems


Consider the definition that a system is controllable if there exists an input that takes
the system from any x0 at time zero to any x1 at some finite time t1 .
(a) Show that x1 can be taken as zero without changing the meaning of controlla-
bility for a linear continuous time system.

(b) In linear discrete time systems, x1 cannot be taken as zero without changing the
meaning of controllability. Why not? Which A require a distinction in discrete
time. What are the eigenvalues of the corresponding A in continuous time?
684 Mathematical Background

Exercise A.29: Observability in continuous time


Consider the linear time-invariant continuous time system
dx
= Ax
dt
x(0) = x0 (A.42)
y = Cx

and let y(t; x0 ) represent the solution to (A.42) as a function of time t given starting
state value x0 at time zero. Consider the output from two different initial conditions
y(t; w), y(t; z) on the time interval 0 ≤ t ≤ t1 with t1 > 0.
The system in (A.42) is observable if

y(t; w) = y(t; z), 0 ≤ t ≤ t1 =⇒ w = z

In other words, if two output measurement trajectories agree, the initial conditions
that generated the output trajectories must agree, and hence, the initial condition is
unique. This uniqueness of the initial condition allows us to consider building a state
estimator to reconstruct x(0) from y(t; x0 ). After we have found the unique x(0),
solving the model provides the rest of the state trajectory x(t). We will see later that
this procedure is not the preferred way to build a state estimator; it simply shows that
if the system is observable, the goal of state estimation is reasonable.
Show that the system in (A.42) is observable if and only if

rank (O) = n

in which O is, again, the same observability matrix that was defined for discrete time
systems 1.36  
C
 CA 
 
 
O= . 
 . 
 . 
CA n−1

Hint: what happens if you differentiate y(t; w) − y(t; z) with respect to time? How
many times is this function differentiable?

Exercise A.30: Observability Gramian in continuous time


Consider the symmetric, n × n matrix Wo defined by
Zt

Wo (t) = eA τ C ′ CeAτ dτ
0

The matrix Wo is known as the observability Gramian of the linear, time-invariant sys-
tem. Prove the following important properties of the observability Gramian.
(a) The observability Gramian Wo (t) is full rank for all t > 0 if and only if the system
is observable.

(b) Consider an observable linear time invariant system with u(t) = 0 so that y(t) =
CeAt x0 . Use the observability Gramian to solve this equation for x0 as a function
of y(t), 0 ≤ t ≤ t1 .

(c) Extend your result from the previous part to find x0 for an arbitrary u(t).
A.18 Exercises 685

Exercise A.31: Detectability of (A, C) and output penalty


Given a system

x(k + 1) = Ax(k) + Bu(k)


y(k) = Cx(k)

Suppose (A, C) is detectable and an input sequence has been found such that

u(k) → 0 y(k) → 0

Show that x(k) → 0.

Exercise A.32: Prove your favorite Hautus lemma


Prove the Hautus lemma for controllability, Lemma 1.2, or observability, Lemma 1.4.

Exercise A.33: Positive semidefinite Q penalty and its square root


Consider the linear quadratic problem with system

x(k + 1) = Ax(k) + Bu(k)


y(k) = Q1/2 x(k)

and infinite horizon cost function



X
Φ= x(k)′ Qx(k) + u(k)′ Ru(k)
k=0
X∞
= y(k)′ y(k) + u(k)′ Ru(k)
k=0

with Q ≥ 0, R > 0, and (A, B) stabilizable. In Exercise A.31 we showed that if (A, Q1/2 )
is detectable and an input sequence has been found such that

u(k) → 0 y(k) → 0

then x(k) → 0.
(a) Show that if Q ≥ 0, then Q1/2 is a well defined, real, symmetric matrix and
Q1/2 ≥ 0.
Hint: apply Theorem A.1 to Q, using the subsequent fact 3.

(b) Show that (A, Q1/2 ) is detectable (observable) if and only if (A, Q) is detectable
(observable). So we can express one of the LQ existence, uniqueness, and stability
conditions using detectability of (A, Q) instead of (A, Q1/2 ).

Exercise A.34: Probability density of the inverse function


Consider a scalar random variable ξ ∈ R and let the random variable η be defined by
the inverse function
η = ξ −1
(a) If ξ is distributed uniformly on [a, 1] with 0 < a < 1, what is the density of η?

(b) Is η’s density well defined if we allow a = 0? Explain your answer.


686 Mathematical Background

Exercise A.35: Expectation as a linear operator


(a) Consider the random variable x to be defined as a linear combination of the
random variables a and b
x =a+b
Show that
E(x) = E(a) + E(b)
Do a and b need to be statistically independent for this statement to be true?

(b) Next consider the random variable x to be defined as a scalar multiple of the
random variable a
x = αa
Show that
E(x) = αE(a)

(c) What can you conclude about E(x) if x is given by the linear combination
X
x= αi vi
i

in which vi are random variables and αi are scalars.

Exercise A.36: Minimum of two random variables


Given two independent random variables, ξ1 , ξ2 and the random variable defined by
the minimum operator
η = min(ξ1 , ξ2 )
(a) Sketch the region X(c) for the inequality min(x1 , x2 ) ≤ c.

(b) Find η’s probability density in terms of the probability densities of ξ1 , ξ2 .

Exercise A.37: Maximum of n normally distributed random variables


Given n independent, identically distributed normal random variables, ξ1 , ξ2 , . . . , ξn
and the random variable defined by the maximum operator

η = max(ξ1 , ξ2 , . . . ξn )

(a) Derive a formula for η’s density.

(b) Plot pη for ξi ∼ N(0, 1) and n = 1, 2, . . . 5. Describe the trend in pη as n in-


creases.

Exercise A.38: Another picture of mean


Consider a scalar random variable ξ with probability distribution Pξ shown in Fig-
ure A.14. Consider the inverse probability distribution, Pξ−1 , also shown in Figure A.14.

(a) Show that the expectation of ξ is equal to the following integral of the probability
distribution (David, 1981, p. 38)
Z0 Z∞
E(ξ) = − Pξ (x)dx + (1 − Pξ (x))dx (A.43)
−∞ 0
A.18 Exercises 687

Pξ−1 (w)
Pξ (x)
A2

A2
A1
0 w 1
A1

0 x

Figure A.14: The probability distribution and inverse distribution for


random variable ξ. The mean of ξ is given by the dif-
ference in the hatched areas, E(ξ) = A2 − A1 .

(b) Show that the expectation of ξ is equal to the following integral of the inverse
probability distribution
Z1
E(ξ) = Pξ−1 (w)dw (A.44)
0
These interpretations of mean are shown as the hatched areas in Figure A.14,
E(ξ) = A2 − A1 .

Exercise A.39: Ordering random variables


We can order two random variables A and B if they obey an inequality such as A ≥ B.
The frequency interpretation of the probability distribution, PA (c) = Pr(A ≤ c), then
implies that PA (c) ≤ PB (c) for all c.
If A ≥ B, show that
E(A) ≥ E(B)

Exercise A.40: Max of the mean and mean of the max


Given two random variables A and B, establish the following inequality

max(E(A), E(B)) ≤ E(max(A, B))

In other words, the max of the mean is an underbound for the mean of the max.

Exercise A.41: Observability


Consider the linear system with zero input

x(k + 1) = Ax(k)
y(k) = Cx(k)
688 Mathematical Background

with  
1 0 0 " #
  1 0 0
A= 0 1 0 , C=
0 1 0
2 1 1

(a) What is the observability matrix for this system? What is its rank?

(b) Consider a string of data measurements

y(0) = y(1) = · · · = y(n − 1) = 0

Now x(0) = 0 is clearly consistent with these data. Is this x(0) unique? If yes,
prove it. If no, characterize the set of all x(0) that are consistent with these
data.

Exercise A.42: Nothing is revealed


An agitated graduate student shows up at your office. He begins, “I am afraid I have
discovered a deep contradiction in the foundations of systems theory.” You ask him
to calm down and tell you about it. He continues, “Well, we have the pole placement
theorem that says if (A, C) is observable, then there exists a matrix L such that the
eigenvalues of an observer
A − ALC
can be assigned arbitrarily.”
You reply, “Well, they do have to be conjugate pairs because the matrices A, L, C
are real-valued, but yeah, sure, so what?”
He continues, “Well we also have the Hautus lemma that says (A, C) is observable
if and only if " #
λI − A
rank =n ∀λ ∈ C
C
“You know, the Hautus lemma has always been one of my favorite lemmas; I don’t
see a problem,” you reply.
“Well,” he continues, “isn’t the innovations form of the system, (A − ALC, C), ob-
servable if and only if the original system, (A, C), is observable?”
“Yeah . . . I seem to recall something like that,” you reply, starting to feel a little
uncomfortable.
“OK, how about if I decide to put all the observer poles at zero?” he asks, innocently.
You object, “Wait a minute, I guess you can do that, but that’s not going to be a
very good observer, so I don’t think it matters if . . . .”
“Well,” he interrupts, “how about we put all the eigenvalues of A − ALC at zero,
like I said, and then we check the Hautus condition at λ = 0? I get
" # " #
λI − (A − ALC) 0
rank = rank λ=0
C C

“So tell me, how is that matrix on the right ever going to have rank n with that big, fat
zero sitting there?” At this point, you start feeling a little dizzy.
What’s causing the contradiction here: the pole placement theorem, the Hautus
lemma, the statement about equivalence of observability in innovations form, some-
thing else? How do you respond to this student?
A.18 Exercises 689

Exercise A.43: The sum of throwing two dice


Using (A.30), what is the probability density for the sum of throwing two dice? On what
number do you want to place your bet? How often do you expect to win if you bet on
this outcome?
Make the standard assumptions: the probability density for each die is uniform
over the integer values from one to six, and the outcome of each die is independent of
the other die.

Exercise A.44: The product of throwing two dice


Using (A.30), what is the probability density for the product of throwing two dice? On
what number do you want to place your bet? How often do you expect to win if you
bet on this outcome?
Make the standard assumptions: the probability density for each die is uniform
over the integer values from one to six, and the outcome of each die is independent of
the other die.

Exercise A.45: The size of an ellipse’s bounding box


Here we derive the size of the bounding box depicted in Figure A.10. Consider a real,
positive definite, symmetric matrix A ∈ Rn×n and a real vector x ∈ Rn . The set of x
for which the scalar x ′ Ax is constant are n-dimensional ellipsoids. Find the length of
the sides of the smallest box that contains the ellipsoid defined by
x ′ Ax = b
Hint: Consider the equivalent optimization problem to minimize the value of x ′ Ax
such that the ith component of x is given by xi = c. This problem defines the ellipsoid
that is tangent to the plane xi = c, and can be used to answer the original question.

Exercise A.46: The tangent points of an ellipse’s bounding box


Find the tangent points of an ellipsoid defined by x ′ Ax = b, and its bounding box
as depicted in Figure A.10 for n = 2. For n = 2, draw the ellipse, bounding box and
compute the tangent points for the following parameters taken from Figure A.10
" #
3.5 2.5
A= b=1
2.5 4.0

Exercise A.47: Let’s make a deal!


Consider the following contest of the American television game show of the 1960s, Let’s
Make a Deal. In the show’s grand finale, a contestant is presented with three doors.
Behind one of the doors is a valuable prize such as an all-expenses-paid vacation to
Hawaii or a new car. Behind the other two doors are goats and donkeys. The contestant
selects a door, say door number one. The game show host, Monty Hall, then says,
“Before I show you what is behind your door, let’s reveal what is behind door num-
ber three!” Monty always chooses a door that has one of the booby prizes behind it.
As the goat or donkey is revealed, the audience howls with laughter. Then Monty asks
innocently,
“Before I show you what is behind your door, I will allow you one chance to change
your mind. Do you want to change doors?” While the contestant considers this option,
the audience starts screaming out things like,
690 Mathematical Background

“Stay with your door! No, switch, switch!” Finally the contestant chooses again,
and then Monty shows them what is behind their chosen door.
Let’s analyze this contest to see how to maximize the chance of winning. Define

p(i, j, y), i, j, y = 1, 2, 3

to be the probability that you chose door i, the prize is behind door j and Monty showed
you door y (named after the data!) after your initial guess. Then you would want to

max p(j|i, y)
j

for your optimal choice after Monty shows you a door.


(a) Calculate this conditional density and give the probability that the prize is behind
door i, your original choice, and door j ≠ i.

(b) You will need to specify a model of Monty’s behavior. Please state the one that
is appropriate to Let’s Make a Deal.

(c) For what other model of Monty’s behavior is the answer that it doesn’t matter if
you switch doors. Why is this a poor model for the game show?

Exercise A.48: Norm of an extended state


Consider x ∈ Rn with a norm denoted |·|α , and u ∈ Rm with a norm denoted |·|β .
Now consider a proposed norm for the extended state (x, u)

|(x, u)|γ := |x|α + |u|β

Show that this proposal satisfies the definition of a norm given in Section A.8.
If the α and β norms are chosen to be p-norms, is the γ norm also a p-norm? Show
why or why not.

Exercise A.49: Distance of an extended state to an extended set


Let x ∈ Rn and X a set of elements in Rn , and u ∈ Rm and U a set of elements in Rm .
Denote distances from elements to their respective sets as

|x|X := inf x−y α |u|U := inf |u − v|β


y∈X v∈U

|(x, u)|X×U := inf (x, u) − (y, v) γ


(y,v)∈X×U

Use the norm of the extended state defined in Exercise A.48 to show that

|(x, u)|X×U = |x|X + |u|U


Bibliography

T. W. Anderson. An Introduction to Multivariate Statistical Analysis. John Wiley


& Sons, New York, third edition, 2003.

T. M. Apostol. Mathematical analysis. Addison-Wesley, 1974.

Y. Bard. Nonlinear Parameter Estimation. Academic Press, New York, 1974.

T. Bayes. An essay towards solving a problem in the doctrine of chances. Phil.


Trans. Roy. Soc., 53:370–418, 1763. Reprinted in Biometrika, 35:293–315,
1958.

S. P. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University


Press, 2004.

F. M. Callier and C. A. Desoer. Linear System Theory. Springer-Verlag, New


York, 1991.

E. A. Coddington and N. Levinson. Theory of Ordinary Differential Equations.


McGraw Hill, 1955.

H. A. David. Order Statistics. John Wiley & Sons, Inc., New York, second edition,
1981.

J. Dieudonne. Foundations of modern analysis. Academic Press, 1960.

G. H. Golub and C. F. Van Loan. Matrix Computations. The Johns Hopkins


University Press, Baltimore, Maryland, third edition, 1996.

J. Hale. Ordinary Differential Equations. Robert E. Krieger Publishing Company,


second edition, 1980.

P. Hartman. Ordinary Differential Equations. John Wiley and Sons, 1964.

R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press,


1985.

C. R. Johnson. Positive definite matrices. Amer. Math. Monthly, 77(3):259–264,


March 1970.

C. R. Johnson and C. J. Hillar. Eigenvalues of words in two positive definite


letters. SIAM J. Matrix Anal. and Appl., 23(4):916–928, 2002.

E. J. McShane. Integration. Princeton University Press, 1944.

691
692 Bibliography

J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York,


second edition, 2006.

A. Papoulis. Probability, Random Variables, and Stochastic Processes. McGraw-


Hill, Inc., second edition, 1984.

E. Polak. Optimization: Algorithms and Consistent Approximations. Springer


Verlag, New York, 1997. ISBN 0-387-94971-2.

R. T. Rockafellar. Convex Analysis. Princeton University Press, Princeton, N.J.,


1970.

R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer-Verlag, 1998.

I. Schur. On the characteristic roots of a linear substitution with an application


to the theory of integral equations (German). Math Ann., 66:488–510, 1909.

G. Strang. Linear Algebra and its Applications. Academic Press, New York,
second edition, 1980.
B
Stability Theory

Version: date: April 25, 2022


Copyright © 2022 by Nob Hill Publishing, LLC

B.1 Introduction
In this appendix we consider stability properties of discrete time sys-
tems. A good general reference for stability theory of continuous time
systems is Khalil (2002). There are not many texts for stability theory
of discrete time systems; a useful reference is LaSalle (1986). Recently
stability theory for discrete time systems has received more attention
in the literature. In the notes below we draw on Jiang and Wang (2001,
2002); Kellett and Teel (2004a,b).
We consider systems of the form

x + = f (x, u)

where the state x lies in Rn and the control (input) u lies in Rm ; in


this formulation x and u denote, respectively, the current state and
control, and x + the successor state. We assume in the sequel that
the function f : Rn × Rm → Rn is continuous. Let φ(k; x, u) denote
the solution of x + = f (x, u) at time k if the initial state is x(0) = x
and the control sequence is u = (u(0), u(1), u(2), . . .); the solution
exists and is unique. If a state-feedback control law u = κ(x) has been
chosen, the closed-loop system is described by x + = f (x, κ(x)), which
has the same form x + = fc (x) where fc (·) is defined by fc (x) := f (x,
κ(x)). Let φ(k; x, κ(·)) denote the solution of this difference equation
at time k if the initial state at time 0 is x(0) = x; the solution exists
and is unique (even if κ(·) is discontinuous). If κ(·) is not continuous,
as may be the case when κ(·) is an implicit model predictive control
(MPC) law, then fc (·) may not be continuous. In this case we assume
that fc (·) is locally bounded.1
1 A function f : X → X is locally bounded if, for any x ∈ X, there exists a neighbor-

hood N of x such that f (N ) is a bounded set, i.e., if there exists a M > 0 such that
f (x) ≤ M for all x ∈ N .

693
694 Stability Theory

We would like to be sure that the controlled system is “stable”, i.e.,


that small perturbations of the initial state do not cause large variations
in the subsequent behavior of the system, and that the state converges
to a desired state or, if this is impossible due to disturbances, to a
desired set of states. These objectives are made precise in Lyapunov
stability theory; in this theory, the system x + = f (x) is assumed given
and conditions ensuring the stability, or asymptotic stability of a spec-
ified state or set are sought; the terms stability and asymptotic stability
are defined below. If convergence to a specified state, x ∗ say, is sought,
it is desirable for this state to be an equilibrium point:

Definition B.1 (Equilibrium point). A point x ∗ is an equilibrium point


of x + = f (x) if x(0) = x ∗ implies x(k) = φ(k; x ∗ ) = x ∗ for all k ≥ 0.
Hence x ∗ is an equilibrium point if it satisfies

x ∗ = f (x ∗ )

An equilibrium point x ∗ is isolated if there are no other equilib-


rium points in a sufficiently small neighborhood of x ∗ . A linear system
x + = Ax + b has a single equilibrium point x ∗ = (I − A)−1 b if I − A is
invertible; if not, the linear system has a continuum {x | (I − A)x = b}
of equilibrium points. A nonlinear system, unlike a linear system, may
have several isolated equilibrium points.
In other situations, for example when studying the stability proper-
ties of an oscillator, convergence to a specified closed set A ⊂ Rn is
sought. In the case of a linear oscillator with state dimension 2, this
set is an ellipse. If convergence to a set A is sought, it is desirable for
the set A to be positive invariant :

Definition B.2 (Positive invariant set). A closed set A is positive invari-


ant for the system x + = f (x) if x ∈ A implies f (x) ∈ A.

Clearly, any solution of x + = f (x) with initial state in A, remains


in A. The closed set A = {x ∗ } consisting of a (single) equilibrium
point is a special case; x ∈ A (x = x ∗ ) implies f (x) ∈ A (f (x) = x ∗ ).
Define |x|A := inf z∈A |x − z| to be the distance of a point x from the
set A; if A = {x ∗ }, then |x|A = |x − x ∗ | which reduces to |x| when
x ∗ = 0.
Before introducing the concepts of stability and asymptotic stability
and their characterization by Lyapunov functions, it is convenient to
make a few definitions.
B.1 Introduction 695

Definition B.3 (K, K∞ , KL, and PD functions). A function σ : R≥0 →


R≥0 belongs to class K if it is continuous, zero at zero, and strictly
increasing; σ : R≥0 → R≥0 belongs to class K∞ if it is a class K and
unbounded (σ (s) → ∞ as s → ∞). A function β : R≥0 × I≥0 → R≥0
belongs to class KL if it is continuous and if, for each t ≥ 0, β(·, t)
is a class K function and for each s ≥ 0, β(s, ·) is nonincreasing and
satisfies limt→∞ β(s, t) = 0. A function γ : R → R≥0 belongs to class
PD (is positive definite) if it is zero at zero and positive everywhere
else.2

The following useful properties of these functions are established


in Khalil (2002, Lemma 4.2): if α1 (·) and α2 (·) are K functions (K∞
functions), then α1−1 (·) and (α1 ◦ α2 )(·)3 are K functions4 (K∞ func-
tions). Moreover, if α1 (·) and α2 (·) are K functions and β(·) is a KL
function, then σ (r , s) = α1 (β(α2 (r ), s)) is a KL function.
The following properties prove useful when analyzing the robust-
ness of perturbed systems.

1. For γ(·) ∈ K, the following holds for all ai ∈ R≥0 , i ∈ I1:n

1  
γ(a1 ) + · · · + γ(an ) ≤ γ a1 + · · · + an ≤
n
γ(na1 ) + · · · + γ(nan ) (B.1)

2. Similarly, for β(·) ∈ KL, the following holds for all ai ∈ R≥0 ,
i ∈ I1:n , and t ∈ R≥0

1  
β(a1 , t) + · · · + β(an , t) ≤ β (a1 + · · · + an ), t ≤
n
β(na1 , t) + β(na2 , t) + · · · + β(nan , t) (B.2)

3. If αi (·) ∈ K(K∞ ), for i ∈ I1:n then

min{αi (·)} := α(·) ∈ K(K∞ ) (B.3)


i

max {αi (·)} := α(·) ∈ K(K∞ ) (B.4)


i
2 Be aware that the existing stability literature sometimes includes continuity in the

definition of a positive definite function. We used such a definition in the first edition
of this text, for example. But in the second edition, we remove continuity and retain
only the requirement of positivity in the definition of positive definite function.
3 (α ◦ α )(·) is the composition of the two functions α (·) and α (·) and is defined
1 2 1 2
by (α1 ◦ α2 )(s) := α1 (α2 (s)).
4 Note, however, that the domain of α−1 (·) may be restricted from R
≥0 to [0, a) for
some a > 0.
696 Stability Theory

P
4. Let vi ∈ Rni for i ∈ I1:n , and v := (v1 , . . . , vn ) ∈ R ni . If αi (·) ∈
K(K∞ ) and βi (·) ∈ KL for i ∈ I1:n , then there exist α(·), α(·) ∈
K(K∞ ) and β(·), β(·) ∈ KL such that

α(|v|) ≤ α1 (|v1 |) + · · · + αn (|vn |) ≤ α(|v|) (B.5)

and, for all t ∈ R≥0

β(|v| , t) ≤ β1 (|v1 | , t) + · · · + βn (|vn | , t) ≤ β(|v| , t) (B.6)

5. Let vi , v, αi (·), βi (·) be defined as in 4. Then there exist α(·),


α(·) ∈ K(K∞ ) and β(·), β(·) ∈ KL such that

α(|v|) ≤ α1 (|v1 |) ⊕ · · · ⊕ αn (|vn |) ≤ α(|v|) (B.7)

and, for all t ∈ R≥0

β(|v| , t) ≤ β1 (|v1 | , t) ⊕ · · · ⊕ βn (|vn | , t) ≤ β(|v| , t) (B.8)

See (Rawlings and Ji, 2012) for short proofs of (B.1) and (B.2), and (Allan,
Bates, Risbeck, and Rawlings, 2017, Proposition 23) for a short proof
of (B.3). The result (B.4) follows similarly to (B.3). Result (B.5) and (B.7)
follow from (B.1) and (B.3)–(B.4), and (B.6) and (B.8) follow from (B.5)
and (B.7), respectively. See also Exercises B.9 and B.10.

B.2 Stability and Asymptotic Stability


In this section we consider the stability properties of the autonomous
system x + = f (x); we assume that f (·) is locally bounded, and that the
set A is closed and positive invariant for x + = f (x) unless otherwise
stated.

Definition B.4 (Local stability). The (closed, positive invariant) set A is


locally stable for x + = f (x) if, for all ε > 0, there exists a δ > 0 such
that |x|A < δ implies φ(i; x) A < ε for all i ∈ I≥0 .

See Figure B.1 for an illustration of this definition when A = {0}; in


this case we speak of stability of the origin.

Remark. Stability of the origin, as defined above, is equivalent to con-


tinuity of the map x , x := (x, φ(1; x), φ(2; x), . . .), R → ℓ∞ at the
origin so that ∥x∥ → 0 as x → 0 (a small perturbation in the initial state
causes a small perturbation in the subsequent motion).
B.2 Stability and Asymptotic Stability 697

εB

x
0

δB

Figure B.1: Stability of the origin. B denotes the unit ball.

Definition B.5 (Global attraction). The (closed, positive invariant) set


A is globally attractive for the system x + = f (x) if φ(i; x) A → 0 as
i → ∞ for all x ∈ Rn .

Definition B.6 (Global asymptotic stability). The (closed, positive in-


variant) set A is globally asymptotically stable (GAS) for x + = f (x) if
it is locally stable and globally attractive.

It is possible for the origin to be globally attractive but not locally


stable. Consider a second order system

x + = Ax + φ(x)

where A has eigenvalues λ1 = 0.5 and λ2 = 2 with associated eigen-


vectors w1 and w2 , shown in Figure B.2; w1 is the “stable” and w2 the
“unstable” eigenvector; the smooth function φ(·) satisfies φ(0) = 0
and (∂/∂x)φ(0) = 0 so that x + = Ax + φ(x) behaves like x + = Ax
near the origin. If φ(x) ≡ 0, the motion corresponding to an initial
state αw1 , α ≠ 0, converges to the origin, whereas the motion corre-
sponding to an initial state αw2 diverges. If φ(·) is such that it steers
nonzero states toward the horizontal axis, we get trajectories of the
form shown in Figure B.2. All trajectories converge to the origin but
the motion corresponding to an initial state αw2 , no matter how small,
is similar to that shown in Figure B.2 and cannot satisfy the ε, δ defini-
tion of local stability. The origin is globally attractive but not stable. A
698 Stability Theory

w2

w1

Figure B.2: An attractive but unstable origin.

trajectory that joins an equilibrium point to itself, as in Figure B.2, is


called a homoclinic orbit.
We collect below a set of useful definitions:

Definition B.7 (Various forms of stability). The (closed, positive invari-


ant) set A is
(a) locally stable if, for each ε > 0, there exists a δ = δ(ε) > 0 such
that |x|A < δ implies φ(i; x) A < ε for all i ∈ I≥0 .

(b) unstable, if it is not locally stable.

(c) locally attractive if there exists η > 0 such that |x|A < η implies
φ(i; x) A → 0 as i → ∞.

(d) globally attractive if φ(i; x) A → 0 as i → ∞ for all x ∈ Rn .

(e) locally asymptotically stable if it is locally stable and locally attrac-


tive.

(f) globally asymptotically stable if it is locally stable and globally at-


tractive.

(g) locally exponentially stable if there exist η > 0, c > 0, and γ ∈ (0, 1)
such that |x|A < η implies φ(i; x) A ≤ c |x|A γ i for all i ∈ I≥0 .

(h) globally exponentially stable if there exists a c > 0 and a γ ∈ (0, 1)


such that φ(i; x) A ≤ c |x|A γ i for all x ∈ Rn , all i ∈ I≥0 .

The following stronger definition of GAS has recently started to be-


come popular.
B.2 Stability and Asymptotic Stability 699

Definition B.8 (Global asymptotic stability (KL version)). The (closed,


positive invariant) set A is globally asymptotically stable (GAS) for x + =
f (x) if there exists a KL function β(·) such that, for each x ∈ Rn

φ(i; x) A ≤ β(|x|A , i) ∀i ∈ I≥0 (B.9)

Proposition B.9 (Connection of classical and KL global asymptotic sta-


bility). Suppose A is compact (and positive invariant) and that f (·) is
continuous. Then the classical and KL definitions of global asymptotic
stability of A for x + = f (x) are equivalent.

The KL version of global asymptotic stability implies the classical


version from (B.9) and the definition of a KL function. The converse
is harder to prove but is established in Jiang and Wang (2002) where
Proposition 2.2 establishes the equivalence of the existence of a KL
function satisfying (2) with UGAS (uniform global asymptotic stabil-
ity), and Corollary 3.3 which establishes the equivalence, when A is
compact, of uniform global asymptotic stability and global asymptotic
stability. Note that f (·) must be continuous for the two definitions to
be equivalent. See Exercise B.8 for an example with discontinuous f (·)
where the system is GAS in the classical sense but does not satisfy (B.9),
i.e., is not GAS in the KL sense.
For a KL version of exponential stability, one simply restricts the
form of the KL function β(·) of asymptotic stability to β(|x|A , i) =
c |x|A λi with c > 0 and λ ∈ (0, 1), but, as we see, that is exactly the
classical definition so there is no distinction between the two forms for
exponential stability.
In practice, global asymptotic stability of A often cannot be
achieved because of state constraints. Hence we have to extend slightly
the definitions given above. In the following, let B denote a unit ball in
Rn with center at the origin.

Definition B.10 (Various forms of stability (constrained)). Suppose


X ⊂ Rn is positive invariant for x + = f (x), that A ⊆ X is closed
and positive invariant for x + = f (x). Then A is
(a) locally stable in X if, for each ε > 0, there exists a δ = δ(ε) > 0
such that x ∈ X ∩ (A ⊕ δB), implies φ(i; x) A < ε for all i ∈ I≥0 .

(b) locally attractive in X if there exists a η > 0 such that x ∈ X ∩ (A ⊕


ηB) implies φ(i; x) A → 0 as i → ∞.

(c) attractive in X if φ(i; x) A → 0 as i → ∞ for all x ∈ X.


700 Stability Theory

(d) locally asymptotically stable in X if it is locally stable in X and


locally attractive in X.

(e) asymptotically stable in X if it is locally stable in X and attractive


in X.

(f) locally exponentially stable in X if there exist η > 0, c > 0, and


γ ∈ (0, 1) such that x ∈ X ∩ (A ⊕ ηB) implies φ(i; x) A ≤ c |x|A γ i
for all i ∈ I≥0 .

(g) exponentially stable in X if there exists a c > 0 and a γ ∈ (0, 1)


such that φ(i; x) A ≤ c |x|A γ i for all x ∈ X, all i ∈ I≥0 .

The assumption that X is positive invariant for x + = f (x) ensures


that φ(i; x) ∈ X for all x ∈ X, all i ∈ I≥0 . The KL version of asymptotic
stability in X is the following.

Definition B.11 (Asymptotic stability (constrained, KL version)). Sup-


pose that X is positive invariant and the set A ⊆ X is closed and pos-
itive invariant for x + = f (x). The set A is asymptotically stable in X
for x + = f (x) if there exists a KL function β(·) such that, for each
x∈X
φ(i; x) A ≤ β(|x|A , i) ∀i ∈ I≥0 (B.10)

Finally, we define the domain of attraction of an asymptotically sta-


ble set A for the system x + = f (x) to be the set of all initial states x
such that φ(i; x) A → 0 as i → ∞. We use the term region of attrac-
tion to denote any set of initial states x such that φ(i; x) A → 0 as
i → ∞. From these definitions, if A is attractive in X, then X is a region
of attraction of set A for the system x + = f (x).

B.3 Lyapunov Stability Theory


Energy in a passive electrical or mechanical system provides a useful
analogy to Lyapunov stability theory. In a lumped mechanical system,
the total mechanical energy is the sum of the potential and kinetic en-
ergies. As time proceeds, this energy is dissipated by friction into heat
and the total mechanical energy decays to zero at which point the sys-
tem is in equilibrium. To establish stability or asymptotic stability,
Lyapunov theory follows a similar path. If a real-valued function can
be found that is positive and decreasing if the state does not lie in the
set A, then the state converges to this set as time tends to infinity. We
now make this intuitive idea more precise.
B.3 Lyapunov Stability Theory 701

B.3.1 Time-Invariant Systems

First we consider the time-invariant (autonomous) model x + = f (x).

Definition B.12 (Lyapunov function (unconstrained and constrained)).


Suppose that X is positive invariant and the set A ⊆ X is closed and
positive invariant for x + = f (x), and f (·) is locally bounded. A func-
tion V : X → R≥0 is said to be a Lyapunov function in X for the system
x + = f (x) and set A if there exist functions α1 , α2 ∈ K∞ , and contin-
uous function α3 ∈ PD such that for any x ∈ X

V (x) ≥ α1 (|x|A ) (B.11)


V (x) ≤ α2 (|x|A ) (B.12)
V (f (x)) − V (x) ≤ −α3 (|x|A ) (B.13)

If X = Rn , then we drop the restrictive phrase “in X.”

Remark (Discontinuous f and V ). In MPC, the value function for the


optimal control problem solved online is often employed as a Lyapunov
function. The reader should be aware that many similar but different
definitions of Lyapunov functions are in use in many different branches
of the science and engineering literature. To be of the most use in MPC
analysis, we do not assume here that f (·) or V (·) is continuous. We
assume only that f (·) is locally bounded; V (·) is also locally bounded
due to (B.12), and continuous on the set A (but not necessarily on a
neighborhood of A) due to (B.11)–(B.12).

Remark (Continuous (and positive definite) α3 ). One may wonder why


α3 (·) is assumed continuous in addition to positive definite in the def-
inition of the Lyapunov function, when much of the classical literature
leaves out continuity; see for example the autonomous case given in
Kalman and Bertram (1960). Again, most of this classical literature as-
sumes instead that f (·) is continuous, which we do not assume here.
See Exercise B.7 for an example from Lazar, Heemels, and Teel (2009)
with discontinuous f (·) for which removing continuity of α3 (·) in Def-
inition B.12 would give a Lyapunov function that fails to imply asymp-
totic stability.

For making connections to the wide body of existing stability litera-


ture, which mainly uses the classical definition of asymptotic stability,
and because the proof is instructive, we first state and prove the clas-
sical version of the Lyapunov stability theorem.
702 Stability Theory

Theorem B.13 (Lyapunov function and GAS (classical definition)). Sup-


pose that X is positive invariant and the set A ⊆ X is closed and positive
invariant for x + = f (x), and f (·) is locally bounded. Suppose V (·)
is a Lyapunov function for x + = f (x) and set A. Then A is globally
asymptotically stable (classical definition).
Proof.
(a) Stability. Let ε > 0 be arbitrary and let δ := α2−1 (α1 (ε)). Suppose
|x|A < δ so that, by (B.12), V (x) ≤ α2 (δ) = α1 (ε). From (B.13),
(V (x(i)))i∈I≥0 , x(i) := φ(i; x), is a nonincreasing sequence so that,
for all i ∈ I≥0 , V (x(i)) ≤ V (x). From (B.11), |x(i)|A ≤ α1−1 (V (x)) ≤
α1−1 (α1 (ε)) = ε for all i ∈ I≥0 .

(b) Attractivity. Let x ∈ Rn be arbitrary. From (B.12) V (x) is finite, and


from (B.11) and (B.13), the sequence (V (x(i)))i∈I≥0 is nonincreasing
and bounded below by zero and therefore converges to V̄ ≥ 0 as i → ∞.
We next show that V̄ = 0. From (B.11) and (B.12) and the properties of
K∞ functions, we have that for all i ≥ 0,

α2−1 (V (x(i))) ≤ |x(i)|A ≤ α1−1 (V (x(i))) (B.14)

Assume for contradiction that V̄ > 0. Since α3 (·) is continuous and


positive definite and interval I := [α2−1 (V̄ ), α1−1 (V̄ )] is compact, the
following optimization has a positive solution

ρ := min α3 (|x|A ) > 0


|x|A ∈I

From repeated use of (B.13), we have that for all i ≥ 0


i−1
X
V (x(i)) ≤ V (x) − α3 ( x(j) A)
j=0

Since |x(i)|A converges to interval I where α3 (|x(i)|A ) is under-


bounded by ρ > 0, α3 (·) is continuous, and V (x) is finite, the in-
equality above implies that V (x(i)) → −∞ as i → ∞, which is a contra-
diction. Therefore V (x(i)) converges to V̄ = 0 and (B.14) implies x(i)
converges to A as i → ∞. ■

Next we establish the analogous Lyapunov stability theorem using


the stronger KL definition of GAS, Definition B.8. Before establishing
the Lyapunov stability theorem, it is helpful to present the following
lemma established by Jiang and Wang (2002, Lemma 2.8) that enables
us to assume when convenient that α3 (·) in (B.13) is a K∞ function
rather than just a continuous PD function.
B.3 Lyapunov Stability Theory 703

Lemma B.14 (From PD to K∞ function (Jiang and Wang (2002))). As-


sume V (·) is a Lyapunov function for system x + = f (x) and set A, and
f (·) is locally bounded. Then there exists a smooth function5 ρ(·) ∈ K∞
such that W (·) := ρ ◦ V (·) is also a Lyapunov function for system
x + = f (x) and set A that satisfies for all x ∈ Rn

W (f (x)) − W (x) ≤ −α(|x|A )

with α(·) ∈ K∞ .
Note that Jiang and Wang (2002) prove this lemma under the as-
sumption that both f (·) and V (·) are continuous, but their proof re-
mains valid if both f (·) and V (·) are only locally bounded.
We next establish the Lyapunov stability theorem in which we add
the parenthetical (KL definition) purely for emphasis and to distinguish
this result from the previous classical result, but we discontinue this
emphasis after this theorem, and use exclusively the KL definition.
Theorem B.15 (Lyapunov function and global asymptotic stability (KL
definition)). Suppose that X is positive invariant and the set A ⊆ X is
closed and positive invariant for x + = f (x), and f (·) is locally bounded.
Suppose V (·) is a Lyapunov function for x + = f (x) and set A. Then A
is globally asymptotically stable (KL definition).
Proof. Due to Lemma B.14 we assume without loss of generality that
α3 ∈ K∞ . From (B.13) we have that

V (φ(i + 1; x)) ≤ V (φ(i; x)) − α3 ( φ(i; x) A) ∀x ∈ Rn i ∈ I≥0

Using (B.12) we have that

α3 (|x|A ) ≥ α3 ◦ α2−1 (V (x)) ∀x ∈ Rn

Combining these we have that

V (φ(i + 1; x)) ≤ σ1 (V (φ(i; x))) ∀x ∈ Rn i ∈ I≥0

in which
σ1 (s) := s − α3 ◦ α2−1 (s)
We have that σ1 (·) is continuous on R≥0 , σ1 (0) = 0, and σ1 (s) < s for
s > 0. But σ1 (·) may not be increasing. We modify σ1 to achieve this
property in two steps. First define

σ2 (s) := max σ1 (s ′ ) s ∈ R≥0


s ′ ∈[0,s]

5A smooth function has derivatives of all orders.


704 Stability Theory

in which the maximum exists for each s ∈ R≥0 because σ1 (·) is con-
tinuous. By its definition, σ2 (·) is nondecreasing, σ2 (0) = 0, and
0 ≤ σ2 (s) < s for s > 0, and we next show that σ2 (·) is continuous
on R≥0 . Assume that σ2 (·) is discontinuous at a point c ∈ R≥0 . Be-
cause it is a nondecreasing function, there is a positive jump in the
function σ2 (·) at c (Bartle and Sherbert, 2000, p. 150). Define 6

a1 := lim σ2 (s) a2 := lim σ2 (s)


s↗c s↘c

We have that σ1 (c) ≤ a1 < a2 or we violate the limit of σ2 from below.


Since σ1 (c) < a2 , σ1 (s) must achieve value a2 for some s < c or we
violate the limit from above. But σ1 (s) = a2 for s < c also violates the
limit from below, and we have a contradiction and σ2 (·) is continuous.
Finally, define

σ (s) := (1/2)(s + σ2 (s)) s ∈ R≥0

and we have that σ (·) is a continuous, strictly increasing, and un-


bounded function satisfying σ (0) = 0. Therefore, σ (·) ∈ K∞ , σ1 (s) <
σ (s) < s for s > 0 and therefore

V (φ(i + 1; x)) ≤ σ (V (φ(i; x))) ∀x ∈ Rn i ∈ I≥0 (B.15)

Repeated use of (B.15) and then (B.12) gives

V (φ(i; x)) ≤ σ i ◦ α2 (|x|A ) ∀x ∈ Rn i ∈ I≥0

in which σ i represents the composition of σ with itself i times. Using


(B.11) we have that

φ(i; x) A ≤ β(|x|A , i) ∀x ∈ Rn i ∈ I≥0

in which

β(s, i) := α1−1 ◦ σ i ◦ α2 (s) ∀s ∈ R≥0 i ∈ I≥0

For all s ≥ 0, the sequence wi := σ i (α2 (s)) is nonincreasing with i,


bounded below (by zero), and therefore converges to a, say, as i →
∞. Therefore, both wi → a and σ (wi ) → a as i → ∞. Since σ (·) is
continuous we also have that σ (wi ) → σ (a) so σ (a) = a, which implies
that a = 0, and we have shown that for all s ≥ 0, α1−1 ◦ σ i ◦ α2 (s) → 0 as
6 The limits from above and below exist because σ2 (·) is nondecreasing (Bartle and
Sherbert, 2000, p. 149). If the point c = 0, replace the limit from below by σ2 (0).
B.3 Lyapunov Stability Theory 705

i → ∞. Since α1−1 (·) also is a K function, we also have that for all s ≥ 0,
α1−1 ◦ σ i ◦ α2 (s) is nonincreasing with i. We have from the properties
of K functions that for all i ≥ 0, α1−1 ◦ σ i ◦ α2 (s) is a K function,
and can therefore conclude that β(·) is a KL function and the proof is
complete. ■

Theorem B.15 provides merely a sufficient condition for global


asymptotic stability that might be thought to be conservative. Next
we establish a converse stability theorem that demonstrates necessity.
In this endeavor we require a useful preliminary result on KL functions
(Sontag, 1998b, Proposition 7)

Proposition B.16 (Improving convergence (Sontag (1998b))). Assume


that β(·) ∈ KL. Then there exists θ1 (·), θ2 (·) ∈ K∞ so that

β(s, t) ≤ θ1 (θ2 (s)e−t ) ∀s ≥ 0, ∀t ≥ 0 (B.16)

Theorem B.17 (Converse theorem for global asymptotic stability). Sup-


pose that the (closed, positive invariant) set A is globally asymptotically
stable for the system x + = f (x). Then there exists a Lyapunov function
for the system x + = f (x) and set A.

Proof. Since the set A is GAS we have that for each x ∈ Rn and i ∈ I≥0

φ(i; x)A ≤ β(|x|A , i)

in which β(·) ∈ KL. Using (B.16) then gives for each x ∈ Rn and
i ∈ I≥0

θ1−1 φ(i; x) A ≤ θ2 (|x|A )e−i

in which θ1−1 (·) ∈ K∞ . Propose as Lyapunov function


X 
V (x) = θ1−1 φ(i; x) A
i=0

Since φ(0; x) = x, we have that V (x) ≥ θ1−1 (|x|A ) and we choose


α1 (·) = θ1−1 (·) ∈ K∞ . Performing the sum gives


X ∞
X
 e
V (x) = θ1−1 φ(i; x)A ≤ θ2 (|x|A ) e−i = θ2 (|x|A )
i=0 i=0
e−1
706 Stability Theory

and we choose α2 (·) = (e/(e − 1))θ2 (·) ∈ K∞ . Finally, noting that


f (φ(i; x)) = φ(i + 1; x) for each x ∈ Rn , i ∈ I≥0 , we have that

X  
V (f (x)) − V (x) = θ1−1 f (φ(i; x)) A − θ1−1 φ(i; x) A
i=0

= −θ1−1 ( φ(0; x) A)

= −θ1−1 (|x|A )

and we choose α3 (·) = θ1−1 (·) ∈ K∞ , and the result is established. ■

The appropriate generalization of Theorem B.15 for the constrained


case is:
Theorem B.18 (Lyapunov function for asymptotic stability (con-
strained)). If there exists a Lyapunov function in X for the system x + =
f (x) and set A, then A is asymptotically stable in X for x + = f (x).
The proof of this result is similar to that of Theorem B.15 and is left
as an exercise.
Theorem B.19 (Lyapunov function for exponential stability). If there
exists V : X → R≥0 satisfying the following properties for all x ∈ X

a1 |x|σ σ
A ≤ V (x) ≤ a2 |x|A
V (f (x)) − V (x) ≤ −a3 |x|σ
A

in which a1 , a2 , a3 , σ > 0, then A is exponentially stable in X for x + =


f (x).
Linear time-invariant systems. We review some facts involving the
discrete matrix Lyapunov equation and stability of the linear system

x + = Ax

in which x ∈ Rn . The discrete time system is asymptotically stable if


and only if the magnitudes of the eigenvalues of A are strictly less than
unity. Such an A matrix is called stable, convergent, or discrete time
Hurwitz.
In the following, A, S, Q ∈ Rn×n . The following matrix equation is
known as a discrete matrix Lyapunov equation,

A′ SA − S = −Q

The properties of solutions to this equation allow one to draw con-


clusions about the stability of A without computing its eigenvalues.
Sontag (1998a, p. 231) provides the following lemma
B.3 Lyapunov Stability Theory 707

Lemma B.20 (Lyapunov function for linear systems). The following


statements are equivalent (Sontag, 1998a).
(a) A is stable.

(b) For each Q ∈ Rn×n , there is a unique solution S of the discrete matrix
Lyapunov equation
A′ SA − S = −Q

and if Q > 0 then S > 0.

(c) There is some S > 0 such that A′ SA − S < 0.

(d) There is some S > 0 such that V (x) = x ′ Sx is a Lyapunov function


for the system x + = Ax.

Exercise B.1 asks you to establish the equivalence of (a) and (b).

B.3.2 Time-Varying, Constrained Systems

Following the discussion in Rawlings and Risbeck (2017), we consider


the nonempty sets X(i) ⊆ Rn indexed by i ∈ I≥0 . We define the time-
varying system
x + = f (x, i)

with f ( · , i) : X(i) → X(i + 1). We assume that f ( · , i) is locally


bounded for all i ∈ I≥0 . Note from the definition of f that the sets
X(i) satisfy positive invariance in the following sense: x ∈ X(i) for
any i ≥ 0 implies x(i + 1) := f (x, i) ∈ X(i + 1). We say that the set se-
quence (X(i))i≥0 is sequentially positive invariant to denote this form
of invariance.

Definition B.21 (Sequential positive invariance). A sequence of sets


(X(i))i≥0 is sequentially positive invariant for the system x + = f (x,
i) if for any i ≥ 0, x ∈ X(i) implies f (x, i) ∈ X(i + 1).

We again assume that A is closed and positive invariant for the


time-varying system, i.e., x ∈ A at any time i ≥ 0 implies f (x, i) ∈ A.
We also assume that A ⊆ X(i) for all i ≥ 0. We next define asymptotic
stability of A.

Definition B.22 (Asymptotic stability (time-varying, constrained)). Sup-


pose that the sequence (X(i))i≥0 is sequentially positive invariant and
the set A ⊆ X(i) for all i ≥ 0 is closed and positive invariant for
x + = f (x, i). The set A is asymptotically stable in X(i) at each time
708 Stability Theory

i ≥ 0 for x + = f (x, i) if the following holds for all i ≥ i0 ≥ 0, and


x ∈ X(i0 )
φ(i; x, i0 ) A ≤ β(|x|A , i − i0 ) (B.17)
in which β ∈ KL and φ(i; x, i0 ) is the solution to x + = f (x, i) at time
i ≥ i0 with initial condition x at time i0 ≥ 0.

This stability definition is somewhat restrictive because φ(i; x, i0 )


is bounded by a function depending on i − i0 rather than on i. For
example, to be more general we could define a time-dependent set of
KL functions, βj (·), j ≥ 0, and replace (B.17) with φ(i; x, i0 ) A ≤
βi0 (|x|A , i) for all i ≥ i0 ≥ 0.
We define a time-varying Lyapunov function for this system as fol-
lows.

Definition B.23 (Lyapunov function: time-varying, constrained case).


Let the sequence (X(i))i≥0 be sequentially positive invariant, and the
set A ⊆ X(i) for all i ≥ 0 be closed and positive invariant. Let V (·,
i) : X(i) → R≥0 satisfy for all x ∈ X(i), i ∈ I≥0

α1 (|x|A ) ≤ V (x, i) ≤ α2 (|x|A )


V (f (x, i), i + 1) − V (x, i) ≤ −α3 (|x|A )

with α1 , α2 , α3 ∈ K∞ . Then V (·, ·) is a time-varying Lyapunov function


in the sequence (X(i))i≥0 for x + = f (x, i) and set A.

Note that f (x, i) ∈ X(i + 1) since x ∈ X(i) which verifies that


V (f (x, i), i + 1) is well defined for all x ∈ X(i), i ≥ 0. We then have the
following asymptotic stability result for the time-varying, constrained
case.

Theorem B.24 (Lyapunov theorem for asymptotic stability (time-vary-


ing, constrained)). Let the sequence (X(i))i≥0 be sequentially positive
invariant, and the set A ⊆ X(i) for all i ≥ 0 be closed and positive in-
variant, and V (·, ·) be a time-varying Lyapunov function in the sequence
(X(i))i≥0 for x + = f (x, i) and set A. Then A is asymptotically stable
in X(i) at each time i ≥ 0 for x + = f (x, i).

Proof. For x ∈ X(i0 ), we have that (φ(i; x, i0 ), i) ∈ X(i) for all i ≥ i0 .


From the first and second inequalities we have that for all i ≥ i0 and
x ∈ X(i0 )

V (φ(i + 1; x, i0 ), i + 1) ≤ V (φ(i; x, i0 ), i) − α3 ( φ(i; x, i0 ) A)


≤ σ1 (V (φ(i; x, i0 ), i))
B.4 Robust Stability 709

with σ1 (s) := s − α3 ◦ α2−1 (s). Note that σ1 (·) may not be K∞ because
it may not be increasing. But given this result we can find, as in the
proof of Theorem B.15, σ (·) ∈ K∞ satisfying σ1 (s) < σ (s) < s for all
s ∈ R>0 such that V (φ(i + 1; x, i0 ), i + 1) ≤ σ (V (φ(i; x, i0 ), i)). We
then have that

φ(i; x, i0 ) A ≤ β (|x|A , i − i0 ) ∀x ∈ X(i0 ), i ≥ i0

in which β(s, i) := α1−1 ◦ σ i ◦ α2 (s) for s ∈ R≥0 , i ≥ 0 is a KL function,


and the result is established. ■

B.3.3 Upper bounding K functions

In using Lyapunov functions for stability analysis, we often have to


establish that the upper bound inequality holds on some closed set.
The following result proves useful in such situations.

Proposition B.25 (Global K function overbound). Let X ⊆ Rn be closed


and suppose that a function V : X → R≥0 is continuous at x0 ∈ X and
locally bounded on X, i.e., bounded on every compact subset of X. Then,
there exists a K function α such that

|V (x) − V (x0 )| ≤ α(|x − x0 |) for all x ∈ X

A proof is given in Rawlings and Risbeck (2015).

B.4 Robust Stability


We now turn to the task of obtaining stability conditions for discrete
time systems subject to disturbances. There are two separate questions
that should be addressed. The first is nominal robustness; is asymp-
totic stability of a set A for a (nominal) system x + = f (x) maintained
in the presence of arbitrarily small disturbances? The second question
is the determination of conditions for asymptotic stability of a set A
for a system perturbed by disturbances lying in a given compact set.

B.4.1 Nominal Robustness

Here we follow Teel (2004). The nominal system is x + = f (x). Con-


sider the perturbed system

x + = f (x + e) + w (B.18)
710 Stability Theory

where e is the state error and w the additive disturbance. Let e :=


(e(0), e(1), . . .) and w := (w(0), w(1), . . .) denote the disturbance se-
quences with norms ∥e∥ := supi≥0 |e(i)| and ∥w∥ := supi≥0 |w(i)|. Let
Mδ := {(e, w) | ∥e∥ ≤ δ, ∥w∥ ≤ δ} and, for each x ∈ Rn , let Sδ denote
the set of solutions φ(·; x, e, w) of (B.18) with initial state x (at time
0) and perturbation sequences (e, w) ∈ Mδ . A closed, compact set A
is nominally robustly asymptotically stable for the (nominal) system
x + = f (x) if a small neighborhood of A is locally stable and attractive
for all sufficiently small perturbation sequences. We use the adjective
nominal to indicate that we are examining how a system x + = f (x) for
which A is known to be asymptotically stable behaves when subjected
to small disturbances. More precisely Teel (2004):

Definition B.26 (Nominal robust global asymptotic stability). The


closed, compact set A is said to be nominally robustly globally asymp-
totically stable (nominally RGAS) for the system x + = f (x) if there
exists a KL function β(·) and, for each ε > 0 and each compact set X,
there exists a δ > 0 such that, for each x ∈ X and each solution φ(·)
of the perturbed system lying in Sδ , φ(i) A ≤ β(|x|A , i) + ε for all
i ∈ I≥0 .

Thus, for each ε > 0, there exists a δ > 0 such that each solution
φ(·) of x + = f (x +e)+w starting in a δ neighborhood of A remains in
a β(δ, 0) + ε neighborhood of A, and each solution starting anywhere
in Rn converges to a ε neighborhood of A. These properties are a
necessary relaxation (because of the perturbations) of local stability
and global attractivity.

Remark. What we call “nominally robustly globally asymptotically sta-


ble” in the above definition is called “robustly globally asymptotically
stable” in Teel (2004); we use the term “nominal” to indicate that we
are concerned with the effect of perturbations e and w on the stabil-
ity properties of a “nominal” system x + = f (x) for which asymptotic
stability of a set A has been established (in the absence of perturba-
tions). We use the expression “A is globally asymptotically stable for
x + = f (x + e) + w” to refer to the case when asymptotic stability of a
set A has been established for the perturbed system x + = f (x +e)+w.

The following result, where we add the adjective “nominal”, is es-


tablished in (Teel, 2004, Theorem 2):

Theorem B.27 (Nominal robust global asymptotic stability and Lya-


punov function). Suppose set A is closed and compact and f (·) is locally
B.4 Robust Stability 711

bounded. Then the set A is nominally robustly globally asymptotically


stable for the system x + = f (x) if and only if there exists a continuous
(in fact, smooth) Lyapunov function for x + = f (x) and set A.
The significance of this result is that while a nonrobust system, for
which A is globally asymptotically stable, has a Lyapunov function,
that function is not continuous. For the globally asymptotically sta-
ble example x + = f (x) discussed in Section 3.2 of Chapter 3, where
f (x) = (0, |x|) when x1 ≠ 0 and f (x) = (0, 0) otherwise, one Lya-
punov function V (·) is V (x) = 2 |x| if x1 ≠ 0 and V (x) = |x| if x1 = 0.
That V (·) is a Lyapunov function follows from the fact that it satisfies
V (x) ≥ |x|, V (x) ≤ 2 |x| and V (f (x)) − V (x) = − |x| for all x ∈ R2 .
It follows immediately from its definition that V (·) is not continuous;
but we can also deduce from Theorem B.27 that every Lyapunov func-
tion for this system is not continuous since, as shown in Section 3.2
of Chapter 3, global asymptotic stability for this system is not robust.
Theorem B.27 shows that existence of a continuous Lyapunov function
guarantees nominal robustness. Also, it follows from Theorem B.17
that there exists a smooth Lyapunov function for x + = f (x) if f (·) is
continuous and A is GAS for x + = f (x). Since f (·) is locally bounded
if it is continuous, it then follows from Theorem B.27 that A is nomi-
nally robust GAS for x + = f (x) if it is GAS and f (·) is continuous.

B.4.2 Robustness

We turn now to stability conditions for systems subject to bounded


disturbances (not vanishingly small) and described by

x + = f (x, w) (B.19)

where the disturbance w lies in the compact set W. This system may
equivalently be described by the difference inclusion

x + ∈ F (x) (B.20)

where the set F (x) := {f (x, w) | w ∈ W}. Let S(x) denote the set
of all solutions of (B.19) or (B.20) with initial state x. We require, in
the sequel, that the closed set A is positive invariant for (B.19) (or for
x + ∈ F (x)):
Definition B.28 (Positive invariance with disturbances). The closed set
A is positive invariant for x + = f (x, w), w ∈ W if x ∈ A implies
f (x, w) ∈ A for all w ∈ W; it is positive invariant for x + ∈ F (x) if
x ∈ A implies F (x) ⊆ A.
712 Stability Theory

Clearly the two definitions are equivalent; A is positive invariant


for x + = f (x, w), w ∈ W, if and only if it is positive invariant for
x + ∈ F (x).

Remark. In the MPC literature, but not necessarily elsewhere, the term
robust positive invariant is often used in place of positive invariant to
emphasize that positive invariance is maintained despite the presence
of the disturbance w. However, since the uncertain system x + = f (x,
w), w ∈ W is specified (x + = f (x, w), w ∈ W or x + ∈ F (x)) in the
assertion that a closed set A is positive invariant, the word “robust”
appears to be unnecessary. In addition, in the systems literature, the
closed set A is said to be robust positive invariant for x + ∈ F (x) if it
satisfies conditions similar to those of Definition B.26 with x + ∈ F (x)
replacing x + = f (x); see Teel (2004), Definition 3.

In Definitions B.29–B.31, we use “positive invariant” to denote “pos-


itive invariant for x + = f (x, w), w ∈ W” or for x + ∈ F (x).

Definition B.29 (Local stability (disturbances)). The closed, positive


invariant set A is locally stable for x + = f (x, w), w ∈ W (or for
x + ∈ F (x)) if, for all ε > 0, there exists a δ > 0 such that, for each
x satisfying |x|A < δ, each solution φ(·) ∈ S(x) satisfies φ(i) A < ε
for all i ∈ I≥0 .

Definition B.30 (Global attraction (disturbances)). The closed, positive


invariant set A is globally attractive for the system x + = f (x, w), w ∈
W (or for x + ∈ F (x)) if, for each x ∈ Rn , each solution φ(·) ∈ S(x)
satisfies φ(i) A → 0 as i → ∞.

Definition B.31 (GAS (disturbances)). The closed, positive invariant set


A is globally asymptotically stable for x + = f (x, w), w ∈ W (or for
x + ∈ F (x)) if it is locally stable and globally attractive.

An alternative definition of global asymptotic stability of closed set


A for x + = f (x, w), w ∈ W, if A is compact, is the existence of a
KL function β(·) such that for each x ∈ Rn , each φ ∈ S(x) satisfies
φ(i) A ≤ β(|x|A , i) for all i ∈ I≥0 . To cope with disturbances we
require a modified definition of a Lyapunov function.

Definition B.32 (Lyapunov function (disturbances)). A function V :


Rn → R≥0 is said to be a Lyapunov function for the system x + = f (x,
w), w ∈ W (or for x + ∈ F (x)) and closed set A if there exist functions
B.5 Control Lyapunov Functions 713

αi ∈ K∞ , i = 1, 2, 3 such that for any x ∈ Rn ,

V (x) ≥ α1 (|x|A ) (B.21)


V (x) ≤ α2 (|x|A ) (B.22)
sup V (z) − V (x) ≤ −α3 (|x|A ) (B.23)
z∈F (x)

Remark. Without loss of generality, we can choose the function α3 (·)


in (B.23) to be a class K∞ function if f (·) is continuous (see Jiang and
Wang (2002), Lemma 2.8).

Inequality B.23 ensures V (f (x, w))−V (x) ≤ −α3 (|x|A ) for all w ∈
W. The existence of a Lyapunov function for the system x + ∈ F (x) and
closed set A is a sufficient condition for A to be globally asymptotically
stable for x + ∈ F (x) as shown in the next result.

Theorem B.33 (Lyapunov function for global asymptotic stability (dis-


turbances)). Suppose V (·) is a Lyapunov function for x + = f (x, w),
w ∈ W (or for x + ∈ F (x)) and closed set A with α3 (·) a K∞ function.
Then A is globally asymptotically stable for x + = f (x, w), w ∈ W (or
for x + ∈ F (x)).

Proof. (i) Local stability: Let ε > 0 be arbitrary and let δ := α2−1 (α1 (ε)).
Suppose |x|A < δ so that, by (B.22), V (x) ≤ α2 (δ) = α1 (ε). Let φ(·)
be any solution in S(x) so that φ(0) = x. From (B.23), (V (φ(i)))i∈I≥0
is a nonincreasing sequence so that, for all i ∈ I≥0 , V (φ(i)) ≤ V (x).
From (B.21), φ(i) A ≤ α1−1 (V (x)) ≤ α1−1 (α1 (ε)) = ε for all i ∈ I≥0 .
(ii) Global attractivity: Let x ∈ Rn be arbitrary. Let φ(·) be any solu-
tion in S(x) so that φ(0) = x. From Equations B.21 and B.23, since
φ(i + 1) ∈ F (φ(i)), the sequence (V (φ(i)))i∈I≥0 is nonincreasing and
bounded from below by zero. Hence both V (φ(i)) and V (φ(i+1)) con-
verge to V̄ ≥ 0 as i → ∞. But φ(i + 1) ∈ F (φ(i)) so that, from (B.23),
α3 ( φ(i) A ) → 0 as i → ∞. Since φ(i) A = α3−1 (α3 ( φ(i) A )) where
α3−1 (·) is a K∞ function, φ(i) A → 0 as i → ∞. ■

B.5 Control Lyapunov Functions


A control Lyapunov function is a useful generalization, due to Sontag
(1998a, pp.218–233), of a Lyapunov function; while a Lyapunov func-
tion is relevant for a system x + = f (x) and provides conditions for the
(asymptotic) stability of a set for this system, a control Lyapunov func-
tion is relevant for a control system x + = f (x, u) and provides condi-
714 Stability Theory

tions for the existence of a controller u = κ(x) that ensures (asymp-


totic) stability of a set for the controlled system x + = f (x, κ(x)). Con-
sider the control system
x + = f (x, u)

where the control u is subject to the constraint

u∈U

Our standing assumptions in this section are that f (·) is continuous


and U is compact.

Definition B.34 (Global control Lyapunov function (CLF)). A function


V : Rn → R≥0 is a global control Lyapunov function for the system
x + = f (x, u), u ∈ U, and closed set A if there exist K∞ functions
α1 (·), α2 (·), α3 (·) satisfying for all x ∈ Rn :

α1 (|x|A ) ≤ V (x) ≤ α2 (|x|A )


inf V (f (x, u)) − V (x) ≤ −α3 (|x|A )
u∈U

Definition B.35 (Global stabilizability). Let set A be compact. The set


A is globally stabilizable for the system x + = f (x, u) if there exists a
state-feedback function κ : Rn → U such that A is globally asymptoti-
cally stable for x + = f (x, κ(x)).

Remark. Given a global control Lyapunov function V (·), one can


choose a control law κ : Rn → U satisfying

V (f (x, κ(x))) ≤ V (x) − α3 (|x|A )/2

for all x ∈ Rn (seeTeel (2004)). Since U is compact, κ(·) is locally


bounded and, hence, so is x , f (x, κ(x)). Thus we may use Theorem
B.13 to deduce that A is globally asymptotically stable for x + = f (x,
κ(x)). If V (·) is continuous, one can also establish nominal robustness
properties.

In a similar fashion one can extend the concept of control Lyapunov


functions to the case when the system is subject to disturbances. Con-
sider the system
x + = f (x, u, w)
B.5 Control Lyapunov Functions 715

where the control u is constrained to lie in U and the disturbance takes


values in the set W. We assume that f (·) is continuous and that U and
W are compact. The system may be equivalently defined by

x + ∈ F (x, u)

where the set-valued function F (·) is defined by

F (x, u) := {f (x, u, w) | w ∈ W}

We can now make the obvious generalizations of the definitions in Sec-


tion B.4.2.

Definition B.36 (Positive invariance (disturbance and control)). The


closed set A is positive invariant for x + = f (x, u, w), w ∈ W (or for
x + ∈ F (x, u)) if for all x ∈ A there exists a u ∈ U such that f (x, u,
w) ∈ A for all w ∈ W (or F (x, u) ⊆ A).

Definition B.37 (CLF (disturbance and control)). A function V : Rn →


R≥0 is said to be a control Lyapunov function for the system x + = f (x,
u, w), u ∈ U, w ∈ W (or x + ∈ F (x, u), u ∈ U) and set A if there exist
functions αi ∈ K∞ , i = 1, 2, 3 such that for any x ∈ Rn ,

α1 (|x|A ) ≤ V (x) ≤ α2 (|x|A )


inf sup V (z) − V (x) ≤ −α3 (|x|A ) (B.24)
u∈U z∈F (x,u)

Remark (CLF implies control law). Given a global control Lyapunov


function V (·), one can choose a control law κ : Rn → U satisfying

sup V (z) ≤ V (x) − α3 (|x|A )/2


z∈F (x,κ(x))

for all x ∈ Rn . Since U is compact, κ(·) is locally bounded and, hence,


so is x , f (x, κ(x)). Thus we may use Theorem B.33 to deduce that
A is globally asymptotically stable for x + = f (x, κ(x), w), w ∈ W (for
x + ∈ F (x, κ(x))).

These results can be further extended to deal with the constrained


case. First, we generalize the definitions of positive invariance of a
set.

Definition B.38 (Control invariance (constrained)). The closed set A is


control invariant for x + = f (x, u), u ∈ U if, for all x ∈ A, there exists
a u ∈ U such that f (x, u) ∈ A.
716 Stability Theory

Suppose that the state x is required to lie in the closed set X ⊂ Rn .


In order to show that it is possible to ensure a decrease of a Lyapunov
function, as in (B.24), in the presence of the state constraint x ∈ X, we
assume that there exists a control invariant set X ⊆ X for x + = f (x,
u, w), u ∈ U, w ∈ W. This enables us to obtain a control law that
keeps the state in X and, hence, in X, and, under suitable conditions,
to satisfy a variant of (B.24).

Definition B.39 (CLF (constrained)). Suppose the set X and closed set
A, A ⊂ X, are control invariant for x + = f (x, u), u ∈ U. A function
V : X → R≥0 is said to be a control Lyapunov function in X for the
system x + = f (x, u), u ∈ U, and closed set A in X if there exist
functions αi ∈ K∞ , i = 1, 2, 3, defined on X, such that for any x ∈ X,

α1 (|x|A ) ≤ V (x) ≤ α2 (|x|A )


inf {V (f (x, u)) | f (x, u) ∈ X} − V (x) ≤ −α3 (|x|A )
u∈U

Remark. Again, if V (·) is a control Lyapunov function in X for x + =


f (x, u), u ∈ U and closed set A in X, one can choose a control law
κ : Rn → U satisfying

V (f (x, κ(x))) − V (x) ≤ −α3 (|x|A )/2

for all x ∈ X. Since U is compact, κ(·) is locally bounded and, hence,


so is x , f (x, κ(x)). Thus, when α3 (·) is a K∞ function, we may use
Theorem B.18 to deduce that A is asymptotically stable for x + = f (x,
κ(x)), u ∈ U in X; also φ(i; x) ∈ X ⊂ X for all x ∈ X, all i ∈ I≥0 .

Finally we consider the constrained case in the presence of distur-


bances. First we define control invariance in the presence of distur-
bances.

Definition B.40 (Control invariance (disturbances, constrained)). The


closed set A is control invariant for x + = f (x, u, w), u ∈ U, w ∈ W
if, for all x ∈ A, there exists a u ∈ U such that f (x, u, w) ∈ A for all
w ∈ W (or F (x, u) ⊆ A where F (x, u) := {f (x, u, w) | w ∈ W}).

Next, we define what we mean by a control Lyapunov function in


this context.

Definition B.41 (CLF (disturbances, constrained)). Suppose the set X


and closed set A, A ⊂ X, are control invariant for x + = f (x, u, w),
u ∈ U, w ∈ W. A function V : X → R≥0 is said to be a control Lyapunov
B.6 Input-to-State Stability 717

function in X for the system x + = f (x, u, w), u ∈ U, w ∈ W and set


A if there exist functions αi ∈ K∞ , i = 1, 2, 3, defined on X, such that
for any x ∈ X,

α1 (|x|A ) ≤ V (x) ≤ α2 (|x|A )


inf sup V (z) − V (x) ≤ −α3 (|x|A )
u∈U z∈F (x,u)∩X

Suppose now that the state x is required to lie in the closed set
X ⊂ Rn . Again, in order to show that there exists a condition similar
to (B.24), we assume that there exists a control invariant set X ⊆ X for
x + = f (x, u, w), u ∈ U, w ∈ W. This enables us to obtain a control
law that keeps the state in X and, hence, in X, and, under suitable
conditions, to satisfy a variant of (B.24).

Remark. If V (·) is a control Lyapunov function in X for x + = f (x, u),


u ∈ U, w ∈ W and set A in X, one can choose a control law κ : X → U
satisfying
sup V (z) − V (x) ≤ −α3 (|x|A )/2
z∈F (x,κ(x))

for all x ∈ X. Since U is compact, κ(·) is locally bounded and, hence,


so is x , f (x, κ(x)). Thus, when α3 (·) is a K∞ function, we may
use Theorem B.18 to deduce that A is asymptotically stable in X for
x + = f (x, κ(x), w), w ∈ W (or, equivalently, for x + ∈ F (x, κ(x))); also
φ(i) ∈ X ⊂ X for all x ∈ X, all i ∈ I≥0 , all φ ∈ S(x).

B.6 Input-to-State Stability


We consider, as in the previous section, the system

x + = f (x, w)

where the disturbance w takes values in Rp . In input-to-state stability


(Sontag and Wang, 1995; Jiang and Wang, 2001) we seek a bound on
the state in terms of a uniform bound on the disturbance sequence
w := (w(0), w(1), . . .). Let ∥·∥ denote the usual ℓ∞ norm for sequences,
i.e., ∥w∥ := supk≥0 |w(k)|.

Definition B.42 (Input-to-state stable (ISS)). The system x + = f (x, w)


is (globally) input-to-state stable (ISS) if there exists a KL function β(·)
and a K function σ (·) such that, for each x ∈ Rn , and each disturbance
sequence w = (w(0), w(1), . . .) in ℓ∞

φ(i; x, wi ) ≤ β(|x| , i) + σ (∥wi ∥)


718 Stability Theory

for all i ∈ I≥0 , where φ(i; x, wi ) is the solution, at time i, if the initial
state is x at time 0 and the input sequence is wi := w(0), w(1), . . . ,

w(i − 1) .

We note that this definition implies the origin is globally asymptot-


ically stable if the input sequence is identically zero. Also, the norm
of the state is asymptotically bounded by σ (∥w∥) where w := w(0),

w(1), . . . . As before, we seek a Lyapunov function that ensures input-
to-state stability.

Definition B.43 (ISS-Lyapunov function). A function V : Rn → R≥0 is


an ISS-Lyapunov function for system x + = f (x, w) if there exist K∞
functions α1 (·), α2 (·), α3 (·) and a K function σ (·) such that for all
x ∈ Rn , w ∈ Rp

α1 (|x|) ≤ V (x) ≤ α2 (|x|)


V (f (x, w)) − V (x) ≤ −α3 (|x|) + σ (|w|)

The following result appears in Jiang and Wang (2001, Lemma 3.5)

Lemma B.44 (ISS-Lyapunov function implies ISS). Suppose f (·) is con-


tinuous and that there exists a continuous ISS-Lyapunov function for
x + = f (x, w). Then the system x + = f (x, w) is ISS.

The converse, i.e., input-to-state stability implies the existence of


a smooth ISS-Lyapunov function for x + = f (x, w) is also proved in
Jiang and Wang (2002, Theorem 1). We now consider the case when the
state satisfies the constraint x ∈ X where X is a closed subset of Rn .
Accordingly, we assume that the disturbance w satisfies w ∈ W where
W is a compact set containing the origin and that X ⊂ X is a closed
robust positive invariant set for x + = f (x, w), w ∈ W or, equivalently,
for x + ∈ F (x, u).

Definition B.45 (ISS (constrained)). Suppose that W is a compact set


containing the origin and that X ⊂ X is a closed robust positive invari-
ant set for x + = f (x, w), w ∈ W. The system x + = f (x, w), w ∈ W is
ISS in X if there exists a class KL function β(·) and a class K function
σ (·) such that, for all x ∈ X, all w ∈ W where W is the set of infinite
sequences w satisfying w(i) ∈ W for all i ∈ I≥0

φ(i; x, wi ) ≤ β(|x| , i) + σ (∥wi ∥)

Definition B.46 (ISS-Lyapunov function (constrained)). A function V :


X → R≥0 is an ISS-Lyapunov function in X for system x + = f (x, w)
B.7 Output-to-State Stability and Detectability 719

if there exist K∞ functions α1 (·), α2 (·), α3 (·) and a K function σ (·)


such that for all x ∈ X, all w ∈ W

α1 (|x|) ≤ V (x) ≤ α2 (|x|)


V (f (x, w)) − V (x) ≤ −α3 (|x|) + σ (|w|)

The following result is a minor generalization of Lemma 3.5 in Jiang


and Wang (2001).

Lemma B.47 (ISS-Lyapunov function implies ISS (constrained)). Suppose


that W is a compact set containing the origin and that X ⊂ X is a closed
robust positive invariant set for x + = f (x, w), w ∈ W. If f (·) is contin-
uous and there exists a continuous ISS-Lyapunov function in X for the
system x + = f (x, w), w ∈ W, then the system x + = f (x, w), w ∈ W is
ISS in X.

B.7 Output-to-State Stability and Detectability


We present some definitions and results that are discrete time versions
of results due to Sontag and Wang (1997) and Krichman, Sontag, and
Wang (2001). The output-to-state (OSS) property corresponds, infor-
mally, to the statement that “no matter what the initial state is, if the
observed outputs are small, then the state must eventually be small”. It
is therefore a natural candidate for the concept of nonlinear (zero-state)
detectability. We consider first the autonomous system

x + = f (x) y = h(x) (B.25)

where f (·) : X → X is locally Lipschitz continuous and h(·) is contin-


uously differentiable where X = Rn for some n. We assume x = 0
is an equilibrium state, i.e., f (0) = 0. We also assume h(0) = 0. We
use φ(k; x0 ) to denote the solution of (B.25) with initial state x0 , and
y(k; x0 ) to denote h(φ(k; x0 )). The function yx0 (·) is defined by

yx0 (k) := y(k; x0 )

We use |·| and ∥·∥ to denote, respectively the Euclidean norm of a


vector and the sup norm of a sequence; ∥·∥0:k denotes the max norm
of a sequence restricted to the interval [0, k]. For conciseness, u, y
 
denote, respectively, the sequences u(j) , y(j) .
720 Stability Theory

Definition B.48 (Output-to-state stable (OSS)). The system (B.25) is


output-to-state stable (OSS) if there exist functions β(·) ∈ KL and
γ(·) ∈ K such that for all x0 ∈ Rn and all k ≥ 0

|x(k; x0 )| ≤ max β(|x0 | , k), γ(∥y∥0:k )

Definition B.49 (OSS-Lyapunov function). An OSS-Lyapunov function


for system (B.25) is any function V (·) with the following properties
(a) There exist K∞ functions α1 (·) and α2 (·) such that

α1 (|x|) ≤ V (x) ≤ α2 (|x|)

for all x in Rn .

(b) There exist K∞ functions α(·) and σ (·) such that for all x ∈ Rn
either
V (x + ) ≤ V (x) − α(|x|) + σ ( y )
or
V (x + ) ≤ ρV (x) + σ ( y ) (B.26)
+
with x = f (x), y = h(x), and ρ ∈ (0, 1).

Inequality (B.26) corresponds to an exponential-decay OSS-


Lyapunov function.

Theorem B.50 (OSS and OSS-Lyapunov function). The following prop-


erties are equivalent for system (B.25):
(a) The system is OSS.

(b) The system admits an OSS-Lyapunov function.

(c) The system admits an exponential-decay OSS-Lyapunov function.

B.8 Input/Output-to-State Stability


Consider now a system with both inputs and outputs

x + = f (x, u) y = h(x) (B.27)

Input/output-to-state stability corresponds roughly to the statement


that, no matter what the initial state is, if the input and the output con-
verge to zero, so does the state. We assume f (·) and h(·) are contin-
uous. We also assume f (0, 0) = 0 and h(0) = 0. Let x(·, x0 , u) denote
the solution of (B.27) which results from initial state x0 and control

u = u(j) j≥0 and let yx0 ,u (k) := y(k; x0 , u) denote h(x(k; x0 , u)).
B.8 Input/Output-to-State Stability 721

Definition B.51 (Input/output-to-state stable (IOSS)). The system (B.27)


is input/output-to-state stable (IOSS) if there exist functions β(·) ∈ KL
and γ1 (·), γ2 (·) ∈ K such that
 
|x(k; x0 )| ≤ max β(|x0 | , k), γ1 (∥u∥0:k−1 ) , γ2 ∥y∥0:k

for every initial state x0 ∈ Rn , every control sequence u = u(j) , and
all k ≥ 0.

Definition B.52 (IOSS-Lyapunov function). An IOSS-Lyapunov function


for system (B.27) is any function V (·) with the following properties:
(a) There exist K∞ functions α1 (·) and α2 (·) such that

α1 (|x|) ≤ V (x) ≤ α2 (|x|)

for all x ∈ Rn .

(b) There exist K∞ functions α(·), σ1 (·), and σ2 (·) such that for every
x and u either

V (x + ) ≤ V (x) − α(|x|) + σ1 (|u|) + σ2 ( y )

or
V (x + ) ≤ ρV (x) + σ1 (|u|) + σ2 ( y )
with x + = f (x, u), y = h(x), and ρ ∈ (0, 1).

The following result proves useful when establishing that MPC em-
ploying cost functions based on the inputs and outputs rather than
inputs and states is stabilizing for IOSS systems. Consider the system
x + = f (x, u), y = h(x) with stage cost ℓ(y, u) and constraints (x,
u) ∈ Z. The stage cost satisfies ℓ(0, 0) = 0 and ℓ(y, u) ≥ α( (y, u) )
for all (y, u) ∈ Rp × Rm with α1 a K∞ function. Let X := {x |
∃u with (x, u) ∈ Z}.

Theorem B.53 (Modified IOSS-Lyapunov function). Assume that there


exists an IOSS-Lyapunov function V : X → R≥0 for the constrained system
x + = f (x, u) such that the following holds for all (x, u) ∈ Z for which
f (x, u) ∈ X

α1 (|x|) ≤ V (x) ≤ α2 (|x|)


V (f (x, u)) − V (x) ≤ −α3 (|x|) + σ (ℓ(y, u))

with α1 , α2 , α3 ∈ K∞ and σ ∈ K.
722 Stability Theory

For any α4 ∈ K∞ , there exists another IOSS-Lyapunov function Λ :


X → R≥0 for the constrained system x + = f (x, u) such that the following
holds for all (x, u) ∈ Z for which f (x, u) ∈ X
α1 (|x|) ≤ Λ(x) ≤ α2 (|x|)
Λ(f (x, u)) − Λ(x) ≤ −ρ(|x|) + α4 (ℓ(y, u))
with α1 , α2 ∈ K∞ and continuous function ρ ∈ PD. Note that Λ = γ ◦ V
for some γ ∈ K.
Conjecture B.54 (IOSS and IOSS-Lyapunov function). The following
properties are equivalent for system (B.27):
(a) The system is IOSS.

(b) The system admits a smooth IOSS-Lyapunov function.

(c) The system admits an exponential-decay IOSS-Lyapunov function.


As discussed in the Notes section of Chapter 2, Grimm, Messina,
Tuna, and Teel (2005) use a storage function like Λ(·) in Theorem B.53
to treat a semidefinite stage cost. Cai and Teel (2008) provide a dis-
crete time converse theorem for IOSS that holds for all Rn . Allan and
Rawlings (2018) provide the converse theorem on closed positive in-
variant sets (Theorem 36), and also provide a lemma for changing the
supply rate function (Theorem 38).

B.9 Incremental-Input/Output-to-State Stability


Definition B.55 (Incremental input/output-to-state stable). The system
(B.27) is incrementally input/output-to-state stable (i-IOSS) if there ex-
ists some β(·) ∈ KL and γ1 (·), γ2 (·) ∈ K such that, for every two

initial states z1 and z2 and any two control sequences u1 = u1 (j)

and u2 = u2 (j)

|x(k; z1 , u1 ) − x(k; z2 , u2 )| ≤
n o
max β(|z1 − z2 | , k), γ1 (∥u1 − u2 ∥0:k−1 ), γ2 ( yz1 ,u1 − yz2 ,u2 0:k )

B.10 Observability
Definition B.56 (Observability). The system (B.27) is (uniformly) observ-
able if there exists a positive integer N and an α(·) ∈ K such that
k−1
X
h(x(j; x, u)) − h(x(j; z, u)) ≥ α(|x − z|) (B.28)
j=0
B.10 Observability 723

for all x, z, all k ≥ N and all control sequences u; here x(j; z, u) =


φ(j; z, u), the solution of (B.27) when the initial state is z at time 0 and
the control sequence is u.

When the system is linear, i.e., f (x, u) = Ax + Bu and h(x) = Cx,


this assumption is equivalent to assuming the observability Gramian
Pn−1 j j ′ ′
j=0 CA (A ) C is positive definite. Consider the system described
by
z+ = f (z, u) + w y + v = h(z) (B.29)

with output yw = y + v. Let z(k; z, u, w) denote the solution, at time


k of (B.29) if the state at time 0 is z, the control sequence is u and the
disturbance sequence is w. We assume, in the sequel, that

Assumption B.57 (Lipschitz continuity of model).


(a) The function f (·) is globally Lipschitz continuous in Rn × U with
Lipschitz constant c.

(b) The function h(·) is globally Lipschitz continuous in Rn with Lips-


chitz constant c.

Lemma B.58 (Lipschitz continuity and state difference bound). Suppose


Assumption B.57 is satisfied (with Lipschitz constant c). Then,

k−1
X
|x(k; x, u) − z(k; z, u, w)| ≤ c k |x − z| + c k−i−1 |w(i)|
i=0

Proof. Let δ(k) := |x(k; x, u) − z(k; z, u, w)|. Then

δ(k + 1) = f (x(k; x, u), u(k)) − f (z(k; z, u, w), u(k)) − w(k)


≤ c |δ(k)| + |w(k)|

Iterating this equation yields the desired result. ■

Theorem B.59 (Observability and convergence of state). Suppose (B.27)


is (uniformly) observable and that Assumption B.57 is satisfied. Then,
w(k) → 0 and v(k) → 0 as k → ∞ imply |x(k; x, u) − z(k; z, u, w)| → 0
as k → ∞.

Proof. Let x(k) and z(k) denote x(k; x, u) and z(k; z, u, w), respec-
tively, in the sequel. Since (B.27) is observable, there exists an integer
724 Stability Theory

N satisfying (B.28). Consider the sum


k+N
X k+N
X
S(k) = v(k) = h(x(j; x, u)) − h(z(j; z, u, w))
j=k j=k
k+N
X
≥ h(x(j; x(k), u)) − h(x(j; z(k), u))
j=k
k+N
X
− h(x(j; z(k), u)) − h(z(j; z(k), u, w)) (B.30)
j=k

where we have used the fact that |a + b| ≥ |a| − |b|. By the assumption
of observability
k+N
X
h(x(j; x(k), u)) − h(x(j; z(k), u)) ≥ α(|x(k) − z(k)|)
j=k

for all k. From Lemma B.58 and the Lipschitz assumption on h(·)

h(x(j; z(k), u)) − h(z(j; z(k), u, w)) ≤


j−1
X
c x(j; z(k), u) − z(j; z(k), u, w) ≤ c c j−1−i |w(i)|
i=k

for all j in {k + 1, k + 2, . . . k + N}. Hence there exists a d ∈ (0, ∞) such


that the last term in (B.30) satisfies
k+N
X
h(x(j; x(k), u)) − h(x(j; z(k), u)) ≤ d ∥w∥k−N:k
j=k

Hence, (B.30) becomes

α(|x(k) − z(k)|) ≤ N ∥v∥k−N:k + d ∥w∥k−N:k

Since, by assumption, w(k) → 0 and v(k) → 0 as k → ∞, and α(·) ∈ K,


it follows that |x(k) − z(k)| → 0 as k → ∞. ■

B.11 Exercises

Exercise B.1: Lyapunov equation and linear systems


Establish the equivalence of (a) and (b) in Lemma B.20.
B.11 Exercises 725

Exercise B.2: Lyapunov function for exponential stability


Let V : Rn → R≥0 be a Lyapunov function for the system x + = f (x) with the following
properties. For all x ∈ Rn

a1 |x|σ ≤ V (x) ≤ a2 |x|σ


V (f (x)) − V (x) ≤ −a3 |x|σ

in which a1 , a2 , a3 , σ > 0. Show that the origin of the system x + = f (x) is globally
exponentially stable.

Exercise B.3: A converse theorem for exponential stability


(a) Assume that the origin is globally exponentially stable (GES) for the system

x + = f (x)

in which f (·) is continuous. Show that there exists a continuous Lyapunov


function V (·) for the system satisfying for all x ∈ Rn

a1 |x|σ ≤ V (x) ≤ a2 |x|σ


V (f (x)) − V (x) ≤ −a3 |x|σ

in which a1 , a2 , a3 , σ > 0.
σ
Hint: Consider summing the solution φ(i; x) on i as a candidate Lyapunov
function V (x).

(b) Establish that in the Lyapunov function defined above, any σ > 0 is valid, and
also that the constant a3 can be chosen as large as one wishes.

Exercise B.4: Revisit Lemma 1.3 in Chapter 1


Establish Lemma 1.3 in Chapter 1 using the Lyapunov function tools established in
this appendix. Strengthen the conclusion and establish that the closed-loop system is
globally exponentially stable.

Exercise B.5: Continuity of Lyapunov function for asymptotic stability


Let X be a compact subset of Rn containing the origin in its interior that is positive
invariant for the system x + = f (x). If f (·) is continuous on X and the origin is
asymptotically stable with a region of attraction X, show that the Lyapunov function
suggested in Theorem B.17 is continuous on X.

Exercise B.6: A Lipschitz continuous converse theorem for exponential sta-


bility
Consider the system x + = f (x), f (0) = 0, with function f : D → Rn Lipschitz contin-
uous on compact set D ⊂ Rn containing the origin in its interior. Choose R > 0 such
that BR ⊆ D. Assume that there exist scalars c > 0 and λ ∈ (0, 1) such that

φ(k; x) ≤ c |x| λk for all |x| ≤ r , k≥0

with r := R/c.
726 Stability Theory

Show that there exists a Lipschitz continuous Lyapunov function V (·) satisfying for
all x ∈ Br

a1 |x|2 ≤ V (x) ≤ a2 |x|2


V (f (x)) − V (x) ≤ −a3 |x|2

with a1 , a2 , a3 > 0.
Hint: Use the proposed Lyapunov function of Exercise B.3 with σ = 2. See also
(Khalil, 2002, Exercise 4.68).

Exercise B.7: Lyapunov function requirements: continuity of α3


Consider the following scalar system x + = f (x) with piecewise affine and discontinu-
ous f (·) (Lazar et al., 2009)

0, x ∈ (−∞, 1]
f (x) =
(1/2)(x + 1), x ∈ (1, ∞)

Note that the origin is a steady state


(a) Consider V (x) = |x| as a candidate Lyapunov function. Show that this V satis-
fies (B.11)–(B.13) of Definition B.12, in which α3 (x) is positive definite but not
continuous.

(b) Show by direction calculation that the origin is not globally asymptotically stable.
Show that for initial conditions x0 ∈ (1, ∞), x(k; x0 ) → 1 as k → ∞.

The conclusion here is that one cannot leave out continuity of α3 in the definition of a
Lyapunov function when allowing discontinuous system dynamics.

Exercise B.8: Difference between classical and KL stability definitions (Teel)


Consider the discontinuous nonlinear scalar example x + = f (x) with
1

 x

 |x| ∈ [0, 1]
2
f (x) = 2x
 |x| ∈ (1, 2)

 2 − |x|


0 |x| ∈ [2, ∞)

Is this system GAS under the classical definition? Is this system GAS under the KL
definition? Discuss why or why not.

Exercise B.9: Combining K functions


Establish (B.5) and (B.7) starting from (B.3) and (B.4) and then using (B.1).

Exercise B.10
Derive KL bounds (B.6) and (B.8) from (B.5) and (B.7), respectively.
Bibliography

D. A. Allan and J. B. Rawlings. An input/output-to-state stability converse the-


orem for closed positive invariant sets. Technical Report 2018–01, TWCCC
Technical Report, December 2018.

D. A. Allan, C. N. Bates, M. J. Risbeck, and J. B. Rawlings. On the inherent


robustness of optimal and suboptimal nonlinear MPC. Sys. Cont. Let., 106:
68–78, August 2017.

R. G. Bartle and D. R. Sherbert. Introduction to Real Analysis. John Wiley &


Sons, Inc., New York, third edition, 2000.

C. Cai and A. R. Teel. Input–output-to-state stability for discrete-time systems.


Automatica, 44(2):326 – 336, 2008.

G. Grimm, M. J. Messina, S. E. Tuna, and A. R. Teel. Model predictive control:


For want of a local control Lyapunov function, all is not lost. IEEE Trans.
Auto. Cont., 50(5):546–558, 2005.

Z.-P. Jiang and Y. Wang. Input-to-state stability for discrete-time nonlinear


systems. Automatica, 37:857–869, 2001.

Z.-P. Jiang and Y. Wang. A converse Lyapunov theorem for discrete-time sys-
tems with disturbances. Sys. Cont. Let., 45:49–58, 2002.

R. E. Kalman and J. E. Bertram. Control system analysis and design via the
“Second method” of Lyapunov, Part II: Discrete–time systems. ASME J. Basic
Engr., pages 394–400, June 1960.

C. M. Kellett and A. R. Teel. Discrete-time asymptotic controllability implies


smooth control-Lyapunov function. Sys. Cont. Let., 52:349–359, 2004a.

C. M. Kellett and A. R. Teel. Smooth Lyapunov functions and robustness of


stability for difference inclusions. Sys. Cont. Let., 52:395–405, 2004b.

H. K. Khalil. Nonlinear Systems. Prentice-Hall, Upper Saddle River, NJ, third


edition, 2002.

M. Krichman, E. D. Sontag, and Y. Wang. Input-output-to-state stability. SIAM


J. Cont. Opt., 39(6):1874–1928, 2001.

J. P. LaSalle. The stability and control of discrete processes, volume 62 of Applied


Mathematical Sciences. Springer-Verlag, 1986.

727
728 Bibliography

M. Lazar, W. P. M. H. Heemels, and A. R. Teel. Lyapunov functions, stability and


input-to-state stability subtleties for discrete-time discontinuous systems.
IEEE Trans. Auto. Cont., 54(10):2421–2425, 2009.

J. B. Rawlings and L. Ji. Optimization-based state estimation: Current status


and some new results. J. Proc. Cont., 22:1439–1444, 2012.

J. B. Rawlings and M. J. Risbeck. On the equivalence between statements with


epsilon-delta and K-functions. Technical Report 2015–01, TWCCC Technical
Report, December 2015.

J. B. Rawlings and M. J. Risbeck. Model predictive control with discrete actua-


tors: Theory and application. Automatica, 78:258–265, 2017.

E. D. Sontag. Mathematical Control Theory. Springer-Verlag, New York, second


edition, 1998a.

E. D. Sontag. Comments on integral variants of ISS. Sys. Cont. Let., 34:93–100,


1998b.

E. D. Sontag and Y. Wang. On the characterization of the input to state stability


property. Sys. Cont. Let., 24:351–359, 1995.

E. D. Sontag and Y. Wang. Output-to-state stability and detectability of nonlin-


ear systems. Sys. Cont. Let., 29:279–290, 1997.

A. R. Teel. Discrete time receding horizon control: is the stability robust. In


Marcia S. de Queiroz, Michael Malisoff, and Peter Wolenski, editors, Optimal
control, stabilization and nonsmooth analysis, volume 301 of Lecture notes
in control and information sciences, pages 3–28. Springer, 2004.
C
Optimization

Version: date: April 25, 2022


Copyright © 2022 by Nob Hill Publishing, LLC

C.1 Dynamic Programming


The name dynamic programming dates from the 1950s when it was
coined by Richard Bellman for a technique for solving dynamic opti-
mization problems, i.e., optimization problems associated with deter-
ministic or stochastic systems whose behavior is governed by differ-
ential or difference equations. Here we review some of the basic ideas
behind dynamic programming (DP) Bellman (1957); Bertsekas, Nedic,
and Ozdaglar (2001).
To introduce the topic in its simplest form, consider the simple
routing problem illustrated in Figure C.1. To maintain connection with
optimal control, each node in the graph can be regarded as a point (x,
t) in a subset S of X × T where both the state space X = {a, b, c, . . . , g}
and the set of times T = {0, 1, 2, 3} are discrete. The set of permissible
control actions is U = {U, D}, i.e., to go “up” or “down.” The control
problem is to choose the lowest cost path from event (d, 0) (state d
at t = 0) to any of the states at t = 3; the cost of going from one
event to the next is indicated on the graph. This problem is equivalent
to choosing an open-loop control, i.e., a sequence (u(0), u(1), u(2))
of admissible control actions. There are 2N controls where N is the
number of stages, 3 in this example. The cost of each control can, in
this simple example, be evaluated and is given in Table C.1.
There are two different open-loop optimal controls, namely
(U, D, U ) and (D, D, D), each incurring a cost of 16. The corresponding

control UUU UUD UDU UDD DUU DUD DDU DDD


cost 20 24 16 24 24 32 20 16

Table C.1: Control Cost.

729
730 Optimization

g
4

f
16 8

e
0 8 8

d
8 16
8

c
8
4

0 1 2 3

Figure C.1: Routing problem.

state trajectories are (d, e, d, e) and (d, c, b, a).


In discrete problems of this kind, DP replaces the N-stage problem
by M single stage problems, where M is the total number of nodes, i.e.,
the number of elements in S ⊂ X × T . The first set of optimization
problems deals with the states b, d, f at time N − 1 = 2. The optimal
decision at event (f , 2), i.e., state f at time 2, is the control U and
gives rise to a cost of 4. The optimal cost and control for node (f ,
2) are recorded; see Table C.2. The procedure is then repeated for
states d and b at time t = 2 (nodes (d, 2) and (b, 2)) and recorded as
shown in Table C.2. Attention is next focused on the states e and c at
t = 1 (nodes (e, 1) and (c, 1)). The lowest cost that can be achieved at
node (e, 1) if control U is chosen, is 16 + 4, the sum of the path cost
16 associated with the control U , and the optimal cost 4 associated
with the node (f , 2) that results from using control U at node (e, 1).
Similarly the lowest possible cost, if control D is chosen, is 8+8. Hence
C.1 Dynamic Programming 731

t 0 1 2
state d e c f d b
control U or D D D U U D
optimal cost 16 16 8 4 8 4

Table C.2: Optimal Cost and Control

the optimal control and cost for node (e, 1) are, respectively, D and 16.
The procedure is repeated for the remaining state d at t = 1 (node (d,
1)). A similar calculation for the state d at t = 0 (node (d, 0)), where
the optimal control is U or D, completes this backward recursion; this
backward recursion provides the optimal cost and control for each (x,
t), as recorded in Table C.2. The procedure therefore yields an optimal
feedback control that is a function of (x, t) ∈ S. To obtain the optimal
open-loop control for the initial node (d, 0), the feedback law is obeyed,
leading to control U or D at t = 0; if U is chosen, the resultant state at
t = 1 is e. From Table C.2, the optimal control at (e, 1) is D, so that the
successor node is (d, 2). The optimal control at node (d, 2) is U . Thus
the optimal open-loop control sequence (U , D, U ) is re-obtained. On
the other hand, if the decision at (d, 0) is chosen to be D, the optimal
sequence (D, D, D) is obtained. This simple example illustrates the
main features of DP that we will now examine in the context of discrete
time optimal control.

C.1.1 Optimal Control Problem

The discrete time system we consider is described by

x + = f (x, u) (C.1)

where f (·) is continuous. The system is subject to the mixed state-


control constraint
(x, u) ∈ Z

where Z is a closed subset of Rn × Rm and Pu (Z) is compact where Pu


is the projection operator (x, u) , u. Often Z = X × U in which case
the constraint (x, u) ∈ Z becomes x ∈ X and u ∈ U and Pu (Z) = U
so that U is compact. In addition there is a constraint on the terminal
state x(N):
x(N) ∈ Xf
732 Optimization

where Xf is closed. In this section we find it easier to express the


value function and the optimal control in terms of the current state
and current time i rather than using time-to-go k. Hence we replace
time-to-go k by time i where k = N − i, replace Vk0 (x) (the optimal cost
at state x when the time-to-go is k) by V 0 (x, i) (the optimal cost at state
x, time i) and replace Xk by X(i) where X(i) is the domain of V 0 (·, i)).
The cost associated with an initial state x at time 0 and a control
sequence u := (u(0), u(1), . . . , u(N − 1)) is

N−1
X
V (x, 0, u) = Vf (x(N)) + ℓ(x(i), u(i)) (C.2)
i=1

where ℓ(·) and Vf (·) are continuous and, for each i, x(i) = φ(i; (x,
0), u) is the solution at time i of (C.1) if the initial state is x at time 0
and the control sequence is u. The optimal control problem P(x, 0) is
defined by
V 0 (x, 0) = min V (x, 0, u) (C.3)
u

subject to the constraints (x(i), u(i)) ∈ Z, i = 0, 1, . . . , N − 1 and


x(N) ∈ Xf . Equation (C.3) may be rewritten in the form

V 0 (x, 0) = min{V (x, 0, u) | u ∈ U(x, 0)} (C.4)


u

where u := (u(0), u(1), . . . , u(N − 1)),

U(x, 0) := {u ∈ RNm | (x(i), u(i)) ∈ Z, i = 0, 1, . . . , N−1; x(N) ∈ Xf }

and x(i) := φ(i; (x, 0), u). Thus U(x, 0) is the set of admissible control
sequences1 if the initial state is x at time 0. It follows from the continu-
ity of f (·) that for all i ∈ {0, 1, . . . , N − 1} and all x ∈ Rn , u , φ(i; (x,
0), u) is continuous, u , V (x, 0, u) is continuous and U(x, 0) is com-
pact. Hence the minimum in (C.4) exists at all x ∈ {x ∈ Rn | U(x,
0) ≠ ∅}.
DP embeds problem P(x, 0) for a given state x in a whole family of
problems P (x, i) where, for each (x, i), problem P(x, i) is defined by

V 0 (x, i) = min{V (x, i, ui ) | ui ∈ U(x, i)}


ui

where
ui := (u(i), u(i + 1), . . . , u(N − 1))
1 An admissible control sequence satisfies all constraints.
C.1 Dynamic Programming 733

N−1
X
V (x, i, ui ) := Vf (x(N)) + ℓ(x(j), u(j)) (C.5)
j=i

and

U(x, i) := {ui ∈ R(N−i)m | (x(j), u(j)) ∈ Z, j = i, i + 1, . . . , N − 1


x(N) ∈ Xf } (C.6)

In (C.5) and (C.6), x(j) = φ(j; (x, i), ui ), the solution at time j of (C.1)
if the initial state is x at time i and the control sequence is ui . For each
i, X(i) denotes the domain of V 0 (·, i) and U(·, i) so that

X(i) = {x ∈ Rn | U(x, i) ≠ ∅}. (C.7)

C.1.2 Dynamic Programming

One way to approach DP for discrete time control problems is the sim-
ple observation that for all (x, i)

V 0 (x, i) = min{V (x, i, ui ) | ui ∈ U(x, i)}


ui

= min{ℓ(x, u) + min V (f (x, u), i + 1, ui+1 ) |


u ui+1
i+1
{u, u } ∈ U(x, i)} (C.8)
 
where ui = (u, u(i + 1), . . . , u(N − 1)) = u, ui+1 . We now make use
of the fact that {u, ui+1 } ∈ U(x, i) if and only if (x, u) ∈ Z, f (x,
u) ∈ X(i + 1), and ui+1 ∈ U(f (x, u), i + 1) since f (x, u) = x(i + 1).
Hence we may rewrite (C.8) as

V 0 (x, i) = min{ℓ(x, u) + V 0 (f (x, u), i + 1) |


u

(x, u) ∈ Z, f (x, u) ∈ X(i + 1)} (C.9)

for all x ∈ X(i) where

X(i) = {x ∈ Rn | ∃u such that (x, u) ∈ Z and f (x, u) ∈ X(i + 1)}


(C.10)
Equations (C.9) and (C.10), together with the boundary condition

V 0 (x, N) = Vf (x) ∀x ∈ X(N), X(N) = Xf

constitute the DP recursion for constrained discrete time optimal con-


trol problems. If there are no state constraints, i.e., if Z = Rn × U where
734 Optimization

U ⊂ Rm is compact, then X(i) = Rn for all i ∈ {0, 1, . . . , N} and the DP


equations revert to the familiar DP recursion:

V 0 (x, i) = min{ℓ(x, u) + V 0 (f (x, u), i + 1)} ∀x ∈ Rn


u

with boundary condition

V 0 (x, N) = Vf ∀x ∈ Rn

We now prove some basic facts; the first is the well known principle
of optimality.

Lemma C.1 (Principle of optimality). Let x ∈ XN be arbitrary, let


u := (u(0), u(1), . . . , u(N − 1)) ∈ U(x, 0) denote the solution of P(x, 0)
and let (x, x(1), x(2), . . . , x(N)) denote the corresponding optimal state
trajectory so that for each i, x(i) = φ(i; (x, 0), u). Then, for any i ∈ {0,
1, . . . , N − 1}, the control sequence ui := (u(i), u(i + 1), . . . , u(N − 1)) is
optimal for P(x(i), i) (any portion of an optimal trajectory is optimal).

Proof. Since u ∈ U(x, 0), the control sequence ui ∈ U(x(i), i). If ui =


(u(i), u(i + 1), . . . , u(N − 1)) is not optimal for P(x(i), i), there exists
a control sequence u′ = (u′ (i), u′ (i + 1), . . . , u(N − 1)′ ) ∈ U(x(i), i)
such that V (x(i), i, u′ ) < V (x(i), u). Consider now the control se-
quence u e := (u(0), u(1), . . . , u(i − 1), u′ (i), u′ (i + 1), . . . , u(N − 1)′ ).
It follows that ue ∈ U(x, 0) and V (x, 0, u e ) < V (x, 0, u) = V 0 (x, 0), a
contradiction. Hence u(x(i), i) is optimal for P(x(i), i). ■

The most important feature of DP is the fact that the DP recur-


sion yields the optimal value V 0 (x, i) and the optimal control κ(x, i) =
arg minu {ℓ(x, u) + V 0 (f (x, u), i + 1) | (x, u) ∈ Z, f (x, u) ∈ X(i + 1)}
for each (x, i) ∈ X(i) × {0, 1, . . . , N − 1}.

Theorem C.2 (Optimal value function and control law from DP). Sup-
pose that the function Ψ : Rn × {0, 1, . . . , N} → R, satisfies, for all i ∈ {1,
2, . . . , N − 1}, all x ∈ X(i), the DP recursion

Ψ (x, i) = min{ℓ(x, u) + Ψ (f (x, u), i + 1) | (x, u) ∈ Z, f (x, u) ∈ X(i + 1)}


X(i) = {x ∈ Rn | ∃u ∈ Rm such that (x, u) ∈ Z, f (x, u) ∈ X(i + 1)}

with boundary conditions

Ψ (x, N) = Vf (x) ∀x ∈ Xf , X(N) = Xf

Then Ψ (x, i) = V 0 (x, i) for all (x, i) ∈ X(i) × {0, 1, 2, . . . , N}; the DP
recursion yields the optimal value function and the optimal control law.
C.1 Dynamic Programming 735

Proof. Let (x, i) ∈ X(i) × {0, 1, . . . , N} be arbitrary. Let u =


(u(i), u(i + 1), . . . , u(N − 1)) be an arbitrary control sequence in U(x,
i) and let x = (x, x(i + 1), . . . , x(N)) denote the corresponding tra-
jectory starting at (x, i) so that for each j ∈ {i, i + 1, . . . , N},
x(j) = φ(j; x, i, u). For each j ∈ {i, i + 1, . . . , N − 1}, let uj :=

u(j), u(j + 1), . . . , u(N − 1) ; clearly uj ∈ U(x(j), j). The cost due
to initial event (x(j), j) and control sequence uj is Φ(x(j), j) defined
by
Φ(x(j), j) := V (x(j), j, uj )
Showing that Ψ (x, i) ≤ Φ(x, i) proves that Ψ (x, i) = V 0 (x, i) since u is
an arbitrary sequence in U(x, i); because (x, i) ∈ X(i) × {0, 1, . . . , N}
is arbitrary, that fact that Ψ (x, i) = V 0 (x, i) proves that DP yields the
optimal value function.
To prove that Ψ (x, i) ≤ Φ(x, i), we compare Ψ (x(j), j) and Φ(x(j),
j) for each j ∈ {i, i + 1, . . . , N}, i.e., we compare the costs yielded by
the DP recursion and by the arbitrary control u along the corresponding
trajectory x. By definition, Ψ (x(j), j) satisfies for each j

Ψ (x(j), j) = min ℓ(x(j), u) + Ψ (f (x(j), u), j + 1) |
u

(x(j), u) ∈ Z, f (x(j), u) ∈ X(j + 1) (C.11)

To obtain Φ(x(j), j) for each j we solve the following recursive equa-


tion

Φ(x(j), j) = ℓ(x(j), u(j)) + Φ(f (x(j), u(j)), j + 1) (C.12)

The boundary conditions are

Ψ (x(N), N) = Φ(x(N), N) = Vf (x(N)) (C.13)

Since u(j) satisfies (x(j), u(j)) ∈ Z and f (x(j), u(j)) ∈ X(j + 1) but
is not necessarily a minimizer in (C.11), we deduce that

Ψ (x(j), j) ≤ ℓ(x(j), u(j)) + Ψ (f (x(j), u(j)), j + 1) (C.14)

For each j, let E(j) be defined by

E(j) := Ψ (x(j), j) − Φ(x(j), j)

Subtracting (C.12) from (C.14) and replacing f (x(j), u(j)) by x(j + 1)


yields
E(j) ≤ E(j + 1) ∀j ∈ {i, i + 1, . . . N}
736 Optimization

Since E(N) = 0 by virtue of (C.13), we deduce that E(j) ≤ 0 for all


j ∈ {i, i + 1, . . . , N}; in particular, E(i) ≤ 0 so that

Ψ (x, i) ≤ Φ(x, i) = V (x, i, u)

for all u ∈ U(x, i). Hence Ψ (x, i) = V 0 (x, i) for all (x, i) ∈ X(i) × {0,
1, . . . , N}. ■

Example C.3: DP applied to linear quadratic regulator


A much used example is the familiar linear quadratic regulator prob-
lem. The system is defined by

x + = Ax + Bu

There are no constraints. The cost function is defined by (C.2) where

ℓ(x, u) := (1/2)x ′ Qx + (1/2)u′ Ru

and Vf (x) = 0 for all x; the horizon length is N. We assume that Q


is symmetric and positive semidefinite and that R is symmetric and
positive definite. The DP recursion is

V 0 (x, i) = min{ℓ(x, u) + V 0 (Ax + Bu, i + 1)} ∀x ∈ Rn


u

with terminal condition

V 0 (x, N) = 0 ∀x ∈ Rn

Assume that V 0 (·, i + 1) is quadratic and positive semidefinite and,


therefore, has the form

V 0 (x, i + 1) = (1/2)x ′ P (i + 1)x

where P (i + 1) is symmetric and positive semidefinite. Then

V 0 (x, i) = (1/2) min{x ′ Qx + u′ Ru + (Ax + Bu)′ P (i + 1)(Ax + Bu)}


u

The right-hand side of the last equation is a positive definite function


of u for all x, so that it has a unique minimizer given by

κ(x, i) = K(i)x K(i) := −(B ′ P (i + 1)B + R)−1 B ′ P (i + 1)

Substituting u = K(i)x in the expression for V 0 (x, i) yields

V 0 (x, i) = (1/2)x ′ P (i)x


C.2 Optimality Conditions 737

where P (i) is given by:

P (i) = Q + K(i)′ RK(i) − A′ P (i + 1)B(B ′ P (i + 1)B + R)−1 B ′ P (i + 1)A

Hence V 0 (·, i) is quadratic and positive semidefinite if V 0 (·, i + 1) is.


But V 0 (·, N), defined by

V 0 (x, N) := (1/2)x ′ P (N)x = 0 P (N) := 0

is symmetric and positive semidefinite. By induction V 0 (·, i) is quad-


ratic and positive semidefinite (and P (i) is symmetric and positive
semidefinite) for all i ∈ {0, 1, . . . , N}. Substituting K(i) = −(B ′ P (i +
1)B + R)−1 B ′ P (i + 1)A in the expression for P (i) yields the more famil-
iar matrix Riccati equation

P (i) = Q + A′ P (i + 1)A − A′ P (i + 1)B(B ′ P (i + 1)B + R)−1 BP (i + 1)A

C.2 Optimality Conditions


In this section we obtain optimality conditions for problems of the form

f 0 = inf {f (u) | u ∈ U }
u

In these problems, u ∈ Rm is the decision variable,f (u) the cost to be


minimized by appropriate choice of uand U ⊂ Rm the constraint set.
The value of the problem is f 0 . Some readers may wish to read only
Section C.2.2, which deals with convex optimization problems and Sec-
tion C.2.3 which deals with convex optimization problems in which the
constraint set U is polyhedral. These sections require some knowledge
of tangent and normal cones discussed in Section C.2.1; Proposition C.7
in particular derives the normal cone for the case when U is convex.

C.2.1 Tangent and Normal Cones

In determining conditions of optimality, it is often convenient to em-


ploy approximations to the cost function f (·) and the constraint set U .
Thus the cost function f (·) may be approximated, in the neighborhood
of a point ū, by the first order expansion f (ū) + ⟨∇f (ū), (u − ū)⟩ or
by the second order expansion f (ū) + ⟨∇f (ū), (u − ū)⟩ + (1/2)((u −
ū)′ ∇2 f (x̄)(u − ū)) if the necessary derivatives exist. Thus we see that
738 Optimization

∇f (ū)
u2
ū ⊕ TU (ū)

ū2 U

u¯1 u1

Figure C.2: Approximation of the set U .

N̂(u)

U
v N̂(v)

TU (v)

TU (u)

Figure C.3: Tangent cones.

in the unconstrained case, a necessary condition for the optimality of


ū is ∇f (ū) = 0. To obtain necessary conditions of optimality for con-
strained optimization problems, we need to approximate the constraint
set as well; this is more difficult. An example of U and its approxima-
tion is shown in Figure C.2; here the set U = {u ∈ R2 | g(u) = 0}
where g : R → R is approximated in the neighborhood of a point ū
satisfying g(ū) = 0 by the set ū ⊕ TU (ū) where2 the tangent cone
TU (ū) := {h ∈ R2 | ∇g(ū), u − ū⟩ = 0}. In general, a set U is approx-
2 IfA and B are two subsets of Rn , say, then A ⊕ B := {a + b | a ∈ A, b ∈ B} and
a ⊕ B := {a + b | b ∈ B}.
C.2 Optimality Conditions 739

N̂U (u)

Figure C.4: Normal at u.

imated, near a point ū, by ū ⊕ TU (ū) where its tangent cone TU (ū) is
defined below. Following Rockafellar and Wets (1998), we use uν-----→ --- v
U
to denote that the sequence {uν | ν ∈ I≥0 } converges to v as ν → ∞
while satisfying uν ∈ U for all ν ∈ I≥0 .

Definition C.4 (Tangent vector). A vector h ∈ Rm is tangent to the set


U at ū if there exist sequences uν-----→
--- ū and λν ↘ 0 such that
U

[uν − ū]/λν → h

TU (u) is the set of all tangent vectors.

Equivalently, a vector h ∈ Rm is tangent to the set U at ū if there


exist sequences hν → h and λν ↘ 0 such that ū + λν hν ∈ U for all
ν ∈ I≥0 . This equivalence can be seen by identifying uν with ū+λν hν .

Proposition C.5 (Tangent vectors are closed cone). The set TU (u) of all
tangent vectors to U at any point u ∈ U is a closed cone.

See Rockafellar and Wets (1998), Proposition 6.2. That TU (ū) is a


cone may be seen from its definition; if h is a tangent, so is αh for any
α ≥ 0. Two examples of a tangent cone are illustrated in Figure C.3.
Associated with each tangent cone TU (u) is a normal cone N̂(u)
defined as follows Rockafellar and Wets (1998):

Definition C.6 (Regular normal). A vector g ∈ Rm is a regular normal


to a set U ⊂ Rm at ū ∈ U if

⟨g, u − ū⟩ ≤ o(|u − ū|) ∀u ∈ U (C.15)

where o(·) has the property that o(|u − ū|)/|u − ū| → 0 as u------→
--- ū with
U
u ≠ ū; N̂U (u) is the set of all regular normal vectors.
740 Optimization

Some examples of normal cones are illustrated in Figure C.3; here


the set N̂U (u) = {λg | λ ≥ 0} is a cone generated by a single vector g,
say, while N̂U (v) = {λ1 g1 + λ2 g2 | λ1 ≥ 0, λ2 ≥ 0} is a cone generated
by two vectors g1 and g2 , say. The term o(|u − ū|) may be replaced by
0 if U is convex as shown in Proposition C.7(b) below but is needed in
general since U may not be locally convex at ū as illustrated in Figure
C.4.
The tangent cone TU (ū) and the normal cone N̂U (ū) at a point ū ∈
U are related as follows.

Proposition C.7 (Relation of normal and tangent cones).


(a) At any point ū ∈ U ⊂ Rm ,

N̂U (ū) = TU (ū)∗ := {g | ⟨g, h⟩ ≤ 0 ∀h ∈ TU (ū)}

where, for any cone V , V ∗ := {g | ⟨g, h⟩ ≤ 0 ∀h ∈ V } denotes the polar


cone of V .

(b) If U is convex, then, at any point ū ∈ U

N̂U (ū) = {g | ⟨g, u − ū⟩ ≤ 0 ∀u ∈ U } (C.16)

Proof.
(a) To prove N̂U (ū) ⊂ TU (ū)∗ , we take an arbitrary point g in N̂U (ū)
and show that ⟨g, h⟩ ≤ 0 for all h ∈ T (ū) implying that g ∈ TU∗ (ū).
For, if h is tangent to U at ū, there exist, by definition, sequences
uν-----→
--- ū and λν ↘ 0 such that
U

hν := (uν − ū)/λν → h

Since g ∈ N̂U (ū), it follows from (C.15) that ⟨g, hν ⟩ ≤ o(|(uν − ū)|) =
o(λν |hν |); the limit as ν → ∞ yields ⟨g, h⟩ ≤ 0, so that g ∈ TU∗ (ū).
Hence N̂U (ū) ⊂ TU (ū)∗ . The proof of this result, and the more subtle
proof of the converse, that TU (ū)∗ ⊂ N̂U (ū), are given in Rockafellar
and Wets (1998), Proposition 6.5.

(b) This part of the proposition is proved in (Rockafellar and Wets,


1998, Theorem 6.9). ■

Remark. A consequence of (C.16) is that for each g ∈ N̂U (ū), the half-
space Hg := {u | ⟨g, u − ū⟩ ≤ 0} supports the convex set U at ū, i.e.,
U ⊂ Hg and ū lies on the boundary of the half-space Hg .
C.2 Optimality Conditions 741

We wish to derive optimality conditions for problems of the form


P : inf u {f (u) | u ∈ U}. The value of the problem is defined to be

f 0 := inf {f (u) | u ∈ U}
u

There may not exist a u ∈ U such that f (u) = f 0 . If, however, f (·) is
continuous and U is compact, there exists a minimizing u in U , i.e.,

f 0 = inf {f (u) | u ∈ U } = min{f (u) | u ∈ U }


u u

The minimizing u, if it exists, may not be unique so

u0 := arg min{f (u) | u ∈ U }


u

may be a set. We say u is feasible if u ∈ U . A point u is globally optimal


for problem P if u is feasible and f (v) ≥ f (u) for all v ∈ U . A point u
is locally optimal for problem P if u is feasible and there exists a ε > 0
such that f (v) ≥ f (u) for all v in (u ⊕ εB) ∩ U where B is the closed
unit ball {u | min |u| ≤ 1}.

C.2.2 Convex Optimization Problems

The optimization problem P is convex if the function f : Rm → R and


the set U ⊂ Rm are convex. In convex optimization problems, U often
takes the form {u | gj (u) ≤ 0, j ∈ J} where J := {1, 2, . . . , J} and
each function gj (·) is convex. A useful feature of convex optimization
problems is the following result:

Proposition C.8 (Global optimality for convex problems). Suppose the


function f (·) is convex and differentiable and the set U is convex. Any
locally optimal point of the convex optimization problem inf u {f (u) |
u ∈ U } is globally optimal.

Proof. Suppose u is locally optimal so that there exists an ε > 0 such


that f (v) ≥ f (u) for all v ∈ (u ⊕ εB) ∩ U . If, contrary to what we
wish to prove, u is not globally optimal, there exists a w ∈ U such
that f (w) < f (u). For any λ ∈ [0, 1], the point wλ := λw + (1 − λ)u
lies in [u, w] (the line joining u and w). Then wλ ∈ U (because U is
convex) and f (wλ ) ≤ λf (w) + (1 − λ)f (u) < f (u) for all λ ∈ (0, 1]
(because f (·) is convex and f (w) < f (u)). We can choose λ > 0 so
that wλ ∈ (u ⊕ εB) ∩ U and f (wλ ) < f (u). This contradicts the local
optimality of u. Hence u is globally optimal. ■
742 Optimization

On the assumption that f (·) is differentiable, we can obtain a simple


necessary and sufficient condition for the (global) optimality of a point
u.

Proposition C.9 (Optimality conditions—normal cone). Suppose the


function f (·) is convex and differentiable and the set U is convex. The
point u is optimal for problem P if and only if u ∈ U and

df (u; v − u) = ⟨∇f (u), v − u⟩ ≥ 0 ∀v ∈ U (C.17)

or, equivalently
−∇f (u) ∈ N̂U (u) (C.18)

Proof. Because f (·) is convex, it follows from Theorem 7 in Appendix


A1 that
f (v) ≥ f (u) + ⟨∇f (u), v − u⟩ (C.19)

for all u, v in U. To prove sufficiency, suppose u ∈ U and that the


condition in (C.17) is satisfied. It then follows from (C.19) that f (v) ≥
f (u) for all v ∈ U so that u is globally optimal. To prove necessity,
suppose that u is globally optimal but that, contrary to what we wish
to prove, the condition on the right-hand side of (C.17) is not satisfied
so that there exists a v ∈ U such that

df (u; h) = ⟨∇f (u), v − u⟩ = −δ < 0

where h := v − u. For all λ ∈ [0, 1], let vλ := λv + (1 − λ)u = u + λh;


because U is convex, each vλ lies in U . Since

f (u + λh) − f (u) f (vλ ) − f (u)


df (u; h) = lim = lim = −δ
λ↘0 λ λ↘0 λ

there exists a λ ∈ (0, 1] such that f (vλ ) − f (u) ≤ −λδ/2 < 0 which
contradicts the optimality of u. Hence the condition in (C.17) must be
satisfied. That (C.17) is equivalent to (C.18) follows from Proposition
C.7(b). ■

Remark. The condition (C.17) implies that the linear approximation


fˆ(v) := f (u) + ⟨∇f (u), v − u⟩ to f (v) achieves its minimum over U
at u.

It is an interesting fact that U in Proposition C.9 may be replaced by


its approximation u ⊕ TU (u) at u yielding
C.2 Optimality Conditions 743

Proposition C.10 (Optimality conditions—tangent cone). Suppose the


function f (·) is convex and differentiable and the set U is convex. The
point u is optimal for problem P if and only if u ∈ U and

df (u; v − u) = ⟨∇f (u), h⟩ ≥ 0 ∀h ∈ TU (u)

or, equivalently
−∇f (u) ∈ N̂U (u) = TU∗ (u).
Proof. It follows from Proposition C.9 that u is optimal for problem P
if and only if u ∈ U and −∇f (u) ∈ N̂U (u). But, by Proposition C.7,
N̂U (u) = {g | ⟨g, h⟩ ≤ 0 ∀h ∈ TU (u)} so that −∇f (u) ∈ N̂U (u) is
equivalent to ⟨∇f (u), h⟩ ≥ 0 for all h ∈ TU (u). ■

C.2.3 Convex Problems: Polyhedral Constraint Set

The definitions of tangent and normal cones given above may appear
complex but this complexity is necessary for proper treatment of the
general case when U is not necessarily convex. When U is polyhedral,
i.e., when U is defined by a set of linear inequalities

U := {u ∈ Rm | Au ≤ b}

where A ∈ Rp×m and b ∈ Rp , I := {1, 2, . . . , p}, then the normal and


tangent cones are relatively simple. We first note that U is equivalently
defined by
U := {u ∈ Rm | ⟨ai , u⟩ ≤ bi , i ∈ I}
where ai is the ith row of A and bi is the ith element of b. For each
u ∈ U , let
I 0 (u) := {i ∈ I | ⟨ai , u⟩ = bi }
denote the index set of constraints active at u. Clearly I 0 (u) = ∅ if u
lies in the interior of U . An example of a polyhedral constraint set is
shown in Figure C.5. The next result shows that in this case, the tangent
cone is the set of h in Rm that satisfy ⟨ai , h⟩ ≤ 0 for all i in I 0 (u) and
the normal cone is the cone generated by the vectors ai , i ∈ I 0 (u); each
P
normal h in the normal cone may be expressed as i∈I 0 (u) µi ai where
each µi ≥ 0.
Proposition C.11 (Representation of tangent and normal cones). Let
U := {u ∈ Rm | ⟨ai , u⟩ ≤ bi , i ∈ I}. Then, for any u ∈ U :

TU (u) = {h | ⟨ai , h⟩ ≤ 0, i ∈ I 0 (u)}


N̂U (u) = TU∗ (u) = cone{ai | i ∈ I 0 (u)}
744 Optimization

Proof. (i) Suppose h is any vector in {h | ⟨ai , h⟩ ≤ 0, i ∈ I 0 (u)}. Let


the sequences uν and λν satisfy uν = u + λν h and λν ↘ 0 with λ0 ,
the first element in the sequence λν , satisfying u + λ0 h ∈ U . It follows
that [uν − u]/λν ≡ h so that from Definition C.4, h is tangent to U
at u. Hence {h | ⟨ai , h⟩ ≤ 0, i ∈ I 0 (u)} ⊂ TU (u). (ii) Conversely,
if h ∈ TU (u), then there exist sequences λν ↘ 0 and hν → h such
that ⟨ai , u + λν hν ⟩ ≤ bi for all i ∈ I, all ν ∈ I≥0 . Since ⟨ai , u⟩ = bi
for all i ∈ I 0 (u), it follows that ⟨ai , hν ⟩ ≤ 0 for all i ∈ I 0 (u), all
ν ∈ I≥0 ; taking the limit yields ⟨ai , h⟩ ≤ 0 for all i ∈ I 0 (u) so that
h ∈ {h | ⟨ai , h⟩ ≤ 0, i ∈ I 0 (u)} which proves TU (u) ⊂ {h | ⟨ai , h⟩ ≤ 0,
i ∈ I 0 (u)}. We conclude from (i) and (ii) that TU (u) = {h | ⟨ai , h⟩ ≤ 0,
i ∈ I 0 (u)}. That N̂U (u) = TU∗ (u) = cone{ai | i ∈ I 0 (u)} then follows
from Proposition C.7 above and Proposition 9 in Appendix A1. ■

The next result follows from Proposition C.5 and Proposition C.7.

Proposition C.12 (Optimality conditions—linear inequalities). Suppose


the function f (·) is convex and differentiable and U is the convex set
{u | Au ≤ b}. Then u is optimal for P : minu {f (u) | u ∈ U} if and only
if u ∈ U and

−∇f (u) ∈ N̂U (u) = cone{ai | i ∈ I 0 (u)}

Corollary C.13 (Optimality conditions—linear inequalities). Suppose


the function f (·) is convex and differentiable and U = {u | Au ≤ b}.
Then u is optimal for P : minu {f (u) | u ∈ U } if and only if Au ≤ b and
there exist multipliers µi ≥ 0, i ∈ I 0 (u) satisfying
X
∇f (u) + µi ∇gi (u) = 0 (C.20)
i∈I 0 (u)

where, for each i, gi (u) := ⟨ai , u⟩−bi so that gi (u) ≤ 0 is the constraint
⟨ai , u⟩ ≤ bi and ∇gi (u) = ai .

Proof. Since any point g ∈ cone{ai | i ∈ I 0 (u)} may be expressed as


P
g = i∈I 0 (u) µi ai where, for each i, µi ≥ 0, the condition −∇f (u) ∈
cone{ai | i ∈ I 0 (u)} is equivalent to the existence of multipliers µi ≥ 0,
i ∈ I 0 (u) satisfying (C.20). ■

The above results are easily extended if U is defined by linear equal-


ity and inequality constraints, i.e., if

U := {⟨ai , u⟩ ≤ bi , i ∈ I, ⟨ci , u⟩ = di , i ∈ E}
C.2 Optimality Conditions 745

a1

−∇f (u)

F ∗ (u)
U u

a2

F (u)

Figure C.5: Condition of optimality.

In this case, at any point u ∈ U , the tangent cone is

TU (u) = {h | ⟨ai , h⟩ ≤ 0, i ∈ I 0 (u), ⟨ci , h⟩ = 0, i ∈ E}

and the normal cone is


X X
N̂U (u) = { λi ai + µi ci | λi ≥ 0 ∀i ∈ I 0 (u), µi ∈ R ∀i ∈ E}
i∈I 0 (u) i∈E

With U defined this way, u is optimal for minu {f (u) | u ∈ U } where


f (·) is convex and differentiable if and only if

−∇f (u) ∈ N̂U (u)

For each i ∈ I let gi (u) := ⟨ai , u⟩ − bi and for each i ∈ E, let hi (u) :=
⟨ci , u⟩ − di so that ∇g(ui ) = ai and ∇hi = ci . It follows from the
characterization of N̂U (u) that u is optimal for minu {f (u) | u ∈ U } if
and only if there exist multipliers λi ≥ 0, i ∈ I 0 (u) and µi ∈ R, i ∈ E
such that X X
∇f (u) + µi ∇gi (u) + hi (u) = 0 (C.21)
i∈I 0 (u) i∈E

C.2.4 Nonconvex Problems

We first obtain a necessary condition of optimality for the problem


min{f (u) | u ∈ U} where f (·) is differentiable but not necessarily
746 Optimization

convex and U ⊂ Rm is not necessarily convex; this result generalizes


the necessary condition of optimality in Proposition C.9.

Proposition C.14 (Necessary condition for nonconvex problem). A nec-


essary condition for u to be locally optimal for the problem of minimizing
a differentiable function f (·) over the set U is

df (u; h) = ⟨∇f (u), h⟩ ≥ 0, ∀h ∈ TU (u)

which is equivalent to the condition

−∇f (u) ∈ N̂U (u)

Proof. Suppose, contrary to what we wish to prove, that there exists


a h ∈ TU (u) and a δ > 0 such that ⟨∇f (u), h⟩ = −δ < 0. Because
h ∈ TU (u), there exist sequences hν-----→
--- h and λν ↘ 0 such that uν :=
U
u + λν hν converges to u and satisfies uν ∈ U for all ν ∈ I≥0 . Then

f (uν ) − f (u) = ⟨∇f (u), λν hν ⟩ + o(λν |hν |)

Hence
[f (uν ) − f (u)]/λν = ⟨∇f (u), hν ⟩ + o(λν )/λν
where we make use of the fact that |hν | is bounded for ν sufficiently
large. It follows that

[f (uν ) − f (u)]/λν → ⟨∇f (u), h⟩ = −δ

so that there exists a finite integer j such that f (uj )−f (u) ≤ −λj δ/2 <
0 which contradicts the local optimality of u. Hence ⟨∇f (u), h⟩ ≥ 0
for all h ∈ TU (u). That −∇f (u) ∈ N̂U (u) follows from Proposition
C.7. ■

A more concise proof proceeds as follows Rockafellar and Wets


(1998). Since f (v) − f (u) = ⟨∇f (u), v − u⟩ + o(|v − u|) it follows
that ⟨−∇f (u), v − u⟩ = o(|v − u|) − (f (v) − f (u)). Because u is lo-
cally optimal, f (v) − f (u) ≥ 0 for all v in the neighborhood of u so
that ⟨−∇f (u), v − u⟩ ≤ o(|v − u|) which, by (C.15), is the definition of
a normal vector. Hence −∇f (u) ∈ N̂U (u).

C.2.5 Tangent and Normal Cones

The material in this section is not required for Chapters 1-7; it is pre-
sented merely to show that alternative definitions of tangent and nor-
mal cones are useful in more complex situations than those considered
C.2 Optimality Conditions 747

N̂U (u) = {0}

g ∈ N̂U (uν )
NU (u)

(a) Normal cones.

TU (u)
T̂U (u)

U
u

(b) Tangent cones.

Figure C.6: Tangent and normal cones.

above. Thus, the normal and tangent cones defined in C.2.1 have some
limitations when U is not convex or, at least, not similar to the con-
straint set illustrated in Figure C.4. Figure C.6 illustrates the type of
difficulty that may occur. Here the tangent cone TU (u) is not con-
vex, as shown in Figure C.6(b), so that the associated normal cone
748 Optimization

N̂U (u) = TU (u)∗ = {0}. Hence the necessary condition of optimal-


ity of u for the problem of minimizing a differentiable function f (·)
over U is ∇f (u) = 0; the only way a differentiable function f (·) can
achieve a minimum over U at u is for the condition ∇f (u) = 0 to be
satisfied. Alternative definitions of normality and tangency are some-
times necessary. In Rockafellar and Wets (1998), a vector g ∈ N̂U (u)
is normal in the regular sense; a normal in the general sense is then
defined by:

Definition C.15 (General normal). A vector g is normal to U at u in


the general sense if there exist sequences uν-----→
--- u and g ν → g where
U
g ν ∈ N̂U (uν ) for all ν; NU (u) is the set of all general normal vectors.

The cone NU (u) of general normal vectors is illustrated in Figure


C.6(a); here the cone NU (u) is the union of two distinct cones each
having form {αg | α ≥ 0}. Also shown in Figure C.6(a) are single
elements of two sequences g ν in N̂U (uν ) converging to NU (u). Counter
intuitively, the general normal vectors in this case point into the interior
of U . Associated with NU (u) is the set T̂U (u) of regular tangents to U
at u defined, when U is locally closed,3 in (Rockafellar and Wets, 1998,
Theorem 6.26) by:

Definition C.16 (General tangent). Suppose U is locally closed at u. A


vector h is tangent to U at u in the regular sense if, for all sequences
uν-----→
--- u, there exists a sequence hν → h that satisfies hν ∈ Tu (uν ) for
U
all ν; T̂U (u) is the set of all regular tangent vectors to U at u.

Alternatively, a vector h is tangent to U at u in the regular sense if,


for all sequences uν-----→
--- u and λν ↘ 0, there exists a sequence hν → h
U
satisfying uν + λν hν ∈ U for all ν ∈ I≥0 . The cone of regular tangent
vectors for the example immediately above is shown in Figure C.6(b).
The following result is proved in Rockafellar and Wets (1998), Theorem
6.26:

Proposition C.17 (Set of regular tangents is closed convex cone). At any


u ∈ U , the set T̂U (u) of regular tangents to U at u is a closed convex
cone with T̂U (u) ⊂ TU (u). Moreover, if U is locally closed at u, then
T̂U (u) = NU (u)∗ .
3A set U is locally closed at a point u if there exists a closed neighborhood N of u
such that U ∩ N is closed; U is locally closed if it is locally closed at all u.
C.2 Optimality Conditions 749

TU (u)
T̂U (u)

U
g1 g2
u

Figure C.7: Condition of optimality.

Figure C.7 illustrates some of these results. In Figure C.7, the con-
stant cost contour {v | f (v) = f (u)} of a nondifferentiable cost func-
tion f (·) is shown together with a sublevel set D passing through the
point u: f (v) ≤ f (u) for all v ∈ D. For this example, df (u; h) =
max{⟨g1 , h⟩, ⟨g2 , h⟩} where g1 and g2 are normals to the level set of
f (·) at u so that df (u; h) ≥ 0 for all h ∈ T̂U (u), a necessary condi-
tion of optimality; on the other hand, there exist h ∈ TU (u) such that
df (u; h) < 0. The situation is simpler if the constraint set U is regular
at u.

Definition C.18 (Regular set). A set U is regular at a point u ∈ U in the


sense of Clarke if it is locally closed at u and if NU (u) = N̂U (u) (all
normal vectors at u are regular).

The following consequences of Clarke regularity are established in


Rockafellar and Wets (1998), Corollary 6.29:

Proposition C.19 (Conditions for regular set). Suppose U is locally


closed at u ∈ U . Then U is regular at u is equivalent to each of the
following.
(a) NU (u) = N̂U (u) (all normal vectors at u are regular).

(b) TU (u) = T̂U (u) (all tangent vectors at u are regular).

(c) NU (u) = TU (u)∗ .

(d) TU (u) = NU (u)∗ .


750 Optimization

(e) ⟨g, h⟩ ≤ 0 for all h ∈ TU (u), all g ∈ NU (u).

It is shown in Rockafellar and Wets (1998) that if U is regular at u


and a constraint qualification is satisfied, then a necessary condition
of optimality, similar to (C.21), may be obtained. To obtain this result,
we pursue a slightly different route in Sections C.2.6 and C.2.7.

C.2.6 Constraint Set Defined by Inequalities

We now consider the case when the set U is specified by a set of differ-
entiable inequalities:

U := {u | gi (u) ≤ 0 ∀i ∈ I} (C.22)

where, for each i ∈ I, the function gi : Rm → R is differentiable. For


each u ∈ U
I 0 (u) := {i ∈ I | gi (u) = 0}

is the index set of active constraints. For each u ∈ U, the set FU (u)
of feasible variations for the linearized set of inequalities; FU (u) is
defined by

FU (u) := {h | ⟨∇gi (u), h⟩ ≤ 0 ∀i ∈ I 0 (u)} (C.23)

The set FU (u) is a closed, convex cone and is called a cone of first order
feasible variations in Bertsekas (1999) because h is a descent direction
for gi (u) for all i ∈ I 0 (u), i.e., gi (u + λh) ≤ 0 for all λ sufficiently
small. When U is polyhedral, the case discussed in C.2.3, gi (u) = ⟨ai ,
u⟩ − bi and ∇gi (u) = ai so that FU (u) = {h | ⟨ai , h⟩ ≤ 0 ∀i ∈ I 0 (u)}
which was shown in Proposition C.11 to be the tangent cone TU (u).
An important question whether FU (u) is the tangent cone TU (u) for
a wider class of problems because, if FU (u) = TU (u), a condition of
optimality of the form in (C.20) may be obtained. In the example in
Figure C.8, FU (u) is the horizontal axis {h ∈ R2 | h2 = 0} whereas
TU (u) is the half-line {h ∈ R2 | h1 ≥ 0, h2 = 0} so that in this case,
FU (u) ≠ TU (u). While FU (u) is always convex, being the intersection
of a set of half-spaces, the tangent cone TU (u) is not necessarily convex
as Figure C.6b shows. The set U is said to be quasiregular at u ∈ U
if FU (u) = TU (u) is which case u is said to be a quasiregular point
Bertsekas (1999). The next result, due to Bertsekas (1999), shows that
FU (u) = TU (u), i.e., U is quasiregular at u, when a certain constraint
qualification is satisfied.
C.2 Optimality Conditions 751

∇g1 (u)

FU (u) TU (u)

∇g2 (u)

Figure C.8: FU (u) ̸= TU (u).

Proposition C.20 (Quasiregular set). Suppose U := {u | gi (u) ≤ 0 ∀i ∈


I} where, for each i ∈ I, the function gi : Rm → R is differentiable.
Suppose also that u ∈ U and that there exists a vector h̄ ∈ FU (u) such
that
⟨∇gi (u), h̄⟩ < 0, ∀ i ∈ I 0 (u) (C.24)
Then
TU (u) = FU (u)
i.e., U is quasiregular at u.

Equation (C.24) is the constraint qualification; it can be seen that it


precludes the situation shown in Figure C.8.

Proof. It follows from the definition (C.23) of FU (u) and the constraint
qualification (C.24) that:

⟨∇gi (u), h + α(h̄ − h)⟩ < 0, ∀h ∈ FU (u), α ∈ (0, 1], i ∈ I 0 (u)

Hence, for all h ∈ FU (u), all α ∈ (0, 1], there exists a vector hα :=
h + α(h̄ − h), in FU (u) satisfying ⟨∇gi (u), hα ⟩ < 0 for all i ∈ I 0 (u).
Assuming for the moment that hα ∈ TU (u) for all α ∈ (0, 1], it follows,
since hα → h as α → 0 and TU (u) is closed, that h ∈ TU (u), thus
proving FU (u) ⊂ TU (u). It remains to show that hα is tangent to U
at u. Consider the sequences hν and λν ↘ 0 where hν := hα for all
ν ∈ I≥0 . There exists a δ > 0 such that ⟨∇gi (u), hα ⟩ ≤ −δ for all
i ∈ I 0 (u) and gi (u) ≤ −δ for all i ∈ I \ I 0 (u). Since

gi (u + λν hν ) = gi (u) + λν ⟨∇gi (u), hα ⟩ + o(λν ) ≤ −λν δ + o(λν )


752 Optimization

for all i ∈ I 0 (u), it follows that there exists a finite integer N such that
gi (u + λν hν ) ≤ 0 for all i ∈ I, all ν ≥ N. Since the sequences {hν }
and {λν } for all ν ≥ N satisfy hν → hα , λν ↘ 0 and u + λν hν ∈ U for
all i ∈ I, it follows that hα ∈ TU (u), thus completing the proof that
FU (u) ⊂ TU (u).
Suppose now that h ∈ TU (u). There exist sequences hν → h and
λ → 0 such that u + λν hν ∈ U so that g(u + λν hν ) ≤ 0 for all ν ∈ I≥0 .
ν

Since g(u + λν hν ) = g(u) + ⟨∇gj (u), λν hν ⟩ + o(λν |hν |) ≤ 0, it follows


that ⟨∇gj (u), λν hν ⟩ + o(λν ) ≤ 0 for all j ∈ I 0 (u), all ν ∈ I≥0 . Hence
⟨∇gj (u), hν ⟩ + o(λν )/λν ≤ 0 for all j ∈ I 0 (u), all ν ∈ I≥0 . Taking the
limit yields ⟨∇gj (u), hν ⟩ ≤ 0 for all j ∈ I 0 (u) so that h ∈ FU (u) which
proves TU (u) ⊂ FU (u). Hence TU (u) = FU (u). ■

The existence of a h̄ satisfying (C.24) is, as we have noted above, a


constraint qualification. If u is locally optimal for the inequality con-
strained optimization problem of minimizing a differentiable function
f (·) over the set U defined in (C.22) and, if (C.24) is satisfied thereby
ensuring that TU (u) = FU (u), then a condition of optimality of the
form (C.20) may be easily obtained as shown in the next result.

Proposition C.21 (Optimality conditions nonconvex problem). Suppose


u is locally optimal for the problem of minimizing a differentiable func-
tion f (·) over the set U defined in (C.22) and that TU (u) = FU (u).
Then
−∇f (u) ∈ cone{∇gi (u) | i ∈ I 0 (u)}
and there exist multipliers µi ≥ 0, i ∈ I 0 (u) satisfying
X
∇f (u) + µi ∇gi (u) = 0 (C.25)
i∈I 0 (u)

Proof. It follows from Proposition C.14 that −∇f (u) ∈ N̂U (u) and
from Proposition C.7 that N̂U (u) = TU∗ (u). But, by hypothesis,
TU (u) = FU (u) so that N̂U (u) = FU∗ (u), the polar cone of FU (u).
It follows from (C.23) and the definition of a polar cone, given in Ap-
pendix A1, that

FU∗ (u) = cone{∇gi (u) | i ∈| I 0 (u)}

Hence
−∇f (u) ∈ cone{∇gi (u) | i ∈ I 0 (u)}
The existence of multipliers µi satisfying (C.25) follows from the defi-
nition of a cone generated by {∇gi (u) | i ∈ I 0 (u)}. ■
C.2 Optimality Conditions 753

C.2.7 Constraint Set Defined by Equalities and Inequalities

Finally, we consider the case when the set U is specified by a set of


differentiable equalities and inequalities:

U := {u | gi (u) ≤ 0 ∀i ∈ I, hi (u) = 0 ∀i ∈ E}

where, for each i ∈ I, the function gi : Rm → R is differentiable and for


each i ∈ E, the function hi : Rm → R is differentiable. For each u ∈ U

I 0 (u) := {i ∈ I | gi (u) = 0}

the index set of active inequality constraints is defined as before. We


wish to obtain necessary conditions for the problem of minimizing a
differentiable function f (·) over the set U . The presence of equality
constraints makes this objective more difficult than for the case when
U is defined merely by differentiable inequalities. The result we wish
to prove is a natural extension of Proposition C.21 in which the equality
constraints are included in the set of active constraints:

Proposition C.22 (Fritz-John necessary conditions). Suppose u is a local


minimizer for the problem of minimizing f (u) subject to the constraint
u ∈ U where U is defined in (C.22). Then there exist multipliers µ0 ,
µi , i ∈ I and λi , i ∈ E, not all zero, such that
X X
µ0 ∇f (u) + µi ∇gi (u) + λj ∇hj (u) = 0 (C.26)
i∈I j∈E

and
µi gi (u) = 0 ∀i ∈ I
where µ0 ≥ 0 and µi ≥ 0 for all i ∈ I 0 .

The condition µi gi (u) = 0 for all i ∈ I is known as the complemen-


tarity conditions and implies µi = 0 for all i ∈ I such that gi (u) < 0.
If µ0 > 0, then (C.26) may be normalized by dividing each term by µ0
yielding the more familiar expression
X X
∇f (u) + µi ∇gi (u) + ∇hj (u) = 0
i∈I j∈E

We return to this point later. Perhaps the simplest method for proving
Proposition C.22 is the penalty approach adopted by Bertsekas (1999),
Proposition 3.3.5. We merely give an outline of the proof. The con-
strained problem of minimizing f (v) over U is approximated, for each
754 Optimization

k ∈ I≥0 by a penalized problem defined below; as k increases the pe-


nalized problem becomes a closer approximation to the constrained
problem. For each i ∈ I, we define

gi+ (v) := max{gi (v), 0}

For each k, the penalized problem Pk is then defined as the problem of


minimizing F k (v) defined by
X X
F k (v) := f (v) + (k/2) (gi+ (v))2 + (k/2) (hj (v))2 + (1/2)|v − u|2
i∈I j∈E

subject to the constraint

S := {v | |v − u| ≤ ε}

where ϵ > 0 is such that f (u) ≤ f (v) for all v in S ∩ U . Let v k denote
the solution of Pk . Bertsekas shows that v k → u as k → ∞ so that for
all k sufficiently large, v k lies in the interior of S and is, therefore, the
unconstrained minimizer of F k (v). Hence for each k sufficiently large,
v k satisfies ∇F k (v k ) = 0, or
X X
∇f (v k ) + µ̄ik ∇g(v k ) + λ̄ki ∇h(v k ) = 0 (C.27)
i∈I i∈E

where
µ̄ik := kgi+ (v k ), λ̄ki := khi (v k )
Let µ k denote the vector with elements µik , i ∈ I and λk the vector with
elements λki , k ∈ E. Dividing (C.27) by δk defined by

δk := [1 + |µ k |2 + |λk |2 ]1 /2

yields X X
µ0k ∇f (v k ) + µik ∇g(v k ) + λkj ∇h(v k ) = 0
i∈I j∈E

where
µ0k := µ̄ik /δk , µik := µ̄ik /δk , λkj := λ̄ki /δk
and
(µ0k )2 + |µ k |2 + |λk |2 = 1
 
Because of the last equation, the sequence µ0k , µ k , λk lies in a compact
set, and therefore has a subsequence, indexed by K ⊂ I≥0 , converging
to some limit (µ0 , µ, λ) where µ and λ are vectors whose elements are,
C.3 Set-Valued Functions and Continuity of Value Function 755

respectively, µi , i ∈ I and λj , j ∈ E. Because v k → u as k ∈ K tends


to infinity, it follows from (C.27) that
X X
µ0 ∇f (u) + µi ∇gi (u) + λj ∇hj (u) = 0
i∈I j∈E

To prove the complementarity condition, suppose, contrary to what we


wish to prove, that there exists a i ∈ I such that gi (u) < 0 but µi > 0.
Since µik → µi > 0 and gi (v k ) → gi (u) as k → ∞, k ∈ K, it follows that
µi µik > 0 for all k ∈ K sufficiently large. But µik = µ̄ik /δk = kgi+ (v k )/δk
so that µi µik > 0 implies µi gi+ (v k ) > 0 which in turn implies gi+ (v k ) =
gi (v k ) > 0 for all k ∈ K sufficiently large. This contradicts the fact that
gi (v k ) → gi (u) < 0 as k → ∞, k ∈ K. Hence we must have gi (u) = 0
for all i ∈ I such that µi > 0.
The Fritz-John condition in Proposition C.22 is known as the Karush-
Kuhn-Tucker (KKT) condition if µ0 > 0; if this is the case, µ0 may be
normalized to µ0 = 1. A constraint qualification is required for the
Karush-Kuhn-Tucker condition to be a necessary condition of optimal-
ity for the optimization problem considered in this section. A sim-
ple constraint qualification is linear independence of {∇gi (u), i ∈
I 0 (u), ∇hj (u), j ∈ E} at a local minimizer u. For, if u is a lo-
cal minimizer and µ0 = 0, then the Fritz-John condition implies that
P P
i∈I 0 (u) µi ∇gi (u) + j∈E λj ∇hj (u) = 0 which contradicts the linear
independence of {∇gi (u), i ∈ I 0 (u), ∇hj (u), j ∈ E} since not all the
multipliers are zero. Another constraint qualification, used in Propo-
sitions C.20 and C.21 for an optimization problem in which the con-
straint set is U := {u | gi (u) ≤ 0, i ∈ I}, is the existence of a vector
h̄(u) ∈ FU (u) such that ⟨∇gi (u), h̄⟩ < 0 for all i ∈ I 0 (u); this condi-
tion ensures µ0 = 1 in C.25. Many other constraint qualifications exist;
see, for example, Bertsekas (1999), Chapter 3.

C.3 Set-Valued Functions and Continuity of Value Func-


tion
A set-valued function U(·) is one for which, for each value of x, U (x)
is a set; these functions are encountered in parametric programming.
For example, in the problem P(x) : inf u {f (x, u) | u ∈ U (x)} (which
has the same form as an optimal control problem in which x is the
state and u is a control sequence), the constraint set U is a set-valued
function of the state. The solution to the problem P(x) (the value of u
that achieves the minimum) can also be set-valued. It is important to
756 Optimization

U (x1 ) Z

x1 x
X

Figure C.9: Graph of set-valued function U(·).

know how smoothly these set-valued functions vary with the parameter
x. In particular, we are interested in the continuity properties of the
value function x , f 0 (x) = inf u {f (x, u) | u ∈ U (x)} since, in optimal
control problems we employ the value function as a Lyapunov function
and robustness depends, as we have discussed earlier, on the continuity
of the Lyapunov function. Continuity of the value function depends,
in turn, on continuity of the set-valued constraint set U (·). We use the
notation U : Rn Ž Rm to denote the fact that U (·) maps points in Rn
into subsets of Rm .
The graph of a set-valued functions is often a useful tool. The graph
of U : Rn Ž Rm is defined to be the set Z := {(x, u) ∈ Rn × Rm | u ∈
U(x)}; the domain of the set-valued function U is the set X := {x ∈
Rn | U(x) ≠ ∅} = {x ∈ Rn | ∃u ∈ Rm such that (x, u) ∈ Z}; clearly
X ⊂ Rn . Also X is the projection of the set Z ⊂ Rn × Rm onto Rn , i.e.,
(x, u) ∈ Z implies x ∈ X. An example is shown in Figure C.9. In this
example, U(x) varies continuously with x. Examples in which U (·)
is discontinuous are shown in Figure C.10. In Figure C.10(a), the set
U(x) varies continuously if x increases from its initial value of x1 , but
jumps to a much larger set if x decreases an infinitesimal amount (from
its initial value of x1 ); this is an example of a set-valued function that
is inner semicontinuous at x1 . In Figure C.10(b), the set U (x) varies
continuously if x decreases from its initial value of x1 , but jumps to
a much smaller set if x increases an infinitesimal amount (from its
initial value of x1 ); this is an example of a set-valued function that is
C.3 Set-Valued Functions and Continuity of Value Function 757

U (x)

S1

U (x1 )

S2

x1 x2 x

(a) Inner semicontinuous set-valued function.

U (x)

S3

S1

U (x1 )

x1 x2 x

(b) Outer semicontinuous set-valued function.

Figure C.10: Graphs of discontinuous set-valued functions.

outer semicontinuous at x1 . The set-valued function is continuous at


x2 where it is both outer and inner semicontinuous.
We can now give precise definitions of inner and outer semiconti-
nuity.

C.3.1 Outer and Inner Semicontinuity

The concepts of inner and outer semicontinuity were introduced by


Rockafellar and Wets (1998, p. 144) to replace earlier definitions of
lower and upper semicontinuity of set-valued functions. This section is
based on the useful summary provided by Polak (1997, pp. 676-682).

Definition C.23 (Outer semicontinuous function). A set-valued func-


tion U : Rn Ž Rm is said to be outer semicontinuous (osc) at x if U (x)
758 Optimization

U(x ′ ) U (x ′ )

U (x) U(x)

S
S

x x
x ⊕ δB x ⊕ δB
x′ x′

(a) Outer semicontinuity. (b) Inner semicontinuity.

Figure C.11: Outer and inner semicontinuity of U(·).

is closed and if, for every compact set S such that U (x) ∩ S = ∅, there
exists a δ > 0 such that U(x ′ ) ∩ S = ∅ for all x ′ ∈ x ⊕ δB.4 The
set-valued function U : Rn Ž Rm is outer semicontinuous if it is outer
semicontinuous at each x ∈ Rn .

Definition C.24 (Inner semicontinuous function). A set-valued function


U : Rn Ž Rm is said to be inner semicontinuous (isc) at x if, for every
open set S such that U(x) ∩ S ≠ ∅, there exists a δ > 0 such that
U(x ′ )∩S ≠ ∅ for all x ′ ∈ x⊕δB. The set-valued function U : Rn Ž Rm
is inner semicontinuous if it is inner semicontinuous at each x ∈ Rn .

These definitions are illustrated in Figure C.11. Roughly speaking,


a set-valued function that is outer semicontinuous at x cannot explode
as x changes to x ′ arbitrarily close to x; similarly, a set-valued function
that is inner semicontinuous at x cannot collapse as x changes to x ′
arbitrarily close to x.

Definition C.25 (Continuous function). A set-valued function is contin-


uous (at x) if it is both outer and inner continuous (at x).

If we return to Figure C.10(a) we see that S1 ∩ U (x1 ) = ∅ but S1 ∩


U(x) ≠ ∅ for x infinitesimally less than x1 so that U (·) is not outer
semicontinuous at x1 . For all S2 such that S2 ∩ U (x1 ) ≠ ∅, however,
S2 ∩ U(x) ≠ ∅ for all x in a sufficiently small neighborhood of x1 so
that U(·) is inner semicontinuous at x1 . If we turn to Figure C.10(b)
we see that S1 ∩ U(x1 ) ≠ ∅ but S1 ∩ U(x) = ∅ for x infinitesimally
greater than x1 so that in this case U(·) is not inner semicontinuous at
x1 . For all S3 such that S3 ∩ U(x1 ) = ∅, however, S3 ∩ U (x) = ∅ for
4 Recall that B := {x | |x| ≤ 1} is the closed unit ball in Rn .
C.3 Set-Valued Functions and Continuity of Value Function 759

all x in a sufficiently small neighborhood of x1 so that U (·) is outer


semicontinuous at x1 .
The definitions of outer and inner semicontinuity may be inter-
preted in terms of infinite sequences (Rockafellar and Wets, 1998, p.
152), (Polak, 1997, pp. 677-678).
Theorem C.26 (Equivalent conditions for outer and inner semicontinu-
ity).
(a) A set-valued function U : Rn Ž Rm is outer semicontinuous at x if
and only if for every infinite sequence (xi ) converging to x, any accu-
mulation point5 u of any sequence (ui ), satisfying ui ∈ U (xi ) for all i,
lies in U(x) (u ∈ U(x)).

(b) A set-valued function U : Rn Ž Rm is inner semicontinuous at x


if and only if for every u ∈ U(x) and for every infinite sequence (xi )
converging to x, there exists an infinite sequence (ui ), satisfying ui ∈
U(xi ) for all i, that converges to u.
Proofs of these results may be found in Rockafellar and Wets (1998);
Polak (1997). Another result that we employ is:
Proposition C.27 (Outer semicontinuity and closed graph). A set-valued
function U : Rn Ž Rm is outer semicontinuous in its domain if and only
if its graph Z is closed in Rn × Rm .
Proof. Since (x, u) ∈ Z is equivalent to u ∈ U (x), this result is a direct
consequence of the Theorem C.26. ■

In the above discussion we have assumed, as in Polak (1997), that


U(x) is defined everywhere in Rn ; in constrained parametric optimiza-
tion problems, however, U(x) is defined on X, a closed subset of Rn ;
see Figure C.9. Only minor modifications of the above definitions are
then required. In definitions C.23 and C.24 we replace the closed set
δB by δB ∩ X and in Theorem C.26 we replace “every infinite sequence
(in Rn )” by “every infinite sequence in X.” In effect, we are replacing
the topology of Rn by its topology relative to X.

C.3.2 Continuity of the Value Function

Our main reason for introducing set-valued functions is to provide us


with tools for analyzing the continuity properties of the value func-
tion and optimal control law in constrained optimal control problems.
5 Recall,
 
u is the limit of ui if ui → u
 as i → ∞; u is an accumulation point of ui
if it is the limit of a subsequence of ui .
760 Optimization

These problems have the form

V 0 (x) = min{V (x, u) | u ∈ U (x)} (C.28)


0
u (x) = arg min{V (x, u) | u ∈ U (x)} (C.29)

where U : Rn Ž Rm is a set-valued function and V : Rn × Rm → R is


continuous; in optimal control problems arising from MPC, u should
be replaced by u = (u(0), u(1), . . . , u(N − 1)) and m by Nm. We are
interested in the continuity properties of the value function V 0 : Rn →
R and the control law u0 : Rn → Rm ; the latter may be set-valued (if
the minimizer in (C.28) is not unique).
The following max problem has been extensively studied in the lit-
erature

φ0 (x) = max{φ(x, u) | u ∈ U (x)}


µ 0 (x) = arg max{φ(x, u) | u ∈ U (x)}

If we define φ(·) by φ(x, u) := −V (x, u), we see that φ0 (x) = −V 0 (x)


and µ 0 (x) = u0 (x) so that we can obtain the continuity properties of
V 0 (·) and u0 (·) from those of φ0 (·) and µ 0 (·) respectively. Using this
transcription and Corollary 5.4.2 and Theorem 5.4.3 in Polak (1997) we
obtain the following result:

Theorem C.28 (Minimum theorem). Suppose that V : Rn × Rm → R


is continuous, that U : Rn Ž Rm is continuous, compact-valued and
satisfies U(x) ⊂ U for all x ∈ X where U is compact. Then V 0 (·) is
continuous and u0 (·) is outer semicontinuous. If, in addition, u0 (x) =
{µ 0 (x)} (there is a unique minimizer µ 0 (x)), then µ 0 (·) is continuous.

It is unfortunately the case, however, that due to state constraints,


U(·) is often not continuous in constrained optimal control problems.
If U(·) is constant, which is the case in optimal control problem if state
or mixed control-state constraints are absent, then, from the above
results, the value function V 0 (·) is continuous. Indeed, under slightly
stronger assumptions, the value function is Lipschitz continuous.

Lipschitz continuity of the value function. If we assume that V (·)


is Lipschitz continuous and that U(x) ≡ U , we can establish Lipschitz
continuity of V 0 (·). Interestingly the result does not require, nor does
it imply, Lipschitz continuity of the minimizer u0 (·).
C.3 Set-Valued Functions and Continuity of Value Function 761

Theorem C.29 (Lipschitz continuity of the value function, constant U ).


Suppose that V : Rn × Rm → R is Lipschitz continuous on bounded sets6
and that U(x) ≡ U where U is a compact subset of Rm . Then V 0 (·) is
Lipschitz continuous on bounded sets.

Proof. Let S be an arbitrary bounded set in X, the domain of the value


function V 0 (·), and let R := S × U; R is a bounded subset of Z. Since R
is bounded, there exists a Lipschitz constant LS such that

V (x ′ , u) − V (x ′′ , u) ≤ LS |x ′ − x ′′ |

for all x ′ , x ′′ ∈ S, all u ∈ U . Hence,

V 0 (x ′ ) − V 0 (x ′′ ) ≤ V (x ′ , u′′ ) − V (x ′′ , u′′ ) ≤ LS |x ′ − x ′′ |

for all x ′ , x ′′ ∈ S, any u′′ ∈ u0 (x ′′ ). Interchanging x ′ and x ′′ in the


above derivation yields

V 0 (x ′′ ) − V 0 (x ′ ) ≤ V (x ′′ , u′ ) − V (x ′ , u′ ) ≤ LS |x ′′ − x ′ |

for all x ′ , x ′′ ∈ S, any u′ ∈ u0 (x ′ ). Hence V 0 (·) is Lipschitz continuous


on bounded sets. ■

We now specialize to the case where U(x) = {u ∈ Rm | (x, u) ∈ Z}


where Z is a polyhedron in Rn × Rm ; for each x, U (x) is a polytope.
This type of constraint arises in constrained optimal control problems
when the system is linear and the state and control constraints are
polyhedral. What we show in the sequel is that, in this special case,
U(·) is continuous and so, therefore, is V 0 (·). An alternative proof,
which many readers may prefer, is given in Chapter 7 where we exploit
the fact that if V (·) is strictly convex and quadratic and Z polyhedral,
then V 0 (·) is piecewise quadratic and continuous. Our first concern is
to obtain a bound on d(u, U(x ′ )), the distance of any u ∈ U (x) from
the constraint set U(x ′ ).
A bound on d(u, U(x ′ )), u ∈ U(x). The bound we require is given
by a special case of a theorem due to Clarke, Ledyaev, Stern, and Wolen-
ski (1998, Theorem 3.1, page 126). To motivate this result, consider a
differentiable convex function f : R → R so that f (u) ≥ f (v)+⟨∇f (v),
u − v⟩ for any two points u and v in R. Suppose also that there
exists a nonempty interval U = [a, b] ⊂ R such that f (u) ≤ 0 for
6A function V (·) is Lipschitz continuous on bounded sets if, for any bounded set S,
there exists a constant LS ∈ [0, ∞) such that |V (z′ )−V (z)| ≤ LS |z−z′ | for all z, z′ ∈ S.
762 Optimization

(u, f (u)

(g, −1) 1

Figure C.12: Subgradient of f (·).

all u ∈ U and that there exists a δ > 0 such that ∆f (u) > δ for all
u ∈ R. Let u > b and let v = b be the closest point in U to u. Then
f (u) ≥ f (v) + ⟨∇f (v), u − v⟩ ≥ δ|v − u| so that d(u, U ) ≤ f (u)/δ.
The theorem of Clarke et al. (1998) extends this result to the case when
f (·) is not necessarily differentiable but requires the concept of a sub-
gradient of a convex function

Definition C.30 (Subgradient of convex function). Suppose f : Rm → R


is convex. Then the subgradient δf (u) of f (·) at u is defined by

δf (u) := {g | f (v) ≥ f (u) + ⟨g, v − u⟩ ∀v ∈ Rm }

Figure C.12 illustrates a subgradient. In the figure, g is one element


of the subgradient because f (v) ≥ f (u) + ⟨g, v − u⟩ for all v; g is
the slope of the line passing through the point (u, f (u)). To obtain a
bound on d(u, U(x)) we require the following result which is a special
case of the much more general result of the theorem of Clarke et al.:

Theorem C.31 (Clarke et al. (1998)). Take a nonnegative valued, convex


function ψ : Rn × Rm → R. Let U(x) := {u ∈ Rm | ψ(x, u) = 0} and
X := {x ∈ Rn | U(x) ≠ ∅}. Assume there exists a δ > 0 such that

u ∈ Rm , x ∈ X, ψ(x, u) > 0 and g ∈ ∂u ψ(x, u) =⇒ |g| > δ

where ∂u ψ(x, u) denotes the convex subgradient of ψ with respect to


C.3 Set-Valued Functions and Continuity of Value Function 763

the variable u. Then, for each x ∈ X, d(u, U (x)) ≤ ψ(x, u)/δ for all
u ∈ Rm .

The proof of this result is given in the reference cited above. We


next use this result to bound the distance of u from U (x) where, for
each x, U(x) is polyhedral.

Corollary C.32 (A bound on d(u, U(x ′ )) for u ∈ U (x)). 7 Suppose


Z is a polyhedron in Rn × Rm and let X denote its projection on Rn
(X = {x | ∃u ∈ Rm such that (x, u) ∈ Z}). Let U(x) := {u | (x,
u) ∈ Z}. Then there exists a K > 0 such that for all x, x ′ ∈ X, d(u,
U(x ′ )) ≤ K|x ′ − x| for all u ∈ U(x) (or, for all x, x ′ ∈ X, all u ∈ U (x),
there exists a u′ ∈ U(x ′ ) such that |u′ − u| ≤ K|x ′ − x|).

Proof. The polyhedron Z admits the representation Z = {(x, u) | ⟨mj ,


u⟩ − ⟨nj , x⟩ − p j ≤ 0, j ∈ J} for some mj ∈ Rm , nj ∈ Rn and p j ∈ R,
j ∈ J := {1, . . . , J}. Define D to be the collection of all index sets I ⊆ J
P
such that j∈I λj mj ≠ 0, ∀λ ∈ ΛI in which, for a particular index
P
set I, ΛI is defined to be ΛI := {λ | λj ≥ 0, j∈I λj = 1}. Because D
is a finite set, there exists a δ > 0 such that for all I ∈ D, all λ ∈ ΛI ,
P
| j∈I λj mj | > δ. Let ψ(·) be defined by ψ(x, u) := max{⟨mj , u⟩−⟨nj ,
x⟩ − p j , 0 | j ∈ J} so that (x, u) ∈ Z (or u ∈ U(x)) if and only if
ψ(x, u) = 0. We now claim that, for every (x, u) ∈ X × Rm such
that ψ(x, u) > 0 and every g ∈ ∂u ψ(x, u), the subgradient of ψ with
respect to u at (x, u), we have |g| > δ. Assuming for the moment that
the claim is true, the proof of the Corollary may be completed with the
aid of Theorem C.31. Assume, as stated in the Corollary, that x, x ′ ∈ X
and u ∈ U(x); the theorem asserts

d(u, U(x ′ )) ≤ (1/δ)ψ(x ′ , u), ∀x ′ ∈ X

But ψ(x, u) = 0 (since u ∈ U(x)) so that

d(u, U(x ′ )) ≤ (1/δ)[ψ(x ′ , u) − ψ(x, u)] ≤ (c/δ)|x ′ − x|

where c is the Lipschitz constant for x , ψ(x, u) (ψ(·) is piecewise


affine and continuous). This proves the Corollary with K = c/δ.
It remains to confirm the claim. Take any (x, u) ∈ X × Rm such
that ψ(x, u) > 0. Then max j {⟨mj , u⟩ − ⟨nj , x⟩ − p j , 0 | j ∈ J} > 0. Let

7 The authors wish to thank Richard Vinter and Francis Clarke for providing this
result.
764 Optimization

I 0 (x, u) denote the active constraint set (the set of those constraints
at which the maximum is achieved). Then

⟨mj , u⟩ − ⟨nj , x⟩ − p j > 0, ∀j ∈ I 0 (x, u)

Since x ∈ X, there exists a ū ∈ U(x) so that

⟨mj , ū⟩ − ⟨nj , x⟩ − p j ≤ 0, ∀j ∈ I 0 (x, u)

Subtracting these two inequalities yields

⟨mj , u − ū⟩ > 0, ∀j ∈ I 0 (x, u)


P
But then, for all λ ∈ ΛI 0 (x,u) , it follows that | j∈I 0 (x,u) λj mj (u − ū)| >
0, so that X
λj m j ≠ 0
j∈I 0 (x,u)

It follows that I 0 (x, u) ∈ D. Hence


X
λj mj > δ, ∀λ ∈ ΛI 0 (x,u)
j∈I 0 (x,u)

Now take any g ∈ ∂u f (x, u) = co{mj | j ∈ I 0 (x, u)} (co denotes “con-
P
vex hull”). There exists a λ ∈ ΛI 0 (x,u) such that g = j∈I 0 (x,u) λj mj .
But then |g| > δ by the inequality above. This proves the claim and,
hence, completes the proof of the Corollary. ■

Continuity of the value function when U(x) = {u | (x, u) ∈ Z}.


In this section we investigate continuity of the value function for the
constrained linear quadratic optimal control problem P(x); in fact we
establish continuity of the value function for the more general prob-
lem where the cost is continuous rather than quadratic. We showed in
Chapter 2 that the optimal control problem of interest takes the form

V 0 (x) = min{V (x, u) | (x, u) ∈ Z}


u

where Z is a polyhedron in Rn × U where U ⊂ Rm is a polytope and,


hence, is compact and convex; in MPC problems we replace the control
u by the sequence of controls u and m by Nm. Let u0 : Rn Ž Rm be
defined by
u0 (x) := arg min{V (x, u) | (x, u) ∈ Z}
u

and let X be defined by

X := {x | ∃u such that (x, u) ∈ Z}


C.3 Set-Valued Functions and Continuity of Value Function 765

so that X is the projection of Z ⊂ Rn × Rm onto Rn . Let the set-valued


function U : Rn Ž Rm be defined by

U(x) := {u ∈ Rm | (x, u) ∈ Z}

The domain of V 0 (·) and of U(·) is X. The optimization problem may


be expressed as V 0 (x) = minu {V (x, u) | u ∈ U (x)}. Our first task is
establish the continuity of U : Rn Ž Rm .

Theorem C.33 (Continuity of U(·)). Suppose Z is a polyhedron in Rn ×U


where U ⊂ Rm is a polytope. Then the set-valued function U : Rn Ž Rm
defined above is continuous in X.

Proof. By Proposition C.27, the set-valued map U (·) is outer semicon-


tinuous in X because its graph, Z, is closed. We establish inner semi-
continuity using Corollary C.32 above. Let x, x ′ be arbitrary points in
X and U(x) and U(x ′ ) the associated control constraint sets. Let S be
any open set such that U(x) ∩ S ≠ ∅ and let u be an arbitrary point in
U(x) ∩ S. Because S is open, there exist an ε > 0 such that u ⊕ εB ⊂ S.
Let ε′ := ε/K where K is defined in Corollary 1. From Corollary C.32,
there exists a u′ ∈ U(x ′ ) such that |u′ − u| ≤ K|x ′ − x| which im-
plies |u′ − u| ≤ ε (u′ ∈ u ⊕ εB) for all x ′ ∈ X such that |x ′ − x| ≤ ε′
(x ′ ∈ (x ⊕ ε′ B) ∩ X). This implies u ∈ U(x ′ ) ∩ S for all x ′ ∈ X such
that |x ′ − x| ≤ ε′ (x ′ ∈ (x ⊕ ε′ B) ∩ X). Hence U(x ′ ) ∩ S ≠ ∅ for all
x ′ ∈ (x ⊕ ε′ B) ∩ X, so that U(·) is inner semicontinuous in X. Since
U(·) is both outer and inner semicontinuous in X, it is continuous in
X. ■

We can now establish continuity of the value function.

Theorem C.34 (Continuity of the value function). Suppose that V : Rn ×


Rm → R is continuous and that Z is a polyhedron in Rn ×U where U ⊂ Rm
is a polytope. Then V 0 : Rn → R is continuous and u0 : Rn Ž Rm is
outer semicontinuous in X. Moreover, if u0 (x) is unique (not set-valued)
at each x ∈ X, then u0 : Rn → Rm is continuous in X.

Proof. Since the real-valued function V (·) is continuous (by assump-


tion) and since the set-valued function U (·) is continuous in X (by
Theorem C.33), it follows from Theorem C.28 that V 0 : Rn → R is
continuous and u0 : Rn Ž Rm is outer semicontinuous in X; it also
follows that if u0 (x) is unique (not set-valued) at each x ∈ X, then
u0 : Rn → Rm is continuous in X. ■
766 Optimization

Lipschitz continuity when U(x) = {u | (x, u) ∈ Z}. Here we estab-


lish that V 0 (·) is Lipschitz continuous if V (·) is Lipschitz continuous
and U(x) := {u ∈ Rm | (x, u) ∈ Z}; this result is more general than
Theorem C.29 where it is assumed that U is constant.
Theorem C.35 (Lipschitz continuity of the value function—U (x)). Sup-
pose that V : Rn ×Rm → R is continuous, that Z is a polyhedron in Rn ×U
where U ⊂ Rm is a polytope. Suppose, in addition, that V : Rn × Rm → R
is Lipschitz continuous on bounded sets.8 Then V 0 (·) is Lipschitz contin-
uous on bounded sets.

Proof. Let S be an arbitrary bounded set in X, the domain of the value


function V 0 (·), and let R := S × U; R is a bounded subset of Z. Let x, x ′
be two arbitrary points in S. Then

V 0 (x) = V (x, κ(x))


V 0 (x ′ ) = V (x ′ , κ(x ′ ))

where V (·) is the cost function, assumed to be Lipschitz continuous


on bounded sets, and κ(·), the optimal control law, satisfies κ(x) ∈
U(x) ⊂ U and κ(x ′ ) ∈ U(x ′ ) ⊂ U. It follows from Corollary C.32
that there exists a K > 0 such that for all x, x ′ ∈ X, there exists a
u′ ∈ U(x ′ ) ⊂ U such that |u′ − κ(x)| ≤ K|x ′ − x|. Since κ(x) is
optimal for the problem P(x), and since (x, κ(x)) and (x ′ , u′ ) both lie
in R = S × U, there exists a constant LR such that

V 0 (x ′ ) − V 0 (x) ≤ V (x ′ , u′ ) − V (x, κ(x))


≤ LR (|(x ′ , u′ ) − (x, κ(x))|)
≤ LR |x ′ − x| + LR K|x ′ − x|
≤ MS |x ′ − x|, MS := LR (1 + K)

Reversing the role of x and x ′ we obtain the existence of a u ∈ U(x)


such that |u − κ(x ′ )| ≤ K|x − x ′ |; it follows from the optimality of
κ(x ′ ) that

V 0 (x) − V 0 (x ′ ) ≤ V (x, u) − V (x ′ , κ(x ′ ))


≤ MS |x − x ′ |

where, now, u ∈ U(x) and κ(x ′ ) ∈ U(x ′ ). Hence |V 0 (x ′ ) − V 0 (x)| ≤


MS |x − x ′ | for all x, x ′ in S. Since S is an arbitrary bounded set in X,
V 0 (·) is Lipschitz continuous on bounded sets. ■
8A function V (·) is Lipschitz continuous on bounded sets if, for any bounded set S,
there exists a constant LS ∈ [0, ∞) such that |V (z′ )−V (z)| ≤ LS |z−z′ | for all z, z′ ∈ S.
C.4 Exercises 767

C.4 Exercises

Exercise C.1: Nested optimization and switching order of optimization


Consider the optimization problem in two variables
min V (x, y)
(x,y)∈Z
in which x ∈ Rn , y ∈ Rm , and V : Rn × Rm → R. Assume this problem has a solution.
This assumption is satisfied, for example, if V is continuous and Z is compact, but, in
general, we do not require either of these conditions.
Define the following four sets
X(y) = {x | (x, y) ∈ Z} Y(x) = {y | (x, y) ∈ Z}
B = {y | X(y) ≠ ∅} A = {x | Y(x) ≠ ∅}
Note that A and B are the projections of Z onto Rn and Rm , respectively. Projection
is defined in Section C.3. Show the solutions of the following two nested optimization
problems exist and are equal to the solution of the original problem
!
min min V (x, y)
x∈A y∈Y(x)
!
min min V (x, y)
y∈B x∈X(y)

Exercise C.2: DP nesting


Prove the assertion made in Section C.1.2 that ui = {u, ui+1 } ∈ U(x, i) if and only if
(x, u) ∈ Z, f (x, u) ∈ X(i + 1), and ui+1 ∈ U(f (x, u), i + 1).

Exercise C.3: Recursive feasibility


Prove the assertion in the proof of Theorem C.2 that (x(j), u(j)) ∈ Z and that f (x(j),
u(j)) ∈ X(j + 1).

Exercise C.4: Basic minmax result


Consider the following two minmax optimization problems in two variables
inf sup V (x, y) sup inf V (x, y)
x∈X y∈Y y∈Y x∈X
in which x ∈ X ⊆ Rn , y ∈ Y ⊆ Rm , and V : X × Y → R.
(a) Show that the values are ordered as follows

inf sup V (x, y) ≥ sup inf V (x, y)


x∈X y∈Y y∈Y x∈X
or, if the solutions to the problems exist,
min max V (x, y) ≥ max min V (x, y)
x∈X y∈Y y∈Y x∈X
A handy mnemonic for this result is that the player who goes first (inner problem)
has the advantage.9
9 Note that different conventions are in use. Boyd and Vandenberghe (2004, p. 240)

say that the player who “goes” second has the advantage, meaning that the inner prob-
lem is optimized after the outer problem has selected a value for its variable. We say
that since the inner optimization is solved first, this player “goes” first.
768 Optimization

(b) Use your results to order these three problems

sup inf sup V (x, y, z) inf sup sup V (x, y, z) sup sup inf V (x, y, z)
x∈X y∈Y z∈Z y∈Y z∈Z x∈X z∈Z x∈X y∈Y

Exercise C.5: Lagrange multipliers and minmax


Consider the constrained optimization problem

min V (x) subject to g(x) = 0 (C.30)


x∈Rn

in which V : Rn → R and g : Rn → Rm . Introduce the Lagrange multiplier λ ∈ Rm


and Lagrangian function L(x, λ) = V (x) − λ′ g(x) and consider the following minmax
problem
min max L(x, λ)
x∈Rn λ∈Rm
Show that if (x0 , λ0 ) is a solution to this problem with finite L(x0 , λ0 ), then x0 is also
a solution to the original constrained optimization (C.30).

Exercise C.6: Dual problems and duality gap


Consider again the constrained optimization problem of Exercise C.5

min V (x) subject to g(x) = 0


x∈Rn

and its equivalent minmax formulation

min max L(x, λ)


x∈Rn λ∈Rm

Switching the order of optimization gives the maxmin version of this problem

max min L(x, λ)


λ∈Rm x∈Rn

Next define a new (dual) objective function q : Rm → R as the inner optimization

q(λ) = min L(x, λ)


x∈Rn

Then the maxmin problem can be stated as

max q(λ) (C.31)


λ∈Rm

Problem (C.31) is known as the dual of the original problem (C.30), and the original
problem (C.30) is then denoted as the primal problem in this context (Nocedal and
Wright, 2006, p. 343–345), (Boyd and Vandenberghe, 2004, p. 223).

(a) Show that the solution to the dual problem is a lower bound for the solution to
the primal problem

max q(λ) ≤ min V (x) subject to g(x) = 0 (C.32)


λ∈Rm x∈Rn

This property is known as weak duality (Nocedal and Wright, 2006, p. 345),
(Boyd and Vandenberghe, 2004, p. 225).
C.4 Exercises 769

(b) The difference between the dual and the primal solutions is known as the duality
gap. Strong duality is defined as the property that equality is achieved in (C.32)
and the duality gap is zero (Boyd and Vandenberghe, 2004, p. 225).

max q(λ) = min V (x) subject to g(x) = 0 (C.33)


λ∈Rm x∈Rn

Show that strong duality is equivalent to the existence of λ0 such that

min V (x) − λ′0 g(x) = min V (x) subject to g(x) = 0 (C.34)


x∈Rn x∈Rn

Characterize the set of all λ0 that satisfy this equation.

Exercise C.7: Example with duality gap


Consider the following function and sets (Peressini, Sullivan, and Uhl, Jr., 1988, p. 34)

V (x, y) = (y − x 2 )(y − 2x 2 ) X = [−1, 1] Y = [−1, 1]

Make a contour plot of V (·) on X × Y and answer the following question. Which of the
following two minmax problems has a nonzero duality gap?

min max V (x, y)


y∈Y x∈X

min max V (x, y)


x∈X y∈Y

Notice that the two problems are different because the first one minimizes over y and
maximizes over x, and the second one does the reverse.

Exercise C.8: The Heaviside function and inner and outer semicontinuity
Consider the (set-valued) function

0, x<0
H(x) =
1, x>0

and you are charged with deciding how to define H(0).


(a) Characterize the choices of set H(0) that make H outer semicontinuous. Justify
your answer.

(b) Characterize the choices of set H(0) that make H inner semicontinuous. Justify
your answer.

(c) Can you define H(0) so that H is both outer and inner semicontinuous? Explain
why or why not.
Bibliography

R. E. Bellman. Dynamic Programming. Princeton University Press, Princeton,


New Jersey, 1957.

D. P. Bertsekas. Nonlinear Programming. Athena Scientific, Belmont, MA, sec-


ond edition, 1999.

D. P. Bertsekas, A. Nedic, and A. E. Ozdaglar. Dynamic Programming and


Optimal Control. Athena Scientific, Belmont, MA 02478-0003, USA, 2001.

S. P. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University


Press, 2004.

F. Clarke, Y. S. Ledyaev, R. J. Stern, and P. R. Wolenski. Nonsmooth analysis


and control theory. Springer-Verlag, New York, 1998.

J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York,


second edition, 2006.

A. L. Peressini, F. E. Sullivan, and J. J. Uhl, Jr. The Mathematics of Nonlinear


Programming. Springer-Verlag, New York, 1988.

E. Polak. Optimization: Algorithms and Consistent Approximations. Springer


Verlag, New York, 1997. ISBN 0-387-94971-2.

R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer-Verlag, 1998.

770

You might also like