0% found this document useful (0 votes)
2 views

Optimal Control

This document is a draft of lecture notes on optimal control theory, specifically focusing on linear quadratic (LQ) methods. It outlines the historical development of optimal control, key principles such as Pontryagin's Minimum Principle, and discusses various applications including finite and infinite horizon linear quadratic regulators. The document also includes design methods and robustness properties associated with LQ solutions.

Uploaded by

ramon.guzman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Optimal Control

This document is a draft of lecture notes on optimal control theory, specifically focusing on linear quadratic (LQ) methods. It outlines the historical development of optimal control, key principles such as Pontryagin's Minimum Principle, and discusses various applications including finite and infinite horizon linear quadratic regulators. The document also includes design methods and robustness properties associated with LQ solutions.

Uploaded by

ramon.guzman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 189

Introduction to Optimal Control

Thierry Miquel

To cite this version:


Thierry Miquel. Introduction to Optimal Control. Master. Introduction to optimal control, ENAC,
France. 2022, pp.188. �hal-02987731v2�

HAL Id: hal-02987731


https://cel.hal.science/hal-02987731v2
Submitted on 17 Oct 2022

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.

Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License


Introduction to Optimal Control
Lecture notes - Draft

Thierry Miquel

[email protected]

October 12, 2022


2
Introduction
The application of optimal control theory to the practical design of multivariable
control systems started in the 1960s1 : in 1957 R. Bellman applied dynamic
programming to the optimal control of discrete-time systems. His procedure
resulted in nonlinear feedback schemes. By 1958, L.S. Pontryagin has developed
the maximum principle relying on the calculus of variations developed by L.
Euler (1707-1783). He solved the minimum-time problem, deriving in 1962 an
on/o relay control law as an optimal control. In 1960 three major papers
were published by R. Kalman and coworkers, working in the U.S. One of these
publicized the vital work of Lyapunov (1857-1918) in the time-domain control of
nonlinear systems. The next discussed the optimal control of systems, providing
the design equations for the Linear Quadratic Regulator (LQR). The third paper
has provided the design equations for the discrete Kalman lter. The continuous
Kalman lter was developed by Kalman and Bucy in 1961.
In control theory, Kalman introduced linear algebra and matrices, so that
systems with multiple inputs and outputs could easily be treated. He also
formalized the notion of optimality in control theory by minimizing a very
general quadratic generalized energy function. In the period of a year, the
major limitations of classical control theory were overcome, important new
theoretical tools were introduced, and a new era in control theory had begun;
we call it the era of modern control. In the period since 1980 the theory has
been further rened under the name of H2 theory which is out of the scope of
this survey.
This lecture focuses on LQ (linear quadratic) theory and is a compilation of a
number of results in the context of control system design. This has been written
thanks to the references put in bibliographical section. It starts with a reminder
of the main results in optimization of non linear systems which will be used as
a background for this lecture. Then linear quadratic regulator (LQR) for nite
nal time and for innite nal time where the solution to the LQ problem are
discussed. The robustness properties of the linear quadratic regulator (LQR)
are then presented where the asymptotic properties and the guaranteed gain
and phase margins associated with the LQ solution are presented. The next
section presents some design methods with a special emphasis on symmetric
root locus. We conclude with a short section dedicated to the Linear Quadratic
Tracker (LQT) where the usefulness of augmenting the plant with integrators
is presented.

1
https://lewisgroup.uta.edu/history.htm
4
Bibliography
[1] Alazard D., Apkarian P. and Cumer C., Robustesse et commande
optimale, Cépaduès (2000)
[2] Anderson B. and Moore J., Optimal Control: Linear Quadratic
Methods, Prentice Hall (1990)
[3] Friedland B., Control System Design: An Introduction to State-
Space Methods, Dover Books on Electrical Engineering (2012)
[4] Hespanha J. P., Linear Systems Theory, Princeton University Press
(2009)

[5] Hull D. G., Optimal Control Theory for Applications, Springer


(2003)

[6] Jerey B. Burl, Linear Optimal Control, Prentice Hall (1998)


[7] Lewis F., Vrabie D. and Syrmos L., Optimal Control, John Wiley
& Sons (2012)

[8] Perry Y. Li, Linear Quadratic Optimal Control, University of


Minnesota (2012)

[9] Sinha A., Linear Systems: Optimal and Robust Control, CRC Press
(2007)

[10] Skogestad S., and Postlethwaite I., Multivariable Feedback Control:


Analysis and Design, John Wiley (2005)
6 Bibliography
Table of contents

1 Overview of Pontryagin's Minimum Principle 11


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Lagrange multipliers . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Euler-Lagrange equation . . . . . . . . . . . . . . . . . . . . . . . 17
1.7 Fundamentals of optimal control theory . . . . . . . . . . . . . . 20
1.7.1 Problem to be solved . . . . . . . . . . . . . . . . . . . . . 20
1.7.2 Bolza, Mayer and Lagrange problems . . . . . . . . . . . . 20
1.7.3 First order necessary conditions . . . . . . . . . . . . . . . 21
1.8 Example: brachistochrone problem . . . . . . . . . . . . . . . . . 23
1.8.1 Problem overview . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.2 System dynamics . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.3 Euler-Lagrange approach . . . . . . . . . . . . . . . . . . 25
1.8.4 Hamiltonian approach . . . . . . . . . . . . . . . . . . . . 26
1.9 Hamilton-Jacobi-Bellman (HJB) equation . . . . . . . . . . . . . 28
1.9.1 Finite horizon control . . . . . . . . . . . . . . . . . . . . 28
1.9.2 Principle of optimality, dynamic programming . . . . . . . 29
1.9.3 Innite horizon control . . . . . . . . . . . . . . . . . . . . 29
1.9.4 Application of HJB equation to linear time invariant systems 30
1.10 Pontryagin's principle . . . . . . . . . . . . . . . . . . . . . . . . 31
1.11 Hamiltonian over time . . . . . . . . . . . . . . . . . . . . . . . . 33
1.11.1 General result . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.11.2 Autonomous system without constraint on input . . . . . 33
1.11.3 Free nal time . . . . . . . . . . . . . . . . . . . . . . . . 34
1.12 Bang-bang control . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.12.1 Pontryagin's principle application . . . . . . . . . . . . . . 35
1.12.2 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.12.3 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.13 Singular arc - Legendre-Clebsch condition . . . . . . . . . . . . . 42

2 Finite Horizon Linear Quadratic Regulator 43


2.1 Problem to be solved . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 Positive denite and positive semi-denite matrix . . . . . . . . . 44
2.3 Hamiltonian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8 Table of contents

2.4 Optimal control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


2.4.1 State vector expression . . . . . . . . . . . . . . . . . . . . 46
2.4.2 Lagrange multipliers for imposed nal state . . . . . . . . 47
2.4.3 Lagrange multipliers for weighted nal state . . . . . . . . 47
2.4.4 Limit values when nal state weighting matrix increases . 48
2.4.5 Closed-loop block diagram . . . . . . . . . . . . . . . . . . 49
2.5 Riccati dierential equation . . . . . . . . . . . . . . . . . . . . . 50
2.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.6.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.6.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.7 Second order necessary condition for optimality . . . . . . . . . . 53
2.8 Minimum cost achieved . . . . . . . . . . . . . . . . . . . . . . . 53
2.9 Application to minimum energy control problem . . . . . . . . . 54
2.9.1 Moving a linear system close to a nal state with minimum
energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.9.2 Moving a linear system exactly to a nal state with
minimum energy . . . . . . . . . . . . . . . . . . . . . . . 56
2.9.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.10 Finite horizon LQ regulator with cross-term in the performance
index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.11 Extension to nonlinear system ane in control . . . . . . . . . . 61

3 Innite Horizon Linear Quadratic Regulator (LQR) 63


3.1 Problem to be solved . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2 Stabilizability and detectability . . . . . . . . . . . . . . . . . . . 64
3.3 Algebraic Riccati equation . . . . . . . . . . . . . . . . . . . . . . 64
3.4 Extension to nonlinear system ane in control . . . . . . . . . . 66
3.5 Solving the algebraic Riccati equation . . . . . . . . . . . . . . . 68
3.5.1 Hamiltonian matrix based solution . . . . . . . . . . . . . 68
3.5.2 Proof of the results on the Hamiltonian matrix . . . . . . 69
3.5.3 Solving general algebraic Riccati and Lyapunov equations 70
3.6 Application to the optimal control of any scalar LTI plant . . . . 72
3.7 Hamiltonian matrix properties . . . . . . . . . . . . . . . . . . . . 74
3.8 Discrete time LQ regulator . . . . . . . . . . . . . . . . . . . . . 77
3.8.1 Finite horizon LQ regulator . . . . . . . . . . . . . . . . . 77
3.8.2 Finite horizon LQ regulator with zero terminal state . . . 78
3.8.3 Innite horizon LQ regulator . . . . . . . . . . . . . . . . 79
3.9 Robustness property . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.9.1 Hsu-Chen theorem . . . . . . . . . . . . . . . . . . . . . . 80
3.9.2 Generalized (MIMO) Nyquist stability criterion . . . . . . 83
3.9.3 Kalman equality . . . . . . . . . . . . . . . . . . . . . . . 84
3.9.4 Robustness of Linear Quadratic Regulator . . . . . . . . . 86

4 Design methods 91
4.1 Symmetric Root Locus . . . . . . . . . . . . . . . . . . . . . . . . 91
4.1.1 Characteristics polynomials . . . . . . . . . . . . . . . . . 91
4.1.2 Root Locus reminder . . . . . . . . . . . . . . . . . . . . . 92
Table of contents 9

4.1.3 Chang-Letov design procedure . . . . . . . . . . . . . . . 94


4.1.4 Proof of the symmetric root locus result . . . . . . . . . . 96
4.2 Asymptotic properties of LQR applied to SISO plants . . . . . . 97
4.2.1 Closed-loop poles location . . . . . . . . . . . . . . . . . . 97
4.2.2 Shape of the magnitude of the loop gain . . . . . . . . . . 98
4.2.3 Weighting matrices selection . . . . . . . . . . . . . . . . . 100
4.2.4 Poles assignment in optimal regulator using root locus . . 101
4.3 Poles shifting in optimal regulator . . . . . . . . . . . . . . . . . 102
4.3.1 Mirror property . . . . . . . . . . . . . . . . . . . . . . . . 102
4.3.2 Reduced-order model . . . . . . . . . . . . . . . . . . . . . 105
4.3.3 Shifting one real eigenvalue . . . . . . . . . . . . . . . . . 105
4.3.4 Shifting a pair of complex conjugate eigenvalues . . . . . . 106
4.3.5 Sequential pole shifting via reduced-order models . . . . . 107
4.4 Frequency domain approach . . . . . . . . . . . . . . . . . . . . . 109
4.4.1 Non optimal pole assignment . . . . . . . . . . . . . . . . 109
4.4.2 Assignment of weighting matrices Q and R . . . . . . . . 110
4.5 Poles assignment in optimal regulator through matrix inequalities 113
4.6 Model matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.6.1 cross-term in the performance index . . . . . . . . . . . . 115
4.6.2 Implicit reference model . . . . . . . . . . . . . . . . . . . 116
4.7 Optimal output feedback . . . . . . . . . . . . . . . . . . . . . . . 117
4.7.1 Reformulation of the state feedback optimal control problem117
4.7.2 Output feedback optimal control problem . . . . . . . . . 120
4.7.3 Solution of the output feedback optimal control problem . 121
4.7.4 Poles placement in a specied region . . . . . . . . . . . . 122
4.8 Frequency shaped LQ control . . . . . . . . . . . . . . . . . . . . 124
4.9 Optimal transient stabilization . . . . . . . . . . . . . . . . . . . 126

5 Linear Quadratic Tracker (LQT) 129


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.2 Control with feedforward . . . . . . . . . . . . . . . . . . . . . . . 129
5.3 Finite horizon Linear Quadratic Tracker . . . . . . . . . . . . . . 130
5.4 Innite horizon Linear Quadratic Tracker . . . . . . . . . . . . . 133
5.4.1 General result . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.4.2 Asymptotically stable linear reference model . . . . . . . . 133
5.4.3 Constant reference tracking . . . . . . . . . . . . . . . . . 134
5.5 Plant augmented with integrator . . . . . . . . . . . . . . . . . . 135
5.5.1 Integral augmentation . . . . . . . . . . . . . . . . . . . . 135
5.5.2 Proof of the cancellation of the steady-state error through
integral augmentation . . . . . . . . . . . . . . . . . . . . 137
5.6 Tracking with prelter . . . . . . . . . . . . . . . . . . . . . . . . 139
5.6.1 Tracking without integral augmentation . . . . . . . . . . 139
5.6.2 Tracking with integral augmentation . . . . . . . . . . . . 142
10 Table of contents

6 Linear Quadratic Gaussian (LQG) regulator 145


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
6.2 Luenberger observer . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.3 White noise through Linear Time Invariant (LTI) system . . . . . 147
6.3.1 Assumptions and denitions . . . . . . . . . . . . . . . . . 147
6.3.2 Mean and covariance matrix of the state vector . . . . . . 148
6.3.3 Autocorrelation function of the stationary output vector . 149
6.3.4 Proof of the expression of the autocorrelation function . . 153
6.4 Kalman-Bucy lter . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.4.1 Linear Quadratic Estimator . . . . . . . . . . . . . . . . . 154
6.4.2 Sketch of the proof . . . . . . . . . . . . . . . . . . . . . . 156
6.5 Duality principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.6 Separation principle . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.7 Controller transfer function . . . . . . . . . . . . . . . . . . . . . 160
6.8 Loop Transfer Recovery . . . . . . . . . . . . . . . . . . . . . . . 162
6.8.1 Lack of guaranteed robustness of LQG design . . . . . . . 162
6.8.2 Doyle's seminal example . . . . . . . . . . . . . . . . . . . 162
6.8.3 Closed-loop eigenvalues and eigenvectors . . . . . . . . . . 164
6.8.4 Asymptotic behavior of Riccati equation . . . . . . . . . . 165
6.8.5 Loop Transfer Recovery (LTR) design . . . . . . . . . . . 166
6.9 Proof of the Loop Transfer Recovery condition . . . . . . . . . . 169
6.9.1 Loop transfer function with observer . . . . . . . . . . . . 169
6.9.2 Loop Transfer Recovery (LTR) condition . . . . . . . . . . 171
6.9.3 Setting the Loop Transfer Recovery design parameter . . 173
6.10 Robust control design . . . . . . . . . . . . . . . . . . . . . . . . 174
6.11 Sensor data fusion . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.11.1 Complementary lter . . . . . . . . . . . . . . . . . . . . . 176
6.11.2 Kalman lter . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.11.3 Relation between complementary lter and Kalman lter 179
6.12 Euler angles estimation . . . . . . . . . . . . . . . . . . . . . . . . 181
6.12.1 One dimensional attitude estimation . . . . . . . . . . . . 181
6.12.2 One dimensional complementary lter . . . . . . . . . . . 182
6.12.3 Direct Cosine Matrix (DCM) and kinematic relations . . . 184
6.12.4 Roll and pitch angles estimation from accelerometer
measurements . . . . . . . . . . . . . . . . . . . . . . . . . 185
6.12.5 Yaw angle estimation from magnetometer measurements . 187
6.12.6 Angular velocity from gyroscope measurements . . . . . . 187
6.12.7 Attitude and Heading Reference System (AHRS) based
on complementary lter . . . . . . . . . . . . . . . . . . . 187
Chapter 1

Overview of Pontryagin's
Minimum Principle

1.1 Introduction
Pontryagin's Minimum (or Maximum) Principle was formulated in 1956 by the
Russian mathematician Lev Pontryagin (1908 - 1988) and his students1 . Its
initial application was dedicated to the maximization of the terminal speed
of a rocket. The result was derived using ideas from the classical calculus of
variations.
This chapter is devoted to the main results of optimal control theory which
leads to conditions for optimality.

1.2 Variation
Optimization can be accomplished by using a generalization of the dierential
called variation.
Let's consider the real scalar cost function J(x) of a vector x ∈ Rn . Cost
function J(x) has a local minimum at x∗ if and only if for all δx suciently
small;
J(x∗ + δx) ≥ J(x∗ ) (1.1)
An equivalent statement statement is that:

∆J(x∗ , δx) = J(x∗ + δx) − J(x∗ ) ≥ 0 (1.2)

c The term ∆J(x∗ , δx) is called the increment of J(x). The optimality condition
can be found by expanding J(x∗ + δx) in a Taylor series around the extremun
point x∗ . When J(x) is a scalar function of multiple variables, the expansion
of J(x) in the Taylor series involves the gradient and the Hessian of the cost
function J(x):

− Assuming that J(x) is a dierentiable function, the term dJ(x
dx
)
is the
gradient of J(x) at x ∈ R which is the vector of R dened by:
∗ n n

1
https://en.wikipedia.org/wiki/Pontryagin's_maximum_principle
12 Chapter 1. Overview of Pontryagin's Minimum Principle
 dJ(x)

dx1
dJ(x∗ ) ..
= ∇J(x∗ ) =  (1.3)
 
.

dx  
dJ(x)
dxn x=x∗

2 ∗
− Assuming that J(x) is a twice dierentiable function, the term d dx
J(x )
2

is the Hessian of J(x) at x ∈ R which is the symmetric n × n matrix


∗ n

dened by:
 ∂ 2 J(x) ∂ 2 J(x)

···
d2 J(x∗ )  ∂x1.∂x1 ∂x1 ∂xn
d2 J(x∗ )
 

2
(1.4)

= ∇ J(x ) =  . =
 .

dx2 2
 dxi dxj 1≤i,j≤n
∂ J(x) ∂ 2 J(x)
∂xn ∂x1 ··· ∂xn ∂xn x=x∗

Expanding J(x∗ + δx) in a Taylor series around the point x∗ leads to the
following expression, where HOT stands for Higher-Order Terms :

1
J(x∗ + δx) = J(x∗ ) + δxT ∇J(x∗ ) + δxT ∇2 J(x∗ )δx + HOT (1.5)
2
Thus:

∆J(x∗ , δx) = J(x∗ + δx) − J(x∗ )


(1.6)
= δxT ∇J(x∗ ) + 21 δxT ∇2 J(x∗ )δx + HOT

When dealing with a functional (a real scalar function of functions) δx is


called the variation of x and the term in the increment ∆J(x∗ , δx) which is
linear in δxT is called the variation of J and is denoted δJ(x∗ ). The variation
of J(x) is a generalization of the dierential and can be applied to the
optimization of a functional. Equation (1.6) can be used to develop necessary
conditions for optimality. Indeed as δx approaches zero the terms δxT δx as
well as HOT become arbitrarily small compared to δx. As a consequence, a
necessary condition for x∗ to be a local extremum of the cost function J is that
the rst variation of J (its gradient) at x∗ is zero:
δJ(x∗ ) = ∇J(x∗ ) = 0 (1.7)

A critical (or stationary) point x∗ is a point where δJ(x∗ ) = ∇J(x∗ ) = 0.


Furthermore the sign of the Hessian provides sucient condition for a local
extremum. Let's write the Hessian ∇2 J(x∗ ) at the critical point x∗ as follows:
 
h11 · · · h1n
∗  .. .. 
2
∇ J(x ) =  . .  (1.8)
hn1 · · · hnn

− The sucient condition for the critical point x∗ to be a local minimum is


that the Hessian is positive denite, that is that all the principal minor
determinants are positive:
1.3. Example 13

∀ 1 ≤ k ≤ n Hk > 0


 H 1 = h11 > 0
 h h12
H2 = 11


 >0
h21 h22

(1.9)


⇔ h11 h12 h13
H3 = h21 h22 h23 > 0




h31 h32 h33




and so on...

− The sucient condition for the critical point x∗ to be a local maximum is


that the Hessian is negative denite, or equivalently that the opposite of
the Hessian is positive denite:

∀ 1 ≤ k ≤ n (−1)k Hk > 0


 H1 = h11 < 0
 h h12
H2 = 11


 >0
h21 h22

(1.10)


⇔ h11 h12 h13
H = h21 h22 h23 < 0

3



h31 h32 h33




and so on...

− If the Hessian has both positive and negative eigenvalues then the critical
point x∗ is a saddle point for the cost function J(x).

It should be emphasized that if the Hessian is positive semi-denite or negative


semi-denite or has null eigenvalues at a critical point x∗ , then it cannot be
concluded that the critical point is a minimizer or a maximizer or a saddle
point of the cost function J(x) and the test is inconclusive.

1.3 Example
Find the local maxima/minima for the following cost function:

J(x) = 5 − (x1 − 2)2 − 2(x2 − 1)2 (1.11)

First let's compute the rst variation of J , or equivalently its gradient:


" # 
dJ(x) 
dJ(x) −2(x1 − 2)
dx
= ∇J(x) = dJ(x) =1 (1.12)
dx −4(x2 − 1)
dx2

A necessary condition for x∗ to be a local extremum is that the rst variation


of J at x∗ is zero for all δx:

δJ(x∗ ) = ∇J(x∗ ) = 0 (1.13)

As a consequence, the following point is a critical point:


 
∗ 2
x = (1.14)
1
14 Chapter 1. Overview of Pontryagin's Minimum Principle

Now, we compute the Hessian to conclude on the nature of this critical point:
 
∗ −2 0
2
∇ J(x ) = (1.15)
0 −4

As far as the Hessian is negative denite we conclude that the critical point
x∗ is a local maximum.

1.4 Lagrange multipliers


Optimal control problems which will be tackled involve minimization of a cost
function subject to constraints on the state vector and the control. The
necessary condition given above is only applicable to unconstrained
minimization problems; Lagrange multipliers provide a method of converting a
constrained minimization problem into an unconstrained minimization
problem of higher order. Optimization can then be performed using the above
necessary condition. A constrained optimization problem is a problem of the
form:
Maximize (or minimize) cost function J(x) subject to the condition g(x) = 0
The most popular technique to solve this constrained optimization problem
is to use the Lagrange multiplier technique. Necessary condition for optimality
of J at a point x∗ are that x∗ satises g(x) = 0 and that the gradient of J is
zero in all direction along the surface g(x) = 0; this condition is satised if the
gradient of J is normal to the surface at x∗ . As far as the gradient of g(x) is
normal to the surface, including x∗ , this condition is satised if the gradient of
J is parallel (that is proportional) to the gradient of g(x) at x∗ , or equivalently:
 
∂J(x) T ∂g(x)
+λ =0 (1.16)
∂x ∂x x=x∗

It is worth noticing that the following relations hold:


  ∂J 


 ∂x1
  ∂J 
∂J

  ∂x 
T T

  2 =b

 J = a + b x = a + x b =  .. 



 ∂x  . 

∂J


 ∂J  ∂xn (1.17)


 ∂x1
∂J 

T

b x ∂J b n

  ∂x
xT b x
 
⇒ −
 2 

 J =a+ n
= . = n n+2

 ∥x∥ ∂x  .  ∥x∥
  ∥x∥


 .

 ∂J
∂xn

As an illustration, consider the cost function J(x) = (x1 − 1)2 + (x2 − 2)2 :
this is the equation of a circle of center (1, 2) with radius J(x). It is clear that
J(x) is minimal when (x1 , x2 ) is situated on the center of the circle. In this
case J(x)∗ = 0. Nevertheless if we impose on (x1 , x2 ) to belong to the straight
line dened by x2 − 2x1 − 6 = 0 then J(x) will be minimized as soon as the
1.5. Example 15

circle of radius J(x) tangent the straight line, that is if the gradient of J(x)
is normal to the surface at x∗ . Parameter λ is called the Lagrange multiplier
and has the dimension of the number of constraints expressed through g(x).
The necessary condition for optimality can be obtained as the solution of the
following unconstrained optimization problem where L(x, λ) is the Lagrange
function:
L(x, λ) = J(x) + λT g(x) (1.18)
Setting to zero the gradient of the Lagrange function with respect to x
leads to (1.16) whereas setting to zero the derivative of the Lagrange function
with respect to the Lagrange multiplier λ leads to the constraint g(x) = 0.
As a consequence, a necessary condition for x∗ to be a local extremum of the
cost function J subject to the constraint g(x) = 0 is that the rst variation of
Lagrange function (its gradient) at x∗ is zero :
 
∂L(x, λ) ∂J(x) T ∂g(x)
=0⇔ +λ =0 (1.19)
∂x x=x∗ ∂x ∂x x=x∗

The bordered Hessian is the (n + m) × (n + m) symmetric matrix which is


used for the second-derivative test. If there are m constraints represented by
g(x) = 0, then there are m border rows at the top-right and m border columns
at the bottom-left (the transpose of the top-right matrix) and the zero in the
south-east corner of the bordered Hessian is an m×m block of zeros, represented
by 0m×m . The bordered Hessian Hb (p) is dened by:
 2 
∂ L(x) ∂ 2 L(x) ∂ 2 L(x) ∂g1 (x) ∂gm (x)
− p · · · · · ·
 ∂x1∂∂x 1 ∂x1 ∂x2 ∂x1 ∂xn ∂x1 ∂x1 
 2 L(x) ∂ 2 L(x)
− p · · · ∂ 2 L(x) ∂g1 (x)
· · · ∂gm (x) 
 ∂x2 ∂x1 ∂x2 ∂x2 ∂x2 ∂xn ∂x2 ∂x2 

 · · · · · · · · · · · · · · · · · · ···  
 ∂ 2 L(x) 2
∂ L(x) 2
∂ L(x) ∂g1 (x) ∂gm (x) 
Hb (p) =  ∂x ∂x ∂xn ∂x2 · · · ∂xn ∂xn − p ∂xn · · · ∂xn 

 ∂gn (x)1 ∂g1 (x)
1

 ∂x1 ··· ··· ∂xn



 ··· ··· ··· 0m×m 

∂gm (x) ∂gm (x)
∂x1 ··· ··· ∂xn x=x∗
(1.20)
The sucient condition for the critical point x∗ to be an extrema is that the
values of p obtained from det (Hb (p)) = 0 must be of the same sign.

− If all the values of p are strictly negative, then it is a maxima

− If all the values of p are strictly positive, then it is a minima

− However if some values of p are zero or of a dierent sign, then the critical
point x∗ is a saddle point.

1.5 Example
Find the local maxima/minima for the following cost function:

J(x) = x1 + 3x2 (1.21)


16 Chapter 1. Overview of Pontryagin's Minimum Principle

Subject to the constraint:

g(x) = x21 + x22 − 10 = 0 (1.22)

First let's compute the Lagrange function of this problem:

L(x, λ) = J(x) + λT g(x) = x1 + 3x2 + λ(x21 + x22 − 10) (1.23)

A necessary condition for x∗ to be a local extremum is that the rst variation


of J at x∗ is zero for all δx:

∂L(x∗ )
 
1 + 2λx1
= = 0 s.t. x21 + x22 − 10 = 0 (1.24)
∂x 3 + 2λx2

As a consequence, the Lagrange multiplier λ shall be chosen as follows:


1
 
x1 = − 2λ 1 9 1
3 ⇒ x21 + x22 − 10 = 2 + 2 − 10 = 0 ⇔ 10 − 40λ2 = 0 ⇔ λ = ±
x2 = − 2λ 4λ 4λ 2
(1.25)
Using the values of the Lagrange multiplier within (1.24) we then obtain 2
critical points:
   
1 ∗ −1 1 ∗ 1
λ = ⇒ x1 = and λ = − ⇒ x2 = (1.26)
2 −3 2 3

− For λ = 1
2 the bordered Hessian is:
   
2λ − p 0 2x1 1−p 0 −2
Hb (p) =  0 2λ − p 2x2  = 0 1 − p −6 (1.27)
2x1 2x2 0 x=x∗ −2 −6 0

Thus:
det (Hb (p)) = −40 + 40p (1.28)

We conclude that the critical point (−1; −3) is a local minima because
det (Hb (p)) = 0 for p = +1 which is strictly positive.

− For λ = − 21 the bordered Hessian is:


   
2λ − p 0 2x1 −1 − p 0 2
Hb (p) =  0 2λ − p 2x2  = 0 −1 − p 6 (1.29)
2x1 2x2 0 x=x∗ 2 6 0

Thus:
det (Hb (p)) = 40 + 40p (1.30)

We conclude that the critical point (+1; +3) is a local maxima because
det (Hb (p)) = 0 for p = −1 which is strictly negative.
1.6. Euler-Lagrange equation 17

1.6 Euler-Lagrange equation


Historically, Euler-Lagrange equation came with the study of the tautochrone
(or isochrone curve) problem. Lagrange solved this problem in 1755 and sent
the solution to Euler. Their correspondence ultimately led to the calculus of
variations 2 .
The problem considered was to nd the expression of x(t) which minimizes
the following performance index J(x(t)) where F (x(t), ẋ(t)) is a real-valued
twice continuous function:
Z tf
J(x(t)) = F (x(t), ẋ(t)) dt (1.31)
0

Furthermore the initial and nal values of x(t) are imposed:



x(0) = x0
(1.32)
x(tf ) = xf

Let x∗ (t) be a candidate for the minimization of J(x(t)). In order to see


whether x∗ (t) is indeed an optimal solution, this candidate optimal input is
perturbed by a small amount δx which leads to a perturbation δx in the optimal
state vector x∗ (t):
x(t) = x∗ (t) + δx(t)

(1.33)
ẋ(t) = ẋ∗ (t) + δ ẋ(t)
The change δJ in the value of the performance index is obtained thanks to
the calculus of variation:

∂F T ∂F T
Z tf Z tf  
δJ = δF (x(t), ẋ(t)) dt = δx + δ ẋ dt (1.34)
0 0 ∂x ∂ ẋ

∂F T
Integrating ∂ ẋ δ ẋ by parts leads to the following expression:
 
d ∂F T d ∂F T ∂F T
dt ∂ ẋ δx = dt ∂ ẋ δx + ∂ ẋ δ ẋ
  tf (1.35)
∂F T d ∂F T ∂F T
R tf
⇒ δJ = 0 ∂x δx − dt ∂ ẋ δx dt + ∂ ẋ δx 0

Because δx is a perturbation around the optimal state vector x∗ (t) we shall


set to zero the rst variation δJ whatever the value of the variation δx:

δJ = 0 ∀ δx (1.36)

This leads to the following necessary conditions for optimality:



 ∂F T δx − d ∂F T δx = 0
∂x dt ∂ ẋ
∂F T
tf (1.37)
∂ ẋ δx 0 = 0

2
https://en.wikipedia.org/wiki/Euler-Lagrange_equation
18 Chapter 1. Overview of Pontryagin's Minimum Principle

As far as the initial and nal values of x(t) are imposed no variation are
permitted on δx:  
x(0) = x0 δx(0) = 0
⇒ (1.38)
x(tf ) = xf δx(tf ) = 0
On the other hand it is worth noticing that if the nal value was not imposed
we shall have ∂F
∂ ẋ = 0.
t=tf
Thus the rst variation δJ of the functional cost reads:
∂F T d ∂F T
Z tf  
δJ = δx − δx dt (1.39)
0 ∂x dt ∂ ẋ
In order to set to zero the rst variation δJ whatever the value of the
variation δx the following second-order partial dierential equation has to be
solved:
∂F T d ∂F T
− =0 (1.40)
∂x dt ∂ ẋ
Or by taking the transpose:
d ∂F ∂F
− =0 (1.41)
dt ∂ ẋ ∂x
We retrieve the well known Euler-Lagrange equation of classical mechanics.
Euler-Lagrange equation is a second orderR t Ordinary Dierential Equations
(ODE) that x shall satisfy to minimize 0 f F (x(t), ẋ(t)) dt. Euler-Lagrange
equation is usually quite dicult to solve.
Nevertheless, because F (x(t), ẋ(t)) does not depends explicitly on time t,
Beltrami identity3 provides a rst integral of the Euler-Lagrange equation.
Denoting by C a constant, the rst integral of the Euler-Lagrange equation
reads as follows:
d ∂F ∂F ∂F T
− =0⇔F − ẋ = C (1.42)
dt ∂ ẋ ∂x ∂ ẋ
Indeed, multiplying both sides of the Euler-Lagrange equation by ẋT we get:
 
d ∂F ∂F d ∂F ∂F
− = 0 ⇒ ẋ T
− ẋT =0 (1.43)
dt ∂ ẋ ∂x dt ∂ ẋ ∂x
Since F (x(t), ẋ(t)) does not depend explicitly on time t, we have:
dF (x(t),ẋ(t)) ∂F T ∂x ∂F T ∂ ẋ
dt = ∂x ∂t + ∂ ẋ ∂t
∂F T ∂ ẋ (1.44)
= ẋT ∂F
∂x + ∂ ẋ ∂t
∂F T ∂ ẋ
⇒ ẋT ∂F
∂x = dF
dt − ∂ ẋ ∂t

Using this expression of ẋT ∂F


∂x in (1.43) reads:
   
d ∂F dF ∂F T ∂ ẋ
ẋT dt ∂ ẋ − dt − ∂ ẋ ∂t = 0
 T T ∂ ẋ
⇔ dt d ∂F
ẋ + ∂F dF
∂ ẋ ∂t − dt = 0
(1.45)
∂ ẋ T 
d ∂F
⇔ dt ∂ ẋ ẋ − F = 0
3
https://en.wikipedia.org/wiki/Beltrami_identity
1.6. Euler-Lagrange equation 19

Denoting by C a constant, the rst integral of the Euler-Lagrange equation


nally reads as the Beltrami identity:

∂F T
F− ẋ = C (1.46)
∂ ẋ
Alternatively, Euler-Lagrange equation could be transformed into a set of
rst order Ordinary Dierential Equations, which may be more convenient to
manipulate, by introducing a control u(t) dened by ẋ(t) = u(t) and by using
the Hamiltonian function H as it will be seen in the next sections.
Example 1.1. Let's nd the shortest distance between two points P1 = (x1 , y1 )
and P2 = (x2 , y2 ) in the euclidean plane.
The length of the path between the two points is dened by:
Z P2 p Z x2 q
J(y(x)) = 2 2
dx + dy = 1 + (y ′ (x))2 dx (1.47)
P1 x1

For that example F (y(x), y′ (x)) reads:


s  2
dy(x)
q
F y(x), y ′ (x) = = 1 + (y ′ (x))2 (1.48)

1+
dx

The initial and nal values on y(x) are imposed as follows:



y(x1 ) = y1
(1.49)
y(x2 ) = y2

The Euler-Lagrange equation for this example reads:


d ∂F ∂F d y ′ (x)

− =0⇔ q =0 (1.50)
dx ∂y ∂y dx ′ 2
1 + (y (x))

From the preceding relation it is clear that, denoting by c1 a constant, y′ (x)


shall satisfy the following rst order dierential equation:
y ′ (x)
 
√ 2
= c1 ⇒ (y ′ (x))2 = c21 1 + (y ′ (x))2
1+(y ′ (x)) (1.51)
c21
⇒ (y ′ (x))2 = 1−c21
⇒ y ′ (x) = a = constant

Alternatively, Beltrami identity reads as follows:


 q
′ 2
( q 
 1 + (y ′ )2 − √ (y ) ′ 2 = C
F (y, y ′ ) = 1 + (y ′ )2

 1+(y )
⇒ (1.52)
q
∂F ′
F − ∂y ′y = C
 ⇒ 1 = C 1 + (y ′ )2
 ⇒ y ′ (x) = a = constant

Thus, the shortest distance between two xed points in the euclidean plane
is a curve with constant slope, that is a straight-line:
y(x) = a x + b (1.53)
20 Chapter 1. Overview of Pontryagin's Minimum Principle

With initial and nal values imposed on y(x) we nally get for y(x) the
Lagrange polynomial of degree 1:

y(x1 ) = y1 x − x2 x − x1
⇒ y(x) = y1 + y2 (1.54)
y(x2 ) = y2 x1 − x2 x2 − x1

1.7 Fundamentals of optimal control theory


1.7.1 Problem to be solved
We rst consider optimal control problems for general nonlinear time invariant
systems of the form: 
ẋ = f (x, u)
(1.55)
x(0) = x0
Where x ∈ Rn and u ∈ Rm are the state variable and control inputs,
respectively, and f (x, u) is a continuous nonlinear function and x0 the initial
conditions. The goal is to nd a control u that minimizes the following
performance index:
Z tf
J(u(t)) = G (x(tf )) + F (x(t), u(t)) dt (1.56)
0

Where:

− t is the current time and tf the nal time;

− J(u(t)) is the integral cost function;

− F (x(t), u(t)) is the scalar running cost function;

− G (x(tf )) is the scalar terminal cost function.

Note that the state equation serves as constraints for the optimization of the
performance index J(u(t)). In addition, notice that the use of function G (x(tf ))
is optional; indeed, if the nal state x(tf ) is imposed then there is no need to
insert the expression G (x(tf )) in the cost to be minimized.

1.7.2 Bolza, Mayer and Lagrange problems


The problem dened above is known as the Bolza problem. In the special case
where F (x(t), u(t)) = 0 then the problem is known as the Mayer problem; on
the other hand if G (x(tf )) = 0 the problem is known as the Lagrange problem.
The Bolza problem is equivalent to the Lagrange problem and in fact leads
to it with the following change of variable:
 R tf
 J1 (u(t)) = 0 (F (x(t), u(t)) + xn+1 (t)) dt

ẋn+1 (t) = 0 (1.57)
 G(x(tf ))
 x
n+1 = tf ∀t
1.7. Fundamentals of optimal control theory 21

It also leads to the Mayer problem if one sets:



 J2 (u(t)) = G (x(tf )) + x0 (tf )
ẋ0 (t) = F (x(t), u(t)) (1.58)
x0 (0) = 0

1.7.3 First order necessary conditions


The optimal control problem is then a constrained optimization problem, with
cost being a functional of u(t) and the state equation providing the constraint
equations. This optimal control problem can be converted to an unconstrained
optimization problem of higher dimension by the use of Lagrange multipliers.
An augmented performance index is then constructed by adding a vector of
Lagrange multipliers λ times each constraint imposed by the dierential
equations driving the dynamics of the plant; these constraints are added to the
performance index by the addition of an integral to form the augmented
performance index Ja :
Z tf
F (x(t), u(t)) + λT (t)(f (x, u) − ẋ) dt (1.59)

Ja (u(t)) = G (x(tf )) +
0

Let u∗ (t) be a candidate for the optimal input vector and let the
corresponding state vector be x∗ (t):

ẋ∗ (t) = f (x∗ (t), u∗ (t)) (1.60)

In order to see whether u∗ (t) is indeed an optimal solution, this candidate


optimal input is perturbed by a small amount δu which leads to a perturbation
δx in the optimal state vector x∗ (t):

u(t) = u∗ (t) + δu(t)



(1.61)
x(t) = x∗ (t) + δx(t)

Assuming that the nal time tf is known, the change δJa in the value of the
augmented performance index is obtained thanks to the calculus of variation 4 :
T
∂G(x(tf ))
δJa = ∂x(tf) δx(tf )+
R tf ∂F T  
∂F T T ∂f ∂f dδx
0 ∂x δx + ∂u δu + λ (t) ∂x δx + ∂u δu − dt dt
T
∂G(x(tf ))
= ∂x(tf
) δx(tf )+
  T  
∂F T
R tf T ∂f ∂F T ∂f T dδx
0 ∂x + λ (t) ∂x δx + ∂u + λ (t) ∂u δu − λ (t) dt dt
(1.62)
In the preceding equation:
∂G(x(tf )) ∂F T ∂F T
− ∂x(tf ) , ∂u and ∂x are row vectors;
4
Ferguson J., Brief Survey of the History of the Calculus of Variations and its Applications
(2004) arXiv:math/0402357
22 Chapter 1. Overview of Pontryagin's Minimum Principle

∂f ∂f
− ∂x and ∂u are matrices;

∂f ∂f dδx
− ∂x δx, ∂u δu and dt are column vectors.

Then we introduce the functional H, known as the Hamiltonian function,


which is dened as follows:

H(x, u, λ) = F (x, u) + λT (t)f (x, u) (1.63)

Then:
∂H T ∂F T
(
= + λT (t) ∂f
∂x
∂H T
∂x
∂F T
∂x
(1.64)
∂u = ∂u + λT (t) ∂f
∂u

Equation (1.62) becomes:

∂G (x(tf )) T tf
∂H T ∂H T
Z  
dδx
δJa = δx(tf ) + δx + δu − λT (t) dt (1.65)
∂x(tf ) 0 ∂x ∂u dt

Let's concentrate on the last term within the integral that we integrate by
parts:
R tf tf R tf T
λT (t) dδx T
dt dt = λ (t)δx 0 − 0 λ̇ (t)δx dt
Rt 0
R tf T (1.66)
⇔ 0 f λT (t) dδx T T
dt dt = λ (tf )δx(tf ) − λ (0)δx(0) − 0 λ̇ (t)δx dt

As far as the initial state is imposed, the variation of the initial condition is
null; consequently we have δx(0) = 0 and:
Z tf Z tf
dδx T
T
λ (t) dt = λT (tf )δx(tf ) − λ̇ (t)δx dt (1.67)
0 dt 0

Using (1.67) within (1.65) leads to the following expression for the rst
variation of the augmented functional cost:
!
∂G (x(tf )) T
δJa = − λT (tf ) δx(tf )+
∂x(tf )
(1.68)
∂H T ∂H T
Z tf    
T
δu + + λ̇ (t) δx dt
0 ∂u ∂x

In order to set the rst variation of the augmented functional cost δJa to
zero the time dependent Lagrange multipliers λ(t), which are also called costate
functions, are chosen as follows:

T ∂H T ∂H
λ̇ (t) + = 0 ⇔ λ̇(t) = − (1.69)
∂x ∂x

This equation is called the adjoint equation. As far as it is a dierential


equation we need to know the value of λ(t) at a specic value of time t to be
able to compute its solution (also called its trajectory ) :
1.8. Example: brachistochrone problem 23

− Assuming that nal value x(tf ) is specied to be xf then the variation


δx(tf ) in (1.68) is zero and λ(tf ) is set such that x(tf ) = xf .

− Assuming that nal value x(tf ) is not specied then the variation δx(tf )
in (1.68) is not equal to zero and the value of λ(tf ) is set by imposing that
the following dierence vanishes at nal time tf :

∂G (x(tf ))T ∂G (x(tf ))


− λT (tf ) = 0 ⇔ λ(tf ) = (1.70)
∂x(tf ) ∂x(tf )

This is the boundary condition, also known as transversality condition,


which set the nal value of the Lagrange multipliers.

Hence in both situations the rst variation of the augmented functional cost
(1.68) can be written as:
tf
∂H T
Z  
δJa = δu dt (1.71)
0 ∂u

Moreover if there is no constraint on input u(t), then δu is free and the rst
variation of the augmented functional cost δJa in (1.71) is set to zero through
the following necessary condition for optimality:

∂H T ∂H
δJa = 0 ⇒ =0⇔ =0 (1.72)
∂u ∂u

1.8 Example: brachistochrone problem


1.8.1 Problem overview
A classic optimal control problem is the brachistochrone problem: it consists
in computing the curve of fastest descent for a point of mass m which slides
without friction and with constant gravitational acceleration g to a xed end
point in the shortest time5 .
The control parameter is the slope γ(t) of the curve. Variable y(t) is the
horizontal position of the point, z(t) its vertical position in the down direction
and v(t) its velocity.

1.8.2 System dynamics


First let's focus on the dynamics of the system using Lagrangian Mechanics.
Let q be the vector of generalized coordinates. We choose:
   
q1 (t) y(t)
q := = (1.73)
q2 (t) z(t)
5
https://apmonitor.com/wiki/index.php/Apps/BrachistochroneProblem
24 Chapter 1. Overview of Pontryagin's Minimum Principle

The kinetic energy T (q, q̇) and potential energy V (q) read as follows (remember
that the vertical position is oriented downward):

T (q, q̇) = 12 m ẏ(t)2 + ż(t)2


 
(1.74)
V (q) = −mgz(t)

The Lagrangian L (for classical mechanics) is dened as the dierence


between kinetic and potential energy:
1
L = T (q, q̇) − V (q) = m ẏ(t)2 + ż(t)2 + mgz(t) (1.75)

2
The dynamics of the system is then obtained by applying Euler-Lagrange
equation:   
d ∂L ∂L ÿ(t) = 0
− =0⇒ (1.76)
dt ∂ q̇i ∂qi z̈(t) = g
Now we introduce the following kinematic relations related to the velocity
v(t) of the point and to the slope γ(t) of the curve on which the point slides on:

ẏ(t) = v(t) cos(γ(t))
(1.77)
ż(t) = v(t) sin(γ(t))

When taking the time derivative of the square of the velocity v(t) and using
relations (1.76) we get the following expression of the time derivative of velocity
v(t):
v(t)2 = ẏ(t)2 + ż(t)2
⇒ v(t)v̇(t) = ẏ(t)ÿ(t) + ż(t)z̈(t) (1.78)
v(t) sin(γ(t))g
⇒ v̇(t) = v(t)

Finally the dynamics of the system is of dimension 3 and reads as follows:



 ẏ(t) = v(t) cos(γ(t))
ż(t) = v(t) sin(γ(t)) (1.79)
v̇(t) = g sin(γ(t))

In order to reduce the size of the system, it is worth noticing that ż(t) and
v̇(t) depend on sin(γ(t)). So we can write:

ż(t)
v̇(t) = g ⇔ v(t)v̇(t) = g ż(t) (1.80)
v(t)

That is, after integration:


1
2 v(t)− 21p
2 v(0)2 = g z(t) − g z(0)
(1.81)
⇔ v(t) = 2g (z(t) − z(0)) + v(0)2

Then the dynamics of the system is reduced to dimension 2:


 √
 ẏ(t) = cos(γ(t))√ 2g z + l0
ż(t) = sin(γ(t)) 2g z + l0 (1.82)
l0 = v(0)2 − 2g z(0)

1.8. Example: brachistochrone problem 25

Constant l0 depends on initial conditions v(0) and z(0)


The dynamics of the system is reduced one step further through the use of
innitesimal element of curvilinear abscissa ds. Indeed, on one side we have:
p q
ds = dy 2 + dz 2 = 1 + (z ′ )2 dy (1.83)
where:
dz
z ′ := (1.84)
dy
On the other side, from (1.82) we have:
p p p
ds = dy 2 + dz 2 = ẏ 2 + ż 2 dt = 2g z + l0 dt (1.85)
Thus by equating (1.83) and (1.85) we get the following relation between dy
and dt: s
1 + (z ′ )2
q
2g z + l0 dt = 1 + (z ′ )2 dy ⇒ dt =
p
dy (1.86)
2g z + l0
Finally the system is reduced to dimension 1 through the following relation:
s
dt 1 + (z ′ )2
t′ := = (1.87)
dy 2g z + l0

1.8.3 Euler-Lagrange approach


Using Euler-Lagrange formalism, the optimal control problem can be formulated
as follows:
nd z(y)
Rt Ry (1.88)
which minimizes tf = 0 f dt = 0 f F (z, z ′ ) dy
According to (1.87) the functional F (z, z ′ ) to be minimized reads:
s
1 + (z ′ )2
F z, z ′ = (1.89)

2g z + l0
Then we have to nd z(y) which solves the Euler-Lagrange equation:
d ∂F ∂F
− =0 (1.90)
dt ∂ ẋ ∂x
Using Beltrami identity, the rst integral of the Euler-Lagrange equation
reads as follows where C is a constant:
s
∂F ′ 1 + (z ′ )2 1 (z ′ )2
F − ′z = C ⇔ −√ q =C (1.91)
∂z 2g z + l0 2g z + l0 1 + (z ′ )2
r  
Multiplying both members by (2g z + l0 ) 1 + (z ′ )2 and simplifying, we
get:
1
C=r   (1.92)
(2g z + l0 ) 1 + (z ′ )2
26 Chapter 1. Overview of Pontryagin's Minimum Principle

We thus obtain the following dierential equation:


 
(2g z + l0 ) 1 + (z ′ )2 = C12
   (1.93)
⇔ 2g z + 2g l0
1 + (z ′ )2 = C12

The following reduced height zr (t) is introduced in order to normalize the


solution:
l0
zr := z + ⇒ dzr = dz (1.94)
2g
We nally get:
 2  1 dzr
zr 1 + zr′ = 2
where zr′ = (1.95)
2g C dy

The solution of this dierential equation is the cycloid curve. The parametric
expression of the cycloid curve is the following where parameter θ varies from 0
to θf :

y = R(C) (θ − sin(θ)) 1
where R(C) := (1.96)
zr = R(C) (1 − cos(θ)) 4g C 2
The cycloid curve corresponds to the trajectory of a point on a circle of
radius R(C) rolling along a straight line.
The values of C and θf shall then be chosen such that the nal conditions
on y and zr are fullled:

y(tf ) = R(C) (θf − sin(θf ))
(1.97)
zr (tf ) = R(C) (1 − cos(θf ))

1.8.4 Hamiltonian approach


We will use z ′ := u as the control variable. The optimal control problem can be
formulated as follows:

nd u(y)
Rt Ry q 2
which minimizes J (u) = 0 f dt = 0 f 2g 1+u
z(y)+l0 dy (1.98)
under the following constraint :
z′ = u

The Hamiltonian function H reads:


s
1 + u2
T
H(x, u, λ) = F (x, u) + λ (t)f (x, u) = + λz u (1.99)
2g z + l0

Because there is no constraint on control u, the necessary conditions for


optimality read as follows:
( ∂H (
∂u = 0 √ 1 √ u + λz = 0
⇒ 2g
√ z+l0 1+u2 (1.100)
∂H
= −λz ′ −g 1 + u (2g z + l0 )−3/2 = −λ′z
2
∂z
1.8. Example: brachistochrone problem 27

Then we have to nd the expression of control u as a function of z and λz


and solve the dierential equations involving z and λz :

z ′ = u(z, λz )

(1.101)
λ′z = g 1 + u(z, λz )2 (2g z + l0 )−3/2
p

This could be a tricky task be let's try it ! First from the rst equation of
(1.100) we get the expression of 1 + u2 as a function of z and λz :

u p 1
√ = −λz 2g z + l0 ⇒ 1 + u2 = 2 (2g z + l )
(1.102)
1+u2 1 − λz 0

Using this expression in the second equation of (1.100), we get the following
expression of λ′z :

λ′z = g q1 + u2 (2g z + l0 )−3/2
(1.103)
1
= g 1−λ2 (2g z+l0 )
(2g z + l0 )−3/2
z

As far as the dierential equation involving z is concerned, we use the


expression of u2 to get the following expression of (z ′ )2 :

2 1 λ2z (2g z + l0 )
z′ = u ⇒ z′ = u2 = − 1 = (1.104)
1 − λ2z (2g z + l0 ) 1 − λ2z (2g z + l0 )

From the rst equation of (1.100), that is √2g1z+l √1+u u


2
+ λz = 0, it is clear
0
that u = z and λz have opposite sign. Thus we get:

s s
(2g z + l0 ) 1
z ′ = −λz
p
= −λ z 2g z + l0 (1.105)
1 − λ2z (2g z + l0 ) 1 − λ2z (2g z + l0 )
q
Then we get the expression of 1−λ2 (2g 1
z+l0 )
that we insert in (1.103). We
z
get: q
1 z′
1−λ2 (2g z+l0 )
= − λ √2g z+l
z z 0

⇒ λ′z = −g λ √z (2g z + l0 )−3/2 (1.106)
z 2g z+l0
z′
= − λgz (2g z+l 2
0)

We nally get:
g z′
λ′z λz = − (1.107)
(2g z + l0 )2
Thus the rst integral of this dierential equation is the following where C1
denotes a constant:
1
λ2z = + C1 (1.108)
2g z + l0
Or, equivalently:

λ2z (2g z + l0 ) = 1 + C1 (2g z + l0 ) (1.109)


28 Chapter 1. Overview of Pontryagin's Minimum Principle

Then this result is used in (1.102) to get the following relation:


1
1 + u2 = 1−λ2 (2g z+l0 )
z
= C1 (2g−1z+l0 ) (1.110)
⇒ (2g z + l0 ) 1 + u2 = − C11


Having in mind that z ′ = u, we retrieve the rst integral (1.93) which has
been obtained through Baltrami identity. Then the resolution process is similar
to what has been done in the previous section.

1.9 Hamilton-Jacobi-Bellman (HJB) equation


1.9.1 Finite horizon control
Let J ∗ (x, t) be the optimal cost-to-go function between t and tf :
Z tf

J (x, t) = minu(t) ∈ U F (x(t), u(t))dt (1.111)
t

The Hamilton-Jacobi-Bellman equation related to the optimal control


problem (1.56) under the constraint (1.55) is the following rst order partial
derivative equation6 :
T !
∂J ∗ (x, t) ∂J ∗ (x, t)

− = minu(t) ∈ U F (x, u) + f (x, u) (1.112)
∂t ∂x

or, equivalently:
∂J ∗ (x, t) ∂J ∗ (x, t)
 
− = H∗ , x(t) (1.113)
∂t ∂x
where
T !
∂J ∗ (x, t)


H (λ(t), x(t)) = minu(t) ∈ U F (x, u) + f (x, u) (1.114)
∂x

For the time-dependent case, the terminal condition on the optimal cost-to-
go function solution of (1.112) reads:

J ∗ (x, tf ) = G (x(tf )) (1.115)

It is worth noticing that the Lagrange multiplier λ(t) represents the partial
derivative with respect to the state of the optimal cost-to-go function7 :

∂J ∗ (x, t)
λ(t) = (1.116)
∂x
6
da Silva J., de Sousa J., Dynamic Programming Techniques for Feedback Control,
Proceedings of the 18th World Congress, Milano (Italy) August 28 - September 2, 2011
7
Alazard D., Optimal Control & Guidance: From Dynamic Programming to Pontryagin's
Minimum Principle, lecture notes
1.9. Hamilton-Jacobi-Bellman (HJB) equation 29

1.9.2 Principle of optimality, dynamic programming


The preceding results lead to the so-called dynamic programming approach
which has been introduced by Bellman8 in 1957. This is a very powerful
approach which encompasses both necessary and sucient conditions for
optimality. Contrary to the Lagrange multipliers approach, the dynamic
programming solves the constrained optimal problem directly. Behind the
dynamic programming is the principle of optimality, which states that from
any point on an optimal state space trajectory, the remaining trajectory is
optimal for the corresponding problem initiated at that point. This result can
be quite easily applied for discrete time system but for continuous time system
it involves to nd the solution of a partial derivative equation which may be
dicult in practice.

1.9.3 Innite horizon control


For innite horizon control problem, the problem is to nd the control u(t)
which minimizes the following cost-to-go function:
Z ∞
J(x) = F (x(t), u(t))dt (1.117)
0

under the following nonlinear time invariant dynamics of the form:



ẋ = f (x, u)
(1.118)
x(0) = x0
Then, denoting by J ∗ (x) the optimal cost function (which now no more
depends on time t), the Hamilton-Jacobi-Bellman equation related to this
optimal control problem reads:
!
∂J (x) T
 ∗ 
0 = minu(t) ∈ U F (x, u) + f (x, u) (1.119)
∂x

Lower bounds on the optimal cost are obtained by integrating the


corresponding inequality:
∂J(x) T
 
0 ≤ F (x, u) + f (x, u) ∀ x, u (1.120)
∂x
It it worth noticing that:
R  T R ∞  ∂J(x) T
∞ ∂J(x)
0 ∂x f (x, u)dt = 0 ∂x ẋ(t)dt
R x(∞)  ∂J(x) T
= x(0) dx (1.121)
∂x
= J(x(∞)) − J(x(0))
Thus, assuming that x(∞) = 0, we get:

∂J(x) T
Z ∞  Z ∞
J(x(0)) − J(0) = − f (x, u)dt ≤ F (x, u)dt (1.122)
0 ∂x 0
8
Bellman R., Dynamic programming, Princeton University Press, 1957
30 Chapter 1. Overview of Pontryagin's Minimum Principle

Moreover, the optimal cost J ∗ (x) has a decay rate given by −F (x, u∗ ), which
is negative. Thus J ∗ (x) may serve as a Lyapunov function to prove that the
optimal control law is stabilizing9 .

1.9.4 Application of HJB equation to linear time invariant


systems
We consider in this section linear time invariant systems, where x(t) is the state
vector and u(t) is the control vector of dimension m. Furthermore we assume
that the cost-to-go function J(x, t) to be minimized is quadratic:

ẋ(t) = Ax(t) + Bu(t)
Rt (1.123)
J(x, t) = t f xT (t)Qx(t) + uT (t)Ru(t) dt


where:

Q = QT ≥ 0

(1.124)
R = RT > 0
Assuming that the nal state at t = tf is set to zero, a candidate solution
J ∗ (x, t) of the Hamilton-Jacobi-Bellman (HJB) partial dierential equation is
the following quadratic function:

J ∗ (x, t) := xT P(t)x where P(t) = PT (t) ≥ 0 (1.125)

Thus:
∂J ∗ (x,t)

 ∂t = xT Ṗ(t)x
(1.126)
∂J ∗ (x,t)

∂x = 2P(t)x(t)
Finally, assuming unconstrained control, that is u(t) ∈ Rm , the Hamilton-
Jacobi-Bellman (HJB) equation (1.112) reads as follows:
  ∗ T 
T ∂J (x)
−x Ṗ(t)x = minu(t) ∈ Rm F (x, u) + ∂x f (x, u)
(1.127)
= minu(t) ∈ Rm xT (t)Qx(t) + uT (t)Ru(t)
+ 2xT (t)P(t) (Ax(t) + Bu(t))


To get minu(t) ∈ Rm , we set the derivative of its argument with respect to u


to zero:
∂ T T T

∂u x Qx + u Ru + 2x P (Ax + Bu) = 0 (1.128)
⇒ 2 Ru + BT Px = 0
We nally get:
u(t) = −R−1 BT P(t)x(t) (1.129)
Thus the Hamilton-Jacobi-Bellman (HJB) partial dierential equation reads:

−xT Ṗ(t)x = xT Qx − xT P(t)BR−1 BT P(t)x + 2xT P(t)Ax (1.130)


9
Anders Rantzer and Mikael Johansson, Piecewise Linear Quadratic Optimal Control,
IEEE Transactions On Automatic Control, Vol. 45, No. 4, April 2000
1.10. Pontryagin's principle 31

Then, using the fact that P(t) = PT (t), we can write:


2xT P(t)Ax = 2xT AT P(t)x = xT P(t)A + AT P(t) x (1.131)


Then the Hamilton-Jacobi-Bellman (HJB) partial dierential equation


becomes:
−xT Ṗ(t)x = xT Qx−xT P(t)BR−1 BT P(t)x+xT P(t)A + AT P(t) x (1.132)


Because this equation must be true ∀ x, we conclude that P(t) = PT (t) shall
solve the following dierential Riccati dierential equation:
−Ṗ(t) = AT P(t) + P(t)A − P(t)BR−1 BT P(t) + Q (1.133)
For time invariant systems with innite horizon (tf → ∞), the optimal cost-
to-go function J ∗ (x, t) is independent of time t: J ∗ (x, t) = J ∗ (x). Thus matrix
P(t) becomes a constant matrix:
tf → ∞ ⇒ J ∗ (x, t) = J ∗ (x) ⇒ P(t) = P (1.134)

1.10 Pontryagin's principle


In this section we consider the optimal control problem with possibly control-
state constraints. More specically we consider the problem of nding a control
u that minimizes the following performance index:
Z tf
J(u(t)) = G (x(tf )) + F (x(t), u(t)) dt (1.135)
0
Under the following constraints:
− Dynamics and boundary conditions:

ẋ = f (x, u)
(1.136)
x(0) = x0

− Mixed control-state constraints:


c(x, u) ≤ 0, where c(x, u) : Rn × Rm → R (1.137)

Usually a slack variable α(t), which is actually a new control variable, is


introduced in order to convert the preceding inequality constraint into an
equality constraint:
c(x, u) + α2 (t) = 0, where α(t) : R → R (1.138)

To solve this problem we introduce the augmented Hamiltonian function


Ha (x, u, λ, µ) which is dened as follows 10 :
Ha (x, u, λ, µ, α) = H(x, u, λ) + µ c(x, u) + α2

(1.139)
= F (x, u) + λT (t)f (x, u) + µ(t) c(x, u) + α2


Then the Pontryagin's principle states that the optimal control u∗ must
satisfy the following conditions:
10
Hull D. G., Optimal Control Theory for Applications, Springer (2003)
32 Chapter 1. Overview of Pontryagin's Minimum Principle

− Adjoint equation and transversality condition:



 λ̇(t) = − ∂Ha
∂x
(1.140)
 λ(tf ) = ∂G(x(tf ))
∂x(tf )

− Local minimum condition for augmented Hamiltonian:


(
∂Ha
∂u = 0 (1.141)
∂Ha
∂α = 0 ⇒ 2µα = 0

− Sign of multiplier µ(t) and complementarity condition: the equation


∂α = 0 implies 2µα = 0. Thus either µ = 0, which is an o-boundary
∂Ha

arc, or α = 0 which is an on-boundary arc:

 For the o-boundary arc where µ = 0 control u is obtained from


∂Ha
∂u = 0 and α from the equality constraint c(x, u) + α2 = 0;
 For the on-boundary arc where α = 0 control u is obtained from
equality constraint c(x, u) = 0. Indeed there always exists a smooth
function ub (x) called boundary control which satises:

c (x, ub (x)) = 0 (1.142)

Then multiplier µ is obtained from ∂Ha


∂u = 0:

∂Ha
∂Ha ∂H ∂c(x, u) ∂u
0= = +µ ⇒µ=− ∂c(x,u)
(1.143)
∂u ∂u ∂u
∂u u=ub (x)

Weierstrass conditions (proposed in 1879) for a variational extremum states


that optimal control u∗ and α∗ within the augmented Hamiltonian function Ha
must satisfy the following condition for a minimum at every point of the optimal
path:
Ha (x∗ , u∗ , λ∗ , µ∗ , α∗ ) − Ha (x∗ , u, λ∗ , µ∗ , α) < 0 (1.144)
Since c(x, u) + α2 (t) = 0, the Weierstrass conditions for a variational
extremum can be rewritten as a function of the Hamiltonian function H and
the inequality constraint:

H(x∗ , u∗ , λ∗ ) − H(x∗ , u, λ∗ ) < 0



(1.145)
c(x∗ , u∗ ) ≤ 0

or, equivalently:
u∗ = minu(t) ∈ U H(x∗ , u, λ∗ ) (1.146)

where U denotes the set of admissible values for the control u (here u(t) ∈ U as
soon as c(x∗ , u) ≤ 0). The last relation is the so-called Pontryagin's principle.
1.11. Hamiltonian over time 33

1.11 Hamiltonian over time


1.11.1 General result
From Pontryagin's principle, special conditions for the Hamiltonian can be
derived11 . When the nal time tf is xed and the Hamiltonian H does not
depend explicitly on time, that is when ∂H ∂t = 0, then the Hamiltonian
functional H remains constant along an optimal trajectory:

H(x∗ , u∗ , λ∗ ) = constant (1.147)

Moreover, if the terminal time tf is free, then along the optimal trajectory
we have:
H(x∗ , u∗ , λ∗ ) = 0 when tf is free (1.148)

1.11.2 Autonomous system without constraint on input


We will show that the Hamiltonian H is constant along the optimal trajectory
in the particular case of autonomous system assuming no constraint on input
u(t). For an autonomous system, the function f () is not an explicit function of
time. From (1.63) we get:

dH ∂H T dx ∂H T du ∂H T dλ
= + + (1.149)
dt ∂x dt ∂u dt ∂λ dt

According to (1.55), (1.63) and (1.69) we have


( T T
λ̇ (t) = − ∂H dH T dx ∂H T du dλ
∂H T
∂x
⇒ = −λ̇ (t) + + ẋT (1.150)
∂λ = f T = ẋT dt dt ∂u dt dt

Having in mind that the Hamiltonian H is a scalar functional we get:

T dx dλ dH ∂H T du
λ̇ (t) = ẋT (t) ⇒ = (1.151)
dt dt dt ∂u dt

Finally, assuming no constraint on input u(t), we use (1.72) to obtain relation


(1.147):
∂H dH
=0⇒ = 0 ⇒ H(x∗ , u∗ , λ∗ ) = constant (1.152)
∂u dt
Example 1.2. As in example 1.1 we consider again the problem of nding the
shortest distance between two points P1 = (x1 , y1 ) and P2 = (x2 , y2 ) in the
euclidean plane.
Setting u(x) = y′ (x) the length of the path between the two points is dened
by: Z P2 p Z x2 p
J(u(x)) = dx2 + dy 2 = 1 + u(x)2 dx (1.153)
P1 x1
11
https://en.wikipedia.org/wiki/Hamiltonian_(control_theory)
34 Chapter 1. Overview of Pontryagin's Minimum Principle

Here J(u(x)) is the performance index to be minimized under the following


constraints:  ′
 y (x) = u(x)
y(x1 ) = y1 (1.154)
y(x2 ) = y2

Let λ(x) be the Lagrange multiplier, which is here a scalar. The Hamiltonian
H reads: p
H= 1 + u2 (x) + λ(x)u(x) (1.155)
The necessary conditions for optimality are the following:
∂H
= −λ′ (x) ⇔ λ′ (x) = 0
(
∂y
(1.156)
∂H
∂u = 0 ⇔ √ u(x)2 + λ(x) = 0
1+u (x)

Denoting by c a constant we get from the rst equation of (1.156):


λ(x) = c (1.157)

Using this relation in the second equation of (1.156) leads to the following
expression of u(x) where constant a is introduced:
q
√ u(x)2 + c = 0 ⇒ u2 (x) = c2
1−c2
⇒ u(x) = c2
1−c2
:= a = constant
1+u (x)
(1.158)
Thus, the shortest distance between two xed points in the euclidean plane
is a curve with constant slope, that is a straight-line:
y(x) = a x + b (1.159)

We obviously retrieve the result of example 1.1.


Moreover, we can check that over the optimal trajectory the Hamiltonian H
is constant (not null because here the nal value of x is set to x2 ). Indeed:
(
λ(x) = q
c
1 + u2 (x) + λ(x)u(x) = constant
p
c2 ⇒H= (1.160)
u(x) = 1−c2

1.11.3 Free nal time


It is worth noticing that if nal time tf is not specied, and after having noticed
that f (tf ) − ẋ(tf ) = 0, the following term shall be added to δJa in (1.62):
!
∂G (x(tf ))T
F (tf ) + f (tf ) δtf (1.161)
∂x

In this case the rst variation of the augmented performance index with
respect to δtf is zero as soon as:

∂G (x(tf ))T
F (tf ) + f (tf ) = 0 (1.162)
∂x
1.12. Bang-bang control 35

As far as boundary conditions (1.70) apply we get:

∂G (x(tf ))
λ(tf ) = ⇒ F (tf ) + λ(tf )f (tf ) = 0 (1.163)
∂x(tf )

The preceding equation is called transversality condition. We recognize in


F (tf ) + λ(tf )f (tf ) the value of the Hamiltonian function H(t) dened in (1.63)
at nal time tf . Because the Hamiltonian H(t) is constant along an optimal
trajectory for an autonomous system (see (1.147)) it is concluded that H(t) = 0
along an optimal trajectory for an autonomous system when nal time tf is free.
Alternatively we can introduce a new variable, denoted s for example, which
is related to time t as follows:

t(s) = t0 + (tf − t0 )s ∀ s ∈ [0, 1] (1.164)

From the preceding equation we get:

dt = (tf − t0 ) ds (1.165)

Then the optimal control problem with respect to time t where the nal
time tf is free is changed into an optimal control problem with respect to new
variable s and an additional state tf (s) which is constant with respect to s. The
optimal control problem reads:
Minimize:
Z 1
J(u(s)) = G (x(1)) + (tf (s) − t0 ) F (x(s), u(s)) ds (1.166)
0

Under the following constraints:

− Dynamics and boundary conditions:


dx(t) dt
 d
 ds x(s) = dt ds = (tf (s) − t0 )f (x(s), u(s))
d
ds tf (s) = 0
(1.167)

x(0) = x0

− Mixed control-state constraints:

c(x(s), u(s)) ≤ 0, where c(x(s), u(s)) : Rn × Rm → R (1.168)

1.12 Bang-bang control


1.12.1 Pontryagin's principle application
Bangbang control is a term used to indicate that the control u switches abruptly
between two values. It appears when the control u is restricted to be between
a lower and an upper bound. We apply Pontryagin's principle in the following
cases:
36 Chapter 1. Overview of Pontryagin's Minimum Principle

− For a problem where the Hamiltonian function H is linear in the scalar


control u we can write:
H = a+σu (1.169)

When scalar control u is limited between umin and umax , Pontryagin's


principle provides the following necessary condition for optimality:

∂H


 umax if σ(t) = <0



 ∂u
∂H

umin ≤ u(t) ≤ umax ⇒ u(t) = umin if σ(t) = >0


 ∂u
 ∈ [umin , umax ] if σ(t) = ∂H = 0



∂u
(1.170)

− For multi-inputs systems, suppose that the Hamiltonian function H is


related to the control vector u(t) as follows:

H = a + bT u + ∥u∥ where b ̸= 0 (1.171)

CauchySchwartz inequality may be applied to get12 :

bT u ≥ −∥b∥∥u∥ ⇒ H ≥ a + ∥u∥ (1 − ∥b∥) (1.172)

and the equality is obtained when u is proportionnal to b:

b
u = −α where α ≥ 0 (1.173)
∥b∥

Moreover Pontryagin's principle provides the following necessary condition


for optimality12 , assuming that ∥u∥ ≤ umax :

 umax if σ < 0

∥u∥ ≤ umax ⇒ α = 0 if σ > 0 where σ = 1 − ∥b∥ (1.174)


∈ [0, umax ] if σ = 0

Function σ is usually called the switching function. Thus optimal control


u(t) switches at times when switching function σ switches from negative to
positive (or vice-versa). This type of control where the control is always set to
boundary values is called bang-bang control.
In addition, and as the unconstrained control case, the Hamiltonian
functional H remains constant along an optimal trajectory for an autonomous
system when there are constraints on input u(t). Indeed in that situation
control u(t) is constant (it is set either to its minimum or maximum value)
and consequently dudt is zero. From (1.151) we get dt = 0.
dH

12
Bertrand R., Epenoy R., New smoothing techniques for solving bang-bang optimal control
problems - Numerical results and statistical interpretation, Optimal Control Applications and
Methods 23(4):171 - 197, July 2002, DOI:10.1002/oca.709
1.12. Bang-bang control 37

Last but not least, assume that the performance index to be minimized reads
as follows where λ0 > 0:

λ0 tf
Z
J(u(t)) = ∥u(t)∥ dt (1.175)
2 0

Then, as suggested by Bertrand & Epenoy12 , it could be valuable to deduce


the solution of the initial problem from the successive solutions of an auxiliary
problem through an homotopic approach by dening the following perturbed
performance index:

λ0 tf
Z
Jϵ (u(t)) = (∥u(t)∥ − ϵ h (∥u(t)∥)) dt (1.176)
2 0
Parameter ϵ is assumed to be in the interval ]0, 1] and function h is a
continuous function satisfying h (w) ≥ 0 ∀ w ∈ [0, 1]. For example, one could
choose h(w) = w − w2 ; with this choice ∥u(t)∥ − ϵ h (∥u(t)∥) = ∥u(t)∥2 for
ϵ = 1 and ∥u(t)∥ − ϵ h (∥u(t)∥) = ∥u(t)∥ for ϵ = 0.
If h(w) → ∞ as w approaches 1 or 0, then h is called a barrier function,
otherwise it is a penalty function.
The homotopic (or continuation) approach12 consists in solving the
perturbed problem with ϵ = 1. Then, after dening a decreasing sequence of ϵ
values (ϵ1 = 1 > ϵ2 > · · · > ϵn > 0), the current optimal control problem
associated with ϵ = ϵk where k = 2, · · · , n is solved with the solution of the
previous one as a starting point.

1.12.2 Example 1
Consider a simple mass m which moves on the x-axis and is subject to a force
f (t)13 . Equation of motion reads:

mÿ(t) = f (t) (1.177)

We set control u(t) as:


f (t)
u(t) = (1.178)
m
Consequently the equation of motion reduces to:

ÿ(t) = u(t) (1.179)

The state space realization of this system is the following:


   
x1 (t) = y(t) ẋ1 (t) = x2 (t) x2 (t)
⇒ ⇔ f (x, u) = (1.180)
x2 (t) = ẏ(t) ẋ2 (t) = u(t) u(t)

We will assume that the initial position of the mass is zero and that the
movement starts from rest: 
y(0) = 0
(1.181)
ẏ(0) = 0
13
Linear Systems: Optimal and Robust Control 1st Edition, by Alok Sinha, CRC Press
38 Chapter 1. Overview of Pontryagin's Minimum Principle

We will assume that control u(t) is subject to the following constraint:

umin ≤ u(t) ≤ umax (1.182)

First we are looking for the optimal control u(t) which enables the mass to
cover the maximum distance in a xed time tf :
The objective of the problem is to maximize y(tf ). This corresponds to
minimize the opposite of y(tf ); consequently the cost J(u(t)) reads as follows
where F (x, u) = 0 when compared to (1.56):

J(u(t)) = G (x(tf )) = −y(tf ) := −x1 (tf ) (1.183)

As F (x, u) = 0 the Hamiltonian for this problem reads:

H(x, u, λ)) = λ(t)T f (x, u)  


 x2 (t)
(1.184)

= λ1 (t) λ2 (t)
u(t)
= λ1 (t)x2 (t) + λ2 (t)u(t)

Adjoint equations read:


(
∂H
∂H λ̇1 (t) = − ∂x =0
λ̇(t) = − ⇔ ∂H
1 (1.185)
∂x λ̇2 (t) = − ∂x2 = −λ1 (t)

Solutions of adjoint equations are the following where c and d are constants:

λ1 (t) = c
(1.186)
λ2 (t) = −ct + d

As far as nal time tf is not specied values of constants c and d are


determined by transversality condition (1.70):

λ1 (tf ) = ∂ (−x1 (tf )) = −1



∂G (x(tf )) ∂x1 (tf )
λ(tf ) = ⇒ (1.187)
∂x(tf )  λ (t ) = ∂ (−x1 (tf )) = 0
2 f ∂x2 (tf )

Consequently:  
c = −1 λ1 (t) = −1
⇒ (1.188)
d = −tf λ2 (t) = t − tf
Thus the Hamiltonian H reads as follows:

H(x, u, λ)) = λ1 (t)x2 (t) + λ2 (t)u(t) = −x2 (t) + (t − tf )u(t) (1.189)

Then ∂H∂u = t − tf ≤ 0 ∀ 0 ≤ t ≤ tf . Applying (1.170) leads to the expression


of control u(t):
∂H
≤ 0 ⇒ u(t) = umax ∀ 0 ≤ t ≤ tf (1.190)
∂u
This is of common sense when the objective is to cover the maximum
distance in a xed time without any constraint on the vehicle velocity at the
1.12. Bang-bang control 39

nal time. The optimal state trajectory can be easily obtained by solving the
state equations with given initial conditions:

x1 (t) = 21 umax t2
 
ẋ1 = x2
⇒ (1.191)
ẋ2 = umax x2 (t) = umax t

The Hamiltonian along the optimal trajectory has the following value:

H(x, u, λ)) = λ1 (t)x2 (t) + λ2 (t)u(t) = −umax t + (t − tf )umax = −umax tf


(1.192)
As expected the Hamiltonian along the optimal trajectory is constant. The
minimum value of the performance index is:

1
J(u(t)) = −x1 (tf ) = − umax t2f (1.193)
2

Alternatively, we can write J(u(t)) as follows:


Z tf   Z tf
dy
J(u(t)) = −y(tf ) = − dt = (−x2 (t)) dt (1.194)
0 dt 0

The Hamiltonian for this equivalent J(u(t)) now reads:

H(x, u, λ)) = −x2 (t) + λ(t)T f (x, u)


(1.195)
= −x2 (t) + λ1 (t)x2 (t) + λ2 (t)u(t)

Adjoint equations become:


(
∂H
∂H λ̇1 (t) = − ∂x =0
λ̇(t) = − ⇔ ∂H
1 (1.196)
∂x λ̇2 (t) = − ∂x2 = 1 − λ1 (t)

Solutions of adjoint equations are the following where c and d are constants:

λ1 (t) = c
(1.197)
λ2 (t) = (1 − c) t + d

Because the nal value of x(tf ) is no specied, we have G (x(tf )) = 0 and


the transversality condition (1.70) now reads:

λ1 (tf ) = ∂ (−x1 (tf )) = 0



∂G (x(tf )) ∂x1 (tf )
G (x(tf )) = 0 ⇒ λ(tf ) = =0⇒ (1.198)
∂x(tf ) λ (t ) = ∂ (−x1 (tf )) = 0
2 f ∂x2 (tf )

Consequently:  
c=0 λ1 (t) = 0
⇒ (1.199)
d = −tf λ2 (t) = t − tf
Obviously, we retrieve the same expressions for λ1 (t) and λ2 (t) than those
obtained previously, and we nally get the same bang-bang optimal control.
40 Chapter 1. Overview of Pontryagin's Minimum Principle

1.12.3 Example 2
We re-use the preceding example but now we are looking for the optimal control
u(t) which enables the mass to cover the maximum distance in a xed time tf
with the additional constraint that the nal velocity is equal to zero:

x2 (tf ) = 0 (1.200)

The solution of this problem starts as in the previous case and leads to the
solution of adjoint equations where c and d are constants:

λ1 (t) = c
(1.201)
λ2 (t) = −ct + d

The dierence when compared with the previous case is that now the nal
velocity is equal to zero, that is x2 (tf ) = 0. Consequently transversality
condition (1.70) involves only state x1 and reads as follows:

∂G (x(tf )) ∂ (−x1 (tf ))


λ(tf ) = ⇔ λ1 (tf ) = = −1 (1.202)
∂x(tf ) ∂x1 (tf )

Taking into account (1.202) into (1.201) leads to:



λ1 (t) = −1
(1.203)
λ2 (t) = t + d

The Hamiltonian H reads as follows:

H(x, u, λ)) = λ1 (t)x2 (t) + λ2 (t)u(t) = −x2 (t) + (t + d)u(t) (1.204)

Thus ∂H
∂u = t + d = λ2 (t) ∀ 0 ≤ t ≤ tf where the value of constant d is not
known: it can be either d < −tf , d ∈ [−tf , 0] or d > 0. Figure 1.1 plots the
three possibilities.

− The possibility d < −tf leads to u(t) = umin ∀t ∈ [0, tf ] according to


(1.170), that is y(t) := x1 (t) = 0.5umin t2 when taking into account initial
conditions (1.181). Thus there is no way to achieve the constraint that
the velocity is zero at instant tf and the possibility d < −tf is ruled out;

− Similarly, the possibility d > 0 leads to u(t) = umax ∀t ∈ [0, tf ], that


is y(t) := x1 (t) = 0.5umax t2 when taking into account initial conditions
(1.181). Thus the possibility d > 0 is also ruled out.

Hence d shall be chosen between −tf and 0. According to (1.170) and Figure
1.1 we have: 
umax ∀ 0 ≤ t ≤ ts
u(t) = (1.205)
umin ∀ ts < t ≤ tf

Instant ts is the switching instant, that is time at which ∂H


∂u = λ2 (t) changes
in sign. Solving the state equations with initial velocity set to zero yields the
1.12. Bang-bang control 41

Figure 1.1: Three possibilities for the values of ∂H/∂u = λ2 (t)

expression of x2 (t) ∀ ts < t ≤ tf :



 ẋ2 = umax ∀ 0 ≤ t ≤ ts
ẋ = umin ∀ ts < t ≤ tf
 2
 x2 (0) = 0 (1.206)
x2 (ts ) = umax ts

x2 (t) = umax ts + umin (t − ts ) ∀ ts < t ≤ tf

Imposing x2 (tf ) = 0 leads to the value of the switching instant ts :

x2 (tf ) = 0 ⇒ umax ts + umin (tf − ts ) = 0


u tf umin tf (1.207)
⇒ ts = uminmin
−umax = − umax −umin

From Figure 1.1 it is clear that at t = ts we have λ2 (ts ) = 0. Using the fact
that λ2 (t) = t + d we nally get the value of constant d:

λ2 (t) = t + d
⇒ d = −ts (1.208)
λ2 (ts ) = 0

Furthermore the Hamiltonian along the optimal trajectory has the following
value:


 ∀ 0 ≤ t ≤ ts H(x, u, λ)) = λ1 (t)x2 (t) + λ2 (t)u(t)
= −umax t + (t + d)umax = −ts umax


 ∀ ts < t ≤ tf H(x, u, λ)) = −umax ts − umin (t − ts ) + (t − ts )umax
= −ts umax

(1.209)
As expected the Hamiltonian along the optimal trajectory is constant.
42 Chapter 1. Overview of Pontryagin's Minimum Principle

1.13 Singular arc - Legendre-Clebsch condition


The case where ∂H/∂u does not yield to a denite value for the control u(t) is
called singular control. Usually singular control arises when a multiplier σ(t) of
the control u(t) (which is called the switching function) in the Hamiltonian H
vanishes over a nite length of time t1 ≤ t ≤ t2 :
∂H
σ(t) := = 0 ∀ t1 ≤ t ≤ t2 (1.210)
∂u
The singular control can be determined by the condition that the switching
function σ(t) and its time derivatives vanish along the so-called singular arc.
Hence over a singular arc we have:

dk
σ(t) = 0 ∀ t1 ≤ t ≤ t2 , ∀k∈N (1.211)
dtk
At some derivative order the control u(t) does appear explicitly and its
value is thereby determined. Furthermore it can be shown that the control u(t)
appears at an even derivative order. So the derivative order at which the control
u(t) does appear explicitly will be denoted 2q . Thus:

d2q σ(t)
k := 2q ⇒ := A(t, x, λ) + B(t, x, λ)u = 0 (1.212)
dt2q
The previous equation gives an explicit equation for the singular control,
once the Lagrange multiplier λ have been obtained through the relation λ̇(t) =
∂x .
− ∂H
The singular arc will be optimal if it satises the following generalized
Legendre-Clebsch condition, which is also known as the Kelley condition14 ,
where 2q is the (always even) value of k at which the control u(t) explicitly
dk
appears in dt k σ(t) for the rst time:

∂ d2q σ(t)
 
(−1) q
≥0 (1.213)
∂u dt2q

Note that for the regular arc the second order necessary condition for
optimality to achieve a minimum cost is the positive semi-deniteness of the
Hessian matrix of the Hamiltonian along an optimal trajectory. This condition
is obtained by setting q = 0 in the generalized Legendre-Clebsch condition
(1.213):
∂ ∂2H
q=0⇒ σ(t) = = Huu ≥ 0 (1.214)
∂u ∂u2
This inequality is also termed regular Legendre-Clebsch condition.

14
Douglas M. Pargett & Douglas Mark D. Ardema, Flight Path Optimization at Constant
Altitude, Journal of Guidance Control and Dynamics, July 2007, 30(4):1197-1201, DOI:
10.2514/1.28954
Chapter 2

Finite Horizon Linear Quadratic


Regulator

2.1 Problem to be solved


The Linear Quadratic Regulator (LQR ) is an optimal control problem where
the state equation of the plant is linear, the performance index is quadratic and
the initial conditions are known. We discuss in this chapter linear quadratic
regulation in the case where the nal time which appears in the cost to be
minimized is nite whereas the next chapter will focus on the innite horizon
case. The optimal control problem to be solved is the following: assume a plant
driven by a linear dynamical equation of the form:

ẋ(t) = Ax(t) + Bu(t)
(2.1)
x(0) = x0

Where:

− A is the state (or system) matrix

− B is the input matrix

− x(t) is the state vector of dimension n

− u(t) is the control vector of dimension m

Then we have to nd the control u(t) which minimizes the following quadratic
performance index:

1 T 
J(u(t)) = x(tf ) − xf S x(tf ) − xf
2
1 tf T
Z
+ x (t)Qx(t) + uT (t)Ru(t) dt (2.2)
2 0

where the nal time tf is set and xf is the nal state to be reached. The
performance index relates to the fact that a trade-o has been done between the
rate of variation of x(t) and the magnitude of the control input u(t). Matrices
44 Chapter 2. Finite Horizon Linear Quadratic Regulator

S and Q shall be chosen to be symmetric positive semi-denite and matrix R


symmetric positive denite.
 S = ST ≥ 0

Q = QT ≥ 0 (2.3)
R = RT > 0

Notice that the use of matrix S is optional; indeed, if the nal state xf is
imposed then there is no need the insert the expression
T
1
S x(tf ) − xf in the cost to be minimized.

2 x(tf ) − xf

2.2 Positive denite and positive semi-denite matrix


A positive denite matrix M is denoted M > 0. We remind that a real n × n
symmetric matrix M = MT is called positive denite if and only if we have
either:
− xT Mx > 0 for all x ̸= 0;
− All eigenvalues of M are strictly positive;
− All of the leading principal minors are strictly positive (the leading
principal minor of order k is the minor of order k obtained by deleting
the last n − k rows and columns);
− Matrix M can be written as follows where matrix M0.5 is square,
symmetric and invertible:
T
M = M0.5 M0.5 where M0.5 = M0.5 (2.4)

Matrix M0.5 is called the root square of matrix M. By getting the modal
decomposition of matrix M, that is M = VDV−1 where V is the matrix
whose columns are the eigenvectors of M and D is the diagonal matrix
whose diagonal elements are the corresponding positive eigenvalues, the
square root M0.5 of M is given by M0.5 = VD0.5 V−1 , where D0.5 is
any diagonal matrix whose elements are the square root of the diagonal
elements of D1 .
Similarly a semi-denite positive matrix M is denoted M ≥ 0. We remind
that a n × n real symmetric matrix M = MT is called positive semi-denite if
and only if we have either:
− xT Mx ≥ 0 for all x ̸= 0;
− All eigenvalues of M are non-negative;
− All of the principal (not only leading) minors are non-negative (the
principal minor of order k is the minor of order k obtained by deleting
n − k rows and the n − k columns with the same position than the rows.
For instance, in a principal minor where you have deleted rows 1 and 3,
you should also delete columns 1 and 3);
1
https://en.wikipedia.org/wiki/Square_root_of_a_matrix
2.3. Hamiltonian matrix 45

T
− Matrix M can be written as M0.5 M0.5 where matrix M0.5 is full row
rank.
Furthermore a real symmetric matrix M is called negative (semi-)denite if
−M is positive (semi-)denite.
 
1 2
Example 2.1. Check that M1 = M1 = T is not positive denite and that
2 3
 
1 −2
M2 = MT2 = is positive denite. ■
−2 5

2.3 Hamiltonian matrix


For this optimal control problem, the Hamiltonian (1.63) reads:
1 T
x (t)Qx(t) + uT (t)Ru(t) + λT (t) (Ax(t) + Bu(t)) (2.5)

H(x, u, λ) =
2
The necessary condition for optimality (1.72) yields:
∂H
= Ru(t) + BT λ(t) = 0 (2.6)
∂u
Taking into account that R is a symmetric matrix, we get:

u(t) = −R−1 BT λ(t) (2.7)

Eliminating u(t) in equation (2.1) reads:

ẋ(t) = Ax(t) − BR−1 BT λ(t)



(2.8)
x(0) = x0

The dynamics of Lagrange multipliers λ(t) is given by (see (1.69)):


∂H
λ̇(t) = − = −Qx(t) − AT λ(t) (2.9)
∂x
The nal values of the Lagrange multipliers are given by (1.70). Using the
fact that S is a symmetric matrix we get:
 
∂ 1 T
(2.10)
 
λ(tf ) = x(tf ) − xf S x(tf ) − xf = S x(tf ) − xf
∂x(tf ) 2
Taking into account that matrices Q and S are symmetric matrices,
equations (2.9) and (2.10) are written as follows:

λ̇(t) = −Qx(t) − AT λ(t)



 (2.11)
λ(tf ) = S x(tf ) − xf

Equations (2.8) and (2.11) represent a two-point boundary value problem.


Combining (2.8) and (2.11) into a single state equation yields:

A −BR−1 BT x(t)
      
ẋ(t) x(t)
= =H (2.12)
λ̇(t) −Q −AT λ(t) λ(t)
46 Chapter 2. Finite Horizon Linear Quadratic Regulator

where we have introduced the Hamiltonian matrix H dened by:

A −BR−1 BT
 
H= (2.13)
−Q −AT

By denition, a matrix H is said to be an Hamiltonian matrix as soon as


the following property holds:

(JH)T = JH (2.14)

where J is the following skew-symmetric matrix:


 
0 I
J = −J = T
(2.15)
−I 0

2.4 Optimal control


2.4.1 State vector expression
Solving (2.12) yields:
   
x(t) Ht x(0)
=e (2.16)
λ(t) λ(0)

In the previous equation, the value of λ(0) is not known. On the other
hand, x(tf ) or λ(tf ) is known, depending on whether the nal state is imposed
or weighted. Thus by replacing t by t − tf in the previous equation we obtain:
     
x(t) x(0) x(tf )
= eHt = eH(t−tf ) (2.17)
λ(t) λ(0) λ(tf )

Then exponential matrix eH(t−tf ) is partitioned as follows:

 
Y1 (t) X1 (t)
H(t−tf )
e := (2.18)
Y2 (t) X2 (t)

Then (2.17) yields:



x(t) = Y1 (t) x(tf ) + X1 (t) λ(tf )
(2.19)
λ(t) = Y2 (t) x(tf ) + X2 (t) λ(tf )

Furthermore notice the following relations obtained when t = tf :


  
Y1 (tf ) X1 (tf ) Y1 (tf ) = X2 (tf ) = I
e H(t−tf )
= I := ⇒ (2.20)
t=tf Y2 (tf ) X2 (tf ) X1 (tf ) = Y2 (tf ) = 0
2.4. Optimal control 47

2.4.2 Lagrange multipliers for imposed nal state


Assume that the nal state x(tf ) is imposed:

x(tf ) := xf (2.21)

Then (2.19) can be manipulated to get rid of the unknown vector λ(tf ):

λ(tf ) = X−1
 
1 (t) x(t) − Y1 (t) xf  (2.22)
λ(tf ) = X−1
2 (t) λ(t) − Y2 (t) xf

Then equating λ(tf ) = λ(tf ) we get:

X−1 −1
 
2 (t) λ(t) − Y2 (t) xf = X1 (t) x(t) − Y 1 (t) x f
⇔ λ(t) = X2 (t) X−1 (2.23)

1 (t) x(t) − Y1 (t) xf + Y2 (t) xf
⇔ λ(t) = X2 (t) X−1 −1

1 (t) x(t) − X2 (t) X1 (t) Y1 (t) − Y2 (t) xf

In order to factor x(t) and xf , let P(t) and F(t) be the following matrices:

P(t) := X2 (t) X−1



1 (t) (2.24)
F(t) := P(t) Y1 (t) − Y2 (t)

We nally get:
λ(t) = P(t) x(t) − F(t) xf (2.25)

2.4.3 Lagrange multipliers for weighted nal state


In the case where nal state x(tf ) is expected to be close to the nal value xf
then the nal condition λ(tf ) is given by (2.10):

(2.26)

λ(tf ) = S x(tf ) − xf

Then (2.19) can be manipulated to get rid of the unknown vector x(tf ):

x(t) = Y1 (t) x(tf ) + X1 (t) λ(tf ) 


= Y1 (t) x(tf ) + X1 (t) S x(tf ) − xf
= (Y1 (t) + X1 (t) S) x(tf ) − X1 (t) S xf
⇒ x(tf ) = (Y1 (t) + X1 (t) S)−1 x(t) + X1 (t) S xf

(2.27)
and λ(t) = Y2 (t) x(tf ) + X2 (t) λ(tf ) 
= Y2 (t) x(tf ) + X2 (t) S x(tf ) − xf
= (Y2 (t) + X2 (t) S) x(tf ) − X2 (t) S xf
⇒ x(tf ) = (Y2 (t) + X2 (t) S)−1 λ(t) + X2 (t) S xf


Then equating x(tf ) = x(tf ) :

(Y2 (t) + X2 (t) S)−1 λ(t) + X2 (t) S xf




= (Y1 (t) + X1 (t) S)−1 x(t) + X1 (t) S xf (2.28)



48 Chapter 2. Finite Horizon Linear Quadratic Regulator

Thus:
 
λ(t) = (Y2 (t) + X2 (t) S) (Y1 (t) + X1 (t) S)−1 x(t) + X1 (t) S xf
− X2 (t) S xf (2.29)

In order to factor x(t) and xf , let PS (t) and FS (t) be the following matrices:

PS (t) := (Y2 (t) + X2 (t) S) (Y1 (t) + X1 (t) S)−1



(2.30)
FS (t) := (X2 (t) − PS (t) X1 (t)) S

We nally get:
λ(t) = PS (t) x(t) − FS (t) xf (2.31)

2.4.4 Limit values when nal state weighting matrix increases


In order to assess what happen when the nal state weighting matrix ∥S∥ → ∞,
we rst recall the Neumann series:

−1
X
(I − T) = Tk (2.32)
k=0

Applying this result to the right term of PS (t) reads:


   −1
(Y1 (t) + X1 (t) S)−1 = I + Y1 (t) (X1 (t) S)−1 X1 (t) S
 −1
= (X1 (t) S)−1 I + Y1 (t) (X1 (t) S)−1
k
(2.33)

= (X1 (t) S)−1 ∞ −1
P
−Y 1 (t) (X 1 (t) S)
 k=0 
−1
≈ (X1 (t) S) I − Y1 (t) (X1 (t) S)−1
 
−1
= S−1 X−11 (t) I − Y1 (t) (X 1 (t) S)

Thus PS (t) can be approximated as follows when ∥S∥ → ∞:


 
−1
PS (t) ≈ (Y2 (t) + X2 (t) S) S−1 X−11 (t) I − Y1 (t) (X1 (t) S)
 −1

≈ Y2 (t) S−1 X−1
1 (t) + X2 (t) X −1
1 (t) I − Y 1 (t) (X 1 (t) S)
≈ Y2 (t) S X1 (t) + X2 (t) X1 (t) I − Y1 (t) S X−1
−1 −1 −1 −1
 
1 (t)
(2.34)
When ∥S∥ → ∞ we retrieve the expression of P(t) in (2.24) by using the
order 0 approximation of PS (t). Indeed:

∥S∥ → ∞ ⇒ PS (t) ≈ X2 (t) X−1 −1


1 (t) I = X2 (t) X1 (t) = P(t) (2.35)

As far as FS (t) is concerned, we also retrieve the expression of F(t) in (2.24)


2.4. Optimal control 49

Figure 2.1: Finite horizon closed-loop optimal control

when ∥S∥ → ∞ by using the order 1 approximation of PS (t). Indeed:


∥S∥ → ∞ ⇒ PS (t) X1 (t) ≈ Y2 (t) S−1 X−1 −1
X1 (t) − Y1 (t) S−1
 
1 (t) + X2 (t) X1 (t)
≈ Y2 (t) S−1 + X2 (t) − X2 (t) X−11 (t) Y1 (t) S
−1

⇒ X2 (t) − PS (t) X1 (t) ≈ X2 (t) X−1


1 (t) Y1 (t) S
−1 − Y (t) S−1
2
⇒ FS (t) := (X2 (t) − PS (t) X1 (t)) S
≈ X2 (t) X−1
1 (t) Y1 (t) − Y2 (t)
= P(t) Y1 (t) − Y2 (t)
= F(t)
(2.36)

2.4.5 Closed-loop block diagram


Finally, using (2.7), optimal control u(t) reads as follows when the nal state
is imposed (when the nal state is weighted, P(t) and F(t) have to be replaced
by PS (t) and FS (t), respectively):
u(t) = −R−1 BT λ(t)
= −R−1 BT P(t) x(t) − F(t) xf (2.37)


:= −K(t)x(t) + F(t) xf
where:
K(t) = R−1 BT P(t) (2.38)
The preceding expression leads to the closed-loop block diagram shown in
Figure 2.1.
It is worth noticing that P(tf ) = X2 (tf )X−1
1 (tf ) → ∞ because X1 (tf ) = 0
when the nal value xf of x(tf ) is imposed, as indicated by (2.20). This is in
line with the nal value of P(t) as indicated by (2.10) when the nal state is
close to zero:
xf = 0 ⇒ λ(tf ) = P(tf )x(tf ) = Sx(tf ) ⇒ P(tf ) = S (2.39)
Consequently, when it is desired that the nal value x(tf ) tends towards xf ,
then S → ∞. Thus S = P(tf ) is singular when the nal value x(tf ) is set to
xf . In that case, and to avoid the numerical diculty when t = tf , we shall set
u(tf ) = 0. Thus the optimal control reads:

−K(t) x(t) + F(t) xf ∀ 0 ≤ t < tf
u(t) = (2.40)
0 for t = tf
50 Chapter 2. Finite Horizon Linear Quadratic Regulator

2.5 Riccati dierential equation


When the nal value xf is set to zero, we have seen that that Lagrange
multipliers λ(t) linearly depend on the state vector x(t) through the time
dependent matrix P(t):

xf = 0 ⇒ λ(t) = P(t)x(t) (2.41)

Using (2.9) and (2.10), we can compute the time derivative of the Lagrange
multipliers λ(t) = P(t)x(t) as follows:

λ̇(t) = Ṗ(t)x(t) + P(t)ẋ(t) = −Qx(t) − AT λ(t)



(2.42)
λ(tf ) = P(tf )x(tf ) = Sx(tf )

Then substituting (2.1), (2.7) and (2.41) within (2.42) we get:



 ẋ = Ax(t) + Bu(t)
u(t) = −R−1 BT λ(t)
λ(t) = P(t)x(t)

Ṗ(t)x(t) + P(t) Ax(t) − BR−1 BT P(t)x(t) = −Qx(t) − AT P(t)x(t)
 

P(tf )x(tf ) = Sx(tf )
(2.43)
Because the previous equation is true for all x(t) and x(tf ) we obtain the
following equation, which is known as the Riccati dierential equation :

AT P(t) + P(t)A − P(t)BR−1 BT P(t) + Q = −Ṗ(t)



(2.44)
P(tf ) = S

From a computational point of view, the Riccati dierential equation (2.44)


may be integrated backward. The kernel P(t) is stored for each values of t and
then is used to compute K(t) and u(t).
Alternatively, the analytic solution of the Riccati dierential equation (2.44)
is given either by P(t) := X2 (t) X−11 (t) in (2.24) when the nal state xf = 0 is
imposed or by PS (t) := (Y2 (t) + X2 (t) S) (Y1 (t) + X1 (t) S)−1 in (2.30) when
the nal state xf = 0 is weighted by matrix S = ST ≥ 0. The key point to
solve the Riccati dierential equation is the partition of matrix eH(t−tf ) shown
in (2.18):
 
Y1 (t) X1 (t)
eH(t−tf )
:= (2.45)
Y2 (t) X2 (t)

It is worth noticing that the Riccati dierential equation can be written in


a compact form as follows where H denotes the Hamiltonian matrix dened in
(2.13):
−Ṗ(t) = AT P(t) + P(t)A − P(t)BR−1 BTP(t) + Q
  In (2.46)
⇔ −Ṗ(t) = P(t) −In H
P(t)
2.6. Examples 51

2.6 Examples
2.6.1 Example 1
Given the following scalar plant:

ẋ(t) = ax(t) + bu(t)
(2.47)
x(0) = x0
Find control u(t) which minimizes the following performance index where
xf = 0, S ≥ 0 and ρ > 0:
1 tf 2
Z
1 T
J(u(t)) = x (tf )Sx(tf ) + ρu (t) dt (2.48)
2 2 0
Hamiltonian matrix H dened in (2.13) reads:
 " −b2
#
A −BR−1 BT

a
H= = ρ (2.49)
−Q −AT 0 −a

Denoting by s the Laplace variable, the exponential of matrix eHt is obtained


thanks to the inverse of the Laplace transform, which is denoted L−1 :
−1 !
s − a b2 /ρ
  
Ht −1 −1 −1
e =L (sI − H) =L
0 s+a
2
  
s + a −b /ρ
= L−1 (s−a)(s+a)
1
0 #! s − a
" 2 1 −b (2.50)
= L−1 s−a ρ(s2 −a2 )
1
0 s+a #
−b2 (eat −e−at )
"
at
= e 2ρa
0 e−at
Following (2.18), the partition of eH(t−tf ) reads:
  
a(t−tf ) −a(t−tf )
−b2 e −e  
e H(t−tf )
= e
a(t−tf )
2ρa  := Y1 (t) X1 (t) (2.51)
0 e−a(t−tf ) Y2 (t) X2 (t)

Because the nal state state xf = 0 is weighted by matrix S ≥ 0, we nally


get the solution of the Riccati dierential equation thanks to PS (t) in (2.30):
PS (t) := (Y2 (t) + X2 (t) S) (Y1 (t) + X1 (t) S)−1
  !−1
a(t−tf ) −a(t−tf )
−b2 e −e
= e−a(t−tf ) S ea(t−tf ) + 2ρa S

Se ( )
−a t−tf
=
(2.52)
 
a(t−tf ) −a(t−tf )
Sb2 e −e 

e (
a t−tf )−
2ρa

S
= 

Sb2 1−e
(
2a t−tf
 )
e (
2a t−tf )+
2ρa
52 Chapter 2. Finite Horizon Linear Quadratic Regulator

Finally the optimal control reads:

u(t) = −K(t)x(t)
= −R−1 BT PS (t)x(t)
= − ρb PS (t)x(t)
(2.53)
−bS
=   x(t)
Sb2 1−e
(
2a t−tf
 )
ρe (
2a t−tf )+
2a

If we want to ensure that the optimal control drives x(tf ) exactly to xf = 0,


we let S → ∞ to weight heavily x(tf ) in the performance index J(u(t)). Then:

2ρa
→ P(t) = X2 (t) X−1
PS (t) |{z} 1 (t) =   (2.54)
S→∞ b2 1 − e2a(t−tf )

and:
−2a
u(t) = −R−1 BT P(t)x(t) =   x(t) (2.55)
b 1 − e2a(t−tf )

2.6.2 Example 2
Given the following plant, which actually represents a double integrator:
      
ẋ1 (t) 0 1 x1 (t) 0
= + u(t) (2.56)
ẋ2 (t) 0 0 x2 (t) 1
Find control u(t) which minimizes the following performance index where
xf = 0 and S = ST ≥ 0:
Z tf
1 1
J(u(t)) = xT (tf )Sx(tf ) + u2 (t)dt (2.57)
2 2 0

Weighting matrix S reads as follows:


 
sp 0
S=S = T
≥0 (2.58)
0 sv

Hamiltonian matrix H dened in (2.13) reads:


 
0 1 0 0
A −BR−1 BT
 
 0 0 0 −1 
H= =  (2.59)
−Q −AT  0 0 0 0 
0 0 −1 0

In order to compute eHt we use the following relation where L−1 stands for
the inverse Laplace transform:
h i
eHt = L−1 (sI − H)−1 (2.60)
2.7. Second order necessary condition for optimality 53

We get:
 1 1 1
− s13
  
s −1 0 0 s s2 s4
 0 s 0 1  −1
 0 1 1
− s12 
sI − H =  s s3
 0 0 s 0  ⇒ (sI − H) =  0 0
  1

s 0 
0 0 1 s 0 0 − s12 1
3 2 
s
(2.61)
1 t t6 − t2

2
0 1 t2 −t 
h i 
⇒ eHt = L−1 (sI − H)−1 = 
 0 0 1

0 
0 0 −t 1

Following (2.18), the partition of eH(t−tf ) reads:


(t−tf )3 (t−tf )2
 
1 (t − tf ) 6 − 2
(t−tf )2
 
Y1 (t) X1 (t)
 
e H(t−tf )  0
= 1 2 −(t − tf )  :=

(2.62)
 0 0 1 0  Y2 (t) X2 (t)
0 0 −(t − tf ) 1
Because the nal state state xf = 0 is weighted by matrix S ≥ 0, we nally
get the solution of the Riccati dierential equation thanks to PS (t) in (2.30):

PS (t) := (Y2 (t) + X2 (t) S) (Y1 (t) + X1 (t) S)−1


#−1
 " (t−t )3 (t−t )2
sp 0 1 + sp 6f t − tf − sv 2f
= (t−tf )2
−sp (t − tf ) sv s p 1 − sv (t − tf )
2
" (t−t )2
 #
1 sp 0 1 − sv (t − tf ) tf − t + sv 2f
=∆ (t−tf )2 (t−tf )3
−sp (t − tf ) sv −sp 1 + sp
2 6
(2.63)
where:
(t − tf )3 (t − tf )2 (t − tf )2
   
∆ = 1 + sp (1 − sv (t − tf )) − sp t − tf − sv
6 2 2
(2.64)

2.7 Second order necessary condition for optimality


It is worth noticing that the second order necessary condition for optimality to
achieve a minimum cost is the positive semi-deniteness of the Hessian matrix
of the Hamiltonian along an optimal trajectory (see (1.214)). This condition is
always satised as soon as R > 0. Indeed we get from (2.6):
∂2H
= Huu = R > 0 (2.65)
∂u2

2.8 Minimum cost achieved


The minimum cost achieved is given by:
1
J ∗ = J(u∗ (t)) = xT (0)P(0)x(0) (2.66)
2
54 Chapter 2. Finite Horizon Linear Quadratic Regulator

Indeed, from the Riccati equation (2.44), we deduce that:


 
xT Ṗ + PA + AT P − PBR−1 BT P + Q x = 0
⇔ xT Ṗx + xT PAx + xT AT Px − xT PBR−1 BT Px + xT Qx = 0 (2.67)
T
⇔ xT Ṗx + xT PAx + xT PAx − xT PBR−1 BT Px + xT Qx = 0

Taking into account the fact that P = PT > 0, R = RT > 0 as well as (2.1),
(2.37) with xf = 0 and (2.38) it can be shown that:

x PBR−1 BT Px = −xT PBu∗


 T

= −xT PBR−1 Ru∗




= u∗T Ru∗



 x PAx = xT P (Ax + Bu∗ − Bu∗ )
T

= xT Pẋ − xT PBu∗ (2.68)






= xT Pẋ + u∗T Ru∗

T
⇒ xT Ṗx + xT PAx + xT PAx − xT PBR−1 BT Px
= xT Ṗx + xT Pẋ + ẋT Px + u∗T Ru∗
d
= dt xT Px + u∗T Ru∗

As a consequence equation (2.67) can be written as follows:

d T
x (t)P(t)x(t) + xT (t)Qx(t) + u∗T (t)Ru∗ (t) = 0 (2.69)

dt
And the performance index (2.2) to be minimized can be re-written as:
R tf
J(u∗ (t)) = 12 xT (tf 
)Sx(tf ) + 1
2 xT (t)Qx(t) + u∗T (t)Ru∗ (t) dt
0
Rt d T
⇔ J(u∗ (t)) = 1
xT (tf )Sx(tf ) − 0 f dt (2.70)

2 x (t)P(t)x(t) dt
⇔ J(u∗ (t)) = 1 T − xT (tf )P(tf )x(tf ) + xT (0)P(0)x(0)

2 x (tf )Sx(tf )

Then taking into account the boundary conditions P(tf ) = S we nally get
(2.66).

2.9 Application to minimum energy control problem


Minimum energy control problem appears when Q := 0.

2.9.1 Moving a linear system close to a nal state with


minimum energy
Let's consider the following dynamical system:

ẋ(t) = Ax(t) + Bu(t) (2.71)

We are looking for the control u(t) which moves the system from the initial
state x(0) = x0 to a nal state which should be close to a given value x(tf ) = xf
at nal time t = tf . We will assume that the performance index to be minimized
2.9. Application to minimum energy control problem 55

is the following quadratic performance index where R is a symmetric positive


denite matrix:
 1 tf T
Z
1 T
J(u(t)) = x(tf ) − xf S x(tf ) − xf + u (t)Ru(t) dt (2.72)
2 2 0

For this optimal control problem, the Hamiltonian (2.5) is:


1
H(x, u, λ) = uT (t)Ru(t) + λT (t) (Ax(t) + Bu(t)) (2.73)
2
The necessary condition for optimality (2.6) yields:

∂H
= Ru(t) + BT λ(t) = 0 (2.74)
∂u
We get:
u(t) = −R−1 BT λ(t) (2.75)
Eliminating u(t) in equation (2.72) reads:

ẋ(t) = Ax(t) − BR−1 BT λ(t) (2.76)

The dynamics of Lagrange multipliers λ(t) is given by (2.9):

∂H
λ̇(t) = − = −AT λ(t) (2.77)
∂x
We get from the preceding equation:
T
λ(t) = e−A t λ(0) (2.78)

The value of λ(0) will inuence the nal value of the state vector x(t). Indeed
let's integrate the linear dierential equation:
T
ẋ(t) = Ax(t) − BR−1 BT λ(t) = Ax(t) + BR−1 BT e−A t λ(0) (2.79)

This leads to the following expression of the state vector x(t):


Rt T
x(t) = eAt x0 + eAt 0 e−Aτ BR−1 BT e−A τ λ(0)dτ
Rt T
 (2.80)
= eAt x0 + eAt 0 e−Aτ BR−1 BT e−A τ dτ λ(0)

Or:
x(t) = eAt x0 + eAt Wc (t) λ(0) (2.81)
where matrix Wc (t) is dened as follows:
Z t
T
Wc (t) = e−Aτ BR−1 BT e−A τ dτ (2.82)
0

Now using (2.10) we set λ(tf ) as follows:

(2.83)

λ(tf ) = S x(tf ) − xf
56 Chapter 2. Finite Horizon Linear Quadratic Regulator

Using (2.78) and (2.81) we get:


T
λ(tf ) = e−A tf λ(0)

(2.84)
x(tf ) = eAtf x0 + eAtf Wc (tf ) λ(0)

And the transversality condition (2.83) is rewritten as follows:



λ(tf ) = S x(tf ) − xf
T (2.85)
⇔ e−A tf λ(0) = S eAtf x0 + eAtf Wc (tf ) λ(0) − xf


Solving the preceding linear equation in λ(0) gives the following expression:
 T

e−A tf − SeAtf Wc (tf ) λ(0) = S eAtf x0 − xf

 T
−1 (2.86)
⇔ λ(0) = e−A tf − SeAtf Wc (tf ) S eAtf x0 − xf


Using the expression of λ(0) in (2.78) leads to the expression of the Lagrange
multiplier λ(t):
T
 T
−1
λ(t) = e−A t e−A tf − SeAtf Wc (tf ) S eAtf x0 − xf (2.87)


Finally control u(t) is obtained thanks equation (2.75):

u(t) = −R−1 BT λ(t) (2.88)

It is clear from the expression of λ(t) that the control u(t) explicitly depends
on the initial state x0 .

2.9.2 Moving a linear system exactly to a nal state with


minimum energy
We are now looking for the control u(t) which moves the system from the initial
state x(0) = x0 to a given nal state x(tf ) = xf at nal time t = tf . We will
assume that the performance index to be minimized is the following quadratic
performance index where R is a symmetric positive denite matrix:
1 tf T
Z
J= u (t)Ru(t) dt (2.89)
2 0
To solve this problem the same reasoning applies than in the previous
example. As far as control u(t) is concerned this leads to equation (2.75). The
change is that now the nal value of the state vector x(t) is imposed to be
x(tf ) = xf . So there is no nal value for the Lagrange multipliers. Indeed
λ(tf ), or equivalently λ(0), has to be set such that x(tf ) = xf . We have seen
in (2.81) that the state vector x(t) has the following expression:

x(t) = eAt x0 + eAt Wc (t) λ(0) (2.90)

where matrix Wc (t) is dened as follows:


Z t
T
Wc (t) = e−Aτ BR−1 BT e−A τ dτ (2.91)
0
2.9. Application to minimum energy control problem 57

Then we set λ(0) as follows where c0 is a constant vector:


λ(0) = Wc−1 (tf )c0 (2.92)
We get:
x(t) = eAt x0 + eAt Wc (t) Wc−1 (tf )c0 (2.93)
Constant vector c0 is used to satisfy the nal value on the state vector x(t).
Setting x(tf ) = xf leads to the value of constant vector c0 :

x(tf ) = xf ⇒ c0 = e−Atf xf − x0 (2.94)


Thus:
λ(0) = Wc−1 (tf ) e−Atf xf − x0 (2.95)


Using (2.95) in (2.78) leads to the expression of the Lagrange multiplier λ(t):
T
λ(t) = e−A t λ(0)
T (2.96)
= e−A t Wc−1 (tf ) e−Atf xf − x0


Finally the control u(t) which moves with the minimum energy the system
from the initial state x(0) = x0 to a given nal state x(tf ) = xf at nal time
t = tf has the following expression:
u(t) = −R−1 BT λ(t)
T
= −R−1 BT e−A t λ(0) (2.97)
T
= −R−1 BT e−A t Wc−1 (tf ) e−Atf xf − x0


It is clear from the preceding expression that the control u(t) explicitly
depends on the initial state x0 . When comparing the initial value λ(0) of the
Lagrange multiplier obtained in (2.95) in the case where the nal state is
imposed to be x(tf ) = xf with the expression of the initial value of the
Lagrange multiplier obtained in (2.86) in the case where the nal state x(tf ) is
close to a given nal state xf we can see that the expression in (2.95)
corresponds to the limit of the initial value (2.86) when matrix S moves
−1
towards innity (note that eAtf = e−Atf ):
 T
−1
e−A tf − SeAtf Wc (tf ) S eAtf x0 − xf

limS→∞
−1
= limS→∞ −SeAtf Wc (tf ) S eAtf x0 − xf

(2.98)
= limS→∞ Wc−1 (tf )e−Atf S−1 S eAtf x0 − xf
= Wc−1 (tf )e−Atf eAtf x0 − xf

2.9.3 Example
Given the following scalar plant:

ẋ(t) = ax(t) + bu(t)
(2.99)
x(0) = x0
Find the optimal control for the following cost functional and nal states
constraints:
We wish to compute a nite horizon Linear Quadratic Regulator with either
a xed or a weighted nal state xf .
58 Chapter 2. Finite Horizon Linear Quadratic Regulator

− When the nal state x(tf ) is set to a xed value xf and the cost functional
is set to:
Z tf
1
J= ρu2 (t) dt (2.100)
2 0

− When the nal state x(tf ) shall be close of a xed value xf so that the
cost functional is modied as follows where is a positive scalar (S > 0):
Z tf
1 1
J = (x(tf ) − xf )T S (x(tf ) − xf ) + ρu2 (t) dt (2.101)
2 2 0

In both cases the two-point boundary value problem which shall be solved
depends on the solution of the following dierential equation where Hamiltonian
matrix H appears:

A −BR−1 BT x(t) a −b2 /ρ x(t)


         
ẋ(t) x(t)
= = =H (2.102)
λ̇(t) −Q −AT λ(t) 0 −a λ(t) λ(t)
The solution of this dierential equation reads:
   
x(t) Ht x(0)
=e (2.103)
λ(t) λ(0)
Denoting by s the Laplace variable, the exponential of matrix Ht is obtained
thanks to the inverse of the Laplace transform denoted L−1 :
 
eHt = L−1 (sI − H)−1
2 /ρ −1
  !
s − a b
= L−1
0 s+a
s + a −b2 /ρ
  
=L −1 1
(s−a)(s+a)
" 0 #! s − a (2.104)
1 2−b
= L−1 s−a ρ(s2 −a2 )
1
0 s+a #
−at
" 2 at
at −b (e −e )
⇔ eHt = e 2ρa
0 e−at
That is:
 " −b2 (eat −e−at )
   # 
x(t) at
Ht x(0) x(0)
=e = e 2ρa (2.105)
λ(t) λ(0) 0 e−at λ(0)

− If the nal state x(tf ) is set to the value xf then the value λ(0) is obtained
by solving the rst equation of (2.105):

at −atf
b2 (e f −e )
x(tf ) = xf = eatf x(0) − λ(0)
−2ρa
2ρa
at
 (2.106)
⇒ λ(0) = 2 atf −atf xf − e x(0) f
b (e −e )
2.10. Finite horizon LQ regulator with cross-term in the performance index 59

And:

 x(t) = eat x(0) + eat −e−at
xf − eat f x(0)

at −at f
e f −e
−2ρae−at (2.107)
 λ(t) = e−at λ(0) = xf − eatf x(0)

at −atf
b2 (e f −e )

The optimal control u(t) is given by:

−b 2ae−at
u(t) = −R−1 BT λ(t) = xf − eatf x(0)

λ(t) = at −at
ρ b (e − e
f f )
(2.108)
Interestingly enough, the open-loop control is independent of the control
weighting ρ.

− If the nal state x(tf ) is expected to be close to the nal value xf then
we have to mix the two equations of (2.105) and the constraint λ(tf ) =
S (x(tf ) − xf ) to compute the value of λ(0) :

(x(tf ) − xf )
λ(tf ) = S  
atf −atf
b2 (e −e )
⇒ e−atf λ(0) = S eatf x(0) − 2ρa λ(0) − xf
atf
(2.109)
S(e x(0)−xf )
⇔ λ(0) = 
at −atf

Sb2 e f −e
−atf
e + 2ρa

Obviously, when S → ∞ we obtain for λ(0) the same expression than


(2.106).

2.10 Finite horizon LQ regulator with cross-term in


the performance index
Consider the following time invariant state dierential equation:

ẋ(t) = Ax(t) + Bu(t)
(2.110)
x(0) = x0

Where:

− A is the state (or system) matrix

− B is the input matrix

− x(t) is the state vector of dimension n

− u(t) is the control vector of dimension m


60 Chapter 2. Finite Horizon Linear Quadratic Regulator

We will assume that the pair (A, B) is controllable. The purpose of this
section is to explicit the control u(t) which minimizes the following quadratic
performance index with cross-terms:

1 tf T
Z
J(u(t)) = x (t)Qx(t) + uT (t)Ru(t) + 2xT (t)Su(t) dt (2.111)
2 0
With the constraint on terminal state:

x(tf ) = 0 (2.112)

Matrices S and Q are symmetric positive semi-denite and matrix R


symmetric positive denite:

 S = ST ≥ 0

Q = QT ≥ 0 (2.113)
R = RT > 0

It can be seen that:

xT (t)Qx(t) + uT (t)Ru(t) + 2xT (t)Su(t) = xT (t)Qm x(t) + v T (t)Rv(t) (2.114)

Where:
Qm = Q − SR−1 ST

(2.115)
v(t) = u(t) + R−1 ST x(t)
Hence cost (2.111) to be minimized can be rewritten as:

1 ∞ T
Z
J(u(t)) = x (t)Qm x(t) + v T (t)Rv(t) dt (2.116)
2 0

Furthermore (2.110) is rewritten as follows, where v(t) appears as the control


vector rather than u(t). Using u(t) = v(t) − R−1 ST x(t) in (2.110) leads to the
following state equation:

ẋ(t) = Ax(t) + B v(t) − R−1 ST x(t)




= A − BR−1 ST x(t) + Bv(t) (2.117)


= Am x(t) + Bv(t)

We will assume that symmetric matrix Qm is positive semi-denite:

Qm = Q − SR−1 ST ≥ 0 (2.118)

Hamiltonian matrix H reads:


Am −BR−1 BT A − BR−1 ST −BR−1 BT
   
H= = (2.119)
−Qm −ATm −Q + SR−1 ST −AT + SR−1 BT

The problem can be solved through the following Hamiltonian system whose
state is obtained by extending the state x(t) of system (2.110) with costate λ(t):

A − BR−1 ST −BR−1 BT
      
ẋ(t) x(t) x(t)
= := H (2.120)
λ̇(t) −Q + SR−1 ST −AT + SR−1 BT λ(t) λ(t)
2.11. Extension to nonlinear system ane in control 61

Ntogramatzidis2 has shown the following results: let P1 and P2 be the


positive semi-denite solutions of the following continuous time algebraic Riccati
equations:
(
0 = AT P1 + P1 A − (S + P1 B) R−1 (S + P1 B)T + Q
(2.121)
0 = −AT P2 − P2 A − (S − P2 B) R−1 (S − P2 B)T + Q

Notice that pair (A, B) has been replaced by (−A, −B) in the second
equation. We will denote by K1 and K2 the following innite horizon gain
matrices:
K1 = R−1 ST + B T P 1 
 
(2.122)
K2 = R−1 ST − B T P 2
Then the optimal control reads:

−K(t)x(t) ∀ 0 ≤ t < tf
u(t) = (2.123)
0 for t = tf

Where:
K(t) = R−1 ST + BT P(t)
 
(2.124)
P(t) = X2 (t)X−1
1 (t)
And:
(
X1 (t) = e(A−BK1 )t − e(A−BK2 )(t−tf ) e(A−BK1 )tf
(2.125)
X2 (t) = P1 e(A−BK1 )t + P2 e(A−BK2 )(t−tf ) e(A−BK1 )tf

Matrix P(t) satisfy the following Riccati dierential equation:

−Ṗ(t) = AT P + PA − (S + BP(t)) R−1 (S + BP(t))T + Q (2.126)

Furthermore the optimal state x(t) and costate λ(t) have the following
expressions:
x(t) = X1 (t) X−1

1 (0) x0 (2.127)
λ(t) = X2 (t) X−1
1 (0) x0

2.11 Extension to nonlinear system ane in control


We consider the following nite horizon optimal control problem consisting in
nding the control u that minimizes the following performance index where q(x)
is positive semi-denite and R = RT > 0:

1 tf
Z
q(x) + uT (t)Ru(t) dt (2.128)

J(u(t)) = G (x(tf )) +
2 0
under the constraint that the system is nonlinear but ane in control:

ẋ(t) = f (x) + g(x) u(t)
(2.129)
x(0) = x0
2
Lorenzo Ntogramatzidis, A simple solution to the nite-horizon LQ problem with zero
terminal state, Kybernetika - 39(4):483-492, January 2003
62 Chapter 2. Finite Horizon Linear Quadratic Regulator

Assuming no constraint, control u∗ (t) that minimizes the performance index


J(u(t)) is dened by:
u∗ (t) = −R−1 g T (x)λ(t) (2.130)
where:
  T 
1 ∂q(x) ∂ (f (x)+g(x) u∗ )
λ̇(t) = − 2 ∂x + ∂x λ(t)
  T  (2.131)
1 ∂q(x) ∂ (f (x)−g(x)R−1 g T (x)λ(t))
=− 2 ∂x + ∂x λ(t)

For boundary value problems, ecient minimization of the Hamiltonian is


possible3 .

3
Todorov E. and Tassa Y., Iterative Local Dynamic Programming, IEEE ADPRL, 2009
Chapter 3

Innite Horizon Linear


Quadratic Regulator (LQR)

3.1 Problem to be solved


We recall that we consider the following linear time invariant system, where x(t)
is the state vector of dimension n, u(t) is the control vector of dimension m and
z(t) is the controlled output (that is not the actual output of the system but
the output of interest for the design):

 ẋ(t) = Ax(t) + Bu(t)
z(t) = N x(t) (3.1)
x(0) = x0

We recall hereafter the performance index which was under consideration


in the previous chapter dealing with nite horizon Linear Quadratic Regulator
(LQR) when the nal state xf is set:

1 tf T
Z
J(u(t)) = x (t)Qx(t) + uT (t)Ru(t) dt (3.2)
2 0
where Q = NT N ≥ 0 (thus Q is symmetric and positive semi-denite) and
R = RT > 0 is a symmetric and positive denite matrix.
In this chapter we will focus on the case where the nal time tf tends toward
innity (tf → ∞). The performance index to be minimized turns to be:
1 ∞ T
Z
x (t)Qx(t) + uT (t)Ru(t) dt (3.3)

J(u(t)) =
2 0
The results presented in this chapter can be envisioned as the results of the
previous chapter as ∥S∥ → ∞ (xf := 0 here) and tf → ∞. When the nal
time tf is set to innity, the Kalman gain K(t) which has been computed in the
previous chapter becomes constant. As a consequence, the control is easier to
implement as far as it is no more necessary to integrate the dierential Riccati
equation and to store the gain K(t) before applying the control. In practice
innity means that nal time tf becomes large when compared to the time
constants of the plant.
64 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

3.2 Stabilizability and detectability


We will assume in the following that (A, B) is stabilizable and (A, N) is
detectable. We recall that the pair (A, B) is said stabilizable if the
uncontrollable eigenvalues of A, if any, have negative real parts. Thus even
though not all system modes are controllable, the ones that are not
controllable do not require stabilization.
Similarly the pair (A, N) is said detectable if the unobservable eigenvalues
of A, if any, have negative real parts. Thus even though not all system modes
are observable, the ones that are not observable do not require stabilization. We
may use the Kalman test to check the controllability of the system:

rank B AB · · · An−1 B = n where n = size of state vector x (3.4)


 

Or equivalently the Popov-Belevitch-Hautus (PBH ) test which shall be


applied to all eigenvalues of A, denoted λi , to check the controllability of the
system, or only on the eigenvalues which are not contained in the left half
plane to check the stabilizability of the system:

∀ λi for controllability

(3.5)
 
rank A − λi I B = n
∀ λi s.t. Re(λi ) ≥ 0 for stabilizability

Similarly we may use the Kalman test to check the observability of the
system:  
N
 NA 
rank   = n where n = size of state vector x (3.6)
 
..
 . 
NAn−1
Or equivalently the Popov-Belevitch-Hautus (PBH ) test which shall be
applied to all eigenvalues of A, denoted λi , to check the observability of the
system, or only on the eigenvalues which are not contained in the left half
plane to check the detectability of the system:

∀ λi for observability
  
A − λi I
rank =n (3.7)
N ∀ λi s.t. Re(λi ) ≥ 0 for detectability

3.3 Algebraic Riccati equation


When nal time tf tends toward innity the matrix P(t) turns to be a
constant symmetric positive denite matrix denoted P. The Riccati equation
(2.44) reduces to an algebraic equation, which is known as the algebraic
Riccati equation (ARE):
AT P + PA − PBR−1 BT P + Q = 0 (3.8)

It is worth noticing that the algebraic Riccati equation (3.8) may have several
solutions. The solution of the optimal control problem only retains the positive
semi-denite solution of the algebraic Riccati equation.
3.3. Algebraic Riccati equation 65

The convergence of limtf →∞ P(t) → P where P ≥ 0 is some positive


semi-denite symmetric constant matrix is guaranteed by the stabilizability
assumption (PT is indeed a solution of the algebraic Riccati equation (3.8)).
Since the matrix P = PT ≥ 0 is constant, the optimal gain K(t) also turns to
be also a constant denoted K. The optimal gain K and the optimal stabilizing
control u(t) are then dened as follows:

u(t) = −Kx(t)
(3.9)
K = R−1 BT P

The need for the detectability assumption is to ensure that the optimal
control computed using the limtf →∞ P(t) generates a feedback gain
K = R−1 BT P that stabilizes the plant, i.e. all the eigenvalues of A − BK lie
on the open left half plane. In addition, it can be shown that the minimum
cost achieved is given by:
1
J ∗ = xT (0)Px(0) (3.10)
2
To get this result rst we notice that the Hamiltonian (1.63) reads:

1 T
x (t)Qx(t) + uT (t)Ru(t) + λT (t) (Ax(t) + Bu(t)) (3.11)

H(x, u, λ) =
2
The necessary condition for optimality (1.72) yields:

∂H
= Ru(t) + BT λ(t) = 0 (3.12)
∂u

Taking into account that R is a symmetric matrix, we get:

u(t) = −R−1 BT λ(t) (3.13)

Eliminating u(t) in equation (3.1) reads:

ẋ(t) = Ax(t) − BR−1 BT λ(t) (3.14)

The dynamics of Lagrange multipliers λ(t) is given by (1.69):

∂H
λ̇(t) = − = −Qx(t) − AT λ(t) (3.15)
∂x

The key point in the LQR design is that Lagrange multipliers λ(t) are now
assume to linearly depends on state vector x(t) through a constant symmetric
positive denite matrix denoted P:

λ(t) = Px(t) where P = PT ≥ 0 (3.16)

By taking the time derivative of the Lagrange multipliers λ(t) and using
again equation (3.1) we get:

λ̇(t) = Pẋ(t) = P (Ax(t) + Bu(t)) = PAx(t) + PBu(t) (3.17)


66 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

Then using the expression of control u(t) provided in (3.13) as well as (3.16)
we get:
λ̇(t) = PAx(t) − PBR−1 BT λ(t)
(3.18)
= PAx(t) − PBR−1 BT Px(t)
Finally using (3.18) within (3.15) and using λ(t) = Px(t) (see (3.16)) we
get:
−PAx(t) + PBR−1 BT Px(t) = Qx(t) + AT Px(t)
(3.19)
⇔ AT P + PA − PBR−1 BT P + Q x(t) = 0
As far as this equality stands for every value of the state vector x(t) we
retrieve the algebraic Riccati equation (3.8):

AT P + PA − PBR−1 BT P + Q = 0 (3.20)

3.4 Extension to nonlinear system ane in control


We consider the following innite horizon optimal control problem consisting
in nding the control u that minimizes the following performance index where
q(x) is positive semi-denite:
1 ∞
Z
q(x) + uT (t)u(t) dt (3.21)

J(u(t)) =
2 0
under the constraint that the system is nonlinear but ane in control:

ẋ = f (x) + g(x) u(t)
(3.22)
x(0) = x0

We assume that vector eld f is such that f (x) = 0. Thus (xe := 0, ue := 0)


is an equilibrium point for the nonlinear system ane in control. Consequently
f (x) = F(x) x for some, possibly not unique, continuous function F : Rn →
Rn×n . The classical optimal control design methodology relies on the solution
of the Hamilton-Jacobi-Bellman (HJB) equation (1.112):

T !
∂J ∗ (x)

1
q(x) + uT u + (3.23)
 
0 = minu(t) ∈ U f (x) + g(x) u
2 ∂x

Assuming no constraint, the minimum of the preceding Hamilton-Jacobi-


Bellman (HJB) equation with respect to u is attained for optimal control u∗ (t)
dened by:  ∗ 
∗ ∂J (x)
T
u (t) = −g (x) (3.24)
∂x
 ∗ 
Then replacing u by u∗ = −g T (x) ∂J∂x(x) , the Hamilton-Jacobi-Bellman
(HJB) equation reads:
  ∗ T  ∗ 
∂J (x)
1
0 = 2 q(x) + ∂x g(x)g (x) ∂J∂x(x)
T

 ∗ T   ∗  (3.25)
∂J (x) T ∂J (x)
+ ∂x f (x) − g(x)g (x) ∂x
3.4. Extension to nonlinear system ane in control 67

We nally get:

∂J (x) T 1 ∂J ∗ (x) T
 ∗     ∗ 
1 ∂J (x)
q(x) + f (x) − T
g(x)g (x) = 0 (3.26)
2 ∂x 2 ∂x ∂x
In the linearized case the solution of the optimal control problem is a linear
static state feedback of the form u = −BT P̄, where P̄ is the symmetric positive
denite solution of the algebraic Riccati equation:

AT P̄ + P̄A − P̄BBT P̄ + Q = 0 (3.27)

where: 
∂f (x)

 A = ∂x
x=0


B = g(0) (3.28)
 ∂ 2 q(x)
 Q= 1


2 2
∂x x=0

Following Sassano and Astol , there exists a matrix R = RT > 0, a


1

neighbourhood of the origin Ω ⊆ R2n and k̄ ≥ 0 such that for all k ≥ k̄ the
function V (x, ξ) is positive denite and satises the following partial
dierential inequality:
1 1
q(x) + Vx (x, ξ)f (x) + Vξ (x, ξ) ξ̇ − Vx (x, ξ) g(x)g T (x)VxT (x, ξ) ≤ 0 (3.29)
2 2
where: (
V (x, ξ) = P (ξ)x + 21 (x − ξ)T R(x − ξ)
(3.30)
ξ̇ = −k VξT (x, ξ) ∀ (x, ξ) ∈ Ω

The C 1 mapping P : Rn → R1×n , P (0) = 0T , is dened as follows:


1 1
q(x) + P (x)f (x) − P (x)g(x)g T (x)P (x)T + σ(x) = 0 (3.31)
2 2
where σ(x) = xT Σ(x)x with Σ : Rn → Rn×n , Σ(0) = 0.
Furthermore P (x) is tangent at x = 0 to P̄:

∂P (x)T
= P̄ (3.32)
∂x x=0

Since P (x) is tangent at x = 0 to the solution P̄ of the algebraic Riccati


equation, the function P (x) x : Rn → R is locally quadratic around the origin
and moreover has a local minimum for x = 0.
Let Ψ(ξ) be Jacobian matrix of the mapping P (ξ) and Φ : Rn × Rn → Rn×n
a continuous matrix valued function such that:
P (ξ) = ξ T Ψ(ξ)T

(3.33)
P (x) − P (ξ) = (x − ξ)T Φ(x, ξ)T
1
Sassano M. and Astol A, Dynamic approximate solutions of the HJ inequality and of the
HJB equation for input-ane nonlinear systems. IEEE Transactions on Automatic Control,
57(10):24902503, 2012.
68 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

Then the approximate regional dynamic optimal control is found to be1 :

u = −g(x)T VxT (x, ξ)


= −g(x)T P (ξ)T + R(x − ξ)

(3.34)
= −g(x)T P (x)T + R(x − ξ) − P (x)T −P (ξ)T


= −g(x)T P (x)T + R − Φ(x, ξ) (x − ξ)




where:
ξ̇ = −k VξT (x, ξ) = −k Ψ(ξ)T x − R x − ξ (3.35)


Such control has been applied to internal combustion engine test benches2 .

3.5 Solving the algebraic Riccati equation


3.5.1 Hamiltonian matrix based solution
It can be shown that if the pair (A, B) is stabilizable and the pair (A, N) is
detectable, with Q = NT N positive semi-denite and R positive denite, then
P is a the unique positive semi-denite (symmetric) solution of the algebraic
Riccati equation (ARE) (3.8).
Combining (3.14) and (3.15) into a single state equation yields:

A −BR−1 BT x(t)
      
ẋ(t) x(t)
= := H (3.36)
λ̇(t) −Q −AT λ(t) λ(t)

We have seen that the following 2n × 2n matrix H is called the Hamiltonian


matrix:
A −BR−1 BT
 
H= (3.37)
−Q −AT
It can be shown that the Hamiltonian matrix H has n eigenvalues in the open
left half plane and n eigenvalues in the open right half plane. The eigenvalues
are symmetric with respects to the imaginary axis: if λ is and eigenvalue of
H then −λ is also an eigenvalue of H. In addition H has no pure imaginary
eigenvalues.  
X1
Furthermore if the 2n × n matrix has columns that comprise all the
X2
eigenvectors of H corresponding to the n eigenvalues in the open left half plane.
Then X1 is invertible and the positive semi-denite solution of the algebraic
Riccati equation (ARE) is:
P = X2 X−11 (3.38)
Similarly the negative semi-denite solution of the algebraic Riccati equation
(ARE) is build thanks to the eigenvectors associated with the n eigenvalues in
the open right half plane (i.e. the unstable invariant subspace). Once again the
solution of the optimal control problem only retains the positive semi-denite
solution of the algebraic Riccati equation.
2
Passenbrunner T., Sassano M., del Re L., Optimal Control with Input Constraints applied
to Internal Combustion Engine Test Benches, 9th IFAC Symposium on Nonlinear Control
Systems, September 4-6, 2013. Toulouse, France
3.5. Solving the algebraic Riccati equation 69

In addition it can be shown that the eigenvalues of A − BK where K =


R−1 BT P (that are the eigenvalues of the closed-loop plant) are equal to the n
eigenvalues in the open left half plane of the Hamiltonian matrix H.

3.5.2 Proof of the results on the Hamiltonian matrix


We recall that, by denition, a matrix H is said to be an Hamiltonian matrix
as soon as the following property holds:
 
T T 0 I
(JH) = JH ⇔ (HJ) = HJ where J = (3.39)
−I 0

Matrix J as the following properties:


   
I 0 I 0
T
J J = JJ = T
and JJ = J J = −
T T
(3.40)
0 I 0 I

In addition the following relation holds:

HJ = (HJ)T ⇒ JT HJ = JT JT HT = −HT (3.41)

Let λ be an eigenvalue of Hamiltonian matrix H associated with eigenvector


x. We get:
Hx = λ x
⇒ HJJT x = λ x
⇒ JT HJJT x = λ JT x (3.42)
T T
⇔ −H J x = λ J x T

⇔ HT JT x = −λ JT x

Thus −λ is an eigenvalue of HT with


 the corresponding eigenvector J x.
T

Using the fact that det (M) = det M we get:


T

det −λI − HT = det (−λI − H)T (3.43)


 

As a consequence we conclude that −λ is also an eigenvalue of H.


To show that H has no eigenvalue on the imaginary axis suppose:

A −BR−1 BT x1
      
x1 x
H = =λ 1 (3.44)
x2 −Q −AT x2 x2

Where x1 and x2 are not both zero and


λ + λ∗ = 0 (3.45)

where λ∗ stands for the complex conjugate of λ. We seek a contradiction.


Let's denote by x∗ the transpose conjugate of vector x.

− Equation (3.44) gives:

Ax1 − BR−1 BT x2 = λx1 ⇒ x∗2 BR−1 BT x2 = x∗2 Ax1 − λx∗2 x1 (3.46)


70 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

− Taking into account that Q is a real symmetric matrix, equation (3.44)


also gives:

−Qx1 − AT x2 = λx2 ⇒ λxT2 = −xT1 Q − xT2 A ⇒ λ∗ x∗2 = −x∗1 Q − x∗2 A (3.47)


Denoting M = BR−1 BT and taking into account (3.47) into (3.46) yields:
 ∗
x2 BR−1 BT x2 = x∗2 Ax1 − λx∗2 x1
x∗2 A = −x∗1 Q − λ∗ x∗2 (3.48)
⇒ x∗2 Mx2 = −x∗1 Qx1 − λ∗ x∗2 x1 − λx∗2 x1 = −x∗1 Qx1 − (λ∗ + λ) x∗2 x1

Using (3.45) we nally get:

x∗2 Mx2 = −x∗1 Q x1 (3.49)

Since R and Q are positive semi-denite matrices, and consequently also


M = BR−1 BT , this implies: 
Mx2 = 0
(3.50)
Qx1 = 0
Then using (3.46) we get:
  
Ax1 = λ x1 A − λI
⇒ x1 = 0 (3.51)
Qx1 = 0 Q

If x1 ̸= 0 then this contradicts observability of the pair (Q, A) by


 the Popov-
Belevitch-Hautus test. Similarly if x2 ̸= 0 then x2 M A + λ I = 0 which
∗ ∗

contradicts the observability of the pair (A, M).

3.5.3 Solving general algebraic Riccati and Lyapunov equations


The general algebraic Riccati equation reads as follows where all matrices are
square of dimension n × n:

AX + XB + C + XDX = 0 (3.52)
Matrices A, B, C and D are known whereas matrix X has to be determined.
The general algebraic Lyapunov equation is obtained as a special case of the
algebraic Riccati by setting D = 0.
The general algebraic Riccati equation can be solved3 by considering the
following 2n × 2n matrix H:
 
B D
H= (3.53)
−C −A

Let the eigenvalues of matrix H be denoted λ1 , i = 1, · · · , 2n, and the


corresponding eigenvectors be denoted v i . Furthermore let M be the 2n × 2n
matrix composed of all real eigenvectors of matrix H; for complex conjugate
eigenvectors, the corresponding columns of matrix M are changed into the real
3
Optimal Control of Singularly Perturbed Linear Systems with Applications: High
Accuracy Techniques, Z. Gajic and M. Lim, Marcel Dekker, New York, 2001
3.5. Solving the algebraic Riccati equation 71

and imaginary parts of such eigenvectors. Note that there are many ways to
form matrix M.
Then we can write the following relation:
 
Λ1 0
(3.54)
 
HM = MΛ = M1 M2
0 Λ2

Matrix M1 contains the n rst columns of M whereas matrix M2 contains


the n last columns of M.
Matrices Λ1 and Λ2 are diagonal matrices formed by the eigenvalues of H
as soon as there are distinct; for eigenvalues with multiplicity greater than 1,
the corresponding part in matrix Λ represents the Jordan form.
Thus we have: 
HM1 = M1 Λ1
(3.55)
HM2 = M2 Λ2

We will focus our attention on the rst equation and split matrix M1 as
follows:  
M11
M1 = (3.56)
M12

Using the expression of H in (3.53), the relation HM1 = M1 Λ1 reads as


follows: 
BM11 + DM12 = M11 Λ1
HM1 = M1 Λ1 ⇒ (3.57)
−CM11 − AM12 = M12 Λ1

Assuming that matrix M11 is not singular, we can check that a solution X
of the general algebraic Riccati equation (3.52) reads:

X = M12 M−1
11 (3.58)

Indeed:

 BM11 + DM12 = M11 Λ1
CM11 + AM12 = −M12 Λ1
X = M12 M−1

11
⇒ AX + XB + C + XDX = AM12 M−1 −1
11 + M12 M11 B + C
+M12 M−111 DM12 M11
−1
−1
= (AM12 + CM11 ) M11
+M12 M−1 11 (BM11 + DM12 ) M11
−1

= −M12 Λ1 M11 + M12 M11 M11 Λ1 M−1


−1 −1
11
=0
(3.59)
It is worth noticing that each selection of eigenvectors within matrix M1
leads to a new solution of the general algebraic Riccati equation (3.52).
Consequently the solution to the general algebraic Riccati equation (3.52) is
not unique. The same statement holds for dierent choice of matrix M2 and
the corresponding solution of (3.52) is obtained from X = M21 M−1 22 .
72 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

3.6 Application to the optimal control of any scalar


LTI plant
We consider the following scalar linear time invariant plant where
x(t) ∈ R, u(t) ∈ R:
ẋ(t) = a x(t) + b u(t) where b ̸= 0

(3.60)
z(t) = c1 x(t) where c1 ̸= 0
We wish to minimize the following performance index:
1 ∞ 2
Z
J(u(t)) = z (t) + ρ u2 (t) dt where ρ > 0 (3.61)
2 0
It is easy to check that pair (a, b) is controllable (meaning that b ̸= 0) and
that pair (a, c1 ) is observable (meaning that c1 ̸= 0).
In order Rto match the considered performance index with the general

expression 12 0 xT (t)Qx(t) + uT (t)Ru(t) dt of the performance index, we


dene weights Q and R as follows:


z 2 (t) T
 = z (t)z(t) = (c1 x(t))T (c1 x(t)) = xT (t)cT1 c1 x(t)
Q := cT1 c1 = c21 (3.62)

R := ρ
The Hamiltonian matrix H reads:
 " 2
#
A −BR−1 BT x(t) a − bρ
 
H= = (3.63)
−Q −AT λ(t) −c21 −a
The eigenvalues of H are obtained by solving:
" #!
b2
s−a
det (sI − H) = 0 ⇒ det ρ =0
c21 s+a
(c1 b)2 (3.64)
⇔ (s − a)(s + a) − ρ =0
(c1 b)2
⇔ s2 − a2 − ρ =0
Thus the two eigenvalues of H read:
 q
 λ1 = + a2 + (c1 b)2
ρ
q (3.65)
 λ = − a2 + (c1 b)2
2 ρ

We check that the eigenvalues of H are symmetric with respect to the


imaginary axis.
The eigenvectors v 1 and v 2 corresponding to eigenvalues λ1 and λ2 ,
respectively, are obtained as follows:
" 2
# 
a − bρ
 
v11 v
Hv 1 = λ1 v 1 ⇒ 2
= λ1 11
−c1 −a v 12 v12
( 2
a v11 − bρ v12 = λ1 v11
⇔ (3.66)
−c21 v11 − a v12 = λ1 v12
( 2
v11 (a − λ1 ) = bρ v12

−c21 v11 = v12 (a + λ1 )
3.6. Application to the optimal control of any scalar LTI plant 73

From the rst equation we can choose for example the following components
for eigenvector v 1 :
  1  s
(c1 b)2

v11
v1 = = a−λρ
1 where λ1 = + a2 + (3.67)
v12 b2 ρ

We can check that this choice for v11 and v12 is compatible with the second
equation. Indeed:
−c21 2
−c21 v11 = v12 (a + λ1 ) ⇒ a−λ1 = ρ
b2
(a + λ1 ) ⇒ a2 − λ21 = − (c1ρb) (3.68)

Changing λ1 by λ2 leads to a possible choice of the components of eigenvector


v2: s
1
(c1 b)2
   
v21
v2 = = a−λ2
ρ where λ2 = − a2 + (3.69)
v22 b2 ρ
As far as λ2 is the eigenvalue of H in the left half plane, we conclude that
λ2 will be the closed-loop eigenvalue once the optimal control has been applied
(notice that we don't know so far the expression of the optimal control !).
As far as v 2 is the eigenvector of H corresponding to the eigenvalue in the
left half plane, we split it as follows:
   1  s
X1 (c1 b)2
v2 = = a−λ ρ
2 where λ2 = − a2 + (3.70)
X2 b2 ρ

Then the solution of the algebraic Riccati equation which leads to the
computation of the optimal control reads:
s
−1 ρ (c1 b)2
P = X2 X1 = 2 (a − λ2 ) where λ2 = − a2 + (3.71)
b ρ

Thus: s !
ρ (c1 b)2
P= 2 a+ a2 + (3.72)
b ρ

We will check those results by using the algebraic Ricatti equation, which
reads:
AT P + PA − PBR−1 BT P + Q = 0
2
⇔ 2aP − bρ P2 + c21 = 0 (3.73)
b2 2 2
⇔ ρ P − 2aP − c1 = 0
The roots of this quadratic equation are:
 r
(c b)2
2a+ 4a2 +4 1ρ
  q 
 ρ (c1 b)2
 P1 =


b2
= b2
a + a2 + ρ >0
2
r ρ
(3.74)
(c1 b)2
4a2 +4
 
 2a− q
ρ ρ (c1 b)2

 P2 = = a − a2 + <0

 2 b2 ρ
2 bρ
74 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

It is clear that P1 is the positive denite solution of the algebraic Riccati


equation (ARE). Thus we retrieve the result (3.72):
s !
ρ (c b) 2
1
P := P1 = 2 a + a2 + (3.75)
b ρ
Furthermore we are now in position to compute the feedback gain K:
s ! s !
b ρ (c b)2 1 (c b)2
−1 T 1 1
K=R B P= a + a2 + = a + a2 + (3.76)
ρ b2 ρ b ρ
Finally, the eigenvalue of the feedback loop reads:
 q 
1 (c1 b)2
spec (A − BK) = A − BK = a − b b a + a2 + ρ
q (3.77)
2
= − a2 + (c1ρb)

We obviously retrieve the result (3.69) obtained through the Hamiltonian


matrix H.

3.7 Hamiltonian matrix properties


Let H be the following Hamiltonian matrix:
 
A −G
H= where G = GT , Q = QT (3.78)
−Q −AT
By denition, a matrix H is said to be an Hamiltonian matrix as soon as
the following property holds:
(JH)T = JH (3.79)
where J is the following skew-symmetric matrix:
 
0 I
T
J = −J = (3.80)
−I 0
Any matrix S ∈ R2n×2n satisfying the following relation is called a symplectic
matrix:
ST JS = SJST = J (3.81)
If H has no eigenvalues on the imaginary axis, then the invariant subspace
X belonging to the n (counting multiplicities) eigenvalues in the open left half
plane is called the stable invariant subspace of H. If the columns of X form
an orthonormal basis for X , then X JX is orthogonal and the following
Hamiltonian block-Schur decomposition is obtained4 :
 
T  T −G
(3.82)
 
X JX H X JX =
0 −TT
4
Peter Benner, Daniel Kressner, Volker Mehrmann, Skew-Hamiltonian and Hamiltonian
Eigenvalue Problems: Theory, Algorithms and Applications, January 2005, Proceedings of
the Conference on Applied Mathematics and Scientic Computing (pp.3-39), DOI: 10.1007/1-
4020-3197-1_1
3.7. Hamiltonian matrix properties 75

where T ∈ Rn×n is an upper triangular matrix (we said that T has a real
Schur form):

t11 t12 · · · t1n

 0 t21 . . .
 
(3.83)

T= .
 
 .. . . . . . .


0 0 · · · tnn
Moreover, given Hamiltonian matrix H dened in (3.78), there is always a
corresponding algebraic Riccati equation (ARE)4 :
 
A −G
H= where G = GT , Q = QT
−Q −AT (3.84)
Corresponding ARE : A P + PA − PGP + Q = 0
T

Assume that P = PT is a symmetric solution of the algebraic Riccati


equation (ARE). Then it is easy to see that the following relation holds:
    
In 0 In 0 A − GP −G
H = (3.85)
P In P In 0 − (A − GP)T
     
In In In
Hence H = (A − GP). Thus the columns of span the
P P P
H-invariant subspace corresponding to λ (H) ∩ λ (A − GP). This implies that
AREs can be solved by computing H-invariant subspaces.
Finally, let G = BR−1 BT and Q = NT N. Thus Hamiltonian matrix (3.78)
reads:
G = BR−1 BT −BR−1 BT
    
A −G A
⇒H= =
Q = NT N −Q −AT −NT N −AT
(3.86)
T 0.5 T
where R = R > 0 ⇒ R = R
T 0.5 R and R −0.5 = R −0.5 .
Then GP = BR−1 BT P := BK where K = R−1 BT P and relation(3.85)
reads as follows:

−BR−1 BT
  
A In 0
−NT N −AT P In
−BR−1 BT
  
In 0 A − BK
= (3.87)
P In 0 − (A − BK)T
 
In 0
From the preceding relation, and using the fact that det = 1,
P In
we get:

det (sI − H) = (−1)n β(s) β(−s) where β(s) := det (sI − A + BK) (3.88)

Furthermore let (A, B, N) be the realization of a strictly proper transfer


matrix F(s):

A BR−0.5
 
F(s) = := N (sI − A)−1 BR−0.5 (3.89)
N 0
76 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

Figure 3.1: Closed-loop Hamiltonian transfer function

Then is can be shown that the following relation holds:


 
−1  −1 0
T
(3.90)

I + F(s)F (−s) = −N 0 (sI − H) +I
NT

To get this result, consider Figure 3.1. The relation between e(s) and r(s)
is obtained by reading Figure 3.1 against the arrows:
e(s) = r(s) − F(s)FT (−s)e(s)
−1 (3.91)
⇒ e(s) = I + F(s)FT (−s) r(s)

On the other hand, the realization of FT (−s) is obtained from the realization
of F(s) as follows:

A BR−0.5
 
F(s) = := N (sI − A)−1 BR−0.5
N 0
 T −1 T
⇒ FT (−s) = N (−sI − A)−1 BR−0.5 = −R−0.5 BT sI − −AT N
T T
 
−A N
=
−R−0.5 BT 0
(3.92)
Thus, in the time domain we have:
A BR−0.5 ẋ1 = Ax1 + BR−0.5 u
   

 F(s) = ⇒
 N 0 y = Nx1
T T (3.93)
ẋ2 = −AT x2 + NT e
  
−A N
 FT (−s) = ⇒


−R−0.5 BT 0 u = −R−0.5 BT x2
From Figure 3.1 we see that e = r − y . Thus the realization of Figure 3.1
reads as follows:
−BR−1 BT
       
 ẋ1 A x1 0
e = r − y 
= + r
−NT N  −AT NT

ẋ2 x2
 
u = −R −0.5 T
B x2 ⇒   x1
y = Nx1  e = −N 0 + Ir
 

x2
(3.94)
In the frequency domain we get:
   
−1 0
(3.95)
 
e(s) = −N 0 (sI − H) + I r(s)
NT
3.8. Discrete time LQ regulator 77

When identifying (3.91) with (3.95) we get relation (3.90). An alternate


relation can also be obtained by replacing FT (−s)F(s) in Figure 3.1 by
F(s)FT (−s). Then we get:

BR−0.5
 
−1  −1
T −0.5 T + I (3.96)

I + F (−s)F(s) = 0 −R B (sI − H)
0

Having in mind that for any square invertible matrix Y we have −1


 XY−0.5Z=
X adj(Y)Z BR
(here X = 0 −R−0.5 BT , Y = (sI − H) and Z = ),
 
det(Y)
0
we conclude that relation (3.90) indicates thatthe eigenvalues of Hamiltonian
matrix H are the roots of det I + F(s)FT (−s) .
Moreover, let:
N adj (sI − A) BR−0.5 Nol (s)
F(s) = N (sI − A)−1 BR−0.5 = := (3.97)
det (sI − A) D(s)
Then:
T
 
ol (s) Nol (−s)
det I + F(s)FT (−s) = det I + ND(s)

D(−s)

D(s)D(−s) I+Nol (s)NT
 (3.98)
ol (−s)
= det D(s)D(−s)

Consequently, the eigenvalues of the


 Hamiltonian matrix H are the roots of
det D(s)D(−s) I + Nol (s)NTol (−s) :

det (sI − H)|s=λ = 0 ⇔ det D(s)D(−s) I + Nol (s)NTol (−s) = 0 (3.99)



s=λ

Alternatively, the preceding relation indicates that D(λ)D(−λ) is an


eigenvalue of matrix −Nol (λ)NTol (−λ). This remark may be used for design
purposes, especially to select matrix N to achieve some specied closed-loop
eigenvalues (we recall that weighting matrix Q is given by Q = NT N).

3.8 Discrete time LQ regulator


3.8.1 Finite horizon LQ regulator
There is an equivalent theory for discrete time systems. Indeed, for the system:

x(k + 1) = Ax(k) + Bu(k)
(3.100)
x(0) = x0
with an equivalent performance criteria:
N −1
1 T 1X T
J(u(k)) = x (N )Sx(N ) + x (k)Qx(k) + uT (k)Ru(k) (3.101)
2 2
k=0

Where Q ≥ 0 is a constant positive denite matrix and R > 0 a constant


positive denite matrix. The optimal control is given by:

u(k) = −K(k)x(k) (3.102)


78 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

Where: −1 T
K(k) = R + BT P(k + 1)B B P(k + 1)A (3.103)
And P(k) is given by the solution of the discrete time Riccati equation:
 −1 T
P(k) = AT P(k + 1)A + Q − AT P(k + 1)B R + BT P(k + 1)B B P(k + 1)A
P(N ) = S
(3.104)

3.8.2 Finite horizon LQ regulator with zero terminal state


We consider the following performance criteria to be minimized:
N −1
1X T
J(u(k)) = x (k)Qx(k) + uT (k)Ru(k) + 2xT (k)Su(k) (3.105)
2
k=0

With the constraint on terminal state:

x(N ) = 0 (3.106)

We will assume that matrices R > 0 and Q − SR−1 ST ≥ 0 are symmetric.


Ntogramatzidis2 has shown the results presented hereafter: denote by P1 and P2
the positive denite solutions of the following continuous time algebraic Riccati
equations:
 −1 T
0 = P1 + AT P1 B + S R + BT P1 B B P 1 A + ST
 


−AT P1 A − Q


−1 T  (3.107)
0 = P2 + ATb P2 Bb + Sb Rb + BTb P2 Bb Bb P2 Ab + STb



−ATb P2 Ab − Qb

Where:
Ab = A−1


 Bb = −A−1 B



Qb = A−T QA−1 (3.108)
R = R − ST A−1 B − BT A−T S + BT A−T QA−1 B

 b



Sb = A−T S − A−T QA−1 B
We will denote by K1 and K2 the following innite horizon gain matrices:
( −1 T
K1 = R + BT P1 B B P 1 A + ST

−1 T (3.109)
K2 = Rb + BTb P2 Bb Bb P2 Ab + STb


Then the optimal control is:



−K(k)x(k) ∀ 0 ≤ k < N
u(k) = (3.110)
0 for k = N
Where:
( −1 T
K(k) = R + BT P(k + 1)B B P(k + 1)A + ST

(3.111)
P(k) = X2 (k)X−1
1 (k)
3.8. Discrete time LQ regulator 79

And:
(
X1 (k) = (A − BK1 )k − (Ab − Bb K2 )(k−N ) (A − BK1 )N
(3.112)
X2 (k) = P1 (A − BK1 )k + P2 (Ab − Bb K2 )(k−N ) (A − BK1 )N

Matrix P(k) satisfy the following Riccati dierence equation:


−1 T
P(k) + AT P(k + 1)B + S R + BT P(k + 1)B B P(k + 1)A + ST
 

− AT P(k + 1)A − Q = 0 (3.113)

Furthermore the optimal state x(k) and costate λ(k) have the following
expressions:

x(k + 1) = (A − BK1 ) e1 (k) − (Ab − Bb K2 ) e2 (k)
(3.114)
λ(k + 1) = P1 (A − BK1 ) e1 (k) + P2 (Ab − Bb K2 ) e2 (k)

Where:
(
e1 (k) = (A − BK1 )k X−11 (0)x0
(k−N ) (3.115)
e2 (k) = (Ab − Bb K2 ) (A − BK1 )N X−1
1 (0)x0

3.8.3 Innite horizon LQ regulator


For the innite horizon problem N → ∞. We will assume that the performance
criteria to be minimized is:

1X T
J(u(k)) = x (k)Qx(k) + uT (k)Ru(k) (3.116)
2
k=0

Then matrix P satises the following discrete time algebraic Riccati


equation:
−1 T
P + AT PB R + BT PB B PA − AT PA − Q = 0 (3.117)

And the discrete time control u(k) is given by:

u(k) = −Kx(k) (3.118)

Where: −1 T
K = R + BT PB B PA (3.119)
If (A, B) is stabilizable, then the closed-loop system is stable, meaning that
all the eigenvalues of (A−BK), with K given by (3.119), will lie within the unit
disk (i.e. have magnitudes less than 1). Let's dene the following symplectic
matrix5 :
A−1 A−1 G
 
H= (3.120)
QA−1 AT + QA−1 G
5
Alan J. Laub, A Schur Method for Solving Algebraic Riccati equations, IEEE
Transactions On Automatic Control, VOL. AC-24, NO. 6, December 1979
80 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

Where:
G = BR−1 BT (3.121)
A symplectic matrix is a matrix which satises:
 
0 I
H JH = J where J =
T
and J−1 = −J (3.122)
−I 0

This implies:

HT J = JH−1 ⇔ J−1 HT J = H−1

A + GA−T Q −GA−T (3.123)


 
−1
⇒H =
−A−T Q A−T

Where A−T = (A−1 )T . Under detectability and stabilizability assumptions,


it can be shown that the eigenvalues of the closed-loop plant (that are the
eigenvalues of A−BK) are equal to the n eigenvalues inside the unit circle of the
Hamiltonian matrix H. The optimal control stabilizes the plant. Furthermore if
X1
the 2n×n matrix has columns that comprise all the eigenvectors associated
X2
with the n eigenvalues of the Hamiltonian matrix H outside the unit circle
(unstable eigenvalues) then X1 is invertible and the positive denite solution of
the algebraic Riccati equation (ARE) is:

P = X2 X−1
1 (3.124)

Thus matrix P for the optimal steady state feedback can be computed thanks
to the unstable (eigenvalues outside the unit circle) eigenvectors of H or the
stable (eigenvalues inside the unit circle) eigenvectors of H−1 .

3.9 Robustness property


3.9.1 Hsu-Chen theorem
Let's consider a linear plant controlled through a state feedback as follows:

ẋ(t) = Ax(t) + Bu(t)
(3.125)
u(t) = −Kx(t) + r(t)

The dynamics of the closed-loop system reads:

ẋ(t) = (A − BK) x(t) + Br(t) (3.126)

In order to compute the closed-loop transfer matrix between X(s) and R(s)
we take the Laplace transform of (3.125) assuming no initial condition:

sX(s) = AX(s) + B(−KX(s) + R(s))


⇒ X(s)(sI − A + BK) = BR(s) (3.127)
⇒ X(s) = (sI − A + BK)−1 BR(s)
3.9. Robustness property 81

Figure 3.2: Full-state feedback control

On the other hand, let Φ(s) be resolvent of the state (transition) matrix A.
Matrix Φ(s) is dened as follows:

Φ(s) = (sI − A)−1 (3.128)

The block diagram of the full-state feedback control is shown in Figure 3.2.
We get:
X(s) = Φ(s)B(R(s) − KX(s))
(3.129)
= (I + Φ(s)BK)−1 Φ(s)BR(s)
Using the fact that (AB)−1 = B−1 A−1 we get:

X(s) = (Φ−1 (s)(I + Φ(s)BK))−1 BR(s)


= (Φ−1 (s) + BK)−1 BR(s) (3.130)
= (sI − A + BK)−1 BR(s)

The open-loop characteristic polynomial is given by:

det (sI − A) = det Φ−1 (s) (3.131)




Whereas the closed-loop characteristic polynomial is given by:

det (sI − A + BK) (3.132)

Sylvester's determinant theorem6 states that the following relation holds


where M1 is an m × n matrix and M2 an n × m matrix (so that M1 and M2
have dimensions allowing them to be multiplied in either order forming a square
matrix):
det (Im + M1 M2 ) = det (In + M2 M1 ) (3.133)
Sylvester's determinant theorem may be proven using the Schur's formula,
which is recalled hereafter:
 
A11 A12
det = det (A22 ) det A11 − A12 A−1

A21
A21 A22 22 (3.134)
−1
= det (A11 ) det A22 − A21 A11 A12


6
https://en.wikipedia.org/wiki/Determinant
82 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)
 
Im −M1
Thus if M = , we get:
M2 In
 
Im −M1
det (M) = det = det (Im + M1 M2 )
M2 In (3.135)
= det (In + M2 M1 )

In addition, for square matrices M3 and M4 of equal size, the determinant


of the matrix product equals the product of their determinants:

det (M3 M4 ) = det (M3 ) det (M4 ) (3.136)

Then we get:
  
det (sI − A + BK) = det (sI − A) I + (sI − A)−1 BK
= det ((sI − A) (I + Φ(s)BK)) (3.137)
= det (sI − A) det (I + Φ(s)BK)
= det (sI − A) det (I + KΦ(s)B)

We nally get the following relation, which is known as the Hsu-Chen


theorem7 :

det (sI − A + BK) = det (sI − A) det (I + KΦ(s)B) (3.138)

The roots of det (sI − A + BK) are the eigenvalues of the closed-loop
system. Consequently they are related to the stability of the closed-loop
system.
Moreover the roots of det (I + KΦ(s)B) are exactly the roots of
det (sI − A + BK). Indeed, as far as Φ(s) = (sI − A)−1 , the inverse of
(sI − A) is computed as the adjugate of matrix (sI − A) divided by
det (sI − A) which nally becomes the denominator of det (I + KΦ(s)B):

 
det(I + KΦ(s)B) = det I + K (sI − A)−1 B
 
adj(sI−A)
= det I + K det(sI−A) B
(3.139)
 
= det det(sI−A)I+K adj(sI−A)B
det(sI−A)
= det(det(sI−A)I+K adj(sI−A)B)
det(sI−A)
⇒ det (sI − A + BK) = det (det (sI − A) I + K adj (sI − A) B)

Thus:
det (sI − A + BK) = 0 ⇔ det (I + KΦ(s)B) = 0 (3.140)

Consequently, the eigenvalues of full-state feedback loop are the roots of


det(I + KΦ(s)B).
7
Pole-shifting techniques for multivariable feedback systems, Retallack D.G., MacFarlane
A.G.J., Proceedings of the Institution of Electrical Engineers, 1970
3.9. Robustness property 83

Figure 3.3: Nyquist contour

3.9.2 Generalized (MIMO) Nyquist stability criterion


Let's recall the generalized (MIMO) Nyquist stability criterion which will be
applied in the next section to the LQR design through Kalman equality.
We remind that the Nyquist plot of det(I + KΦ(s)B) is the image of det(I +
KΦ(s)B) as s goes clockwise around the Nyquist contour: this includes the
entire imaginary axis (s = jω ) and an innite semi-circle around the right half
plane as shown in Figure 3.3.
The generalized (MIMO) Nyquist stability criterion states that the number
of unstable closed-loop poles (that are the roots of det(sI−A+BK)) is equal to
the number of unstable open-loop poles (that are the roots of det (sI − A)) plus
the number of encirclements of the critical point (0, 0) by the Nyquist plot of
det(I+KΦ(s)B); the encirclement is counted positive in the clockwise direction
and negative otherwise.
An easy way to determine the number of encirclements of the critical point
is to draw a line out from the critical point, in any directions. Then by counting
the number of times that the Nyquist plot crosses the line in the clockwise
direction (i.e. left to right) and by subtracting the number of times it crosses
in the counterclockwise direction then the number of clockwise encirclements
of the critical point is obtained. A negative number indicates counterclockwise
encirclements.
It is worth noticing that for Single-Input Single-Output (SISO) systems K
is a row vector whereas B is a column vector. Consequently KΦ(s)B is a scalar
and we have:

det(I + KΦ(s)B) = det(1 + KΦ(s)B) = 1 + KΦ(s)B (3.141)

Thus for Single-Input Single-Output (SISO) systems the number of


encirclements of the critical point (0, 0) by the Nyquist plot of
84 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

det(I + KΦ(s)B) is equivalent to the number of encirclements of the critical


point (−1, 0) by the Nyquist plot of KΦ(s)B.
In the context of output feedback the control u(t) = −Kx(t) is replaced
by u(t) = −Ky(t) where y(t) is the output of the plant: y(t) = Cx(t). As a
consequence the control u(t) reads u(t) = −KCx(t) and state feedback gain K
is replaced by output feedback gain KC in equation (3.138):

det(sI − A + BKC) = det (sI − A) det(I + KCΦ(s)B) (3.142)

This equation involves the transfer function CΦ(s)B between the output
Y (s) and the control U (s) of the plant without any feedback and is used in the
Nyquist stability criterion for Single-Input Single-Output (SISO) systems.
It is also worth noticing that (I + KCΦ(s)B)−1 is attached to the so called
sensitivity function of the closed-loop whereas CΦ(s)B is attached to the open-
loop transfer function from the process' input U (s) to the plant output Y (s).

3.9.3 Kalman equality


Let's consider the full-state feedback control is shown in Figure 3.2. Kalman
has shown the following result, known as Kalman equality :

(I + L(−s))T R (I + L(s)) = R + (Φ(−s)B)T Q (Φ(s)B) (3.143)

where L(s) is the loop gain and K the optimal feedback gain (obtained
through the algebraic Riccati equation):

L(s) = KΦ(s)B (3.144)

The proof of the Kalman equality is provided hereafter. Consider the


algebraic Riccati equation (3.8):
PA + AT P − PBR−1 BT P + Q = 0 (3.145)

Because K = R−1 BT P, P = PT and R = RT , the previous equation can


be re-written as:

P (sI − A) − (−sI − A)T P + KT RK = Q (3.146)

Using the fact that Φ(s) = (sI − A)−1 we get:


T
PΦ−1 (s) + Φ−1 (−s) P + KT RK = Q (3.147)

Left multiplying by BT ΦT (−s) and right multiplying by Φ(s)B yields:

BT ΦT (−s)PB + BT PΦ(s)B + BT ΦT (−s)KT RKΦ(s)B =


BT ΦT (−s)QΦ(s)B (3.148)
3.9. Robustness property 85

Adding R to both sides of equation (3.148) as using the fact that RK = BT P


we get:

R + BT ΦT (−s)KT R + RKΦ(s)B + BT ΦT (−s)KT RKΦ(s)B =


R + BT ΦT (−s)QΦ(s)B (3.149)
The previous equation can be re-written as:
(I + KΦ(−s)B)T R (I + KΦ(s)B) = R + (Φ(−s)B)T Q (Φ(s)B) (3.150)
This completes the proof. ■
Let R−0.5 be the root-square of matrix R−1 :
T
R−1 = R−0.5 R−0.5 (3.151)
T
Multiplying Kalman equality (3.143) by R−0.5 on the left side and by


R−0.5 on the right side we get:


T
R−0.5 (I + L(−s))T R (I + L(s)) R−0.5
T  
= R−0.5 R + (Φ(−s)B)T Q (Φ(s)B) R−0.5 (3.152)
T
Matrix R0.5 = R0.5 is the root square of matrix R. By getting the modal
decomposition of matrix R, that is R = VDV−1 where V is the matrix whose
columns are the eigenvectors of R and D is the diagonal matrix whose diagonal
elements are the corresponding positive eigenvalues, the square root R0.5 of R is
given by R0.5 = VD0.5 V−1 , where D0.5 is any diagonal matrix whose elements
are the square root of the diagonal elements of D. Thus we get:
R = VDV−1 ⇒ R−1 = VD−1 V−1
T
⇒ R−0.5 RR−0.5 = R−0.5 RR−0.5
= VD−0.5 V−1 VDV−1 VD−0.5 V−1 (3.153)
 

= VD−0.5 DD−0.5 V−1


=I
Thus:
T  
R−0.5 R + (Φ(−s)B)T Q (Φ(s)B) R−0.5
T
= I + Φ(−s)BR−0.5 Q Φ(s)BR−0.5 (3.154)


On the other hand, we have:


T
R−0.5 (I + L(−s))T R (I + L(s)) R−0.5
T
= R−0.5 + L(−s)R−0.5 R R−0.5 + L(s)R−0.5

T (3.155)
= R−0.5 + L(−s)R−0.5 R0.5 R0.5 R−0.5 + L(s)R−0.5

T
= I + R0.5 L(−s)R−0.5 I + R0.5 L(s)R−0.5


Finally, let Q := NT N. Then Kalman equality (3.143) can equivalently be


written as follows:
T
I + R0.5 L(−s)R−0.5 I + R0.5 L(s)R−0.5

T
= I + NΦ(−s)BR−0.5 NΦ(s)BR−0.5 (3.156)

86 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

3.9.4 Robustness of Linear Quadratic Regulator


The robustness of the LQR design can be assessed through the Kalman equality
(3.143). We will specialize Kalman equality to the specic case where the plant
is a Single Input - Single Output (SISO) system. Then KΦ(s)B and R are
scalars. Setting Q = NT N, and using the fact that NΦ(s)B is scalar for SISO
plants, Kalman equality (3.143) reduces as follows:
1
Q = NT N ⇒ (1 + L(−s)) (1 + L(s)) = 1 + R (NΦ(−s)B) (NΦ(s)B)
(3.157)
Transfer function L(s) is the loop gain, which is scalar for SISO plants:

L(s) = KΦ(s)B (3.158)

Substituting s = jω yields:
1
∥1 + L(jω)∥2 = 1 + ∥NΦ(jω)B∥2 (3.159)
R
Therefore:
∥1 + L(jω)∥ ≥ 1 ∀ω ∈ R (3.160)
For SISO plants, the sensitivity function S(s) and the complementary
sensitivity function T(s) are dened as follows:
1
(
S(s) = 1+L(s)
L(s) (3.161)
T(s) = 1 − S(s) = 1+L(s)

Substituting s = jω , Kalman's inequality guarantees that:



∥S(jω)∥ ≤ 1
(3.162)
∥T(jω)∥ ≤ 2

Those inequalities are represented in Figure 3.4.


We recall that:

− A small sensitivity function is desirable for good disturbance rejection.


Generally, this is especially important at low frequencies.

− A complementary sensitivity function close to one is desirable for good


reference tracking. Generally, this is especially important at low
frequencies.

− A small complementary sensitivity function is desirable for good noise


rejection. Generally, this is especially important at high frequencies.

Furthermore, let's introduce the real part X(ω) and the imaginary part Y (ω)
of L(jω):
L(jω) = X(ω) + jY (ω) (3.163)
Then ∥1 + L(jω)∥2 reads as follows:

∥1 + L(jω)∥2 = ∥1 + X(ω) + jY (ω)∥2 = (1 + X(ω))2 + Y (ω)2 (3.164)


3.9. Robustness property 87

Figure 3.4: Upper bounds of sensitivity function S(s) and complementary


sensitivity function T(s) through LQR design

Consequently inequality (3.160) reads as follows:

∥1 + L(jω)∥ ≥ 1
⇔ ∥1 + L(jω)∥2 ≥ 1 (3.165)
⇔ (1 + X(ω))2 + Y (ω)2 ≥ 1
As a consequence, the Nyquist plot of L(jω) will be outside the circle of
unit radius centered at (−1, 0). Thus applying the generalized (MIMO) Nyquist
stability criterion and knowing that the LQR design always leads to a stable
closed-loop plant, the implications of Kalman inequality are the following:

− If the open-loop system has no unstable pole, then the Nyquist plot of
L(jω) does not encircle the critical point (−1, 0). This corresponds to a
positive gain margin of +∞ as depicted as depicted in Figure 3.5.

− On the other hand if Φ(s) has unstable poles, the Nyquist plot of L(jω)
encircles the critical point (−1, 0) a number on times which corresponds
to the number of unstable open-loop poles. This corresponds to a negative
gain margin which is always lower or equal to 20 log10 (0.5) = −6 dB as
depicted in Figure 3.6.

In both situations, if the process' phase increases by 60 degrees its Nyquist


plots rotates by 60 degrees but the number of encirclements still does not change.
Thus the LQR design always leads to a phase margin which is always greater
or equal to 60 degrees.
Last but not least, it can be seen in Figure 3.5 and Figure 3.6 that at
high-frequency the loop gain L(jω) can have at most −90 degrees phase for
high-frequencies and therefore the roll-o rate is at most −20 dB/decade.
88 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)

Figure 3.5: Nyquist plot of L(s): example where the open-loop system has
no unstable pole

Figure 3.6: Nyquist plot of L(s): example where the open-loop system has
unstable poles
3.9. Robustness property 89

Unfortunately those nice properties are lost as soon as the performance index
J(u(t)) contains state / control cross-terms 8 :
Z tf
1
J(u(t)) = xT (t)Qx(t) + uT (t)Ru(t) + 2xT (t)Su(t) dt (3.166)
2 0

This is especially the case for LQG (Linear Quadratic Gaussian) regulator
where the plant dynamics as well as the output measurement are subject to
stochastic disturbances and where a state estimator has to be used.

8
Doyle J.C., Guaranteed margins for LQG regulators, IEEE Transactions on Automatic
Control, Volume: 23, Issue: 4, Aug 1978
90 Chapter 3. Innite Horizon Linear Quadratic Regulator (LQR)
Chapter 4

Design methods

4.1 Symmetric Root Locus


4.1.1 Characteristics polynomials
Let's consider the following state space realization (A, B, N):

ẋ(t) = Ax(t) + Bu(t)
(4.1)
z(t) = N x(t)
We will assume that (A, B, N) is minimal, or equivalently that (A, B) is
controllable and (A, N) is observable, or equivalently that the following loop
gain (or open-loop) transfer function is irreducible:
N adj (sI − A) B N (s)
G(s) = N (sI − A)−1 B = = (4.2)
det (sI − A) D(s)
The polynomial D(s) = det (sI − A) is the loop gain characteristics
polynomial, which is assumed to be of degree n, and polynomial matrix N (s)
is the numerator of N (sI − A)−1 B. From the fact that the numerator of G(s)
involves adj (sI − A) it is clear that the degree of its numerator N (s), which
will be denoted m, is strictly lower than the degree of its denominator D(s),
which will be denoted n:
deg(N (s)) = m < deg(D(s)) = n (4.3)
It can be shown that for single-input single-ouput (SISO) systems we have
the following relation where N (s) is the polynomial (not matrix) numerator of
the transfer function:
 
sI − A −B
det
N 0 N (s)
G(s) = N (sI − A)−1 B = = (4.4)
det (sI − A) D(s)
Now let's assume that the system is closed thanks to the following output
(not state !) feedback control u(t):
u(t) = −kp Ko z(t) + Fr(t) (4.5)
Where:
92 Chapter 4. Design methods

− kp is a scaling factor

− Ko is the output (not state !) feedback matrix gain

− F is the pre-lter gain

Then the state matrix of the closed-loop system reads A − kp BKo N and the
polynomial det (sI − A + kp BKo N) is the closed-loop characteristics
polynomial.

4.1.2 Root Locus reminder


The root locus technique1 has been developed in 1948 by Walter R. Evans (1920-
1999). This is a graphical method for sketching in the s-plane the locus of roots
of the following polynomial when parameter kp varies to 0 to innity:

det (sI − A + kp BKo N) = D(s) + kp N (s) (4.6)

Usually polynomial D(s) + kp N (s) represents the denominator of a


closed-loop transfer function. Polynomial D(s) + kp N (s) represents here the
denominator of the closed-loop transfer function when control u(t) reads:

u(t) = −kp Ko y(t) + Fr(t) (4.7)

It is worth noticing that the roots of D(s) + kp N (s) are also the roots of
1 + kp N (s)
D(s) :

N (s)
D(s) + kp N (s) = 0 ⇔ 1 + kp = 0 ⇔ L(s) := kp F (s) = −1 (4.8)
D(s)

Without loss of generality let's dene transfer function F (s) as follows:


Qm≤n
N (s) j=1 (s − zj )
F (s) = = a Qn (4.9)
D(s) i=1 (s − pi )

Transfer function L(s) = kp F (s) is called the loop transfer function. In the
SISO case the numerator of the loop transfer function L(s) is scalar as well as
its denominator.
Equation L(s) = −1 can be equivalently split into two equations:

|L(s)| = 1
(4.10)
arg (L(s)) = (2k + 1) π, k = 0, ±1, · · ·

The magnitude condition can always be satised by a suitable choice of kp .


On the other hand the phase condition does not depend on the value of kp but
only on the sign of kp . Thus we have to nd all the points in the s-plane that
satisfy the phase condition. When scalar gain kp varies from zero to innity (i.e.
kp is positive), the root locus technique is based on the following rules:
1
Walter R. Evans , Graphical Analysis of Control Systems, Transactions of the American
Institute of Electrical Engineers, vol. 67, pp. 547 - 551, 1948
4.1. Symmetric Root Locus 93

− The root locus is symmetrical with respect to the horizontal real axis
(because roots are either real or complex conjugate);
− The number of branches is equal to the number of poles of the loop transfer
function. Thus the root locus has n branches;
− The root locus starts at the n poles of the loop transfer function;
− The root locus ends at the zeros of the loop transfer function. Thus m
branches of the root locus end on the m zeros of F (s) and there are (n−m)
asymptotic branches;
− Assuming that coecient a in F (s) is positive, a point s∗ on the real
axis belongs to the root locus as soon as there is an odd number of poles
and zeros on its right. Conversely assuming that coecient a in F (s) is
negative, a point s∗ on the real axis belongs to the root locus as soon as
there is an even number of poles and zeros on its right. Be careful to take
into account the multiplicity of poles and zeros in the counting process;
− The (n − m) asymptotic branches of the root locus which diverge to ∞
are asymptotes.

 The angle δk of each asymptote with the real axis is dened by:
π + arg(a) + 2kπ
δk = ∀ k = 0, . . . , n − m − 1 (4.11)
n−m
 Denoting by pi the n poles of the loop transfer function (that are the
roots of D(s)) and by zj the m zeros of the loop transfer function
(that are the roots of N (s)), the asymptotes intersect the real axis
at a point (called pivot or centroid) given by:
Pn Pm≤n
i=1 pi − j=1 zj
σ= (4.12)
n−m
− The breakaway / break-in points are located on the real axis and always
have a vertical tangent. They are located at the roots sb of the following
equation as soon as there is an odd (if coecient a in F (s) is positive) or
even (if coecient a in F (s) is negative) number of poles and zeros on its
right (Be careful to take into account the multiplicity of poles and zeros
in the counting process):
   
d 1 d D(s)
ds F (s) s=s = ds N (s) s=s =0
(4.13)
b b
⇔ D′ (sb )N (sb ) − D(sb )N ′ (sb ) =0
Indeed from the fact that breakaway / break-in points have always a
vertical tangent we can write:
N (s) dkp D′ (s)N (s) − D(s)N ′ (s)
1 + kp F (s) = 1 + kp =0⇒ =− =0
D(s) dp N 2 (s)
(4.14)
From this relation we get (4.13).
94 Chapter 4. Design methods

− On the imaginary axis we have s = jω . Thus the value of the (positive)


critical gain beyond which the closed-loop system becomes unstable is the
value of kp (kp ≥ 0) such that the root locus of F (s) crosses the imaginary
axis. In that situation at least one pole of the closed-loop system is purely
imaginary. As far as D(s) + kp N (s) represents the denominator of the
closed-loop transfer function the critical gain can be obtained by replacing
s by jω and by solving:

1 + kp F (jω) = 0 ⇔ D(jω) + kp N (jω) = 0 (4.15)

The previous equation is then split into its real and imaginary part and
provides a system of 2 equations which lead to the value of the critical
gain and the oscillation frequency at the critical gain. It is worth noticing
that the Routh criterion can be used for the same purpose.

− Note that if the degree of polynomial D(s) is greater than or equal to


the degree of polynomial N (s) plus 2, meaning that the relative degree
of transfer function F (s) is greater than or equal to 2 (n − m ≥ 2), then
the sum of the poles of the feedback system is independent of the value of
parameter kp , and therefore is equal to the sum of the poles of the open
loop system when kp = 0. This property is known as the centroid theorem.
To get this result, we have simply to expand D(s) + kp N (s) taking into
account n − m ≥ 2:
Qm≤n−2
N (s) j=1 (s−zj )
F (s) = D(s) = a Q n
(s−pi )
i=1
⇒ D(s) + kp N (s) =
Qn Qm≤n−2 (4.16)
i=1 (s − pi ) + kp a j=1 (s − zj )
= sn − (r1 + r2 + · · · + rn ) sn−1 + · · ·

Assuming that n − m ≥ 2, the coecient of the term sn−1 in polynomial


D(s) + kp N (s) does not depend on parameter kp . Because this coecient
is obtained has the opposite of the sum r1 + r2 + · · · + rn of the roots
of polynomial D(s) + kp N (s), we conclude the sum of the poles of the
feedback system is independent of the value of parameter kp .

4.1.3 Chang-Letov design procedure


The purpose of this section is to have some insight on how to drive the modes of
the closed-loop plant thanks to the LQR design applied to SISO plants. More
precisely, we focus on single-input plants for which the cost to be minimized is
dened as in (3.3):

1 ∞ T
Z
x (t)Qx(t) + uT (t)Ru(t) dt (4.17)

J(u(t)) =
2 0

Nevertheless, weight matrix Q is here dened as follows, where matrix N is


a design matrix:
Q = NT N (4.18)
4.1. Symmetric Root Locus 95

Figure 4.1: Full-state feedback control with ctitious output z

Let z(t) := N x(t) be the controlled output: this is a ctitious output which
represents the output of interest for the design. The controlled output z(t) is
expressed as a linear function of the state vector x(t) as:

z(t) := N x(t) (4.19)

Thus the cost to be minimized can be rewritten as follows:


R∞
J(u(t)) = 21 R0 xT (t)NT Nx(t) + Ru 2 (t) dt

∞ (4.20)
= 21 0 z T (t)z(t) + Ru2 (t) dt


Furthermore the cost to be minimized is now constrained by the dynamics


of the system with the following state space representation:

ẋ(t) = Ax(t) + Bu(t)
(4.21)
z(t) = Nx(t)

From this state space representation we obtain the following open-loop


transfer function which is written as the ratio between a numerator N (s) and
a denominator D(s):
N (s)
N (sI − A)−1 B = (4.22)
D(s)
We recall that the cost (4.20) is minimized by choosing the following control
law, where P is the solution of the algebraic Riccati equation:

u(t) = −Kx(t)
(4.23)
K = R−1 BT P

This leads to a full-state feedback control with ctitious output z which is


represented in Figure 4.1 where Φ(s) = (sI − A)−1 .
Let D(s) be the open-loop characteristics polynomial and β(s) be the closed-
loop characteristic polynomial:

D(s) = det (sI − A)
(4.24)
β(s) = det (sI − A + BK)

In the single control case which is under consideration, it can be shown (see
section 4.1.4) that the characteristic polynomial of the closed-loop system is
96 Chapter 4. Design methods

linked with the numerator and the denominator of the loop transfer function as
follows:
1
β(s) β(−s) = D(s)D(−s) + N (s)N (−s) (4.25)
R
This relation can be associated with the root locus of
G(s)G(−s) = N (s)N (−s)
D(s)D(−s) where ctitious gain kp = R varies from 0 to ∞. This
1

leads to the so-called Chang-Letov design procedure, which enables to nd the
closed-loop poles based on the open-loop poles and zeros of G(s)G(−s). The
dierence with the root locus of G(s) is that both the open-loop poles and
zeros and their reections about the imaginary axis have to be taken into
account (this is due to the multiplication by G(−s)). The actual closed-loop
poles are those located in the left half plane with negative real part; indeed
optimal control leads always to a stabilizing gain. It is worth noticing that
matrix N is actually a design parameter which is used to shape the root locus.

4.1.4 Proof of the symmetric root locus result


The proof of (4.25) can be done as follows: taking the determinant of the
Kalman equality (3.143) and having in mind that det(MT ) = det(M) and that
for SISO systems R is scalar yields:
   
det (I + KΦ(−s)B)T R (I + KΦ(s)B) = det R + (Φ(−s)B)T Q (Φ(s)B)
T
   
⇔ det (I + KΦ(−s)B)T (I + KΦ(s)B) = det I + (Φ(−s)B)RQ(Φ(s)B)
T
   
⇔ det (I + KΦ(−s)B)T det (I + KΦ(s)B) = det I + (Φ(−s)B)RQ(Φ(s)B)
T
 
⇔ det (I + KΦ(−s)B) det (I + KΦ(s)B) = det I + (Φ(−s)B)RQ(Φ(s)B)
(4.26)
Where:
adj (sI − A)
Φ(s) = (sI − A)−1 = (4.27)
det (sI − A)

Furthermore it has been seen in (3.138) that thanks to the Hsu-Chen theorem
we have:
det (sI − A + BK)
det (I + KΦ(s)B) = (4.28)
det (sI − A)

Let D(s) be the open-loop characteristics polynomial and β(s) be the closed-
loop characteristic polynomial:

D(s) = det (sI − A)
(4.29)
β(s) = det (sI − A + BK)

As a consequence, using (4.28) in the left part of (4.26) yields:


!
β(s) β(−s) (Φ(−s)B)T Q (Φ(s)B)
= det I + (4.30)
D(s)D(−s) R
4.2. Asymptotic properties of LQR applied to SISO plants 97

In the single control case R and I are scalars (I = 1). Using Q = NT N


(4.30) becomes:
(NΦ(−s)B)T (NΦ(s)B)
 
β(s) β(−s)
= det 1 +
D(s)D(−s) R
(4.31)
(NΦ(−s)B)T (NΦ(s)B)
=1+ R

We recognize in NΦ(s)B = N (sI − A)−1 B the open-loop transfer function


G(s) which is the ratio between numerator polynomial N (s) and denominator
polynomial D(s) :
N (s)
G(s) = NΦ(s)B = N (sI − A)−1 B = (4.32)
D(s)
Using (4.32) in (4.31) yields:
β(s) β(−s) 1 N (s)N (−s)
=1+ R
D(s)D(−s) D(s)D(−s)
1
(4.33)
⇔ β(s) β(−s) = D(s)D(−s) + R N (s)N (−s)

This completes the proof. ■

4.2 Asymptotic properties of LQR applied to SISO


plants
We will see that Kalman equality allows for loop shaping through LQR design
for SISO plants. Lectures from professor Faryar Jabbari (Henry Samueli School
of Engineering, University of California) and professor Perry Y. Li (University
of Minnesota) are the primary sources of this section.
We recall that Φ(s) = (sI − A)−1 where dim(A) = n × n, which is also the
dimension of weight Q := NT N.

4.2.1 Closed-loop poles location


Relation (4.25) reads:
1
β(s) β(−s) = D(s)D(−s) + N (s)N (−s) (4.34)
R
From (4.34) we can get the following results:
− When R is large, i.e. 1/R is small so that the control energy is weighted
very heavily in the performance index, the roots of β(s), that are the
closed-loop poles, approach the stable open-loop poles or the negative of
the unstable open-loop poles:
β(s) β(−s) ≈ D(s)D(−s) as R → ∞ (4.35)

− When R is small (i.e. R → 0) then 1/R is large and the control is


cheap. Then the roots of β(s), that are the closed-loop poles, approach
the stable open-loop zeros or the negative of the non-minimum phase
open-loop zeros:
98 Chapter 4. Design methods

1
β(s) β(−s) ≈ N (s)N (−s) as R → 0 (4.36)
R
Equation (4.36) shows that any roots of β(s) β(−s) that remains nite as
R → 0 must tend toward the roots of N (s)N (−s). But from (4.3) we know
that the degree of N (s)N (−s), say 2m, is less than the degree of β(s) β(−s),
which is 2n. Therefore m roots of β(s) are the roots of N (s)N (−s) in the open
left half plane (stable roots). The remaining n − m roots of β(s) asymptotically
approach innity in the left half plane. For very large s we can ignore all but
the highest power of s in (4.34) so that the magnitude (or modulus) of the roots
that tend toward innity shall satisfy the following approximate relation:

b2m
(−1)n s2(n−m) ≈ (−1)m (4.37)
R
where we denote:

β(s) = det (sI − A + BK) = sn + βn−1 sn−1 + · · · + β1 s + β0



(4.38)
N (s) = bm sm + bm−1 sm−1 + · · · + b1 s + b0

The roots of β(−s) are the reection across the imaginary of the roots of
β(s). Now express s in the exponential form:

s = r ejθ (4.39)

We get from (4.37):

b2m b2
(−1)n r2n ej2nθ ≈ (−1)m r2m ej2mθ ⇒ r2(n−m) ≈ m (4.40)
R R
Therefore, the remaining n−m zeros of β(s) lie on a circle of radius r dened
by:
  1
b n−m
r≈ √m (4.41)
R
The particular pattern to which the 2(n − m) solutions of (4.41) lie is known
as the Butterworth conguration. The angle of the 2(n − m) branches which
diverge to ∞ are obtained by adapting relation (4.11) to the case where transfer
function reads G(s)G(−s) = N (s)N (−s)
D(s)D(−s) .

4.2.2 Shape of the magnitude of the loop gain


We recall that the loop gain L(s) is dened as follows:

L(s) := KΦ(s)B (4.42)

When Q = NT N, and assuming a SISO plant, Kalman equality (3.143)


becomes:

(1 + L(−s))T (1 + L(s)) = 1 + 1
(Φ(−s)B)T NT N (Φ(s)B)
R (4.43)
=1+ 1
R (NΦ(−s)B)T (N (Φ(s)B))
4.2. Asymptotic properties of LQR applied to SISO plants 99

Denoting by λ(X(jω)) the eigenvalues of matrix X(jω) and by σ(X(jω)) its


singular values (that are the root square of the strictly positive eigenvalues of
either XT (−jω)X(jω) or X(jω)XT (−jω)), the preceding equality implies:
   
λ (1 + L(−jω))T (1 + L(jω)) =1+ R 1
λ (NΦ(−jω)B)T (N (Φ(jω)B))
q
1 2
⇔ σ (1 + L(jω)) = 1 + R σ (NΦ(jω)B)
(4.44)
For the range of frequencies for which σ (NΦ(jω)B) ≫ 1 (typically low
frequencies) equation (4.44) shows that:
1
σ (L(jω)) ≈ √ σ (NΦ(jω)B) (4.45)
R
For SISO system matrices N and K have the same dimension. Denoting by
|K| the absolute value of each element of K, and using the fact that L(s) :=
KΦ(s)B, we get from the previous equation :
1 |N|
σ (L(jω)) ≈ √ σ (NΦ(jω)B) ⇒ |K| ≈ √ where Q = NT N (4.46)
R R
Assuming that z = N x, then NΦ(s)B represents the transfer function from
the control signal u(t) to the controlled output z(t). As a consequence:
− The shape of the magnitude of the loop gain L(s) is determined by the
magnitude of the transfer function from the control input u(t) to the
controlled output z(t);

− Parameter R, that is weight R, moves the magnitude Bode plot up and
down.
Note that although the magnitude of L(s) mimics the magnitude of NΦ(s)B,
the phase of the loop gain L(s) always leads to a stable closed-loop with an
appropriate phase margin. At high-frequency, it has been seen in Figure 3.5
and Figure 3.6 that the loop gain L(jω) can have at most −90 degrees phase
for high-frequencies and therefore the roll-o rate is at most −20 dB/decade.
In practice, this means that for ω ≫ 1, and for some constant a , we have the
following approximation (remind that Φ(s) = (sI − A)−1 = det(sI−A)
adj(sI−A)
so that
the degree of the denominator of L(s) is n and the degree of its numerator is at
most n − 1):
a a 1
|L(jω)| ≈ √ where √ = lim s |L(s)| ≈ lim s √ |NΦ(s)B| (4.47)
ω R R s→∞ s→∞ R
Thus:
a = lim s |NΦ(s)B| (4.48)
s→∞
Therefore the cross-over frequency ωc is approximately given by:
a a
|L(jωc )| = 1 ≈ √ ⇒ ωc ≈ √ (4.49)
ωc R R
Consequently:
100 Chapter 4. Design methods

− LQR controllers always exhibit a high-frequency magnitude decay of −20


dB/decade. The (slow) −20 dB/decade magnitude decrease is the main
shortcoming of state-feedback LQR controllers because it may not be
sucient to clear high-frequency upper bounds on the loop gain needed
to reject disturbances and/or for robustness with respect to process
uncertainty.

− The cross-over frequency is proportional to 1/ R and generally small
values for R result in faster step responses.

4.2.3 Weighting matrices selection


The preceding results motivates the following design rule extended to the
case of multiple input multiple output systems:
− Modal point of view: assuming that all states are available for control,
choose N (remind that Q = NT N ⇒ Q0.5 = N) such that n − 1√zeros of
NΦ(s)B are at the desired pole location. Then use cheap control R → 0
to design LQ system so that n−1 poles of the closed-loop system approach
these desired locations. It is worth noticing that for SISO plants the roots
of NΦ(s)B are also the roots of:
 
sI − A −B
det =0 (4.50)
N 0

− Frequency point of view: alternatively we have seen that at low frequencies


|N|
|K| ≈ √ R
so that the loop gain is approximately |L(s)| ≈ √1R |NΦ(s)B|.
So the shape of the magnitude of the loop gain L(s) is determined by
the magnitude of NΦ(s)B, that is the transfer function from the control
input u(t) to the controlled output z(t). In addition, we have seen that at
high frequency |NΦ(jω)B| ≈ ω√a R , where a = lims→∞ s|NΦ(s)B| is some

constant. So we can choose√ R to pick the bandwidth ωc which is where
|L(jω)| = 1. Thus choose R ≈ ωac where ωc is the desired bandwidth.
Thus contrary to the Chang-Letov design procedure for Single-Input Single-
Output (SISO) systems where scalar R was the design parameter the following
design rules for Multi Input Multi Output (MIMO) systems use matrix Q as
the design parameter. We may also use the fact that if λi is a stable eigenvalue
(i.e. eigenvalue inthe open left half  plane)  of the Hamiltonian matrix H =
A −BR−1 BT

X1i
with eigenvector then λi is also an eigenvalue of
−Q −AT X2i
A − BK with eigenvector X1i . Therefore in the single input case we can use
this result by nding the eigenvalues of H and then realizing that the stable
eigenvalues are the poles of the optimal closed-loop plant.
Alternatively, a simpler choice for matrices Q and R is given by the Bryson's
rule who proposed to take Q and R as diagonal matrices such that:
( 1
qii = max. acceptable value of z
2
1
i
(4.51)
rjj = max. acceptable value of u
2
j
4.2. Asymptotic properties of LQR applied to SISO plants 101

Diagonal matrices Q and R are associated to the following performance


index where ρ is a free parameter to be set by the designer:
!
1 ∞ X
Z X
J(u(t)) = 2
qii zi (t) + ρ2 2
rjj uj (t) dt (4.52)
2 0
i i

If after simulation |zi (t)| is to large then increase qii ; similarly if after
simulation |uj (t)| is to large then increase rjj .

4.2.4 Poles assignment in optimal regulator using root locus


Let λi be an eigenvalue of the open-loop state matrix A corresponding to
eigenvector v i . This open-loop eigenvalue will not be modied by state
feedback gain K by setting in (4.120) the m × 1 vector pi to zero and the n × 1
eigenvector v Ki to the open-loop eigenvector v i corresponding to eigenvalue λi :

(
Av i = λi v i
  −1
K = ··· 0m×1 ··· ··· vi ···
| {z } |{z} (4.53)
ith column ith column

⇒ (A − BK) v i = λi v i

Coming back to the general case, let v 1 , · · · , v n be the eigenvectors of the


open-loop state matrix A. Matrix V is dened as follows:

(4.54)
 
V= v1 · · · vn

Note that if λi and λj := λ̄i are complex conjugate eigenvalues, then the
corresponding eigenvectors v i and v j are also complex conjugate:

λj = λ̄i ⇔ v j = v̄ i (4.55)

In order to get a real valued matrix V, v i and v j shall be changed into the real
part and imaginary part of v i , that is Re(v i ) and Im(v i ), respectively.
Let λ1 , · · · , λr be the r ≤ n eigenvalues that are desired to be changed by
state feedback gain K and v 1 , · · · , v r the corresponding eigenvectors of the state
matrix A. Similarly let λr+1 , · · · , λn be the n − r eigenvalues that are desired to
be kept invariant by state feedback gain K and v r+1 , · · · , v n the corresponding
eigenvectors of the state matrix A. Assuming that matrix V is invertible, matrix
M is dened and split as follows where Mr is an r × n matrix and Mn−r is an
(n − r) × n matrix:
 
−1
−1 Mr
(4.56)

M=V = v 1 · · · v r v r+1 · · · v n =
Mn−r

Shieh & al.7 have shown that, once weighting matrix R = RT > 0 is
set, the characteristic polynomial β(s) of the closed-loop transfer function is
102 Chapter 4. Design methods

linked with the numerator and the denominator of the loop transfer function
Φ(s)B = (sI − A)−1 B as follows:

Φ(s)B = (sI − A)−1 B = adj(sI−A)B


:= Nol (s)
det(sI−A) D(s) (4.57)
⇒ β(s) β(−s) = D(s) D(−s) + kp N rl (s) (N rl (−s))T

where: 
 Nol (s) = adj (sI − A) B

−1
N rl (s) = q T0 Mr Nol (s) R0.5 (4.58)
 q ∈ Rr×1

0
T
Matrix R = R
0.5 0.5 is the root square of matrix R. By getting the modal
decomposition of matrix R, that is R = VDV−1 where V is the matrix whose
columns are the eigenvectors of R and D is the diagonal matrix whose diagonal
elements are the corresponding positive eigenvalues, the square root R0.5 of R is
given by R0.5 = VD0.5 V−1 , where D0.5 is any diagonal matrix whose elements
are the square root of the diagonal elements of D2 .
Relation (4.57) can be associated with root locus of the ctitious transfer
N T (s) N (−s)
function G(s)G(−s) = D(s) rl
D(−s) where ctitious gain kp varies from 0 to ∞.
rl

The arbitrary nonzero r × 1 column vector q 0 is used to shape the locus. It is


worth noticing that Mr Nol (s) and D(s) share λr+1 , · · · , λn as common roots,
and thus pole / zero simplication within G(s) and G(−s) shall be done before
drawing the root locus.
Once the positive scalar kp has been selected on the root locus such that the
r closed-loop eigenvalues λK1 , · · · , λKr are appropriately placed (note that for
kp = 0, the corresponding branches start at λ1 , · · · , λr ), the weighting matrix
Q has the following expression:
 
Q = kp MTr q 0 q T0 Mr (4.59)

The following relation also holds:

(4.60)
 
Q v r+1 · · · v n = 0

The preceding results is a generalization of the Chang-Letov design


procedure seen in section 4.1.3. This may be used as follows: rst choose R
and set Q = 0 to get the minimum energy optimal control law. Then identify
the r closed-loop eigenvalues which do not meet design specications. Finally
compute Q as previously seen such that all eigenvalues are appropriately
placed.

4.3 Poles shifting in optimal regulator


4.3.1 Mirror property
The purpose of this section is to underline the relation between the weighting
matrix Q and the closed-loop eigenvalues of the optimal regulator.
2
https://en.wikipedia.org/wiki/Square_root_of_a_matrix
4.3. Poles shifting in optimal regulator 103

We recall the expression of the 2n × 2n Hamiltonian matrix H:

A −BR−1 BT
 
H= (4.61)
−Q −AT
which corresponds to the following algebraic Riccati equation:

AT P + PA − PBR−1 BT P + Q = 0 (4.62)

The characteristic polynomial of matrix H in (4.61) is given by3 :

det (sI − H) = det (sI − A) det (I − QS(s)) det sI + AT (4.63)




Where the term S(s) is dened by:


−1
S(s) = (sI − A)−1 BR−1 BT sI + AT (4.64)

Setting Q = QT = 2αP, where α ≥ 0 is a design parameter, the algebraic


Riccati equation reads:

Q = QT = 2αP ⇒ (A + αI)T P + P (A + αI) − PBR−1 BT P = 0 (4.65)

which corresponds to the following Hamiltonian matrix H:

A + αI −BR−1 BT
 
H= (4.66)
0 − (A + αI)T

Let λi be the open-loop eigenvalues, that are the eigenvalues of matrix A.


As far as the eigenvalues of a matrix are the same than the eigenvalues of its
transpose, we can see that the 2n eigenvalues of the preceding Hamiltonian
matrix is the set {λi + α} ∪ {− (λi + α)}, i = 1, · · · , n.
Because the eigenvalues λαi of A + αI − BK are the n eigenvalues of the
Hamiltonian matrix H with negative real part, and denoting Re(λ) the real part
of λ, there are two possibilities:

Re(λi + α) = Re(λi ) + α ≤ 0 ⇒ λαi = λi + α
(4.67)
Re (− (λi + α)) = −Re(λi ) − α < 0 ⇒ λαi = −λi − α

Finally let λKi be the closed-loop eigenvalues, that are the eigenvalues of
matrix A − BK. The eigenvalues of A − BK are obtained from the eigenvalues
of A + αI − BK by subtracting α to λαi . Thus from (4.67), and given a
controllable pair (A, B), a positive denite symmetric matrix R and a positive
real constant α, the algebraic Riccati equation (4.65) where Q = QT = 2αP
has a unique positive denite solution P = PT > 0 such that λKi have the
following property:

 Re(λi ) ≤ −α ⇒ λKi = λi
Re(λi ) > −α ⇒ λKi = −λi − 2α ∀ i = 1, · · · , n (4.68)
Im(λKi ) = Im(λi )

104 Chapter 4. Design methods

Figure 4.2: Mirror property of LQR design when Q = QT = 2αP

This mirror property is illustrated in Figure 4.2.


Consequently, and denoting by Re(λKi ) the real part of λKi , it can be shown
that the positive denite real symmetric solution P of (4.65) is such that the
following mirror property holds4 :

 Re(λKi ) ≤ −α
Im(λKi ) = Im(λi ) ∀ i = 1, · · · , n (4.69)
(α + λi )2 = (α + λKi )2

Once the algebraic Riccati equation (4.65) is solved in P the classical LQR
design is applied: 
u(t) = −Kx(t)
(4.70)
K = R−1 BT P
It is worth noticing that the algebraic Riccati equation (4.65) can be changed
into a Lyapunov equation by pre- and post-multiplying (4.65) by P−1 and setting
X := P−1 :

(A + αI)T P + P (A + αI) − PBR−1 BT P = 0


⇒ P−1 (A + αI)T + (A + αI) P−1 − BR−1 BT = 0 (4.71)
X := P−1 ⇒ X (A + αI)T + (A + αI) X = BR−1 BT

Matrix R remains the degree of freedom for the design and it seems that
it may be used to set the damping ratio of the complex conjugate dominant
poles for example. Unfortunately (4.66) indicates that the eigenvalues of the
Hamiltonian matrix H, which are closely related to eigenvalues of the closed-
loop system, are independent of matrix R. Thus matrix R has no inuence on
the location of the closed-loop poles in that situation.
Furthermore it is worth reminding that the higher the displacement of
closed-loop eigenvalues with respect to the open-loop eigenvalue is, the higher
the control eort is. Thus specifying very fast dominant poles may lead to
unacceptable control eort.
3
Y. Ochi and K. Kanai, Pole placement in optimal regulator by continuous pole-shifting,
Journal of Guidance Control and Dynamics, Vol. 18, No. 6 (1995), pp. 1253-1258
4
Optimal pole shifting for continuous multivariable linear systems, M. H. Amin, Int.
Journal of Control 41 No. 3 (1985), 701707.
4.3. Poles shifting in optimal regulator 105

4.3.2 Reduced-order model


The preceding result can be used to recursively shift on the left all the real
parts of the poles of a system to any positions while preserving their imaginary
parts. Let A ∈ Rn×n be the state matrix of the system to be controlled and
B ∈ Rn×m the input matrix. We assume that all the eigenvalues of A are
distinct and that (A, B) is controllable and that the symmetric positive denite
weighting matrix R for the control is given. The purpose of this section is to
compute the state weighting matrix Q which leads to the desired closed-loop
eigenvalues by shifting recursively the actual eigenvalues of the state matrix. It
is worth noticing that, through the shifting process, real eigenvalues remain real
eigenvalues whereas complex conjugate eigenvalues remain complex conjugate
eigenvalues.
The core idea of the method is to consider the transformation z i = CT x
where C is appropriately chosen. This leads to the following reduced order
model where matrix Λ corresponds to the diagonal (or Jordan) form of state
matrix A:
 T
C A = ΛCT ⇔ AT C = CΛT
z i = C x ⇒ ż i = Λz i + Gu where
T
(4.72)
G = CT B
In this new basis the performance index turns to be:
1 ∞ T e
Z 
Ji = z i Qi z i + uT Ru dt where Q = CQ
e i CT (4.73)
2 0

4.3.3 Shifting one real eigenvalue


Let λi be an eigenvalue of A. We will rst assume that λi is real. We wish to
shift λi to λKi .
Let v be a left eigenvector of A: v T A = λi v T . In other words, v is a
(right) eigenvector of AT corresponding to λi : AT v = λi v . Then we dene z i
as follows:
z i := CT x where C = v (4.74)
Using the fact that v is a (right) eigenvector of AT (z i = v T x), we can write:

ż i = v T Ax + v T Bu
= v T λi x + v T Bu
= λi v T x + v T Bu (4.75)
= λi z i + v T Bu
= λi z i + Gu where G := v T B = CT B

Then setting u := −R−1 GT Pz e , where scalar P


i
e > 0 is a design parameter,
and having in mind that z i is scalar (thus λi I = λi ), we get:
 
ż i = λi − GR−1 GT P e z
i (4.76)

Let λKi be the desired eigenvalue of the preceding reduced-order model.


Then we shall have:
λKi = λi − GR−1 GT P e (4.77)
106 Chapter 4. Design methods

Thus, matrix P
e reads:
Pe = λi − λKi (4.78)
GR−1 GT
The state weighting matrix Qi that will shift the open-loop eigenvalue λi to
the closed-loop eigenvalue λKi is obtained through the following identication:
e i z = xT Qi x. We nally get:
z Ti Q i

z i = v T x := CT x ⇒ Qi = CQ
e i CT (4.79)

Once matrix Pe has been computed, matrix Q


e i is obtained thanks to the
corresponding algebraic Riccati equation:

0 = Pλ e − PGR
e i + λi P −1 GT Pe +Q
ei
(4.80)
e
e e e −1
⇔ Qi = −2λi P + PGR G P T e

4.3.4 Shifting a pair of complex conjugate eigenvalues


The procedure to shift a pair of complex conjugate eigenvalues follows the same
idea: let λi and λ̄i be a pair of complex conjugate eigenvalues of A. We wish to
shift λi and λ̄i to λKi and λ̄Ki .
Let v and v̄ be a pair left eigenvectors of A. In other words, v and v̄ is a
pair of (right) eigenvector of AT corresponding to λi :
 
 T T
  T T
 λi 0
v v̄ A = v v̄
0 λ̄i
 (4.81)
λi 0
⇔ AT v v̄
   
= v v̄
0 λ̄i

In order to manipulate real values, we will use the real part and the imaginary
part of the preceding equation. Denoting λi := a + j b, that is a := Re(λi ) and
b := Im(λi ), the preceding relation is equivalently replaced by the following
one:
 
T
    λi 0
A v v̄ = v v̄
0 λ̄i 
 a −b
 (4.82)
T
  
⇔A Re(v) Im(v) = Re(v) Im(v)
b a

Then dene z i as follows:

z i := CT x where C = (4.83)
 
Re(v) Im(v)

Using the fact that v and v̄ is a pair of (right) eigenvector of AT , we get:

ż i = Ai z i + Gu (4.84)

where:
T

 G=C  B
(4.85)

a −b
 Ai =
b a
4.3. Poles shifting in optimal regulator 107

Then setting u = −R−1 GT P e z , , where 2 × 2 positive denite matrix P


i
e is
a design parameter, we get:

−1 T e
 Λi = Ai − GR G P


ż i = Λi z i where   (4.86)
T p 1 p2
 P=P = >0

 e e e e
pe2 pe3

Thus the closed-loop eigenvalues are the eigenvalues of matrix Λi . Here the
design process becomes a little bit more involved because parameters pe1 , pe2 and
pe3 of matrix P
e shall be chosen to meet the desired complex conjugate closed-
loop eigenvalues λKi and λ̄Ki while minimizing the trace of P e (indeed it can be
shown that min(Ji ) = min(tr(P))). The design process has been described by
e
Arar & Sawan5 .
Alternatively, we can choose the three coecients qe1 , qe2 and qe3 of matrix
Qei = Qe T ≥ 0 such that the eigenvalues with negative real part of the following
i
Hamiltonian matrix H e i correspond to the desired eigenvalues λKi and λ̄Ki , as
proposed by Fujinaka & Omatu6 .
Thus the problem consists to nd matrix Q e i:
 
qe1 qe2
Qi =
e =QeT ≥ 0
i (4.87)
qe2 qe3

such that:
e i ) = (s − λKi )(s − λ̄Ki )(s + λKi )(s + λ̄Ki ) = s4 + c2 s2 + c0 (4.88)
det(sI − H

where:
Ai −GR−1 GT
 
H
ei = (4.89)
−Q
ei −Ai
Once matrix Q
e i has been computed, matrix Q is obtained as follows:

e i CT
Q = CQ (4.90)

4.3.5 Sequential pole shifting via reduced-order models


When the imaginary part of the shifted eigenvalues is preserved, that is when
Im(λKi ) = Im(λi ) and Im(λ̄Ki ) = Im(λ̄i ), then the design process can be
simplied by using the mirror property underlined by Amin4 and presented in
Section 4.3.1: given a controllable pair (Λi , G), a positive denite symmetric
matrix R and a positive real constant α, then the following algebraic Riccati
equation has a unique positive denite solution Pe =P e T > 0:

(Λi + αI)T P e (Λi + αI) − PGR


e +P e −1 T e
G P=0 (4.91)
5
Abdul-Razzaq S. Arar, Mahmoud E. Sawan, Optimal pole placement with prescribed
eigenvalues for continuous systems, Journal of the Franklin Institute, Volume 330, Issue 5,
September 1993, Pages 985-994
6
Toru Fujinaka, Sigeru Omatu, Pole Placement Using Optimal Regulators, IEEJ
Transactions on Electronics Information and Systems 121(1):240-245, January 2001,
DOI:10.1541/ieejeiss1987.121.1_240
108 Chapter 4. Design methods

Moreover the feedback control law u = −Ki x shift the pair of complex
conjugate eigenvalues (λi , λ̄i ) of matrix A to a pair of complex conjugate
eigenvalues (λKi , λ̄Ki ) as follows, assuming α + Re(λi ) ≥ 0:

e T

 Pi = CPC 
Re(λKi ) = − (2α + Re(λi ))
Qi = 2α Pi ⇒
Im(λKi ) = Im(λi )
Ki = R−1 BT Pi = R−1 GT PCe T

(4.92)
The design process proposed by Amin4 to shift several eigenvalues recursively
is the following:

1. Set i = 1 and A1 = A.

2. Let λi be the eigenvalue of matrix Ai which is desired to be shifted:

− Assume that λi is real. We wish to shift λi to λKi ≤ λi . Then


compute a (right) eigenvector v of ATi corresponding to λi . In other
words v T is the left eigenvector of Ai : v T Ai = λi v T . Then compute
C, G, α and Λi dened by:

 C=v
 G = CT B

(4.93)

 α = − λKi2+λi ≥ 0 where λKi ≤ λi ∈ R
Λi = λi ∈ R

− Now assume that λi = a + jb is complex. We wish to shift λi and λ̄i


to λKi and λ̄Ki where:

Re(λKi ) ≤ Re(λi ) := a
(4.94)
Im(λKi ) = Im(λi ) := b

This means that the shifted poles shall have the same imaginary parts
than the original ones. Then compute (right) eigenvectors (v 1 , v 2 ) of
ATi corresponding to λi and  λ̄i . In   T(v 1 , v 2 ) are the left
 other words
T T

v1 T λi 0 v1
eigenvectors of Ai : T Ai = . Then compute
v2 0 λ̄i v T2
C, G, α and Λi dened by:
 
v 1 = v̄ 2 ⇒ C = Re(v 1 ) Im(v 1 )


 T
 G=C B


α = − Re(λKi 2
+λi )
≥0 (4.95)
 
a −b


 λi = a + jb ∈ C ⇒ Λi =


b a

3. Compute P e = P e T > 0, which is dened as the unique positive denite


solution of the following algebraic Riccati equation:

(Λi + αI)T P e (Λi + αI) − PGR


e +P e −1 T e
G P=0 (4.96)
4.4. Frequency domain approach 109

Alternatively, P
e can be dened as follows:

e = X−1
P (4.97)

where X is the solution of the following Lyapunov equation:

(Λi + αI) X + X (Λi + αI)T = GR−1 GT (4.98)

4. Compute Pi , Qi and Ki as follows:


e T

 Pi = CPC
Qi = 2α Pi (4.99)
Ki = R−1 BT Pi = R−1 GT PC
e T

5. Set i = i + 1 and Ai = Ai−1 − B Ki−1 . Go to step 2 if some others


open-loop eigenvalues have to be shifted.

Once the loop is nished compute P = i Pi , Q = i Qi and K = i Ki .


P P P
Gain K is such that eigenvalues of A − BK are located to the desired values
λKi . Furthermore Q is the weighting matrix for the state vector and P is the
positive denite solution of the corresponding algebraic Riccati equation.

4.4 Frequency domain approach


4.4.1 Non optimal pole assignment
We have seen in (3.138) that thanks to the Hsu-Chen theorem the closed-loop
characteristic polynomial det(sI − A + BK) reads as follows:

det (sI − A + BK) = det (sI − A) det(I + KΦ(s)B) (4.100)


Let D(s) = det (sI − A) be the determinant of Φ(s), that is the plant
characteristic polynomial, and Nol (s) = adj (sI − A) B be the adjugate matrix
of sI − A times matrix B:
adj (sI − A) B Nol (s)
Φ(s)B = (sI − A)−1 B = := (4.101)
det (sI − A) D(s)
Consequently (4.100) reads:

det (sI − A + BK) = det (D(s)I + KNol (s)) (4.102)

As soon as λKi is a desired closed-loop eigenvalue then the following relation


holds:
det (D(s)I + KNol (s))|s=λKi = 0 (4.103)
Consequently it is desired that matrix D(s)I + KNol (s)|s=λKi is singular.
Let ω i be a vector belonging to the kernel of D(s)I + KNol (s)|s=λKi . Thus
replacing s by λKi we can write:

(D(λKi )I + KNol (λKi )) ω i = 0 (4.104)


110 Chapter 4. Design methods

Actually, vector ω i ̸= 0 can be used as a design parameter.


Alternatively, when making the parallel that λi is an eigenvalue of matrix
A as soon as det (sI − A)|s=λi = 0, we conclude that D(λKi ) is an eigenvalue
of matrix −KNol (λKi ), and thus ω i is an eigenvector of −KNol (λKi )
corresponding to the eigenvalue D(λKi ). This remark can be extended to the
output feedback case where Nol (s) = C adj (sI − A) B.
In order to get gain K the preceding relation is rewritten as follows:

KNol (λKi )ω i = −D(λKi )ω i (4.105)

This relation does not lead to the value of gain K as soon as Nol (λKi )ω i
is a vector which is not invertible. Nevertheless assuming that n denotes the
order of state matrix A we can apply this relation for the n desired closed-loop
eigenvalues. We get:

(4.106)
   
K v K1 · · · v Kn = − p1 · · · pn

where vectors v Ki and pi are given by:



v Ki = Nol (λKi ) ω i
(4.107)
pi = D(λKi ) ω i

We nally get the following expression of gain K:


−1
(4.108)
 
K=− p1 · · · pn v K1 · · · v Kn

4.4.2 Assignment of weighting matrices Q and R


The starting point is the Kalman equality (3.143) that we recall hereafter:

(I + KΦ(−s)B)T R (I + KΦ(s)B) = R + (Φ(−s)B)T Q (Φ(s)B) (4.109)

Using the fact that det (XY) = det (X) det (Y) leads to the following result:
 
det (I + KΦ(−s)B)T det (R) det ((I + KΦ(s)B))
 
= det R + (Φ(−s)B)T Q (Φ(s)B)
 
⇔ det (I + KΦ(−s)B)T det ((I + KΦ(s)B))
 
= det I + R−1 (Φ(−s)B)T Q (Φ(s)B)
(4.110)
On the other hand, let D(s) be the open-loop characteristic polynomial and
β(s) be the closed-loop characteristic polynomial:

D(s) = det (sI − A)
(4.111)
β(s) = det (sI − A + BK)

As in the previous section, let Nol (s) be the following polynomial matrix:

Nol (s) := adj (sI − A) B (4.112)


4.4. Frequency domain approach 111

Then we get:
Nol (s)
Φ(s) := (sI − A)−1 ⇒ Φ(s) B = (sI − A)−1 B := (4.113)
D(s)
Furthermore the Hsu-Chen equality (3.138) reads as follows with those
notations:

det (sI − A + BK) = det (sI − A) det (I + KΦ(s)B)


det (sI − A + BK) β(s)
⇔ det (I + KΦ(s)B) = := (4.114)
det (sI − A) D(s)

Finally, using the fact that det XT = det (X), relation (4.110) becomes:


β(−s) β(s)  
= det I + R−1 (Φ(−s)B)T Q (Φ(s)B) (4.115)
D(−s) D(s)
We
 nally get the
 following result where
T
det I + R (Φ(−s)B) Q (Φ(s)B) is a rational fraction whose denominator
−1

is D(s) D(−s), that is the denominator of Φ(−s)Φ(s):


 
β(s) β(−s) = D(s) D(−s) det I + R−1 (Φ(−s)B)T Q (Φ(s)B)
 T 
(4.116)

= D(s) D(−s) det I + R−1 ND(−s)ol (−s)
Q ND(s)
ol (s)

= det D(s) D(−s) I + R−1 Nol (−s)T Q Nol (s)




Thus the closed-loop eigenvalues are the roots λKi with negative real part
such that:

det D(s) D(−s) I + R−1 Nol (−s)T Q Nol (s) s=λ = 0 (4.117)

Ki

Relation (4.117) indicates there exists eigenvectors ω i ̸= 0 such that for a


given closed-loop eigenvalue λKi the following relation holds7 :
 
D(−λKi )D(λKi )I + R−1 (Nol (−λKi ))T Q Nol (λKi ) ω i = 0 (4.118)

Let n be the order of state matrix A. Once eigenvector ω i ̸= 0 has been


obtained for each λKi , relation (4.108) can be used to compute the optimal gain
K as follows:
−1
(4.119)
  
K = − p1 · · · pn v K1 · · · v Kn

where vectors v Ki and pi are given as in the non optimal pole assignment
problem:

v Ki = Nol (λKi ) ω i
(4.120)
pi = D(λKi ) ω i
7
L.S. Shieh, H.M. Dib, R.E. Yates, Sequential design of linear quadratic state regulators
via the optimal root-locus techniques, IEE Proceedings D - Control Theory and Applications,
Volume: 135 , Issue: 4, July 1988, DOI: 10.1049/ip-d.1988.0040
112 Chapter 4. Design methods

It is worth noticing that if the open-loop eigenvalue λi is desired to be kept


in the closed-loop, then applying (4.118) with relations D(λi ) = 0 and λi = λKi
imply that pi = 0 and that v Ki shall be chosen such that Q v Ki = 0. Indeed:

  
T
 D(−λKi )D(λKi )I + R−1 (Nol (−λKi )) Q Nol (λKi ) ω i = 0

D(λi ) = 0
(4.121)

λi = λKi


pi = D(λKi ) ω i = D(λi ) ω i = 0

Q Nol (λi ) ω i = Q v Ki = 0

On the other hand, if λKi and ω i are set, then Q and R shall be chosen such
that (4.118) holds ∀ i. Once matrix R = R > 0 has been set, matrix Q can be
assumed to be a real diagonal matrix whose coecients qi shall be computed to
comply with (4.118):

 
q1
Q = QT = 
 .. 
∈ Rn (4.122)
. 
qn

Nevertheless, take care that the computed matrix Q = QT may not be


positive semi-denite in that case.
Finally, for single input system, ω i := ωi ̸= 0 and R := R > 0 are scalars
and (4.118) reduces as follows:

Q
ωi ̸= 0 ∈ R ⇒ D(−λKi )D(λKi ) + (Nol (−λKi ))T Nol (λKi ) = 0 ∀ i (4.123)
R

Example 4.1. We consider the following state equation:


   
0 1 0
ẋ = x+ u (4.124)
10 −9 1

We wish to design an optimal state feedback controller such that the closed-
loop poles are located at {λK1 = −10, λK2 = −2}.
To solve this problem, we rst observe that the eigenvalues of A are {λ1 =
−10, λ2 = 1}. Thus the problem consists in preserving λ1 = −10 in the state
feedback loop while shifting λ2 = 1 towards λK2 = −2. Because we are looking
for an optimal state feedback controller, we have to select matrix Q = QT ≥ 0
and R > 0 to achieve those specications.
The characteristic polynomial D(s) of state matrix A reads:

D(λK1 ) = D(−10) = 0
D(s) = det (sI − A) = s + 9s − 10 ⇒
2
(4.125)
D(λK2 ) = D(−2) = −24
4.5. Poles assignment in optimal regulator through matrix inequalities 113

In addition, let Nol (s) be the following polynomial matrix:


    
s+9 1 0 1
Nol (s) := adj (sI − A) B = =
10 s 1 s
  T

 Nol (λK1 ) = Nol (−10) = 1 −10
  T
Nol (−λK1 ) = Nol (10) = 1 10

⇒  T (4.126)

 Nol (λ K2 ) = N ol (−2) = 1 −2

  T
Nol (−λK2 ) = Nol (2) = 1 2

Because we focus on a single input system, we use relation (4.123) to select


Q
Q = QT ≥ 0 and R > 0. Furthermore we will assume that is a diagonal
R
matrix:  
Q qr1 0
:= (4.127)
R 0 qr2
We get:
Q
D(−λKi )D(λKi ) + (Nol (−λKi ))T Nol (λKi ) = 0
 R    
qr1 − 100 qr2 = 0 1 −100 qr1 0
⇔ ⇔ = (4.128)
qr1 − 4 qr2 − 288 = 0 1 −4 qr2 288

We nally get:
   −1   
qr1 1 −100 0 300
= =
qr2 1 −4 288 3
   
Q qr1 0 300 0
⇒ := = (4.129)
R 0 qr2 0 3

4.5 Poles assignment in optimal regulator through


matrix inequalities
In this section a method for designing linear quadratic regulator with prescribed
closed-loop pole is presented.
Let Λcl = {λ1 , λ2 , · · · , λn } be a set of prescribed closed-loop eigenvalues,
where Re(λi ) < 0 and λi ∈ Λcl implies that the complex conjugate of λi , which
is denoted λ∗i , belongs also to Λcl . The problem consists in nding a state
feedback controller u = −Kx such that the eigenvalues of A − BK, which are
denoted λ (A − BK), belongs to Λcl :

λ (A − BK) = Λcl (4.130)

while minimizing the quadratic performance index J(u(t)) for some Q > 0
and R > 0.
114 Chapter 4. Design methods

Z ∞
1
xT (t)Qx(t) + uT (t)Ru(t) dt (4.131)

J(u(t)) =
2 0

We provide in that section the material written by He, Cai and Han8 .
Assume that (A, B) is controllable. Then, the pole assignment problem is
solvable if and only if there exist two matrices X1 ∈ Rn×n and X2 ∈ Rn×n
such that the following matrix inequalities are satised:

FT XT2 X1 + XT1 X2 F + XT2 BR−1 BT X2 ≤ 0



(4.132)
XT1 X2 = XT2 X1 > 0

where F is any matrix such that λ (F) = Λcl and (X1 , X2 ) satises the
following generalized Sylvester matrix equation9 :

AX1 − X1 F = BR−1 BT X2 (4.133)

If (X1 , X2 ) is a feasible solution to the above two inequalities, then the


weighting matrix Q in the quadratic performance index J(u(t)) can be chosen
as follows:
Q = −AT X2 X−1 1 − X2 FX1
−1
(4.134)

In addition the solution the corresponding Riccati Algebraic equation reads:

P = X2 X−1
1 (4.135)

The starting point to get this result is the fact that there must exist an
eigenvector matrix X such that the following formula involving Hamiltonian
matrix H holds:
HX = XF (4.136)

Splitting the 2n × n matrix X into 2 square n × n matrices X1 and X2 and


using the expression of the 2n × 2n Hamiltonian matrix H leads to the following
relation:

A −BR−1 BT
      
X1 X1 X1
X= ⇒ = F (4.137)
X2 −Q −AT X2 X2

The preceding relation is expanded as follows:

AX1 − BR−1 BT X2 = X1 F AX1 − X1 F = BR−1 BT X2


 
⇔ (4.138)
−QX1 − AT X2 = X2 F Q = −AT X2 X−1 −1
1 − X2 FX1

8
Hua-Feng He, Guang-Bin Cai and Xiao-Jun Han, Optimal Pole Assignment of Linear
Systems by the Sylvester Matrix Equations, Hindawi Publishing Corporation, Abstract and
Applied Analysis, Volume 2014, Article ID 301375, http://dx.doi.org/10.1155/2014/301375
9
An explicit solution to right factorization with application in eigenstructure assignment,
Bin Zhou, Guangren Duan, Journal of Control Theory and Applications 08/2005; 3(3):275-
279. DOI: 10.1007/s11768-005-0049-7
4.6. Model matching 115

Since X1 is nonsingular matrix Q is positive denite if and only if XT1 QX1


is positive denite. Using the rst equation of (4.138) into the second one we
get:

XT1 QX1 = −XT1 AT X2 − XT1 X2 F


= −(AX1 )T X2 − XT1 X2 F
(4.139)
= −(X1 F + BR−1 BT X2 )T X2 − XT1 X2 F
= −(FT XT2 X1 + XT1 X2 F + XT2 BR−1 BT X2 )

We recall the Schur's formula:


 
A11 A12
det = det(A22 ) det(A11 − A12 A−1
22 A21 )
A21 A22 (4.140)
= det(A11 ) det(A22 − A21 A−1
11 A12 )

Using the Schur's formula and denoting S = XT1 X2 we nally get:


 T T
F S + SF XT2 B

T
X1 QX1 ≥ 0 ⇔ ≤0 (4.141)
BT X2 −R−1

4.6 Model matching


4.6.1 cross-term in the performance index
Assume that the output z(t) of interest is expressed as a linear combination of
state vector x(t) and control u(t): z(t) = N x(t) + D u(t). Thus the cost to be
minimized reads:
R∞
 J(u(t)) = 12 0 z T (t)z(t) + uT (t)R1 u(t) dt

z(t) = N x(t) + D u(t)


R1 = RT1 >R0


⇒ J(u(t)) = 12 0 xT (t)NT + uT (t)DT (N x(t) + D u(t)) + uT (t)R1 u(t) dt


(4.142)
Then we get a more general form of the quadratic performance index. Indeed
the quadratic performance index can be rewritten as:
 T   
1 ∞
R x Q S x
J(u(t)) = 2 0 dt
u S T R u  (4.143)

= 12 0 xT (t)Qx(t) + uT (t)Ru(t) + 2xT (t)Su(t) dt
R

where:
 Q = QT := NT N ≥ 0

R = RT := DT D + R1 > 0 (4.144)
S := NT D

It can be seen that:

xT (t)Qx(t) + uT (t)Ru(t) + 2xT (t)Su(t) = xT (t)Qm x(t) + v T (t)Rv(t) (4.145)

where:
Qm = Q − SR−1 ST

(4.146)
v(t) = u(t) + R−1 ST x(t)
116 Chapter 4. Design methods

Hence cost (4.143) can be rewritten as:


Z ∞
1
J(u(t)) = xT (t)Qm x(t) + v T (t)Rv(t) dt (4.147)
2 0

Moreover the plant dynamics ẋ(t) = Ax(t) + Bu(t) is modied as follows:

ẋ = Ax(t) + Bu(t)
= Ax(t) + B v(t) − R−1 ST x(t)

(4.148)
= Am x(t) + Bv(t)
where Am = A − BR−1 ST

Assuming that Qm (which is symmetric) is positive denite, we then get a


standard LQR problem for which the optimal state feedback control law is given
from (3.9):

v(t) = −R−1 BT Px(t)


 
u(t) = −Kx(t)
⇒ (4.149)
v(t) = u(t) + R−1 ST x(t) K = R−1 (PB + S)T

Where matrix P is the positive denite matrix which solves the following
algebraic Riccati equation (see (3.8)):
PAm + ATm P − PBR−1 BT P + Qm = 0 (4.150)

It is worth noticing that robustness properties of the LQ state feedback are


lost if the cost to be minimized contains a state-control cross-term as it is the
case here.

4.6.2 Implicit reference model


Let Ar be the desired closed-loop state matrix of the system and e(t) be the
following error vector:
e(t) := ẋ(t) − Ar x(t) (4.151)

In that section we consider the problem to nd control u(t) which minimizes
the following performance index:
1
R∞ T
J(u(t)) = e (t)e(t) dt
2
1
R0∞ T (4.152)
= 2 0 (ẋ(t) − Ar x(t)) (ẋ(t) − Ar x(t)) dt

Expanding ẋ(t) we get:


R∞
J(u(t)) = 1
2 (Ax(t) + Bu(t) − Ar x(t))T (Ax(t) + Bu(t) − Ar x(t)) dt
1
R0∞ T
= 2 0 ((A − Ar ) x(t) + Bu(t)) ((A − Ar ) x(t) + Bu(t)) dt
(4.153)
We get a cost to be minimized which contains a state-control cross-term:

1 ∞ T
Z
x (t)Qx(t) + uT (t)Ru(t) + 2xT (t)Nu(t) dt (4.154)

J(u(t)) =
2 0
4.7. Optimal output feedback 117

Where:  T
 Q = (A − Ar ) (A − Ar )
R = BT B (4.155)
N = (A − Ar )T B

Then we can re-use the results of section 4.6.1. Let P be the positive denite
matrix which solves the following algebraic Riccati equation:

PAm + ATm P − PBR−1 BT P + Qm = 0 (4.156)

Where:
 Qm = Q − NR−1 NT

Am = A − BR−1 NT (4.157)
v(t) = u(t) + R−1 NT x(t)

The stabilizing control u(t) is then dened in a similar fashion than (4.149):

v(t) = −R−1 BT Px(t)


 
u(t) = −Kx(t)
⇒ (4.158)
v(t) = u(t) + R−1 NT x(t) K = R−1 (PB + N)T

It is worth noticing that robustness properties of the LQ state feedback are


lost because the cost to be minimized contains a state-control cross-term here.
Furthermore let V be the change of basis matrix to the Jordan form Λr of
the desired closed-loop state matrix Ar :

Λr = V−1 Ar V (4.159)

Let Acl be the state matrix of the closed-loop which is written using matrix
V as follows :
ẋ(t) = Acl x(t) = VΛcl V−1 x(t) (4.160)
Assuming that the desired Jordan form Λr is a diagonal matrix and using
the fact V−1 = VT the product eT (t)e(t) in (4.152) reads as follows:

eT (t)e(t) = xT (t) (Acl − Ar )T (Acl − Ar ) x(t)


(4.161)
= xT (t)V (Λcl − Λr )T (Λcl − Λr ) VT x(t)

R∞From the preceding equation it is clear that minimizing the cost J(u(t)) =
0 e (t)e(t) dt consists in nding the control u(t) which minimizes the gap
1 T
2
between the desired eigenvalues (which are set in Λr ) and the actual eigenvalues
of the closed-loop.

4.7 Optimal output feedback


4.7.1 Reformulation of the state feedback optimal control
problem
We consider in this section the following plant:

ẋ(t) = Ax(t) + Bu(t) (4.162)


118 Chapter 4. Design methods

We which minimizes the following performance index J where Q = QT ≥ 0


and R = RT > 0:
Z ∞
xT (t)Qx(t) + uT (t)Ru(t) dt (4.163)

J=
0
We will assume that we seek for a stabilizing static state feedback gain K:
u(t) = −Kx(t) (4.164)

Using the relation u(t) = −Kx(t), the performance index J reads:


R∞ 
J = 0 xT (t)Qx(t) + (Kx(t))T RKx(t) dt
R∞ (4.165)
= 0 xT (t) Q + KT RK x(t) dt


Moreover, the dynamics of the plant where u(t) = −Kx(t) reads:

u(t) = −Kx(t) ⇒ ẋ(t) = (A − BK) x(t) (4.166)

Thus, after integration, and denoting by x(0) the initial value of x(t), we
get:
x(t) = e(A−BK) t x(0) (4.167)
Consequently, the performance index J reads:
R∞
J = 0 xT (t) Q + KT RK x(t) dt

R∞ T (4.168)
= 0 x(0)T e(A−BK) t Q + KT RK e(A−BK) t x(0) dt


Using the property of trace, i.e. tr (XYZ) = tr (YZX), the performance


index J is written as follows:
R∞ T
J = 0  x(0)T e(A−BK) t Q + KT RK e(A−BK) t x(0) dt

R∞ T

= tr 0 x(0)T e(A−BK) t Q + KT RK e(A−BK) t x(0) dt

(4.169)
R 
∞ (A−BK)T t T
 (A−BK) t T
= tr 0 e Q + K RK e x(0)x(0) dt

Because the initial value x(0) of the state vector is usually unknown, the
product x(0)x(0)T is removed from the performance index J . We get:

J = tr (P) (4.170)

where matrix P is the so-called grammian:


Z ∞
T
e(A−BK) t Q + KT RK e(A−BK) t dt = PT > 0 (4.171)

P=
0
T
When multiplying P by e(A−BK) t0 on the left and by e(A−BK) t0 on the
right, we get ∀ t0 ∈ R:
T
e(A−BK) t0 P e(A−BK) t0
R∞ T
= 0 e(A−BK) (t+t0 ) Q + KT RK e(A−BK) (t+t0 ) dt (4.172)

R∞ T
= t0 e(A−BK) t Q + KT RK e(A−BK) t dt

4.7. Optimal output feedback 119

Dierentiation with respect to t0 and using the facts that matrix A − BK is


T
assumed to be stable (that is limt→∞ e(A−BK) t = limt→∞ e(A−BK) t = 0) and
R b(x)
that g(x) = a(x) f (τ ) dτ ⇒ g ′ (x) = f (b(x)) b′ (x) − f (a(x)) a′ (x) yields:
T
(A − BK)T e(A−BK) t0
P e(A−BK) t0
T
+ e(A−BK) t0
P e(A−BK) t0 (A − BK)
T
= −e(A−BK) t0 Q + KT RK e(A−BK) t0 (4.173)


Finally setting t0 = 0 leads to the following Lyapunov equation:


(A − BK)T P + P (A − BK) = − Q + KT RK (4.174)


Alternatively, and following Lewis & al.10 , assume that there exists a positive
denite matrix P = PT > 0 such that the following equality holds:
d T
x (t)Px(t) = −xT (t) Q + KT RK x(t) (4.175)
 
dt
Then the performance index J dened in (4.165) reads:
R∞
J = 0 R xT (t) Q + KT RK

x(t) dt
∞ d T

= − 0 dt x (t)Px(t) dt
t→∞ (4.176)
= − xT (t)Px(t) t=0
= xT (0) P x(0) − limt→∞ xT (t) P x(t)
Assuming that the closed-loop is stable so that x(t) vanishes with time, we
get the following relation where tr (X) denotes the trace of matrix X:
lim x(t) = 0 ⇒ J = xT (0) P x(0) = tr P x(0) xT (0) (4.177)

t→∞

Furthermore, when using (4.162) and (4.163) in (4.175), we can write:


d
−xT (t) Q + KT RK x(t) = dt xT (t)Px(t)
 

= ẋT (t)Px(t) + xT (t)Pẋ(t)


= (Ax(t) + Bu(t))T Px(t) + xT (t)P (Ax(t) + Bu(t)) (4.178)
= ((A −BK) x(t))T Px(t) + xT (t)P ((A  − BK) x(t))
= xT (t) (A − BK)T P + P (A − BK) x(t)

Since this relation shall hold for all value of x(t), we shall have:
− Q + KT RK = (A − BK)T P + P (A − BK) (4.179)


We retrieve the Lyapunov equation (4.174).


Consequently, the dynamic optimization control problem can be converted
into the following equivalent static optimization control problem:

Find P = PT > 0 and K


which minimizes tr (P)
(4.180)
under the constraint ATcl P + PAcl + Q + KT RK = 0


where Acl := A − BK
10
Lewis F., Vrabie D., Syrmos V., Optimal Control, John Wiley & Sons, 3rd Edition, 2012
120 Chapter 4. Design methods

The constraint to be satised is simply the algebraic Riccati equation.


Indeed by adding and subtracting terms PBK and (BK)T P within the
algebraic Riccati equation we get:

AT P + PA − PBR−1 BT P + Q = 0
⇔ (A − BK)T P + P (A − BK) + (BK)T P + PBK − PBR−1 BT P + Q = 0
⇔ (A − BK)T P + P (A − BK) + KT BT P + PBK − PBR−1 BT P + Q = 0
K = R−1 BT P ⇒ (A − BK)T P + P (A − BK) + KT BT P +   − 
PBK PBK
 + Q = 0

BT P = RK ⇒ (A − BK)T P + P (A − BK) + Q + KT RK = 0


(4.181)

4.7.2 Output feedback optimal control problem


The preceding result can be extended to output feedback optimal control
problem. Indeed, consider the following plant:

ẋ(t) = Ax(t) + Bu(t)
(4.182)
y(t) = C x(t)

We wish to nd a stabilizing static output feedback gain K, u(t) = −Ky(t),


which minimizes the following performance index J where Q = QT ≥ 0 and
R = RT > 0: R∞
J = 0 xT (t)Qx(t) + uT (t)Ru(t) dt
 
(4.183)
u(t) = −Ky(t) = −KCx(t)
Let:
QK = Q + CT KT RKC = QTK ≥ 0 (4.184)
Then the performance index J reads:
R∞ 
J = 0 xT (t)Qx(t) + (KCx(t))T RKCx(t) dt
(4.185)
R∞
= R0 xT (t) Q + CT KT RKC x(t) dt


= 0 xT (t) QK x(t) dt

Following Lewis & al.10 , assume that there exists a positive denite matrix
P = PT > 0 such that the following equality holds:
d T
x (t)Px(t) = −xT (t) QK x(t) (4.186)

dt
The performance index J now reads:
R∞ d T 
J = − 0 dt x (t)Px(t) dt
t→∞
= − xT (t)Px(t) t=0 (4.187)
= xT (0) P x(0) − limt→∞ xT (t) P x(t)

Assuming that the closed-loop is stable so that x(t) vanishes with time, we
get the following relation where tr (X) denotes the trace of matrix X:

lim x(t) = 0 ⇒ J = xT (0) P x(0) = tr P x(0) xT (0) (4.188)



t→∞
4.7. Optimal output feedback 121

Furthermore, when using (4.182) and (4.183) in (4.186), we can write:


d
−xT (t) QK x(t) = dt xT (t)Px(t)


= ẋT (t)Px(t) + xT (t)Pẋ(t)


= (Ax(t) + Bu(t))T Px(t) + xT (t)P (Ax(t) + Bu(t))
= ((A −BKC) x(t))T Px(t) + xT (t)P ((A  − BKC) x(t))
= xT (t) (A − BKC)T P + P (A − BKC) x(t)
(4.189)
Since this relation shall hold for all value of x(t), we shall have:

−QK = (A − BKC)T P + P (A − BKC) (4.190)

Consequently, the dynamical optimization control problem can be converted


into the following equivalent static optimization control problem:

Find P = PT > 0 and K


which minimizes tr (P) (4.191)
under the constraint (A − BKC)T P + P (A − BKC) + QK = 0

4.7.3 Solution of the output feedback optimal control problem


First, we recall the following property: let XYZ be a square matrix where
matrices X, Y and Z are of appropriate dimension. Then the following
properties hold10 :
 ∂ tr(XY)

 ∂Y = XT



∂ tr(XYZ)
∂Y = XT ZT (4.192)



 ∂ tr(XYT Z)
 ∂ tr(ZT YXT )
∂Y = ∂Y = ZX

Example 4.2. In order to illustrate the rst relation, we consider the following
matrices:   
2 3
 X=


 4 5  (4.193)
y11 y12
 Y=


y21 y22
Then:
    
2 3 y11 y12 2y11 + 3y21 2y12 + 3y22
tr (XY) = =
4 5 y21 y22 4y11 + 5y21 4y12 + 5y22 (4.194)
⇒ tr (XY) = 2y11 + 3y21 + 4y12 + 5y22

Thus:
 
∂ tr(XY) ∂ tr(XY)  
∂ tr (XY)  ∂y11 ∂y12 2 4
= = = XT (4.195)
∂Y ∂ tr(XY) ∂ tr(XY) 3 5
∂y21 ∂y22


122 Chapter 4. Design methods

Now, we are in position to solve the output feedback optimal control problem
thanks to the Lagrange multiplier approach. We dene the (scalar) Hamiltonian
H as follows where Λ = ΛT is a n × n diagonal matrix of Lagrange multipliers
to be determined:
  
H = tr (P) + tr Λ (A − BKC)T P + P (A − BKC) + QK (4.196)

The necessary conditions to solve this optimization problem with respect to


matrices K, Λ and P read as follows:
 ∂H
= 0 ⇒ 2 RKCΛCT − BT PΛCT = 0



 ∂K
T
∂H
∂Λ = 0 ⇒ (A − BKC) P + P (A − BKC) + QK = 0 (4.197)

 ∂H T

∂P = 0 ⇒ I + (A − BKC) Λ + Λ (A − BKC) = 0

From the rst equation, and assuming that CΛCT is nonsingular, the static
output feedback gain K can be computed as a function of Lagrange multipliers
Λ:
∂H −1
= 0 ⇒ K = R−1 BT PΛCT CΛCT (4.198)
∂K
It is worth noticing that for the static state feedback case where C = I, the
static state feedback K no more depends of Lagrange multipliers Λ:

C = I ⇒ K = R−1 BT P (4.199)

Moreover, in the state feedback case the Lyapunov equation specifying the
constraint turns to be the algebraic Riccati equation:

C = I ⇒ K = R−1 BT P ⇒ QK = Q + KT RK = Q + PBR−1 BT P
⇒ 0 = (A − BKC)T P + P (A − BKC) + QK
= (A − BK)T P + P (A − BK) + Q + PBR−1 BT P
= AT P + PA − PBR−1 BT P + Q
(4.200)

4.7.4 Poles placement in a specied region


In this section, we still wish to nd a stabilizing static output feedback gain K,
u(t) = −Ky(t), which minimizes the performance index J dened in (4.183).
We add the constraint that closed-loop poles are situated in the sector region
C(α, θ) shown in Figure 4.3.
Let QK be dened as in (4.184) and let Acl be the closed-loop state matrix:

Acl = A − BKC (4.201)

We have seen in the previous section that if there exists a positive denite
matrix P = PT > 0 such that the following equality holds:

ATcl P + PAcl + QK = 0 (4.202)


4.7. Optimal output feedback 123

Figure 4.3: Sector region C(α, θ)

Then the minimum cost J ∗ reads as follows where tr (X) denotes the trace of
matrix X:
J ∗ = xT (0) P x(0) = tr P x(0) xT (0) (4.203)


Following Yuan & al.11 , let matrix A


e be dened as follows, where ⊗ denotes
the Kronecker product:

where α ≤ 0

 Aα =A − αI
(4.204)
 
sin(θ)Aα cos(θ)Aα sin(θ) cos(θ)
Ae = := ⊗ Aα
− cos(θ)Aα sin(θ)Aα − cos(θ) sin(θ)

Then it can be shown11 that all the eigenvalues of matrix A will be


situated inside the sector region C(α, θ) shown in Figure 4.3 if and only if all
the eigenvalues of matrix A e are situated in the left half plane. Denoting by
Ψ(s) the polynomial whose companion matrix is A e , this can be tested for
example with the Routh-Hurwitz criterion.
This result can be extended to the case where we add the constraint that all
the eigenvalues of state matrix A shall have a real part greater than αm . This
will be the case if and only if all the eigenvalues of the following block diagonal
matrix Ae e are situated in the left half plane:
 
A 0
(4.205)
e
A
ee =
0 αm I − A

Moreover let:
 h  
i sin(θ) cos(θ)
 B e1 Be 2 := ⊗B

h − cos(θ)
i sin(θ) (4.206)
 A
 e− B
e cl = A e1 B2 KC
e

11
Yuan L., Achenie L., Jiang W., Linear Quadratic Optimal Output Feedback Control For
Systems With Poles In A Specied Region, International Journal of Control, Vol. 64(6), pages
= 1151-1164, 1996
124 Chapter 4. Design methods

Then the optimal control problem for poles placement in a specied region
reads as follows11 :
Find P = PT > 0 and K
which minimize tr (P) under the constraints :
( (4.207)
(A − BKC)T P + P (A − BKC) + QK = 0
e T P + PA
A e cl < 0
cl

where:  
sin(θ) cos(θ)
A
e cl = ⊗ (A − αI − BKC) (4.208)
− cos(θ) sin(θ)

4.8 Frequency shaped LQ control


Often system performances are specied in the frequency domain. The
purpose of this section is to shift the time domain nature of the LQR problem
in the frequency domain as proposed by Gupta in 198012 . This is done thanks
to Parseval's theorem which enables to write the performance index J to be
minimized as follows, where w represents frequency (in rad/sec ):
R∞
J = 0 R xT (t)Qx(t) + uT (t)Ru(t) dt

1 ∞ T T (4.209)
= 2π 0 x (−jω)Qx(jω) + u (−jω)Ru(jω)dw
Then constant weighting matrices Q and R are modied to be function of
the frequency w in order to place distinct penalties on the state and control cost
at various frequencies:
Q = Q(w) = WqT (−jω)Wq (jω)

(4.210)
R = R(w) = WrT (−jω)Wr (jω)
For the existence of the solution of the LQ regulator, matrix R(w) shall be
of full rank. Since we seek to minimize the quadratic cost J , then large terms in
the integrand incur greater penalties than small terms and more eort is exerted
to make then small. Thus if there is for example an high frequency region where
the model of the plant presents unmodeled dynamics and if the control weight
Wr (jω) is chosen to have large magnitude over this region then the resulting
controller would not exert substantial energy in this region. This in turn would
limit the controller bandwidth.
Let us dene the following vectors to carry out the dynamics of the weights
in the frequency domain, where s denotes the Laplace variable:

z(s) = Wq (s) x(s)
(4.211)
v(s) = Wr (s) u(s)
In order to simplify the process of selecting useful weights, it is common to
choose weighting matrices to be scalar functions multiplying the identity matrix:

Wq (s) = wq (s)I
(4.212)
Wr (s) = wr (s)I
12
Frequency-Shaped Cost Functionals: Extension of Linear Quadratic Gaussian Design
Methods, Narendra K. Gupta, Journal of Guidance Control and Dynamics. 11/1980; 3(6):529-
535. DOI: 10.2514/3.19722
4.8. Frequency shaped LQ control 125

The performance index J to be minimized in (4.209) turns to be:


Z ∞
1
J= z T (−jω)z(jω) + v T (−jω)v(jω)dw (4.213)
2π 0
Using Parseval's theorem we get in the time domain:
Z ∞
J= z T (t)z(t) + v T (t)v(t) dt (4.214)
0

Let the state space model of the rst equation of (4.211) be the following,
where z(t) is the output and x(t) the input of the following MIMO system:

χ̇q (t) = Aq χq (t) + Bq x(t)
z(t) = Nq χq (t) + Dq x(t)  (4.215)
⇒ z(s) = Nq (sI − Aq )−1 Bq + Dq x(s) = Wq (s)x(s)

Similarly, let the state space model of the second equation of (4.211) be the
following, where v(t) is the output and u(t) the input of the following MIMO
system:

χ̇r (t) = Ar χr (t) + Br u(t)
v(t) = Nr χr (t) + Dr u(t)  (4.216)
⇒ v(s) = Nr (sI − Ar )−1 Br + Dr u(s) = Wr (s)u(s)

Then it can be shown from (4.215) and (4.216) that:

z T (t)z(t) + v T (t)v(t) = (Nq χq (t) + Dq x(t))T (Nq χq (t) + Dq x(t))


+ (Nr χr (t) + Dr u(t))T (Nr χr (t) + Dr u(t)) (4.217)

That is:
 
x(t)
z T (t)z(t) + v T (t)v(t) = xT (t) χTq (t) χr (t)T Qf χq (t)
 

χr (t)
+ 2 x (t) χTq (t) χr (t)T Nf u(t) + uT (t)Rf u(t) (4.218)
 T 

Where:  T
Dq Dq DTq Nq
 
 0
Qf = NTq Dq NTq Nq



 0 
T

0  0  Nr Nr



0 (4.219)
Nf = 0 

 




 T
Nr Dr

Rf = DTr Dr

Then dene the augmented state vector xa (t):


 
x(t)
xa (t) = χq (t) (4.220)
χr (t)
126 Chapter 4. Design methods

And the augmented state space model:


 
x(t)
d 
ẋa (t) = χq (t) = Aa xa (t) + Ba u(t) (4.221)
dt
χr (t)

Where:   

 A 0 0



Aa = Bq Aq 0 
0  0  Ar

(4.222)

 B
B = 0

a

 


Br
Using (4.218) the performance index J dened in (4.214) is written as
follows:
Z ∞
J= xTa (t)Qf xa (t) + 2xa (t)Nf u(t) + u(t)T Rf u(t) dt (4.223)
0

As far as cross-term in the performance index J appears results obtained


in section 4.6.1 will be used: The algebraic Riccati equation (4.150) reads as
follows:
PAm + ATm P − PBa R−1 BTa P + Qm = 0 (4.224)
Where: (
Qm = Qf − Nf R−1 T
f Nf
(4.225)
Am = Aa − Ba R−1 T
f Nf

Denoting by P the positive denite matrix which solves the algebraic Riccati
equation (4.224), the stabilizing control u(t) is then dened in a similar fashion
than (4.149): (
u(t) = −Kx(t)
(4.226)
K = R−1f (PBa + Nf )
T

4.9 Optimal transient stabilization


We provide in that section the material written by L. Qiu and K. Zhou13 . Let's
consider the feedback system for stabilization system in Figure 4.4 where F (s)
is the plant and K(s) is the controller: The known transfer function F (s) of
the plant is assumed to be strictly proper with a monic polynomial on the
denominator (a monic polynomial is a polynomial in which the leading coecient
(the nonzero coecient of highest degree) is equal to 1):

N (s) bn−1 sn−1 + · · · b1 s + b0


F (s) = = n (4.227)
D(s) s + an−1 sn−1 + · · · + a1 s + a0
13
Preclassical Tools for Postmodern Control, An optimal And Robust Control Theory For
Undergraduate Education, Li Qiu and Kemin Zhou, IEEE Control Systems Magazine, August
2013
4.9. Optimal transient stabilization 127

Figure 4.4: Feedback system for stabilization

Similarly the unknown transfer function K(s) of the controller is assumed


to be strictly proper with a monic polynomial on the denominator:

q(s) qm−1 sm−1 + · · · q1 s + q0


K(s) = = m (4.228)
p(s) s + pm−1 sm−1 + · · · + p1 s + p0

The closed-loop characteristic polynomial β(s) is:

β(s) = N (s)q(s) + D(s)p(s)


(4.229)
= sn+m + βn+m−1 sn+m−1 + · · · + β1 s + β0

For given coprime polynomials N (s) and D(s) as well as an arbitrarily


chosen closed-loop characteristic polynomial β(s) the expression of p(s) and
q(s) amounts to solving the following Diophantine equation:

β(s) = N (s)q(s) + D(s)p(s) (4.230)

This linear equation in the coecients of p(s) and q(s) has solution for
arbitrary β(s) if and only if m ≥ n. The solution is unique if and only if
m = n. Now consider the following performance measure J(ρ, µ) where ρ and
µ are positive number to give relative weights to outputs y1 (t) and y2 (t) and to
inputs w1 (t) and w2 (t) respectively and δ(t) is the Dirac delta function:
Z ∞
1
J(ρ, µ) = y12 (t) + ρy22 (t) dt
2 0
w1 (t) = µδ(t)
w2 (t) = 0
Z ∞
1
+ y12 (t) + ρy22 (t) dt (4.231)
2 0
w1 (t) = 0
w2 (t) = δ(t)

The design procedure to obtain the controller which minimizes performance


measure J(ρ, µ) is the following:

− Find polynomial dµ (s) (also called spectral factor ) which is formed with
the n roots with negative real parts of D(s)D(−s) + µ2 N (s)N (−s):

D(s)D(−s) + µ2 N (s)N (−s) = dµ (s)dµ (−s) (4.232)


128 Chapter 4. Design methods

− Find polynomial dρ (s) (also called spectral factor ) which is formed with
the n roots with negative real parts of D(s)D(−s) + ρN (s)N (−s):

D(s)D(−s) + ρN (s)N (−s) = dρ (s)dρ (−s) (4.233)

− Then the optimal controller K(s) = q(s)/p(s) is the unique nth order
strictly proper transfer function such that:

D(s)p(s) + N (s)q(s) = dµ (s)dρ (s) (4.234)


Chapter 5

Linear Quadratic Tracker (LQT)

5.1 Introduction
The regulator problem that has been tackled in the previous chapters is in fact
a spacial case of a wider class of problems where the outputs of the system are
required to follow a desired trajectory in some optimal sense. As underlined in
the book of Anderson and Moore trajectory following problems can be
conveniently separated into three dierent problems which depend on the
nature of the desired output trajectory:

− If the plant outputs are to follow a class of desired trajectories, for example
all polynomials up to certain order, the problem is referred to as a servo
(servomechanism) problem;

− When the plant outputs are to follow the response of another plant (or
model) the problem is referred to as model following problems;

− If the desired output trajectory is a particular prescribed function of time,


the problem is called a tracking problem.

This chapter is devoted to the presentation of some results common to all three of
these problems, with a particular attention being given on the tracking problem.

5.2 Control with feedforward


We will consider in this section the following linear system, where x(t) is the
state-vector, u(t) the control and y(t) the controlled output (that is the output
of interest): 
ẋ(t) = Ax(t) + Bu(t)
(5.1)
y(t) = C x(t)
Control with feedforward gain allows set point regulation. We will assume
that control u(t) has the following expression where F is the feedforward gain
and where r(t) is the commanded value for the output y(t):

u(t) = −K x(t) + Fr(t) (5.2)


130 Chapter 5. Linear Quadratic Tracker (LQT)

The optimal control problem is then split into two separate problems which
are solved individually to form the suboptimal control:
− First the commanded value r(t) is set to zero and the gain K is computed
to solve the Linear Quadratic Regulator (LQR) problem;

− Then the feedforward gain F is computed such that the steady-state value
of output y(t) is equal to the commanded value r(t) := y c .

r(t) := y c (5.3)

Using the expression (5.2) of the control u(t) within the state space
realization (5.1) of the linear system leads to:

ẋ(t) = Ax(t) + Bu(t) = (A − BK) x(t) + BFy c
(5.4)
y(t) = C x(t)

Then matrix F is computed such that the steady-state value of output y(t) is
y c . Assuming that ẋ = 0, which corresponds to the steady-state, the preceding
equations becomes:

0 = (A − BK) x + BFy c ⇔ x = −(A − BK)−1 BFy c



(5.5)
y = Cx

That is:
y = −C(A − BK)−1 BFy c (5.6)
Setting y to y c and assuming that the size of the output vector y(t) is the
same than the size of the control vector u (square plant) leads to the following
expression of the feedforward gain F:
 −1
y = y c ⇒ F = − C (A − BK)−1 B (5.7)

For a square plant the feedforward gain F is nothing than the inverse of the
closed-loop static gain (the closed-loop static gain is obtained by setting the
Laplace variable s to 0 in the expression of the closed-loop transfer function).

5.3 Finite horizon Linear Quadratic Tracker


We will consider in this section the following linear system, where x(t) is the
state-vector, u(t) the control and y(t) the measured output:

ẋ(t) = Ax(t) + Bu(t)
(5.8)
y(t) = C x(t)

It is now desired to nd an optimal control law in such a way that the
controlled output y(t) tracks or follows a reference output r(t). Hence the
performance index is dened as:
1 tf T
Z
1 T
J(u(t)) = e (tf )Se(tf ) + e (t)Qe(t) + uT (t)Ru(t) dt (5.9)
2 2 0
5.3. Finite horizon Linear Quadratic Tracker 131

Where e(t) is the trajectory error dened as:

e(t) := y(t) − r(t) = C x(t) − r(t) (5.10)

The Hamiltonian H is then dened as:


1 1
H(x, u, λ) = eT (t)Q e(t) + uT (t)Ru(t) + λT (t) (Ax(t) + Bu(t)) (5.11)
2 2
The optimality condition (1.72) yields:
∂H
= 0 = Ru(t) + BT λ(t) ⇒ u(t) = −R−1 BT λ(t) (5.12)
∂u
Equation (1.69) yields:

λ̇(t) = − ∂H
∂x
 T

= − ∂e∂x(t) Q e(t) + AT λ(t)
(5.13)
= − CT Q (C x(t) − r(t)) + AT λ(t)


⇔ λ̇(t) = −AT λ(t) − CT QC x(t) + CT Q r(t)

With the terminal condition (1.70):


∂ 12 eT (tf )Se(tf )
λ(tf ) = ∂x(tf )
T
=
∂ (Cx(tf )−r(tf ))
Se(tf ) (5.14)
∂x(tf )
= T
C S (Cx(tf ) − r(tf ))

In order to get the closed-loop control law, expression (2.25) is modied


through a feedforward term g(t) to be determined:

λ(t) = P(t)x(t) − g(t) (5.15)

Using (5.15) the terminal conditions (5.14) can be written as:

P(tf )x(tf ) − g(tf ) = CT S (Cx(tf ) − r(tf )) (5.16)

Which implies by identication:

P(tf ) = CT SC

(5.17)
g(tf ) = CT Sr(tf )

Furthermore from (5.12) and (5.15) the control law reads:

u(t) = −R−1 BT λ(t)


= −R−1 BT P(t)x(t) − g(t) (5.18)


= −R−1 BT P(t) x(t) + R−1 BT g(t)

From the preceding equation it is clear that the optimal control is the sum
of two components:
− a state-feedback component: −K(t) x(t) where K(t) = R−1 BT P(t);
132 Chapter 5. Linear Quadratic Tracker (LQT)

− and a feedforward component: v(t) := R−1 BT g(t)

In addition, dierentiating (5.15) yields:

λ̇(t) = Ṗ(t)x(t) + P(t)ẋ(t) − ġ(t) (5.19)

Using (5.13) we get:

− AT λ(t) − CT QC x(t) + CT Q r(t)


= Ṗ(t)x(t) + P(t) (Ax(t) + Bu(t)) − ġ(t) (5.20)

Using (5.8), (5.15) and (5.18) to express u(t) as a function of x(t) and g(t)
we nally get:
 
Ṗ(t) + AT P(t) + P(t)A − P(t)BR−1 BT P(t) + CT QC x(t)
− ġ(t) − AT − P(t)BR−1 BT g(t) − CT Q r(t) = 0 (5.21)


The solution of (5.21) can be obtained by solving the preceding dierential


equation with nal conditions (5.17) as two separate problems

−Ṗ(t) = AT P(t) + P(t)A − P(t)BR−1 BT P(t) + CT QC



(5.22)
P(tf ) = CT SC

and:
T −1 T T
 
 −ġ(t) = A − P(t)BR B g(t) + C Q r(t)
:= (A − BK(t))T g(t) + CT Q r(t) (5.23)
g(tf ) = CT Sr(tf )

Thus the implementation of the tracker (5.18) in real-time involves a


standard optimal feedback regulator and a feedforward controller:

− The feedback regulator term requires the backward-in-time solution of the


dierential Riccati equation (5.22). This dierential Riccati equation is
independent of the reference signal r(t) and its solution has been studied
in section 2.5.

− For particular applications where the reference signal r(t) is known a


priori, the feedforward term g(t) can also be computed o-line by
integrating, backwards in time, the dierential equation (5.23).
Backward integration is achieved when the time is reversed, that is by
setting τ = tf − t (thus the minus signs to the left of equality (5.23) is
omitted). Then the initial value of the feedforward term g(0) is known
and during the actual control run g(0) can be used to solve a forward
dierential equation instead1 .
1
Rocio Alba-Flores and Enrique Barbieri, Real-time Innite Horizon Linear-Quadratic
Tracking Controller for Vibration Quenching in Flexible Beams, 2006 IEEE Conference on
Systems, Man, and Cybernetics, October 8-11, 2006, Taipei, Taiwan
5.4. Innite horizon Linear Quadratic Tracker 133

5.4 Innite horizon Linear Quadratic Tracker


5.4.1 General result
When innite horizon is considered, the performance index (5.9) is changed as
follows where the tracking error e(t) is dened in (5.10):
1 ∞ T
Z
J(u(t)) = e (t)Qe(t) + uT (t)Ru(t) dt (5.24)
2 0

Assuming that (A, B) is detectable and (A, Q C) is detectable, there
exists a unique steady-state solution of equations (5.22) obtained through the
corresponding algebraic Riccati equation. Assuming that we want to achieve a
perfect tracking of r(t), control law (5.18) can be written as:
u(t) = −R−1 BT Px(t) + R−1 BT g(t) (5.25)
Matrix P is the positive denite solution of the following algebraic Riccati
equation which is derived from (5.22) by setting Ṗ = 0:
Ṗ = 0 ⇒ 0 = AT P + PA − PBR−1 BT P + CT QC (5.26)
On the other hand, feedforward term g(t) is derived from (5.23):
ġ(t) = − (A − BK)T g(t) − CT Q r(t)
(

lim g(tf ) = CT Sr(tf ) (5.27)


tf →∞

However, it is worth noticing that all the eigenvalues of state matrix


− (A − BK)T are situated in the right half plane (thus unstable) because gain
K is such that all the eigenvalues of (A − BK) are stable. Thus feedforward
term g(t) is not bounded in general. Nevertheless the innite horizon tracker
can be approximated over a nite control interval [0, tf ] by using the
steady-state gain K and the auxiliary function g(t) where tf is large enough.

5.4.2 Asymptotically stable linear reference model


We will assume hereafter that the reference signal r(t) is given as the output of
the following asymptotically stable linear reference model where Ar is known
and has all its eigenvalues in the left-half plane:
ṙ(t) = Ar r(t) (5.28)
Then by combining (5.8) and (5.28) the following augmented plant can be
x(t)
build with state-vector xa (t) := . We will denote Aa , Ba and Ca the
r(t)
related matrices corresponding to the state-space representation of this
augmented system:
       
ẋ(t) A 0 x(t) B

 = + u(t)
 ṙ(t) 0 Ar r(t) 0
  

x(t)
xa (t) := ⇒ := Aa xa (t) + Bau(t)
r(t) 
e(t) = y(t) − r(t) = C −I xa (t)




:= Ca xa (t)

(5.29)
134 Chapter 5. Linear Quadratic Tracker (LQT)

Then minimization of performance index (5.24) is achieved by applying


classical results on LQR problems and control u(t) reads as follows2 :
 
x(t)
u(t) = −Ka where Ka = R−1 BTa Pa (5.30)
r(t)

where Pa is the positive denite solution of the following algebraic Riccati


equation:
0 = ATa Pa + Pa Aa − Pa Ba R−1 BTa Pa + CTa QCa (5.31)

5.4.3 Constant reference tracking


Finally, let's assume a constant value for r(t), which will be denoted rss . Then
at steady-state (5.8) reads as follows, where xss denotes the steady-state of x(t):

0 = Axss + Buss
(5.32)
y ss = C xss

If we impose y ss := rss , the preceding relations read:


    
0 A B xss
y ss := rss ⇒ = (5.33)
rss C 0 uss
 
A B
Assuming that matrix is square and invertible, we get:
C 0
   −1  
xss A B 0
=
uss C 0 rss 
M11 M12 0
:= (5.34)
 M21  M22 rss
M12
= rss
M22

Then let x
e(t) the error between the actual state-vector x(t) and its steady-
state value xss and u
e(t) the error between the actual control u(t) and its steady-
state value uss : 
e(t) := x(t) − xss
x
(5.35)
e(t) := u(t) − uss
u
Then using (5.35) the dynamics of x
e(t) reads:

e˙ (t) = ẋ(t)
x
= Ax(t) + Bu(t) (5.36)
= A (ex(t) + xss ) + B (e
u(t) + uss )

It is clear from (5.33) that Axss + Buss = 0. We nally get:

e˙ (t) = Ae
Axss + Buss = 0 ⇒ x x(t) + Be
u(t) (5.37)
2
Hamidreza Modares, Frank L. Lewis, Online Solution to the Linear Quadratic Tracking
Problem of Continuous-time Systems using Reinforcement Learning, 52nd IEEE Conference
on Decision and Control, December 10-13, 2013. Florence, Italy
5.5. Plant augmented with integrator 135

In addition, the tracking error dened in (5.10) becomes:

e(t) := y(t) − rss = C x(t) − C xss = C x


e(t) (5.38)

Then we consider the minimization of following performance index :

1 ∞ T
Z
J(eu(t)) = e (t)Qe(t) + u eT (t)Re
u(t) dt (5.39)
2 0

The minimization of J(eu(t)) is achieved by applying classical results on LQR


problems and control u
e(t) reads as follows:

x(t) where K = R−1 BT P


e(t) = −Ke
u (5.40)

Matrix P is the positive denite solution of the following algebraic Riccati


equation:
0 = AT P + PA − PBR−1 BT P + CT QC (5.41)
The actual control is nally obtained thanks to (5.35):

e(t) + uss = −K x
u(t) = u e(t) + uss = −K (x(t) − xss ) + uss (5.42)

That is, using (5.34):

u(t) = −K x(t) + K xss + uss 


  xss
= −K x(t) + K I
u
 ss
 A B −1 (5.43)
  
 0
= −K x(t) + K I
C 0 rss
= −K x(t) + (K M12 + M22 ) rss
= −K x(t) + F rss where F := K M12 + M22

5.5 Plant augmented with integrator


5.5.1 Integral augmentation
An alternative to make the steady-state error exactly equal to zero in response
to a step for the commanded value r(t) = y c is to replace the feedforward gain
F by an integrator which will cancel the steady-state error whatever the input
step (the system's type is augmented to be of type 1). The advantage of adding
an integrator is that it eliminates the need to determine the feedforward gain F
which could be dicult because of the uncertainty in the model.
By augmenting the system with the integral error the LQR routine will
choose the value of the integral gain automatically.
The integrator is denoted T/s , where T ̸= 0 is a constant which may be
used to increase the response of the closed-loop system. Let xi be the additional
component of the state-vector which is proportional to the integral of the error
e(t) = r(t) − y(t). Adding an integrator augments the system's dynamics as
follows:
136 Chapter 5. Linear Quadratic Tracker (LQT)


 ẋ(t) = Ax(t) + Bu(t)
y(t) = C x(t) 
 Te(t) = T r(t) − y(t)
ẋi (t) =  − 
  = Tr(t) TC x(t)  

(5.44)

d x(t) A 0 x(t) B 0
= + u(t) + r(t)


 dt

xi (t)  −TC 0 xi (t) 0 T
  x(t)
 y(t) = C 0


xi (t)

Then, the suboptimal control is found by solving the LQR regulation


problem where r = 0:

− The augmented state space model reads:


  
A 0
 Aa =
  

x(t)  −TC 0
dt x (t) = ẋa (t) = Aa xa (t) + Ba u(t) where 
d 
i B
 Ba =

0
(5.45)

− The performance index J(u(t)) to be minimized is the following:


Z ∞
1
J(u(t)) = xTa (t)Qa xa (t) + uT (t)Ru(t) dt (5.46)
2 0

Where, denoting by Na a design matrix, matrix Qa is dened as follows:

Qa = NTa Na (5.47)

Note that design matrix Na shall be chosen such pair (Aa , Na ) is


detectable.

Assuming that pair (Aa , Ba ) is stabilizable and pair (Aa , Na ) is detectable


the algebraic Riccati equation can be solved. This leads to the following
expression of the control u(t) (here feedforward gain F no more exists):

u(t) = −Ka xa (t)


= −R−1 BTa P xa (t)  
 P11 P12 x(t)
−1 (5.48)
 T
= −R B 0
P21 P22 xi (t)
= −R B P11 x(t) − R−1 BT P12 xi (t)
−1 T

:= −Kp x(t) − Ki xi (t)

Obviously, term Kp = R−1 BT P11 represents the proportional gain of the


controller whereas term Ki = R−1 BT P12 represents the integral gain of the
controller.
5.5. Plant augmented with integrator 137

Figure 5.1: Plant augmented with integrator

The state space equation of closed-loop system is obtained by setting u(t) =


−Ka xa (t) = −Kp x(t) − Ki xi (t) in (5.44):
    
d x(t) 0

 dt x (t) = (A a − B K
a a ) x a (t) + r(t)



 i    T   

 A − BKp −BKi x(t) 0
 = + r(t)
−TC  0
 x i (t) T (5.49)
x(t)

  
y(t) = C 0


xi (t)




 Rt 
xi (t) = T 0 r(τ ) − y(τ ) dτ
The corresponding bloc diagram is shown in Figure 5.1 where
Φa (s) = (sI − Aa )−1 .

5.5.2 Proof of the cancellation of the steady-state error through


integral augmentation
In order to proof that integrator cancels the steady-state error when r(t) is a
step input, let us compute the nal value of the error e(t) using the nal value
theorem where s denotes the Laplace variable:
lim e(t) = lim sE(s) (5.50)
t→∞ s→0
When r(t) is a step input with amplitude one, we have:
1
r(t) = 1 ∀ t ≥ 0 ⇒ R(s) = (5.51)
s
Using the feedback u = −Ka xa the dynamics of the closed-loop system is:
 
0
ẋa = (Aa − Ba Ka ) xa + r(t)
 T
⇒ e(t) = T r(t) − y(t)

 
 x (5.52)
= T r(t) − C 0
  xi
= T r(t) − C 0 xa
138 Chapter 5. Linear Quadratic Tracker (LQT)

Using the Laplace transform, and denoting by I the identity matrix, we get:
  
−1 0
X a (s) = (sI − Aa + Ba Ka ) R(s)

  T  (5.53)
E(s) = T R(s) − C 0 X a (s)

Inserting (5.51) in (5.53) we get:


  
−1 0 1
(5.54)
 
E(s) = T I − C 0 (sI − Aa + Ba Ka )
T s
Then the nal value theorem (5.50) takes the following expression:

limt→∞ e(t) = lims→0 sE(s)


  
  −1 0
= lims→0 T I − C 0 (sI − Aa + Ba Ka )
   T (5.55)
0
= T I − C 0 (−Aa + Ba Ka )−1
 
T
Let us focus on the inverse of the matrix −Aa + Ba Ka . First we write Ka
as Ka = Kp Ki , where Kp and Ki represents respectively the proportional
and the integral gains. Then using (5.45) we get:
     
−A 0 B  −A + BKp BKi
(5.56)

−Aa + Ba Ka = + Kp Ki =
TC 0 0 TC 0
Assuming that X is a square
 invertible matrix, it can be shown that the
X Y
inverse of the matrix is the following:
Z 0
−1 " #
Y (ZY)−1

X Y 0
= −1
YT YYT

Z 0 W
where XY (ZY)−1 + YW = 0 (5.57)

Thus:
 −1
−1 −A + BKp BKi
(−Aa + Ba Ka ) =
TC 0
BKi (TCBKi )−1
" #
0
=  −1
(BKi )T BKi (BKi )T W
(5.58)
And:
0 BKi (TCBKi )−1 0
    
0 −1
(−Aa + Ba K) =
∗
T W T
BKi (TCBKi )−1 T

=
WT (5.59)
BKi (TCBKi )−1 T
  
  −1 0  
⇒ C 0 (−Aa + Ba K) = C 0
T WT
= CBKi (TCBKi )−1 T
5.6. Tracking with prelter 139

Figure 5.2: Linear Quadratic Tracker with constant reference signal rss

Consequently, using (5.59) in (5.55), the nal value of the error e(t) becomes:
  
  −1 0
limt→∞ e(t) = T I − C 0 (−Aa + Ba Ka )
  T
= T I − CBKi (TCBKi )−1 T
(5.60)
= T − TCBKi (TCBKi )−1 T
=T−T
=0

As a consequence, the integrator allows to cancel the steady-state error


whatever the input step r(t).

5.6 Tracking with prelter


5.6.1 Tracking without integral augmentation
We consider Figure 5.2 and the problem to design a control u(t) which minimizes
the error e(t) between the output y r (t) of the reference model represented by
Gr (s) and the actual output y(t) of the plant represented by F(s).
The state space realization of Gr (s) and F(s) are assumed to read as follows,
where rss is a constant reference signal and where input matrix B2 has been
introduced to tackle the case where an integrator is inserted in the feedforward
path of the loop, as presented in Section 5.5:
 
ẋ(t) = A1 x(t) + B1 u(t) + B2 rss
 F(s) : y(t) = C x(t)


1
 (5.61)
ẋ r (t) = Ar xr (t) + Br rss
 Gr (s) :


y r (t) = Cr xr (t)

Assuming that no integrator is inserted in the feedforward path of the loop,


matrices A1 , B1 and C1 are related to the state-space representation of the
actual plant:


 A1 := A
B1 := B

(5.62)

 B2 := 0
C1 := C

140 Chapter 5. Linear Quadratic Tracker (LQT)

The tracking error e(t) reads:

e(t) := y(t) − y r (t) = C1 x(t) − Cr xr (t) (5.63)

In order to solve this problem, we rst compute the steady-state values


imposing that at steady-state we shall have y ss = y r . From (5.61) we get:
ss


 0 = A1 xss + B1 uss + B2 rss
0 = Ar xrss + Br rss
y ss = y r ⇔ C1 xss = Cr xrss

ss

    
−B2 A1 0 B1 xss
⇔  −Br  rss =  0 Ar 0   xrss  (5.64)
0 C1 −Cr 0 uss

Let matrix M be dened as follows:


 
A1 0 B1
M :=  0 Ar 0  (5.65)
C1 −Cr 0

Assuming that matrix M is invertible, we get:


   
xss −B2
 xr  = M−1  −Br  rss (5.66)
ss
uss 0

Let:    
−B2 M1
M−1  −Br  :=  M2  (5.67)
0 M3
Thus:
   
xss M1
 xr  =  M2  rss
ss
(5.68)
uss M3

Note that if matrix M is not invertible, its pseudo inverse (MoorePenrose


inverse) M+ can be used instead of its inverse M−1 . We recall that M+ is a
generalization of the inverse of a matrix and is such that MM+ M = M and
M+ MM+ = M+ . It can be computed by using the singular value
decomposition (SVD) of M: If M = UΣV∗ is the singular value
decomposition (SVD) of M, then M+ = VΣ+ U∗ .
Then let x
e(t) the error between the actual state-vector x(t) and its steady-
state value xss , x
er (t) the error between the reference model state-vector xr (t)
and its steady-state value xrss and ue(t) the error between the actual control u(t)
and its steady-state value uss :

 xe(t) := x(t) − xss
e (t) := xr (t) − xrss
x (5.69)
 r
e(t) := u(t) − uss
u
5.6. Tracking with prelter 141

Using (5.69) the dynamics of x


e(t) reads:

e˙ (t) = ẋ(t)
x
= A1 x(t) + B1 u(t) + B2 rss
(5.70)
= A1 (e x(t) + xss ) + B1 (e
u(t) + uss ) + B2 rss
= A1 xe(t) + B1 u
e(t) + A1 xss + B1 uss + B2 rss

Similarly:
e˙ r (t) = ẋr (t)
x
= Ar xr (t) + Br rss
(5.71)
= Ar x er (t) + xrss + Br rss
= Ar x er (t) + Ar xrss + Br rss
It is clear from (5.64) that A1 xss +B1 uss +B2 rss = 0 and Ar xrss +Br rss = 0.
We nally get the following state space equation:
 ˙      
x
e(t) A1 0 x
e(t) B1
= + e(t)
u
e˙ r (t)
x 0 Ar x
e (t) 0
  r (5.72)
x
e(t)
:= Aa + Ba u
e(t)
x
er (t)

In addition, the tracking error dened in (5.63) becomes:

e(t) := y(t) − y r (t)


= C1 x(t) − Cr xr (t)  (5.73)
= C1 (ex(t) + xss ) − Cr x er (t) + xrss
= C1 xe(t) − Cr x er (t) + C1 xss − Cr xrss

Using the last equation of (5.64), we nally get the following output
equation:

C1 xss = Cr xrss ⇒ e(t) = C1 x


e(t) − Cr xer (t) 
  x e(t)
= C1 −Cr
  x er (t) (5.74)
e(t)
x
:= Ca
x
er (t)

Then we consider the minimization of following performance index :


1 ∞ T
Z
J(eu(t)) = e (t)Qe(t) + u eT (t)Re
u(t) dt (5.75)
2 0
The minimization of J(eu(t)) is achieved by applying classical results on LQR
problems and control u
e(t) reads as follows:
 
x
e(t)
e(t) = −Ka
u where Ka = R−1 BTa P (5.76)
x
er (t)

Matrix P is the positive denite solution of the following algebraic Riccati


equation:
0 = ATa P + PAa − PBa R−1 BTa P + CTa QCa (5.77)
142 Chapter 5. Linear Quadratic Tracker (LQT)

Figure 5.3: Linear Quadratic Tracker with prelter

The actual control u(t) is nally obtained thanks to (5.69):


u(t) = u e(t) + uss 
e(t)
x
= −Ka + uss
 xer (t)    (5.78)
x(t) xss
= −Ka + Ka + uss
xr (t) xrss

That is, when splitting Ka as Ka := Kx Kr and using (5.68):


 

   
x(t) M1
u(t) = −Ka +K rss + M3 rss
xr (t)  a M 2
  x(t) (5.79)
:= − Kx Kr + Dpf rss
xr (t)
:= −Kx x(t) + y pf (t)

where:
 y pf (t) = −K

r x (t) + Dpf r ss
 r 
M1 (5.80)
 Dpf := Ka + M3
M2
Consequently the actual optimal control u(t) is the sum of two components:
− a state-feedback component: −Kx x(t);
− and a feedforward component y pf (t) with is obtained as the output of the
a prelter Cpf (s) with the following realization:
(
ẋr (t) = Ar xr (t) + Br rss
Cpf (s) : (5.81)
y pf (t) = −Kr xr (t) + Dpf rss

This is illustrated in Figure 5.3.

5.6.2 Tracking with integral augmentation


Assuming that an integrator is inserted in the feedforward path of the loop, as
presented in Section 5.5, the state vector x(t) of the plant has to be extended
by adding a new component xi (t) in the state vector:
 
x(t)
where ẋi (t) = Te(t) = T r(t) − y(t) (5.82)

x(t) →
xi (t)
5.6. Tracking with prelter 143

Then using (5.44) matrices A1 , B1 and C1 in (5.61) read as follows:


  
 A1 :=
 A 0
 −TC 0



 
B


B1 :=

 0  (5.83)
0


B2 :=


T



  

C1 := C 0

Nevertheless the algebraic Riccati equation (5.77) is not solvable because


pair (Aa , Ca )is no more observable. In order
 to tackle this point, weighting
matrix Ca := C1 −Cr = C 0 −Cr has to be changed, for example
 

as follows:
Ca := [ C I −Cr ]
|{z} (5.84)
0 becomes I

Finally steady-state value of the integral term xi (t) is xiss = 0.


Consequently, tracking with integral augmentation will not the change
steady-state values and (5.68) is still valid (meaning that B2 is assumed to be
zero to compute steady-state values). This implies that the expression of the
structure of the prelter remains unchanged. Finally (5.80) now reads as
follows where matrix 0 has been added after M1 in the expression of Dpf to
take into account the presence of integral term in the state vector of the
augmented plant:
 y pf (t) = −K

r x (t) + Dpf r ss
  r 
M1

(5.85)

 D pf := K a  0  + M3

M2
144 Chapter 5. Linear Quadratic Tracker (LQT)
Chapter 6

Linear Quadratic Gaussian


(LQG) regulator

6.1 Introduction
The design of the Linear Quadratic Regulator (LQR) assumes that the whole
state is available for control and that there is no noise. Those assumptions may
appear unrealistic in practical applications. We will assume in this chapter that
the process to be controlled is described by the following linear time invariant
model where w(t) and v(t) are random processes which represents the process
noise and the measurement noise, respectively:

ẋ(t) = Ax(t) + Bu(t) + w(t)
(6.1)
y(t) = Cx(t) + v(t)

The preceding relation can be equivalently represented by the block diagram


in Figure 6.1.
Linear Quadratic Gaussian (LQG) control deals with the design of a
regulator which minimizes a quadratic cost using the available output and
taking into account the noise into the process and the available output for
control. More precisely the LQG control problem is to nd the optimal control
u(t) which minimizes the following performance index J(u(t)) where E() is the

Figure 6.1: Open-loop linear system with process and measurement noises
146 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

mathematical expectation, Q = QT ≥ 0 and R = RT > 0:


 Z tf 
1
J(u(t)) = E lim T T
x (t)Qx(t) + u (t)Ru(t) dt (6.2)
tf →∞ 2 tf 0

As far as only the output y(t) is now available for control (not the full state
x(t)), the separation principle will be used to design the LQG regulator. Indeed,
the solution of the LQG problem can be split into two steps:

− First an estimator will be used to estimate the full state using the available
output y(t)

− Then an LQ controller will be designed using the state estimation in place


of the true (but unknown) state x(t)

6.2 Luenberger observer


Consider a process with the following state space model where y(t) denotes the
measured output and u(t) the control input:

ẋ(t) = Ax(t) + Bu(t)
(6.3)
y(t) = Cx(t)

We assume that x(t) cannot be measured and the goal of the observer is to
estimate x(t) based on y(t). Luenberger observer (1964) provides an estimation
of the state vector through the following dierential equation where matrices F,
J and L have to be determined:
d
x
b(t) = Fb
x(t) + Ju(t) + Ly(t) (6.4)
dt
The estimation error e(t) is dened as follows:

e(t) = x(t) − x
b(t) (6.5)

Thus using (6.3) and (6.4) its time derivative reads:

b˙ (t) = Ax(t) + Bu(t) − Fb (6.6)



ė(t) = ẋ(t) − x x(t) + Ju(t) + Ly(t)

Using (6.5) and the output equation y(t) = Cx(t) the preceding relation can
be rewritten as follows:
ė(t) = Ax(t) + Bu(t) − F (x(t) − e(t)) − Ju(t) − LCx(t)
(6.7)
= Fe(t) + (A − F − LC) x(t) + (B − J) u(t)

As soon as the purpose of the observer is to move the estimation error e(t)
towards zero independently of control u(t) and true state vector x(t) we choose
matrices F and J as follows:

J=B
(6.8)
F = A − LC
6.3. White noise through Linear Time Invariant (LTI) system 147

Figure 6.2: Luenberger observer

Thus the dynamics of the estimation error e(t) reduces to be:


ė(t) = Fe(t) = (A − LC) e(t) (6.9)
Where matrix L shall be chosen such that all the eigenvalues of A − LC are
situated in the left half plane. Furthermore the Luenberger observer (6.4) can
now be written as follows using (6.8):

b˙ (t) = (A − LC) x
x b(t) + Bu(t) + Ly(t) 
(6.10)
= Abx(t) + Bu(t) + L y(t) − Cb
x(t)
Figure 6.2 shows the structure of the Luenberger observer.

6.3 White noise through Linear Time Invariant (LTI)


system
6.3.1 Assumptions and denitions
Let's consider the following linear time invariant system which is fed by a random
process w(t) of dimension n (which is also the dimension of the state vector x(t)):

ẋ(t) = Ax(t) + Bw(t)
(6.11)
y(t) = Cx(t)
The mean of w(t) will be denoted E [w(t)] and its autocorrelation function
Rw (τ ) will be denoted E w(t)w (t + τ ) where E designates the expectation
T


operator. We said that w(t) is a wide-sense stationary (WSS) random process


when the two following properties hold:
− The mean mw (t) := E [w(t)] of w(t) is independent of t, that is constant;
− The autocorrelation function Rw (t, t + τ ) just depends on the time
dierence τ = (t + τ ) − t:
h i
Rw (t, t + τ ) = E (w(t) − mw (t)) (w(t + τ ) − mw (t + τ ))T
(6.12)
:= Rw (τ )
148 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

We will assume that w(t) is a white noise (which is a special case of a wide-
sense stationary (WSS) random process) with zero mean Gaussian probability
density function (pdf) p(w). The covariance matrix of the Gaussian probability
density function p(w) will be denoted Pw and the Dirac delta function will be
denoted δ(τ ):

− 1 wT P−1

1 w w
 p(w) = (2π)n/2 √det(Pw ) e 2

E [w(t)] = m (6.13)
  w (t) =T 0
Rw (τ ) = E w(t)w (t + τ ) = Pw δ(τ ) where Pw = PTw > 0
 

Because w(t) is a stochastic process, the dierential equation ẋ(t) = Ax(t)+


Bw(t) is called a stochastic dierential equation. Moreover this particular type
of stochastic dierential equation where w(t) comes from the derivative of a
Wiener process is called Langevin equation and can be written more elegantly
as the following Ornstein-Uhlenbeck process where w(t) is a vector of Wiener
process, also called Brownian motion:

dx(t) = Ax(t) dt + B dw(t) (6.14)

6.3.2 Mean and covariance matrix of the state vector


As far as w(t) is a random process it is clear from (6.11) that the state vector
x(t) and the output vector y(t) are also a random processes. When expending
the expression of the state vector obtained for deterministic signals we get:
Z t
At
x(t) = e x0 + eA(t−τ ) B w(τ ) dτ (6.15)
0

Let mx (0) = E [x0 ] be the mean of the initial value x0 of the state vector
x(t) and Px (0) the covariance matrix of the initial value x0 of the state vector.
Then it can be shown that x(t) is a Gaussian random process with:

− Mean mx (t) given by:

mx (t) = E [x(t)] = eAt mx (0) (6.16)

Assuming that mx (0) = 0 we get zero for the mean value of x(t)::

mx (0) = 0 ⇒ mx (t) = 0 (6.17)

− Covariance matrix Px (t) which is dened as follows:


h i
Px (t) = E (x(t) − mx (t)) (x(t) − mx (t))T (6.18)

Assuming that mx (0) = 0 we get:

mx (0) = 0 ⇒ Px (t) = E x(t) x(t)T (6.19)


 
6.3. White noise through Linear Time Invariant (LTI) system 149

Finally, assuming that mx (0) = 0 and the input random process w(t) is
a zero mean white noise with autocorrelation function Rw (τ ) = Pw δ(τ )
and
 is uncorrelated with the initial value x0 of the state vector, that is
E x0 wT (τ ) = 0, then matrix Px (t) reads as follows:


Px (t) = E x(t) x(t)T


 
 R t A(t−τ )  R t A(t−τ ) T 
At
= E e x0 + 0 e At
B w(τ ) dτ e x0 + 0 e B w(τ ) dτ
  R T 
At A Tt R t A(t−τ ) t A(t−τ2 )
= e Px (0)e +E 0 e
1 B w(τ1 ) dτ1 0 e B w(τ2 ) dτ2
T t t T
= eAt Px (0)eA t + 0 0 eA(t−τ ) B E w(τ )wT (τ1 ) BT eA (t−τ1 ) dτ dτ1
R R  
T tRt T
= eAt Px (0)eA t + 0 0 eA(t−τ ) B Pw δ(τ1 − τ ) BT eA (t−τ1 ) dτ dτ1
R

(6.20)
Rt
Using the fact that 0 g(τ1 ) δ(τ1 − τ ) dτ1 = g(τ ), we nally obtain:
Z t
Tt T (t−τ )
Px (t) = eAt Px (0)eA + eA(t−τ ) BPw BT eA dτ (6.21)
0

Because the evaluation of the preceding integral is dicult, we take the


derivative of Px (t) to get the following Lyapunov matrix dierential
equation where Px (t) = PTx (t) ≥ 0:

Ṗx (t) = APx (t) + Px (t)AT + BPw BT (6.22)

Assuming that the system is stable (i.e. all the eigenvalues of the state matrix
A have negative real part) the random process x(t) will become stationary after
a certain amount of time: its mean mx (t) will be zero whereas the value of its
covariance matrix Px (t) turns to be a constant matrix Px = PTx ≥ 0 ∀ t which
solves the following matrix algebraic Lyapunov equation:

APx + Px AT + BPw BT = 0 (6.23)

Thus after a certain amount of time the state vector x(t) as well as the
output vector y(t) are wide-sense stationary (WSS) random processes.

6.3.3 Autocorrelation function of the stationary output vector


As in the previous section, we will assume in the following that mx (0) = 0. Let
Rx (τ ) be the autocorrelation function of the stationary state vector x(t):

Rx (τ ) = E x(t) x(t + τ )T (6.24)


 

The autocorrelation function Ry (τ ) (which may be a matrix for vector


signal) of the output vector y(t) = C x(t) is dened as follows:

Ry (τ ) = E y(t) y(t + τ )T = CRx (τ )CT (6.25)


 
150 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

It is clear from the denition of the autocorrelation function Ry (τ ) that the


stationary value of the covariance matrix Py of y(t) is equal to the value of the
autocorrelation function Ry (τ ) at τ = 0:

Py = E y(t) y(t)T = CPx CT = Ry (τ )|τ =0 (6.26)


 

The power spectral density (psd) Sy (f ) of a stationary process y(t) is given


by the Fourier transform of its autocorrelation function Ry (τ ):
Z +∞
Sy (f ) = Ry (τ )e−j2πf τ dτ (6.27)
−∞

Then we will see in Section 6.3.4 that the following result holds:

Sy (f ) = F(−s) Pw FT (s) s=j2πf


(6.28)

where F(s) is the transfer function of the linear system, which is assumed
to be stable :
F(s) = C (sI − A)−1 B (6.29)
Relation (6.28) indicates that the power spectral density (psd) Sy (f ) of y(t)
can be obtained thanks to the transfer function F(s) of the stable linear system
and the spectral density matrix Pw of the exciting white noise w(t).
Let Sy (s) be the (one-sided) Laplace transform of the autocorrelation
function Ry (τ ):
Z +∞
Sy (s) = L [Ry (τ )] = Ry (τ )e−sτ dτ (6.30)
0

It can be seen that the power spectral density (psd) Sy (f ) of y(t) can be
obtained thanks to the (one-sided) Laplace transform Sy (s) of Ry (τ ) as:

Sy (f ) = Sy (−s)|s=j2πf + Sy (s)|s=j2πf (6.31)

Indeed we can write:


R +∞
Sy (f ) = −∞ Ry (τ )e−j2πf τ dτ
R0 R +∞
= −∞ Ry (τ )e−j2πf τ dτ + 0 Ry (τ )e−j2πf τ dτ
R0
= −∞ Ry (τ )e−sτ dτ
R +∞
+ 0 Ry (τ )e−sτ dτ (6.32)
s=j2πf s=j2πf
R +∞ sτ
R +∞ −sτ
= 0 Ry (−τ )e dτ + 0 Ry (τ )e dτ
s=j2πf s=j2πf

As far as Ry (τ ) is an even function we get:

Ry (−τ ) = Ry (τ )
R +∞ R +∞ (6.33)
⇒ Sy (f ) = 0 Ry (τ )esτ dτ + 0 Ry (τ )e−sτ dτ
s=j2πf s=j2πf

The preceding equations reads:

Sy (f ) = Sy (−s)|s=j2πf + Sy (s)|s=j2πf (6.34)


6.3. White noise through Linear Time Invariant (LTI) system 151

Then, using (6.28), we can write:

Sy (−s) + Sy (s) = F(−s) Pw FT (s) (6.35)

When identifying the stable transfer function Sy (s) in the preceding relation,
we get the autocorrelation function Ry (τ ) ∀τ ≥ 0 thank to the inverse (one-
sided) Laplace transform of Sy (s):

Sy (−s) + Sy (s) = F(−s) Pw FT (s)


(6.36)
⇒ Ry (τ ) = L−1 [Sy (s)] ∀τ ≥ 0 where Sy (s) stable
Finally using the initial value theorem on the (one-sided) Laplace transform
Sy (s) we get the following result:

Py = Ry (τ )|τ =0 = lim s Sy (s) (6.37)


s→∞

Example 6.1. Let F(s) be a rst order system with time constant a and let
w(t) be a white noise with covariance Pw :
1

F(s) = 1+as
(6.38)
Rw (τ ) = E w(t)wT (t + τ ) = Pw δ(τ ) where Pw = PTw > 0
 

One realization of transfer function F(s) is the following:


ẋ(t) = − a1 x(t) + w(t)

(6.39)
y(t) = a1 x(t)
That is: 
ẋ(t) = Ax(t) + Bw(t)
(6.40)
y(t) = Cx(t)
Where:
 A = − a1

B=1 ⇒ F(s) = C(sI − A)−1 B (6.41)


C = a1

As far as a > 0 the system is stable. The covariance matrix Px (t) is dened
as follows: h i
Px (t) = E (x(t) − mx (t)) (x(t) − mx (t))T (6.42)

Where matrix Px (t) is the solution of the following Lyapunov dierential


equation:
2
Ṗx (t) = APx (t) + Px (t)AT + BPw BT = − Px (t) + Pw (6.43)
a
We get:
a  a  2t
Px (t) = Pw + Px (0) − Pw e− a (6.44)
2 2
The stationary value Px of the covariance matrix Px (t) of the state vector
x(t)is obtained as t → ∞:
a
Px = lim Px (t) = Pw (6.45)
t→∞ 2
152 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Consequently the stationary value Py of the covariance matrix of the output


vector y(t) reads:
1 a Pw
Py = CPx CT = 2
× Pw = (6.46)
a 2 2a
This result can be retrieved thanks to the power spectral density (psd) of the
output vector y(t). Indeed let's compute the power spectral density (psd) Sy (f )
of the output stationary process y(t) of the system:
Z +∞
Sy (f ) = Ry (τ )e−j2πf τ dτ = F(−s) Pw FT (s) s=j2πf
(6.47)
−∞

We get:
Pw Pw
F(−s) Pw FT (s) = (1+as)(1−as) = 1−(as)2
T Pw (6.48)
⇒ Sy (f ) = F(−s) Pw F (s) s=j2πf = 1+(2πf a)2

Furthermore let's decompose Pw


1−(as)2
as the sum Sy (−s) + Sy (s):

Pw Pw 1 Pw 1
2
= + = Sy (−s) + Sy (s) (6.49)
1 − (as) 2 1 − as 2 1 + as

Thus by identication we get for the stable transfer function Sy (s):


Pw 1
Sy (s) = (6.50)
2 1 + as

The autocorrelation function Ry (τ ) is given by the inverse Laplace transform


of Sy (s):
 
−1 −1 Pw 1 Pw − τ
Ry (τ ) = L [Sy (s)] = L = e a ∀τ ≥0 (6.51)
2a 1/a + s 2a

As far as Ry (τ ) is an even function we get:


Pw τ
Ry (τ ) = ea ∀ τ ≤ 0 (6.52)
2a

Thus the autocorrelation function Ry (τ ) for τ ∈ R reads:


Pw − |τ |
Ry (τ ) = e a ∀τ ∈R (6.53)
2a

Finally we use the initial value theorem on the (one-sided) Laplace transform
Sy (s) to get the following result:

Pw
Py = Ry (τ )|τ =0 = lim s Sy (s) = (6.54)
s→∞ 2a

6.3. White noise through Linear Time Invariant (LTI) system 153

6.3.4 Proof of the expression of the autocorrelation function


The autocorrelation function Ry (t, t + τ ) of a non-stationary process reads as
follows:
Ry (t, t + τ ) = E y(t) y T (t + τ )
 

= E Cx(t) xT (t + τ )C T


= C E x(t) xT (t + τ ) CT

R  R T 
t A(t−τ1 ) t+τ A(t+τ −τ2 )
= CE 0 e B w(τ1 ) dτ1 0 e B w(τ2 ) dτ2 CT
R R 
t t+τ T
= C 0 0 eA(t−τ1 ) B E w(τ1 ) wT (τ2 ) BT eA (t+τ −τ2 ) dτ1 dτ2 CT
 

(6.55)
Using the facts that w(t) is Rta white noise, that is
E w(τ1 ) wT (τ2 ) = Pw δ(τ2 − τ1 ), and that 0 g(τ1 ) δ(τ2 − τ1 ) dτ1 = g(τ2 ), we
 

get:
R R 
t t+τ T
Ry (t, t + τ ) = C 0 0 eA(t−τ1 ) B Pw δ(τ2 − τ1 ) BT eA (t+τ −τ2 ) dτ1 dτ2 CT
R 
t T
= C 0 eA(t−τ2 ) B Pw BT eA (t+τ −τ2 ) dτ2 CT
(6.56)
Let ξ := t − τ2 ⇒ dξ = −dτ2 . Then:
R 
0 T
Ry (t, t + τ ) = −C t eAξ B Pw BT eA (ξ+τ ) dξ CT
R
t T
 (6.57)
= C 0 eAξ B Pw BT eA (ξ+τ ) dξ CT

We nally get1 :
Z t   T
Ry (t, t + τ ) = C eAξ B Pw C eA(ξ+τ ) B dξ (6.58)
0

Because t is the upper limit of the integral, the preceding autocorrelation


function Ry (t, t + τ ) is not stationary. But stationary comes at t → ∞ when
the process reaches steady state:
Z ∞   T
lim Ry (t, t + τ ) = C eAξ B Pw C eA(ξ+τ ) B dξ := Ry (τ ) (6.59)
t→∞ 0

Of course (6.59) is valid only if the process has a steady state response, that
is if the process is stable.
Then the power spectral density (psd) Sy (f ) of y(t) is dened as the Fourier
transform of the autocorrelation function Ry (τ ). We get from (6.59):
R +∞ −j2πf τ dτ
Sy (f ) = R
Ry (τ )e
R−∞
+∞ ∞ T 
= −∞ 0 C eAξ B Pw C eA(ξ+τ ) B dξ e−j2πf τ dτ

R  (6.60)
R∞ +∞ T
= 0 C eAξ B Pw −∞ C eA(ξ+τ ) B e−j2πf τ dτ dξ


1
Friedland B., Control System Design: An Introduction to State-Space Methods, Dover
Books on Electrical Engineering (2012)
154 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Replacing ξ + τ by t in the bracketed integral yields:


R +∞ T R +∞ T
−∞ C eA(ξ+τ ) B e−j2πf τ dτ = C eAt B e−j2πf (t−ξ) dt
−∞
At B T e−j2πf t dt
R +∞
ej2πf ξ

= −∞ C e
= ej2πf ξ FT (j2πf )
(6.61)
where F(j2πf ) is dened as follows:
Z +∞
F(j2πf ) = C eAt B e−j2πf t dt (6.62)
−∞

It is worth noticing that the time response of a causal system is zero ∀t < 0.
So we recognize in F(j2πf ) the transfer function of the linear system when
s = j2πf . Indeed for a linear and causal system we have:
R +∞
F(j2πf ) = −∞ C eAt B e−j2πf t dt
R +∞
= 0 C eAt B e−j2πf t dt
R +∞
= 0 C eAt B e−st dt
s=j2πf
R +∞ −(sI−A)t (6.63)
= 0 Ce B dt
s=j2πf
= C (sI − A)−1 B
s=j2πf
= F(s)|s=j2πf

Returning to (6.60), we get from the preceding results:


R∞
Sy (f ) = R0 C eAξ B Pw ej2πf ξ FT (j2πf ) dξ


= 0 C eAξ B ej2πf ξ dξ Pw FT (j2πf ) (6.64)
= F(−j2πf ) Pw FT (j2πf )

We nally retrieve result (6.28):

Sy (f ) = F(−s) Pw FT (s) s=j2πf


(6.65)

This completes the proof. ■

6.4 Kalman-Bucy lter


6.4.1 Linear Quadratic Estimator
Let's consider the following linear time invariant model where w(t) and v(t) are
random processes which represents the process noise and the measurement noise
respectively: 
ẋ(t) = Ax(t) + Bu(t) + w(t)
(6.66)
y(t) = Cx(t) + v(t)
The Kalman-Bucy lter is a state estimator that is optimal in the sense that
it minimizes the covariance of the estimated error e(t) = x(t) − xb(t) when the
following conditions are met:
6.4. Kalman-Bucy lter 155

− Random vectors w(t) and v(t) are zero mean Gaussian noise. Let p(w)
and p(v) be the probability density function (pdf) of random processes
w(t) and v(t). Then:
 1 T −1
 p(w) = n/2
√1 e− 2 w Pw w
(2π) det(Pw )
1 − 12 v T P−1
(6.67)
 p(v) = √ v v
p/2
e
(2π) det(Pv )

− Random vectors w(t) and v(t) are white noise (i.e. uncorrelated). The
covariance matrices of w(t) and v(t) will be denoted Pw and Pv
respectively:

E w(t)wT (t + τ ) = Pw δ(τ ) where Pw = PTw > 0


  
(6.68)
E v(t)v T (t + τ ) = Pv δ(τ ) where Pv = PTv ≥ 0

− The cross correlation between w(t) and v(t) is zero:

E w(t)v T (t + τ ) = 0
  
(6.69)
E v(t)wT (t + τ ) = 0

The Kalman-Bucy lter is a special form of the Luenberger observer (6.10):

b˙ (t) = Ab (6.70)

x x(t) + Bu(t) + L(t) y(t) − Cbx(t)

Where the time dependent observer gain L(t), also-called Kalman gain, is
given by:
L(t) = Y(t)CT P−1v (6.71)
where matrix Y(t) is the solution of the following dierential Riccati
equation:

Ẏ(t) = AY(t) + Y(t)AT − Y(t)CT P−1


v CY(t) + Pw (6.72)

The suboptimal observer gain L = YCT P−1 v is obtained thanks to the


positive denite steady state solution Y = YT > 0 of the following algebraic
Riccati equation:

L = YCT P−1

v
(6.73)
0 = AY + YAT − YCT P−1 v CY + Pw

For discrete time systems, the following discrete time algebraic Riccati
equation has be be solved to get the suboptimal observer gain, as shown in
section 3.8:
−1
Y + AYCT Pv + CYCT CYAT − AYAT − Pw = 0 (6.74)

Kalman gain shall be tuned when the covariance matrices Pw and Pv are
not known:
− When measurements y(t) are very noisy the coecients of covariance
matrix Pv are high and Kalman gain will be quite small;
156 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

− On the other hand when we do not trust very much the linear time
invariant model of the process the coecients of covariance matrix Pw
are high and Kalman gain will be quite high.
From a practical point of view matrices Pw and Pv are design parameters
which are tuned to achieve the desired properties of the closed-loop.
Moreover, when the Riccati equation (6.72) related to the Kalman-Bucy
lter is identied to the Riccati equation related the Linear-Quadratic-Regulator
(LQR) we get:

Ẏ(t) = AY(t) + Y(t)AT − Y(t)CT P−1


v CY(t) + Pw
:= A T Y(t) + Y(t)A − Y(t)BR−1 BT Y(t) + Q

 Q := Pw ≥ 0
 (6.75)
R := Pv > 0



 B → CT
A → AT

6.4.2 Sketch of the proof


To get this result let's consider the following estimation error e(t):

e(t) = x(t) − x
b(t) (6.76)

Thus using (6.66) and (6.70) its time derivative reads:

b˙ (t)
ė(t) = ẋ(t) − x 
= Ax(t) + Bu(t) + w(t) − Ab x(t) + Bu(t) + L(t) y(t) − Cb
x(t)
= Ae(t) + w(t) − L(t) (Cx(t) + v(t) − Cb x(t))
= (A − L(t)C) e(t) + w(t) − L(t)v(t)
(6.77)
Since v(t) and w(t) are zero mean white noise their weighted sum n(t) =
w(t) − L(t)v(t) is also a zero mean white noise. We get:

n(t) = w(t) − L(t)v(t) ⇒ ė(t) = (A − L(t)C) e(t) + n(t) (6.78)

The covariance matrix Pn of n(t) reads:

Pn = E hn(t)nT (t)
 
i
= E (w(t) − L(t)v(t)) (w(t) − L(t)v(t))T (6.79)
= Pw + L(t)Pv LT (t)

Then the covariance matrix Y(t) of e(t) is obtained thanks to (6.22):

Ẏ(t) = (A − L(t)C) Y(t) + Y(t) (A − L(t)C)T + Pn


(6.80)
= AY(t) + Y(t)AT − L(t)CY(t) − Y(t)CT L(t)T + Pn

By using the expression (6.79) of the covariance matrix Pn of n(t) we get:

Ẏ(t) = AY(t) + Y(t)AT + Pw


− L(t)CY(t) − Y(t)CT L(t)T + L(t)Pv LT (t) (6.81)
6.5. Duality principle 157

Let's complete the square of −L(t)CY(t) − Y(t)CT L(t)T + L(t)Pv LT (t). First
we will focus on the scalar case where we try to minimize the following quadratic
function f (L) where Pv > 0:

f (L) = −2LCY + Pv L2 (6.82)

Completing the square of f (L) means writing f (L) as follows:

f (L) = P−1 2 2 2 −1
v (LPv − YC) − Y C Pv (6.83)

Then it is clear that f (L) is minimal when LPv − YC and that the minimal
value of f (L) is −Y2 C2 P−1
v . This approach can be extended to the matrix case.
When we complete the square of −L(t)CY(t) − Y(t)CT L(t)T + L(t)Pv LT (t)
we get:

− L(t)CY(t) − Y(t)CT L(t)T + L(t)Pv LT (t) =


T
L(t)Pv − Y(t)CT P−1 L(t)Pv − Y(t)CT

v
− Y(t)CT P−1
v CY(t) (6.84)

Using the preceding relation within (6.81) reads:

Ẏ(t) = AY(t) + Y(t)AT + Pw


T
+ L(t)Pv − Y(t)CT P−1 L(t)Pv − Y(t)CT

v
− Y(t)CT P−1
v CY(t) (6.85)

In order to nd the optimum observer gain L(t) which minimizes the covariance
matrix Y(t) we choose L(t) such that Y(t) decreases by the maximum amount
possible at each instant in time. This is accomplished by setting L(t) as follows:

L(t)Pv − Y(t)CT = 0 ⇔ L(t) = Y(t)CT P−1


v (6.86)

Once L(t) is set such that L(t)Pv − Y(t)CT = 0 the matrix dierential
equation (6.85) reads as follows:

Ẏ(t) = AY(t) + Y(t)AT − Y(t)CT P−1


v CY(t) + Pw (6.87)

This is Equation (6.72).

6.5 Duality principle


In the chapter dedicated to the closed-loop solution of the innite horizon Linear
Quadratic Regulator (LQR) problem we have seen that the minimization of the
cost functional J(u(t)):

1 ∞ T
Z
J(u(t)) = x (t)Qx(t) + uT (t)Ru(t)dt (6.88)
2 0
158 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Under the constraint



ẋ(t) = Ax(t) + Bu(t)
(6.89)
x(0) = x0

This leads to solving the following algebraic Riccati equation where Q =


QT ≥ 0 (thus Q is symmetric and positive semi-denite matrix), and R =
RT > 0 is a symmetric and positive denite matrix:

0 = AT P + PA − PBR−1 BT P + Q (6.90)

The constant suboptimal Kalman gain K and the suboptimal stabilizing


control u(t) are then dened as follows :

u(t) = −Kx(t)
(6.91)
K = R−1 BT P

Then let's compare the preceding relations with the following relations which
are actually those which have been seen in (6.73):

L = YCT P−1

v
(6.92)
0 = YAT + AY − YCT P−1 v CY + Pw

Then it is clear than the duality principle on Table 6.1 between observer and
controller gains apply.

Controller Observer
A AT
B CT
C BT
K LT
P = PT ≥ 0 Y = YT ≥ 0
Q = QT ≥ 0 Pw = PTw ≥ 0
R = RT > 0 Pv = PTv > 0
A − BK AT − CT LT
Table 6.1: Duality principle

6.6 Separation principle


Let e(t) be the state estimation error:

e(t) = x(t) − x
b(t) (6.93)

Using (6.109) we get the following expressions for the dynamics of the state
vector x(t):
ẋ(t) = Ax(t) + Bu(t) + w(t)
= Ax(t) − BKb x(t) + w(t)
(6.94)
= Ax(t) − BK (x(t) − e(t)) + w(t)
= (A − BK) x(t) + BKe(t) + w(t)
6.6. Separation principle 159

In addition using (6.108) and y(t) = Cx(t) + v(t) we get the following
expressions for the dynamics of the estimation error e(t) :

ė(t) = ẋ(t) − xb˙ (t) 


= Ax(t) + Bu(t) + w(t) − Ab x(t) + Bu(t) + L y(t) − Cbx(t)
= (A − LC) e(t) + w(t) − Lv(t)
(6.95)
Thus the closed-loop dynamics is dened as follows:
      
ẋ(t) A − BK BK x(t) w(t)
= + (6.96)
ė(t) 0 A − LC e(t) w(t) − Lv(t)

From equations (6.96) it is clear that the 2n eigenvalues of the closed-loop


are just the union between the n eigenvalues of the state-feedback coming from
the spectrum of A − BK and the n eigenvalues of the state estimator coming
from the spectrum of A − LC . This result is called the separation principle.
More precisely the separation principle states that the optimal control law is
achieved by adopting the following two steps procedure:

− First assume an exact measurement of the full state to solve the


deterministic Linear Quadratic (LQ) control problem which minimizes
the following cost functional J(u(t)):

1 ∞ T
Z
J(u(t)) = x (t)Qx(t) + uT (t)Ru(t)dt (6.97)
2 0

This leads to the following stabilizing control u(t) :

u(t) = −Kx(t) (6.98)

Where the Kalman gain K = R−1 BT P is obtained thanks to the positive


semi-denite solution P of the following algebraic Riccati equation:

0 = AT P + PA − PBR−1 BT P + Q (6.99)

− Then obtain an optimal estimate of the state which minimizes the


following estimated error covariance:
h i
E eT (t)e(t) = E (x(t) − x b(t))T (x(t) − x (6.100)
 
b(t))

This leads to the Kalman-Bucy lter:

d
(6.101)

x x(t) + Bu(t) + L y(t) − Cb
b(t) = Ab x(t)
dt

And the stabilizing control u(t) now reads:

u(t) = −Kb
x(t) (6.102)
160 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

The observer gain L reads:

L = YCT P−1
v (6.103)

where matrix Y = YT > 0 is the positive denite solution of the following


algebraic Riccati equation

0 = AY + YAT − YCT P−1


v CY + Pw (6.104)

It is worth noticing that the optimal state estimate is independent of Q


and R. Moreover the observer dynamics much be faster than the desired
state-feedback dynamics.
Furthermore the dynamics of the state vector x(t) is slightly modied when
compared with an actual state-feedback control u(t) = −Kx(t). Indeed
we have seen in (6.94) that the dynamics of the state vector x(t) is now
modied and depends on e(t) and w(t):

ẋ(t) = (A − BK) x(t) + BKe(t) + w(t) (6.105)

6.7 Controller transfer function


First let's assume that a full state-feedback u(t) = −Kx(t) is applied on the
following system: 
ẋ(t) = Ax(t) + Bu(t) + w(t)
(6.106)
y(t) = Cx(t) + v(t)
Then the dynamics of the closed-loop system is given by:

u(t) = −Kx(t) ⇒ ẋ(t) = (A − BK) x(t) + w(t) (6.107)

If the full state vector x(t) is assumed not to be available the control u(t) =
−Kx(t) cannot be computed. Then an observer has to be added. We recall the
dynamics of the observer (see (6.10)):

b˙ (t) = Ab (6.108)

x x(t) + Bu(t) + L y(t) − Cbx(t)

and the control u(t) = −Kx(t) has to be changed into:

u(t) = −Kb
x(t) (6.109)

Gathering (6.108) and (6.109) leads to the state space representation of the
controller:
b˙ (t)
   
x A K BK x
b(t)
(6.110)
u(t) CK DK y(t)
Where:    
AK BK A − BK − LC L
= (6.111)
CK DK −K 0
The controller transfer function K(s) is the relation between the Laplace
transform of its output, U (s), and the Laplace transform of its input, Y (s). By
6.7. Controller transfer function 161

Figure 6.3: Block diagram of the controller in the time domain and the frequency
domain

taking the Laplace transform of equation (6.108) and (6.109) (and assuming no
initial condition) we get:
(  
sX(s)
b = AX(s)
b + BU (s) + L Y (s) − CX(s)
b
(6.112)
U (s) = −KX(s)
b

We nally get:
U (s) = −K(s)Y (s) (6.113)

where the controller transfer function K(s) reads:

K(s) = K (sI − A + BK + LC)−1 L (6.114)

The preceding relation can be equivalently represented in the time domain


or the frequency domain by the block diagram shown in Figure 6.3 where:

Φ(s) := (sI − A)−1 (6.115)


162 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

6.8 Loop Transfer Recovery


6.8.1 Lack of guaranteed robustness of LQG design
According to the choice of matrix L which drives the dynamics of the error e(t)
the closed-loop may not be stable. So far LQR is shown to have either innite
gain margin (stable open-loop plant) or at least −6 dB gain margin and at
least sixty degrees phase margin. In 1978 John Doyle2 showed that all the nice
robustness properties of LQR design can be lost once the observer is added and
that LQG design can exhibit arbitrarily poor stability margins. Around 1981
Doyle along with Gunter Stein followed this line by showing that the loop shape
itself will, in general, change when a lter is added for estimation. Fortunately
there is a way of designing the Kalman-Bucy lter so that the full state-feedback
properties are recovered at the input of the plant. This is the purpose of the
Loop Transfer Recovery design. The LQG/LTR design method was introduced
by Doyle and Stein in 1981 before the development of H2 and H∞ methods
which is a more general approach to directly handle many types of modeling
uncertainties.

6.8.2 Doyle's seminal example


Consider the following state space realization:
         
ẋ1 (t) 1 1 x1 (t) 0 1
= + u(t) + w(t)


ẋ2 (t) 0 1 x2 (t) 1 1

  x1 (t) (6.116)
 y(t) = 1 0 + v(t)


x2 (t)

where w(t) and v(t) are Gaussian white noise with covariance matrices Pw
and Pv , respectively:   
1 1
 Pw = σ


1 1
(6.117)

 σ>0
Pv = 1

Let J(u(t)) be the following cost functional to be minimized :

1 ∞ T
Z
J(u(t)) = x (t)Qx(t) + uT (t)Ru(t)dt (6.118)
2 0

where:   
1 1
 Q=q


1 1
(6.119)

 q>0
R=1

Applying the separation principle the optimal control law is achieved by


adopting the following two steps procedure:
2
Doyle J.C., Guaranteed margins for LQG regulators, IEEE Transactions on Automatic
Control, Volume: 23, Issue: 4, Aug 1978
6.8. Loop Transfer Recovery 163

− First assume an exact measurement of the full state to solve the


deterministic Linear Quadratic (LQ) control problem which minimizes
the following cost functional J(u(t)):
1 ∞ T
Z
J(u(t)) = x (t)Qx(t) + uT (t)Ru(t)dt (6.120)
2 0

This leads to the following stabilizing control u(t) :

u(t) = −Kx(t) (6.121)

Where the Kalman gain K = R−1 BT P is obtained thanks to the positive


semi-denite solution P of the following algebraic Riccati equation:

0 = AT P + PA − PBR−1 BT P + Q (6.122)

We get:  
∗ ∗
P= (6.123)
α α

And:

K = R−1 BT P = α where α = 2 + (6.124)


  p
1 1 4+q >0

− Then obtain an optimal estimate of the state which minimizes the


following estimated error covariance :
h i
E eT (t)e(t) = E (x(t) − x b(t))T (x(t) − x (6.125)
 
b(t))

This leads to the Kalman-Bucy lter:


d
(6.126)

b(t) = Ab
x x(t) + Bu(t) + L(t) y(t) − Cb
x(t)
dt

And the stabilizing control u(t) now reads:

u(t) = −Kb
x(t) (6.127)

The observer gain L = YCT P−1 v is obtained thanks to the positive semi-
denite solution Y of the following algebraic Riccati equation:

0 = AY + YAT − YCT P−1


v CY + Pw (6.128)

We get:  
β ∗
Y= (6.129)
β ∗

And:

 
1
L = YC T
P−1
v =β where β = 2 + 4+σ >0 (6.130)
1
164 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Now assume that the input matrix of the plant is multiplied by a scalar gain
∆ (nominally unit) :
 
1
ẋ(t) = Ax(t) + ∆Bu(t) + w(t) (6.131)
1

In order to assess the stability of the closed-loop system we will assume no


exogenous disturbance v(t) and w(t). Then the dynamics of the closed-loop
system reads:
   
ẋ(t) x(t)
= A cl (6.132)
b˙ (t)
x x
b(t)
where, using (6.108) and (6.131):
 
  1 1 0 0
A −∆BK  0 1 −∆α −∆α 
Acl = =  (6.133)
LC A − BK − LC  β 0 1−β 1 
β 0 −α − β 1 − α

The characteristic equation of the closed-loop system is:

det (sI − Acl ) = s4 + p3 s3 + p2 s2 + p1 s + p0 = 0 (6.134)

The evaluation of coecients p3 , p2 , p1 and p0 is quite tedious. Nevertheless


coecient p0 reads;
p0 = 1 + (1 − ∆)αβ (6.135)
The closed-loop system is unstable if:

1
p0 < 0 ⇔ ∆ > 1 + (6.136)
αβ

With large values of α and β even a slight increase in the value of ∆ from
its nominal value will render the closed-loop system to be unstable. Thus the
phase margin of the LQG control-loop can be almost 0. This example clearly
shows that the robustness of the LQG control-loop to modeling uncertainty is
not guaranteed.

6.8.3 Closed-loop eigenvalues and eigenvectors


Let λ be a closed-loop eigenvalue and v the corresponding eigenvector of the LQ
state-feedback:

(λI − (A − BK)) v = 0 where K = R−1 BT P


(6.137)
⇔ λv − Av + BR−1 BT Pv = 0

Thus we can equivalently write:


 
v
BR−1 BT (6.138)
 
λI − A =0
Pv
6.8. Loop Transfer Recovery 165

Furthermore we have seen that matrix P is the positive semi-denite solution


of the following algebraic Riccati equation:

AT P + PA − PBR−1 BT P + Q = 0 (6.139)

Thus, multiplying this equality by eigenvector −v , and adding and


subtracting λPv , we get:
−AT P − PA + PBR−1 BT P − Q v + λPv − λPv

−1 T T
 =0 (6.140)
⇔ P λI − A + BR B P v − Q + λP + A P v = 0

Inserting (6.137) within (6.140) yields:

λI − A + BR−1 BT P v = 0 ⇒  Q + λP TP v = 0
 
+ A
(6.141)

 T
 v
⇔ Q λI + A =0
Pv

Finally, (6.138) and (6.141) together read:

λI − A BR−1 BT
  
v
=0
Q λI + AT Pv
(6.142)
A −BR−1 BT
   
v
⇔ λI − =0
−Q −AT Pv
The preceding relation indicates the equivalence between any closed-loop
eigenvalue λ and the corresponding eigenvector v of the LQ state-feedback
correspond and any  eigenvalue of the Hamiltonian  matrix
A −BR−1 BT
 
v
H := and the corresponding eigenvector where
−Q −AT Pv
matrix P is the positive semi-denite solution of the algebraic Riccati
equation.

6.8.4 Asymptotic behavior of Riccati equation


Now, let W be some unitary matrix (WT W = I) and M be some symmetric
positive denite matrix (M = MT > 0) such that R = ϵ2 M. Then if the transfer
function CΦ(s)B is right invertible with no unstable zeros the following relation
holds:
 R = ϵ2 M

1
Q = CT C ⇒ lim K = M−0.5 WC = R−0.5 WC (6.143)
−1 T ϵ→0 ϵ
K=R B P

Indeed, the algebraic Riccati equation becomes in that case:


( T
R = ϵ2 M = ϵM0.5 ϵM0.5


Q = CT C = CT WT WC = (WC)T (WC)
⇒ 0 = AT P + PA − PBR−1 BT P + Q
−1
= AT P + PA − PB Mϵ2 BT P + (WC)T (WC)
 −0.5 T  −0.5 
= AT P + PA − M ϵ BT P M
BT P + (WC)T (WC)
ϵ
(6.144)
166 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

If the transfer function CΦ(s)B is right invertible with no unstable zeros


then P → 0 as ϵ → 0 and the preceding equation reads as follows, where
AT P + PA has been neglected:
 T  −0.5 
M−0.5 T
0 ≈ − ϵ B P M
ϵ BTP + (WC)T (WC)
ϵ→0 (6.145)
M−0.5 T
⇒ ϵ B P ≈ WC
ϵ→0

We nally get, using the fact that R = ϵ2 M:

1 M−0.5 T 1
K = R−1 BT P = M−0.5 B P ≈ M−0.5 WC = R−0.5 WC (6.146)
ϵ ϵ ϵ→0 ϵ

This completes the proof. ■

6.8.5 Loop Transfer Recovery (LTR) design


Linear Quadratic (LQ) controller and the Kalman-Bucy lter (KF) alone have
very good robustness property. Nevertheless we have seen with Doyle's seminal
example that Linear Quadratic Gaussian (LQG) control which simultaneously
involves a Linear Quadratic (LQ) controller and a Kalman-Bucy lter (KF) does
not have any guaranteed robustness. Therefore the LQG / LTR design tries to
recover a target open-loop transfer function. The target loop transfer function
is either:

− the Linear Quadratic (LQ) control open-loop transfer function, which is


KΦ(s)B

− or the Kalman-Bucy lter (KF) open-loop transfer function, which is


CΦ(s)L.

Let ρ be a parameter design of either design matrix Q or matrix Pw and


F(s) the transfer function of the plant:

F(s) = CΦ(s)B (6.147)

Then two types of Loop Transfer Recovery are possible:

− Input recovery: let ωc be the cut-o frequency (i.e. 0 dB) of the targeted
dynamics. The objective is to tune ρ such that:

lim K(s)F(s) ≈ KΦ(s)B| (6.148)


ρ→∞ s = jω s = jω
ω < ωc ω < ωc

The objective of the input recovery design is shown in Figure 6.4. The
corresponding objective in the state space domain is the following:

ẋ(t) = Ax(t) + Bu(t)
(6.149)
u(t) = −Kx(t) + r(t)
6.8. Loop Transfer Recovery 167

Figure 6.4: Input recovery objective

− Output recovery: let ωc be the cut-o frequency (i.e. 0 dB) of the targeted
dynamics. The objective is to tune ρ such that:

lim F(s)K(s) ≈ CΦ(s)L| (6.150)


ρ→∞ s = jω s = jω
ω < ωc ω < ωc

The objective of the output recovery design is shown in Figure 6.53 . The
corresponding objective in the state space domain is the following:

b˙ (t) = Ab
 
x x(t) + L r(t) − b
y (t)
(6.151)
y (t) = Cb
b x(t)

We recall that initial design matrices Q0 and R0 are set to meet control
requirements whereas initial design matrices Pw0 and Pv0 are set to meet
observer requirements. Let ρ be a parameter design of either design matrix Pw
or matrix Q. Weighting parameter ρ is tuned to make a trade-o between
initial performances and stability margins and is set according to the type of
Loop Transfer Recovery:

− Input recovery: a new observer design with the following design matrices:

Pw = Pw0 + ρ2 BBT

(6.152)
Pv = Pv0
3
Ronaldo Waschburger and Karl Heinz Kienitz, A root locus approach to loop transfer
recovery based controller design, 13th International Conference on Control Automation
Robotics & Vision (ICARCV), 2014
168 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Figure 6.5: Output recovery objective

− Output recovery: a new controller is designed with the following design


matrices:
Q = Q0 + ρ2 CT C

(6.153)
R = R0

The preceding relation is simply obtained by applying the duality


principle.
From a practical point of view, design parameter ρ is increased until
satisfactory robust properties of the loop transfer function are achieved. It is
worth noticing that to apply Loop Transfer Recovery (LTR) the transfer
function CΦ(s)B shall be minimum phase (i.e. no zero with positive real part)
and square (meaning that the system has the same number of inputs and
outputs).
Example 6.2. Let's the double integrator plant:
  
0 1

 A=
 0 0



0 (6.154)
B=
 1



 
C= 1 0

Let:   
 K =  k1  k2
l1 (6.155)
 L=
l2
Then the controller transfer function is given by (6.114):
K(s) = K (sI − A + BK + LC)−1 L
 −1  
 s + l1 −1 l1
(6.156)

= k1 k2
k1 + l2 s + k2 l2
(k1 l1 +k2 l2 )s+k1 l2
= s2 +(k2 +l1 )s+k2 l1 +k1 +l2
6.9. Proof of the Loop Transfer Recovery condition 169

From (6.153) we set Q and R as follows:



 Q0 := 0  2 
ρ 0
2 T 2 T
Q = Q0 + ρ C C ⇒ Q = ρ C C = (6.157)
0 0
R = R0 := 1

The Kalman gain K = R−1 BT P is then obtained thanks to the positive


semi-denite solution P of the following algebraic Riccati equation:
 
∗ ∗
0= AT P
+ PA − PBR−1 BT P
+Q⇒P= √
ρ 2ρ (6.158)
−1 T
 √   
⇒K=R B P= ρ 2ρ := k1 k2
Consequently:
(k1 l1 +k2 l2 )s+k1 l2
lim K(s) = lim 2
ρ→∞ ρ→∞ s +(k2 +l1 )s+k

2 l1 +k1 +l2
(ρl1 + 2ρl2 )s+ρl2
= lim 2 √ √
ρ→∞ s +( 2ρ+l1 )s+ 2ρl1 +ρ+l2 (6.159)
= lim ρl1 s+ρl
ρ
2
ρ→∞
= l1 s + l2
The transfer function of the plant reads:
1
F(s) = CΦ(s)B = C (sI − A)−1 B = (6.160)
s2
Therefore:
l1 s + l2
lim K(s)F(s) = (6.161)
ρ→∞ s2
Note that:
l1 s + l2
CΦ(s)L = (6.162)
s2
Therefore the loop transfer function has been recovered.

6.9 Proof of the Loop Transfer Recovery condition


6.9.1 Loop transfer function with observer
The Loop Transfer Recovery design procedure tries to recover a target loop
transfer function, here the open-loop full state LQ control, despite the use of
the observer.
The lecture of Faryar Jabbari, from the Henry Samueli School of
Engineering, University of California, is the primary source of this section4 .
We will rst show what happen when adding an observer-based closed-loop
on the following system where y(t) is the actual output of the system (not the
controlled output): 
ẋ(t) = Ax(t) + Bu(t)
(6.163)
y(t) = Cx(t)
4
http://mae2.eng.uci.edu/~fjabbari//me270b/chap9.pdf
170 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Figure 6.6: Block diagram of open-loop transfer function

Figure 6.7: Broken state-feedback loop

Taking the Laplace transform and assuming no initial condition, we get:


 
sX(s) = AX(s) + BU (s) X(s) = Φ(s)BU (s)
⇒ (6.164)
Y (s) = CX(s) Y (s) = CX(s)

where:
Φ(s) = (sI − A)−1 (6.165)
The preceding relations can be represented by the block diagram in Figure
6.6. Let K be a full state-feedback gain matrix such that the closed-loop system
is asymptotically stable, i.e. the eigenvalues of A − BK lie in the left half s-
plane, and the open-loop transfer function when the loop is broken at the input
point of the given system meets some given frequency dependent specications.
The state feedback control uf with full state available is:

uf (t) = −Kx(t) ⇔ U f (s) = −KX(s) (6.166)

We will focus on the regulator problem and thus r = 0. As shown in Figure


6.7 the loop transfer function is evaluated when the loop is broken at the input
point of the system. The so-called target loop transfer function Lt (s) is dened
as follows:
U f (s) = Lt (s)U (s) where Lt (s) = −KΦ(s)B (6.167)
If the full state vector x(t) is assumed not to be available, the control u(t) =
−Kx(t) cannot be computed. We then add an observer with the following
expression:
b˙ (t) = Ab (6.168)

x x(t) + Bu(t) + L y(t) − Cb x(t)
The observer-based state-feedback control uo is:

uo (t) = −Kb
x(t) (6.169)
6.9. Proof of the Loop Transfer Recovery condition 171

Figure 6.8: Broken observer-based feedback loop

Taking the Laplace transform results in:


(  
sX(s)
b = AX(s)
b + BU (s) + L Y (s) − CX(s)
b
(6.170)
U (s) = −KX(s)
o
b

or, equivalently:
(   
X(s)
b = (sI − A)−1 BU (s) + L Y (s) − CX(s)
b
(6.171)
U (s) = −KX(s)
o
b

Usually equation (6.171) is not the same than (6.166). Relation (6.171) can
be represented by the block diagram in Figure 6.8. The loop transfer function
evaluated when the loop is broken at the input point of the closed-loop system
becomes:
−1
= Φ(s)−1 + LC

X(s)
b (BU (s) + LY (s))
Y (s) = CΦ(s)BU (s)
−1
⇒ X(s)
b = Φ(s)−1 + LC (BU (s) + LCΦ(s)BU (s))
−1 (6.172)
= Φ(s)−1 + LC (B + LCΦ(s)B) U (s)
−1
= Φ(s)−1 + LC

BU (s)
−1
−1
+ Φ(s) + LC LCΦ(s)BU (s)

Finally, the actual loop transfer function La (s) reads as follows:

U o (s) = La (s)U (s)


−1
where La (s) = −K Φ(s)−1 + LC (B + LCΦ(s)) B (6.173)

6.9.2 Loop Transfer Recovery (LTR) condition


Loop Transfer Recovery (LTR) will be achieved when the loop transfer
function with state-feedback and with state-based observer are equal, that is
when La (s) = Lt (s), where loop transfer functions Lt (s) and La (s) are given
by (6.167) and (6.173) respectively.
172 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

The matrix inversion lemma5 is the equation:

(A − BD−1 C)−1 = A−1 + A−1 B(D − CA−1 B)−1 CA−1 (6.174)

Simple manipulations show that:


−1  
Φ(s)−1 + LC = Φ(s) I − L (I + CΦ(s)L)−1 CΦ(s) (6.175)

So we have for the rst term to the right of equation (6.172):


−1  
Φ(s)−1 + LC B = Φ(s) I − L (I + CΦ(s)L)−1 CΦ(s)B (6.176)

And for the second term to the right of equation (6.172):


−1  
Φ(s)−1 + LC LCΦ(s)B = Φ(s) I − L (I + CΦ(s)L)−1 CΦ(s) LCΦ(s)B
 
= Φ(s) L − L (I + CΦ(s)L)−1 CΦ(s)L CΦ(s)B
 
= Φ(s)L I − (I + CΦ(s)L)−1 CΦ(s)L CΦ(s)B
(6.177)
In addition, applying again the matrix inversion lemma to the following
equality, we have:

(I + A)−1 = I − (I + A)−1 A ⇔ (I + A)−1 A = I − (I + A)−1 (6.178)

Thus :
(I + CΦ(s)L)−1 CΦ(s)L = I − (I + CΦ(s)L)−1 (6.179)
Applying this result to equation (6.177) leads to:
−1
Φ(s)−1 + LC LCΦ(s)B = Φ(s)L (I + CΦ(s)L)−1 CΦ(s)B (6.180)

And here comes the light ! Indeed, if we impose:

L (I + CΦ(s)L)−1 = B (CΦ(s)B)−1 (6.181)

Then equations (6.176) and (6.180) become:


 
−1 + LC −1 B = Φ(s) I − L (I + CΦ(s)L)−1 CΦ(s)B
 


 Φ(s)

= Φ(s) (I − B)
−1



−1
Φ(s) + LC LCΦ(s)B = Φ(s)L (I + CΦ(s)L)−1 CΦ(s)B

= Φ(s)B
(6.182)
and thus, when summing those two terms, (6.172) reads:

X(s)
b = Φ(s)BU (s) (6.183)
5
D. J. Tylavsky, G. R. L. Sohie, Generalization of the matrix inversion lemma, Proceedings
of the IEEE, Year: 1986, Volume: 74, Issue: 7, Pages: 1050 - 1052
6.9. Proof of the Loop Transfer Recovery condition 173

We nally get:

U o (s) = −KX(s)
b = −KΦ(s)BU (s) (6.184)

That is, we get for U o (s) the same expression than the expression obtained
through the full state-feedback given in (6.166).
As a conclusion, the Loop Transfer Recovery (LTR) is achieved when the
loop transfer function with state-feedback and with state-based observer are
equal, that is when La (s) = Lt (s). This property is achieved as soon as U o (s)
has the same expression as the full state-feedback U f (s), that is when the
following relation holds:

L (I + CΦ(s)L)−1 = B (CΦ(s)B)−1 (6.185)

6.9.3 Setting the Loop Transfer Recovery design parameter


Condition (6.185) is not an easy condition to satisfy. The traditional approaches
to this problem is to design matrix L of the observer such that the condition is
satised asymptotically and ρ a design parameter.
One way to asymptotically satisfy (6.185) is to set L such that:

L
lim = BW0 (6.186)
ρ→∞ ρ

where W0 is a non-singular matrix.


Indeed in this case we have:

L (I + CΦ(s)L)−1 = L
ρ ρ(I + CΦ(s)L)
−1

L 1 L
−1 (6.187)
= ρ ρ I + CΦ(s) ρ

Thus, as ρ → ∞:
 −1
lim L (I + CΦ(s)L)−1 = lim L 1
+ CΦ(s) Lρ
ρI
ρ→∞ ρ→∞ ρ
 −1
= lim Lρ CΦ(s) Lρ (6.188)
ρ→∞
−1
= BW0 (CΦ(s)BW0 )
= B (CΦ(s)B)−1

Now let's concentrate how (6.186) can be achieved. First we have seen in
(6.143) that if the transfer function CΦ(s)B is right invertible with no unstable
zeros then for some unitary matrix W (WT W = I) and some symmetric positive
denite matrix M (M = MT > 0), the asymptotic value of feedback gain K
reads as follows, where ϵ has been replaced by ρ1 :
 M
 R = ρ2
Q = CT C ⇒ lim K = ρM−0.5 WC = R−0.5 WC (6.189)
 −1 T ρ→∞
K=R B P
174 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Applying the duality principle we have the same result for the asymptotic
value of the observer gain L:
 M
 Pv = ρ2
P = BBT ⇒ lim L = ρBWM−0.5 = BWP−0.5 (6.190)
 w ρ→∞ v
L = YCT P−1 v

Then, we concentrate on input recovery (6.152). We design a new observer


with the following design matrices:
Pw = Pw0 + ρ2 BBT

(6.191)
Pv = Pv0

Then if we replace Pv by Pv0 and Pw = BBT by Pw = Pw0 + ρ2 BBT in


(6.190), the asymptotic value of the observer gain L reads as follows:

 Pv = Pv0 L
Pw = Pw0 + ρ2 BBT ⇒ lim L = ρBWP−0.5 v0 ⇒ lim = BWPv0−0.5
T −1 ρ→∞ ρ→∞ ρ
L = YC Pv

(6.192)
−0.5
By setting W0 := WPv0 we nally get (6.186):
L
lim = BW0 (6.193)
ρ→∞ ρ

6.10 Robust control design


Robust control problems, and especially H2 robust control problems, are solved
in a dedicated framework presented in Figure 6.9 where:
− G(s) is the transfer function of the generalized plant;
− K(s) is the transfer function of the controller ;

− u is the control vector of the generalized plant G(s) which is computed


by the controller K(s);

− w is the input vector formed by exogenous inputs such as disturbances or


noise;

− y is the vector of output available for the controller K(s);

− z is the performance output vector, also-called the controlled output, that


is the vector that allows to characterize the performance of the closed-
loop system. This is a virtual output used only for design that we wish to
maintain as small as possible.
It is worth noticing that in the standard feedback control loop in Figure 6.9
all reference signals are set to zero.
The H2 control problem consists in nding the optimal controller K(s) which
minimizes ∥Tzw (s)∥2 , that is the H2 norm of the transfer between the exogenous
inputs vector w and the vector of interest variables z .
6.10. Robust control design 175

Figure 6.9: Standard feedback control loop

The general form of the realization of a plant is the following:


    
ẋ(t) A B1 B2 x(t)
 z(t)  =  C1 0 D12   w(t)  (6.194)
y(t) C2 D21 0 u(t)

Linear-quadratic-Gaussian (LQG) control is a special case of H2 optimal


control applied to stochastic system.
Let's consider the following system realization:

ẋ(t) = Ax(t) + B2 u(t) + d(t)
(6.195)
y(t) = C2 x(t) + n(t)

Where d(t) and n(t) are white noise with the intensity of their
autocorrelation function equals to Wd and Wn respectively. Denoting by E()
the mathematical expectation we have:
    
d(t)  T Wd 0
T (6.196)

E d (τ ) n (τ ) = δ(t − τ )
n(t) 0 Wn

The LQG problem consists in nding a controller u(s) = K(s)y(s) such that
the following performance index is minimized:
 Z T 
T T
(6.197)

JLQG = E lim x (t)Qx(t) + u (t)Ru(t) dt
T →∞ 0

Where matrices Q ans R are symmetric and (semi)-positive denite


matrices:
Q = QT ≥ 0

(6.198)
R = RT > 0
This problem can be cast as the H2 optimal control framework in the
following manner. Dene signal z(t) whose norm is to be minimized as follows:
 0.5  
Q 0 x(t)
z(t) = (6.199)
0 R0.5 u(t)
176 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

And represent the stochastic inputs d(t) and n(t) as a function of the vector
w(t) of exogenous disturbances :

Wd0.5
   
d(t) 0
= w(t) (6.200)
n(t) 0 Wn0.5

Where w(t) is a white noise process of unit intensity. Then the LQG cost
function reads as follows:
 Z T 
JLQG = E lim z (t)z(t)dt = ∥Tzw (s)∥22
T
(6.201)
T →∞ 0

And the generalized plant reads as follows:


    
ẋ(t) A B1 B2 x(t)
 z(t)  =  C1 0 D12   w(t)  (6.202)
y(t) C2 D21 0 u(t)

Where:
B1 = Wd0.5 0
  





  0.5 

 Q

 C1 =
0



  (6.203)
0


D12 =


R0.5








D21 = 0 Wn0.5
  

It follows that:

B1 w(t) = Wd0.5 0 w(t) = d(t)


  
(6.204)
D21 w(t) = 0 Wn0.5 w(t) = n(t)

And:

z T (t)z(t) = (C1 x(t) + D12 w(t))T (C1 x(t) + D12 w(t))


(6.205)
= xT (t)Qx(t) + uT (t)Ru(t)

Thus costs (6.197) and (6.201) are equivalent.

6.11 Sensor data fusion


6.11.1 Complementary lter
Sensor data fusion considers the problem to integrate redundant measurement
information from separate sensor systems.
The basic idea of complementary lter consists in taking the measurements
of two sensors, ltering out low-frequency and high-frequency noises of each
sensor, and combining the ltered outputs to get a better estimate of the signal
6.11. Sensor data fusion 177

Figure 6.10: Complementary lter implementing fusion between baro-altimeter


and vertical accelerometer measurements

of interest. An example of two sensors that complement each other are baro-
altimeter and vertical accelerometer.
Let y1 (t) and y2 (t) noisy measurements of some signal y(t), coming for
example from a baro-altimeter and a vertical accelerometer, respectively.
Denoting by v(t) some low frequency zero mean noise process, by w(t) some
high frequency zero mean noise process and by s the Laplace variable, we will
assume that:
 
y1 (t) = y(t) + v(t) Y1 (s) = Y (s) + V (s)
⇔ (6.206)
y2 (t) = ÿ(t) + w(t) Y2 (s) = s2 Y (s) + W (s)

The complementary lter that implements the fusion between the two
measurements is shown in Figure 6.106 .
From Figure 6.10, the expression of Yb (s) reads:
  
Yb (s) = 1s k1 Y1 (s) − Yb (s)
  
+ 1s Y2 (s) + k0 Y1 (s) − Yb (s)
    (6.207)
⇔ 1 + ks1 + ks20 Yb (s) = ks1 + ks20 Y1 (s) + s12 Y2 (s)
⇔ Yb (s) = 2k1 s+k0 Y1 (s) + 2 1
s +k1 s+k0
Y2 (s)
s +k1 s+k0

Using the fact that Y1 (s) = Y (s) + V (s) and Y2 (s) = s2 Y (s) + W (s), we
nally get:
k1 s+k0 1
Yb (s) = Y (s) + V (s) + W (s)
s2 +k1 s+k0 s2 +k1 s+k0
1−F (s) (6.208)
:= Y (s) + F (s) V (s) + s2
W (s)
where:
k1 s + k0 s2
F (s) := ⇔ 1 − F (s) := (6.209)
s2 + k1 s + k0 s2 + k1 s + k0
Transfer function F (s) is a low-pass lter with unity static gain whereas
1 − F (s) is a high-pass lter.
6
W. T. Higgins, A Comparison of Complementary and Kalman Filtering, IEEE
Transactions on Aerospace and Electronic Systems, vol. AES-11, no. 3, pp. 321-325, May
1975, doi: 10.1109/TAES.1975.308081.
178 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Figure 6.11: Equivalent lter implementing fusion between baro-altimeter and


vertical accelerometer measurements

Thanks to the expression of F (s), relation (6.207) can also be written as


follows:  
Yb (s) = F (s) Y1 (s) + 1−F (s)
Y2 (s)
s 2
(6.210)
= s2 Y2 (s) + F (s) Y1 (s) − s12 Y2 (s)
1


Relation (6.210) leads to an equivalent version of the complementary lter


shown in Figure 6.11 where the low-pass lter F (s) operates only on the noises.

6.11.2 Kalman lter


More generally, there are two measurements for sensor data fusion problems,
y 1 (t) and y 2 (t), and one of it serves as an input to the state equation, which is
seen as the process model. Denoting v(t) the random process which represents
the measurement noise and x e(t) the noisy state vector, we have:
(
e˙ (t) = A x
x e(t) + B y 2 (t) (process)
(6.211)
y 1 (t) = C x(t) + v(t) (measurement)

Take care that in the measurement equation, this is the actual state vector
x(t) which is used, not the noisy state vector xe(t) of the process equation.
Furthermore, denoting w(t) the random processes which represents the
process noise, and y 2 (t) := u(t) + w(t), we nally get:

e˙ (t) = A x

x e(t) + B (u(t) + w(t))
y 2 (t) := u(t) + w(t) ⇒ (6.212)
y 1 (t) = C x(t) + v(t)

Assuming no noise, v(t) = w(t) = 0, then x e(t) changes into its noiseless
value x(t) and actual measurement y 1 (t) changes into its noiseless value y(t).
Then we get the following noiseless state equation:

ẋ(t) = A x(t) + B u(t)
v(t) = w(t) = 0 ⇒ (6.213)
y(t) = C x(t)

The error equations reads as follows where δx(t) is the error state vector and
δy(t) the error output vector. Note that we dene δy(t) as δy(t) := y 1 (t)−C xe(t)
to be compliant with Figure 6.11:
 
e(t) − x(t)
δx(t) := x δ ẋ(t) = A δx(t) + B w(t)
⇒ (6.214)
δy(t) := y 1 (t) − C x
e(t) δy(t) = −C δx(t) + v(t)
6.11. Sensor data fusion 179

According to (6.10), the dynamics of the observer, which is actually a


Kalman-Bucy lter, reads:

b˙ (t) = A δb

δx x(t) + L(t) δy(t) − (−C) δbx(t) (6.215)
x(t) + L(t) δy(t) + C δb
= A δb x(t)

The time dependent observer gain L(t), also-called Kalman gain, is similar
to (6.71):
L(t) = Y(t) (−C)T P−1
v = −Y(t)C Pv
T −1
(6.216)
where matrix Y(t) is the solution of the following dierential Riccati equation
(see (6.72) where Pw has been replaced by BPw BT , that is the covariance of
noise B w(t)):

Ẏ(t) = AY(t) + Y(t)AT − Y(t)CT P−1


v CY(t) + BPw B
T
(6.217)

The steady-state Kalman lter is achieved when Ẏ(t) = 0. Then the


preceding dierential Riccati equation turns to be the algebraic Riccati
equation and matrix Y is the constant positive solution of the following
algebraic Riccati equation:
0 = AY + YAT − YCT P−1
v CY + BPw B
T
(6.218)

Furthermore the actual estimate of the signal reads:

e(t) − x
x(t) = x
δb b(t) ⇔ x e(t) − δb
b(t) = x x(t) (6.219)

By taking the time derivative of the preceding equation, and using (6.211)
and (6.214), we get the state equation for the estimate of the actual state vector
b(t):
x

b˙ (t) = x
x e˙ (t) − δ x
b˙ (t) 
= Ax e(t) + B y 2 (t) − A δb x(t) + L(t) δy(t) + C δb
x(t)
= A (e x(t) − δb x(t)) (6.220)
 + B y 2 (t) 
+L(t) y 1 (t) − 
Cx
e(t)
 + C (x
 −x
e(t)
 b(t))

We nally get:
 
b˙ (t) = A x
x b(t) + B y 2 (t) + L(t) y 1 (t) − C x
b(t) (6.221)

6.11.3 Relation between complementary lter and Kalman


lter
By taking the Laplace transform of (6.221), and assuming that L(t) is constant,
we get:
 
L(t) := L ⇒ s X(s)
b = A X(s)
b + B Y 2 (s) + L Y 1 (s) − C X(s)
b
(6.222)
⇒ X(s)
b = (sI − (A − LC))−1 (L Y (s) + B Y (s))
1 2
180 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Then let yb(t) = C x


b(t). Multiplying (6.222) by C yields:

C X(s)
b = Yb (s) = C (sI − (A − LC))−1 (L Y 1 (s) + B Y 2 (s)) (6.223)

Comparing the preceding relation with (6.210), we conclude that


complementary lter and Kalman lter are equivalent. Furthermore the
transfer function F (s) of the low-pass lter reads as follows:

F (s) = C (sI − (A − LC))−1 L (6.224)

Example 6.3. In the specic case of sensor fusion between baro-altimeter and
vertical accelerometer presented in (6.206), the state vector can be chosen as
follows, assuming no noise:

x1 (t) := y(t)
(6.225)
x2 (t) := ẏ(t)

Then (6.211) reads:


 
e˙ 1 (t)
    
 x 0 1 x
e1 (t) 0


˙ = (ÿ(t) + w(t))
 x

 e2 (t) 0 0 x
e2 (t) 1
e(t) + B y 2 (t)
:= A x (6.226)
  


 y 1
(t) = 1 0 x(t) + v(t)

 := C x(t) + v(t)

Let L be the steady state observer gain, also-called steady state Kalman gain,
which is obtained as follows:
L = −YCT P−1
v (6.227)

where matrix Y is the constant positive solution of the following algebraic Riccati
equation:
0 = AY + YAT − YCT P−1
v CY + BPw B
T
(6.228)
Then, according to (6.224), the transfer function F (s) of the low-pass lter
reads as follows:
 
k1
L := ⇒ F (s) = C (sI − (A − LC))−1 L
k0
 s + k1 −1 −1 k1 (6.229)
   

= 1 0
k0 s k0
k1 s+k0
= s2 +k1 s+k0

We then retrieve the expression of F (s) obtained in (6.209).


More generally, and following Higgins6 , typical application of the Kalman


lter in navigation systems extends Figure 6.11 as shown in Figure 6.12,
although, as seen before, the actual implementation may be dierent. Note
that the Kalman lter just operates on noises and is not aected by actual
signals that are to be estimated.
6.12. Euler angles estimation 181

Figure 6.12: Typical application of the Kalman lter in inertial navigation

6.12 Euler angles estimation


6.12.1 One dimensional attitude estimation
System description
We consider the simple pendulum in Figure 6.13 tted with an accelerometer
and a gyroscope in the ball at the end of the arm.

Figure 6.13: Simple pendulum

Accelerometer measurement
First, we compute the position, velocity and acceleration in the inertial frame:

− Inertial position: 
x = L sin(θ)
(6.230)
z = −L cos(θ)

− Inertial velocity: 
ẋ = L θ̇ cos(θ)
(6.231)
ż = L θ̇ sin(θ)

− Inertial acceleration:

ẍ = L θ̈ cos(θ) − L θ̇2 sin(θ)



(6.232)
z̈ = L θ̈ sin(θ) + L θ̇2 cos(θ)
182 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Let ax and az the x-component and z -component provided by the


accelerometer. Because the accelerometer is linked to the body frame, and
denoting by Rbi the rotation matrix from the inertial frame to the body frame,
its provides the following data, known as specic acceleration :
     
ax ÿ 0
= Ri b
− (6.233)
az z̈ −g

where:  
cos(θ) sin(θ)
Rbi = (6.234)
− sin(θ) cos(θ)
Thus:    
ax L θ̈ + g sin(θ)
= (6.235)
az L θ̇2 + g cos(θ)

Then neglecting θ̇2 and θ̈ we get:

ax L θ̈ + g sin(θ) g sin(θ)
= ≈ = tan(θ) (6.236)
az L θ̇2 + g cos(θ) g cos(θ)

Thus a rst approximation of angle θ, provided by the accelerometer, is θa


where:  
ax
θa ≈ arctan (6.237)
az

Attitude estimation problem


For this one dimensional example, the gyroscope provides q := θ̇. Then
attitude estimation problem consists in computing an estimate of θ from noisy
measurements of ax , az and q .

6.12.2 One dimensional complementary lter


Sensor data fusion considers the problem to integrate redundant measurement
information from separate sensor systems.
The basic idea of complementary lter consists in taking the measurements
of two sensors, ltering out low-frequency and high-frequency noises of each
sensor, and combining the ltered outputs to get a better estimate of the signal
of interest. An example of two sensors that complement each other are gyro
and accelerometer.
Let y1 (t) and y2 (t) noisy measurements of some signal y(t), coming for
example from an accelerometer and a gyroscope, respectively. Denoting by
v(t) some low frequency zero mean noise process, by w(t) some high frequency
zero mean noise process and by s the Laplace variable, we will assume that:
 
y1 (t) = y(t) + v(t) Y1 (s) = Y (s) + V (s)
⇔ (6.238)
y2 (t) = ẏ(t) + w(t) Y2 (s) = s Y (s) + W (s)

For the example of section 6.12.1, we have y1 (t) := θa (t) and y2 (t) := q(t).
6.12. Euler angles estimation 183

Figure 6.14: Complementary lter implementing fusion between gyro and


accelerometer measurements

The complementary lter that implements the fusion between the two
measurements is shown in Figure 6.147 .
From Figure 6.14, the expression of Yb (s) reads:
  
1
 
Yb (s) = s Y2 (s) + k0 Y1 (s) − Yb (s)
⇔ 1+ k0
Yb (s) = k0 1 (6.239)
s s Y1 (s) + s Y2 (s)
k0 1
⇔ Yb (s) = s+k0 Y1 (s) + s+k0 Y2 (s)

Using the fact that Y1 (s) = Y (s) + V (s) and Y2 (s) = s Y (s) + W (s), we
nally get:

k0 1
Yb (s) = Y (s) + s+k V (s) + s+k W (s)
0 0
1−F (s) (6.240)
:= Y (s) + F (s) V (s) + s W (s)
where:
k0 s
F (s) := ⇔ 1 − F (s) := (6.241)
s + k0 s + k0
Transfer function F (s) is a low-pass lter with unity static gain whereas
1 − F (s) is a high-pass lter.
In the continuous time domain, (6.239) reads:

k0 1
Yb (s) = s+k 0
Y1 (s) + s+k 0
Y2 (s)
⇔ k0 Y (s) + s Y (s) = k0 Y1 (s) + Y2 (s)
b b (6.242)
d
⇒ k0 yb(t) + dt yb(t) = k0 y1 (t) + y2 (t)

In order to discretize this continuous time complementary lter, we have to


nd an approximation of the dierentiation operator dtd
. Let Ts be the sampling
period and denote z the sample period delay operator. Several options can
−1

7
W. T. Higgins, A Comparison of Complementary and Kalman Filtering, IEEE
Transactions on Aerospace and Electronic Systems, vol. AES-11, no. 3, pp. 321-325, May
1975, doi: 10.1109/TAES.1975.308081.
184 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

be used to approximate the dierentiation operator. Using backward dierence


we get:
d 1 − z −1
≈ (6.243)
dt Ts
Then (6.242) is approximated as follows at discrete-time t = k Ts :
k0 yb(k Ts ) + yb˙ (k Ts ) = k0 y1 (k Ts ) + y2 (k Ts )
(6.244)
⇒ k0 yb(k Ts ) + yb(k Ts )−byT((k−1)
s
Ts )
≈ k0 y1 (k Ts ) + y2 (k Ts )
Usually the sampling period is omitted in the expression of the discrete-time,
and the preceding relation is written as follows:
ybk − ybk−1
k0 ybk + ≈ k0 y1k + y2k (6.245)
Ts
We nally get:
1
ybk ≈ yk−1 + Ts (k0 y1k + y2k ))
(b (6.246)
1 + k0 Ts
Or equivalently:
1
yk−1 + Ts y2k ) + (1 − α) y1k where α :=
ybk ≈ α (b (6.247)
1 + k0 Ts

6.12.3 Direct Cosine Matrix (DCM) and kinematic relations


Let xi be a vector expressed in the inertial frame, xb a vector expressed in the
body frame and Rbi (η) the rotation matrix, also called Direct Cosine Matrix
(DCM), from the inertial frame to the body frame:
xb = Rbi (η)xi (6.248)

Rotation matrix Rbi (η) is obtained by the multiplication of the rotation


matrices around Euler angles, namely yaw angle ψ , pitch angle θ and then roll
angle ϕ, respectively. Denoting cx = cos(x), sx = sin(x) and Ry the rotation
matrix dedicated to angle y we get:
Rbi (η) = 
Rϕ Rθ Rψ   
1 0 0 cθ 0 −sθ cψ sψ 0
=  0 cϕ sϕ   0 1 0   −sψ cψ 0 
 0 −sϕ cϕ sθ 0 cθ 0 0 1 (6.249)
cθ cψ cθ sψ −sθ
= (sϕ sθ cψ − cϕ sψ )
 (sϕ sθ sψ + cϕ cψ ) sϕ cθ 
(cϕ sθ cψ + sϕ sψ ) (cϕ sθ sψ − sϕ cψ ) cϕ cθ
It is worth noticing that Rbi (η) is an orthogonal matrix. Consequently the
rotation matrix Rib (η) from the body frame to the inertial frame is obtained as
follows:
−1 T
Rib (η) := Rbi (η) = Rbi (η) 
cθ cψ (sϕ sθ cψ − cϕ sψ ) (cϕ sθ cψ + sϕ sψ )
(6.250)
=  cθ sψ (sϕ sθ sψ + cϕ cψ ) (cϕ sθ sψ − sϕ cψ ) 
−sθ sϕ cθ cϕ cθ
6.12. Euler angles estimation 185

The relation between the angular velocities (p, q, r) in the body frame and
the time derivative of the Euler angles (ϕ, θ, ψ) is the following:
       
p ϕ̇ 0 0
ν :=  q  =  0  + Rϕ  θ̇  + Rϕ Rθ  0  (6.251)
r 0 0 ψ̇

We nally get:
    
p 1 0 − sin(θ) ϕ̇
 q  =  0 cos(ϕ) sin(ϕ) cos θ   θ̇  (6.252)
r 0 − sin(ϕ) cos(ϕ) cos θ ψ̇

That is:
ν = W(η) η̇ (6.253)
where:  
ϕ
η :=  θ  (6.254)
ψ
and:  
1 0 − sin(θ)
W(η) =  0 cos(ϕ) sin(ϕ) cos(θ)  (6.255)
0 − sin(ϕ) cos(ϕ) cos(θ)
It is worth noticing that the preceding relation can be obtained from the
following equality which simply states that the time derivative of matrix Rib (η)
can be seen as matrix Ω(ν) of the angular velocities in the body frame expressed
in the inertial frame:
 
0 −r q
d i
R (η) = Rib (η) Ω(ν) where Ω(ν) = −Ω(ν)T =  r 0 −p ] (6.256)
dt b
−q p 0

Conversely we have:
η̇ = W(η)−1 ν (6.257)
where:  
1 sin(ϕ) tan(θ) cos(ϕ) tan(θ)
W(η)−1 = 0 cos(ϕ) − sin(ϕ)  (6.258)


sin(ϕ) cos(ϕ)
0 cos(θ) cos(θ)

6.12.4 Roll and pitch angles estimation from accelerometer


measurements
Let g be the gravitational acceleration, ai the acceleration in the inertial frame,
ab the acceleration in the body frame and Rbi the rotation matrix from the
inertial frame to the body frame. The accelerometer provides the following
measurement, called the specic acceleration:

am = Rbi ai − g (6.259)

186 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

Denoting v i the velocity in the inertial frame and v b the velocity in the body
frame, we have the following relation where Rib the rotation matrix from the
body frame to the inertial frame:

v i = Rib v b (6.260)

Thus, after derivation:

d i
ai := v = Rib v̇ b + Ṙib v b (6.261)
dt

Thus the specic acceleration am in (6.259) reads:


 
am = Rbi Rib v̇ b + Ṙib v b − Rbi g
(6.262)
= v̇ b + Rbi Ṙib v b − Rbi g

Once the computation achieved, we get the following expression for the
measurements provided by a 3-axis accelerometer8 :
        
ax u̇ 0 w −v p − sin(θ)
am =  ay  =  v̇  +  −w 0 u   q  − g  cos(θ) sin(ϕ) 
az ẇ v −u 0 r cos(θ) cos(ϕ)
(6.263)
The last term of equation (6.263) can be used to approximate roll angle ϕ
and pitch angle θ as follows:
  
a
 ϕ ≈ arctan ayz

(6.264)
 
 θ ≈ arctan √ a2x

ay +a2z

Note that if the Inertial Measurement Unit (IMU) is not situated at the
center of mass, then the accelerometers coordinates (lx , ly , lz ) along each axis in
the body frame with its origin at the center of gravity shall be taken into account
and the measurements (6.263) provided by a 3-axis accelerometer becomes8 :

        
ax u̇ 0 w −v p − sin(θ)
am =  ay  =  v̇  +  −w 0 u  q  − g  cos(θ) sin(ϕ) 
az ẇ v −u 0 r cos(θ) cos(ϕ)
−r2 − q 2 p q − r2
  
p r + q̇ lx
+  p q + ṙ −p2 − r2 r q − ṗ   ly  (6.265)
p r − q̇ r q + ṗ −q 2 − p2 lz
8
Marian J. Blachuta and Rafal T. Grygiel and Roman Czyba and Grzegorz Szafranski,
Attitude and heading reference system based on 3D complementary lter, 2014 19th
International Conference on Methods and Models in Automation and Robotics (MMAR)
6.12. Euler angles estimation 187

6.12.5 Yaw angle estimation from magnetometer


measurements
T
Let mix 0 miz be the geomagnetic eld measurements in the inertial

T
frame and mx my mz be the geomagnetic eld measurements in the


body frame. Those those vectors are related as follows:


   i 
mx mx
 my  = Rbi  0 
mz miz
mix cos (ψ) cos (θ) − miz sin (θ)
 

=  miz cos (θ) sin (ϕ) − mix (cos (ϕ) sin (ψ) − cos (ψ) sin (ϕ) sin (θ)) 
mix (sin (ϕ) sin (ψ) + cos (ϕ) cos (ψ) sin (θ)) + miz cos (ϕ) cos (θ)
(6.266)
Then it is worth noticing that the following relations hold:

sin (ϕ) mz − cos (ϕ) my = mix sin (ψ)



(6.267)
cos (θ) mx + sin (ϕ) sin (θ) my + cos (ϕ) sin (θ) mz = mix cos (ψ)

Those two equations can be used to approximate yaw angle ψ as follows:


 
sin (ϕ) mz − cos (ϕ) my
ψ = arctan (6.268)
cos (θ) mx + sin (ϕ) sin (θ) my + cos (ϕ) sin (θ) mz

6.12.6 Angular velocity from gyroscope measurements


Gyroscope provides the roll, pitch and yaw rate, p, q and r, respectively, with
respect to its body axis system. The relationship between rate gyro output and
angular velocity of the Euler angles is similar to (6.258). Nevertheless, because
the actual values of the Euler angles ϕ, θ and ψ is not known, it is their estimated
values ϕb, θb and ψb which is used in matrix W(b η ):

 

ϕ
 1 sin( ϕb ) tan( θb ) cos( ϕb ) tan( θb )  p 
d  − sin( ϕb )
η )−1 ν =  0
θ  ≈ W(b cos( ϕb )  q
  
dt sin( ϕ ) cos( ϕ )
ψ r
b b
0 bcos( θ ) cos( θ )
b

(6.269)

6.12.7 Attitude and Heading Reference System (AHRS) based


on complementary lter
The purpose of the Attitude and Heading Reference System (AHRS) is to
compute the best estimate of the Euler angles ϕ, θ and ψ from the roll and
pitch estimates (6.264) provided by the accelerometers, from the yaw estimate
(6.268) provided by the magnetometer and from the angular velocities (6.269)
provided by the gyroscopes.
Because each channel is decoupled, the best estimate can be achieved by
3 independent complementary lters of the form (6.241) for continuous time
188 Chapter 6. Linear Quadratic Gaussian (LQG) regulator

estimate, or (6.247) for discrete time estimate. In those equations, y1 stands for
the measurements provided by the accelerometers or the magnetometer, and y2
stands for the measurements provided by the gyroscopes.

You might also like