0% found this document useful (0 votes)
3 views

Nonlinear and Adaptive Control

Uploaded by

mzwtju
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Nonlinear and Adaptive Control

Uploaded by

mzwtju
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 125

ECE 517:

Nonlinear and Adaptive Control


Fall 2013 Lecture Notes

Daniel Liberzon

November 20, 2013


2 DANIEL LIBERZON

Disclaimers
Don’t print future lectures in advance as the material is always in the process of being
updated. You can consider the material here stable 2 days after it was presented in class.
These lecture notes are posted for class use only.
This is a very rough draft which contains many errors.
I don’t always give proper references to sources from which results are taken. A lack of
reference does not mean that the result is original. In fact, all results presented in these
notes (with possible exception of some simple examples) were borrowed from the literature
and are not mine.
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1 Motivating example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Course logistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Weak Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 LaSalle and Barbalat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Connection with observability . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Back to the adaptive control example . . . . . . . . . . . . . . . . . . . . . . 15
3 Minimum-phase systems and universal regulators . . . . . . . . . . . . . . . . . . . . 17
3.1 Universal regulators for scalar plants . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1 The case b > 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 General case: non-existence results . . . . . . . . . . . . . . . . . . . 20
3.1.3 Nussbaum gains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Relative degree and minimum phase . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Stabilization of nonlinear minimum-phase systems . . . . . . . . . . 27
3.3 Universal regulators for higher-dimensional plants . . . . . . . . . . . . . . . 29
4 Lyapunov-based design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1 Control Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.1 Sontag’s universal formula . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Back to the adaptive control example . . . . . . . . . . . . . . . . . . . . . . 38
5 Backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1 Integrator backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 Adaptive integrator backstepping . . . . . . . . . . . . . . . . . . . . . . . . . 44
6 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.1 Gradient method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

3
4 DANIEL LIBERZON

6.2 Parameter estimation: stable case . . . . . . . . . . . . . . . . . . . . . . . . 49


6.3 Unstable case: adaptive laws with normalization . . . . . . . . . . . . . . . . 54
6.3.1 Linear plant parameterizations (parametric models) . . . . . . . . . 57
6.3.2 Gradient method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.3 Least squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3.4 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.4 Sufficiently rich signals and parameter identification . . . . . . . . . . . . . . 66
6.5 Case study: model reference adaptive control . . . . . . . . . . . . . . . . . . 70
6.5.1 Direct MRAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.5.2 Indirect MRAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7 Input-to-state stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.1 Weakness of certainty equivalence . . . . . . . . . . . . . . . . . . . . . . . . 78
7.2 Input-to-state stability and stabilization . . . . . . . . . . . . . . . . . . . . . 80
7.2.1 ISS backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3 Adaptive ISS controller design . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.3.1 Adaptive ISS backstepping . . . . . . . . . . . . . . . . . . . . . . . 90
7.3.2 Modular design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
8 Stability of slowly time-varying systems . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.1 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.2 Application to adaptive stabilization . . . . . . . . . . . . . . . . . . . . . . . 98
8.3 Detectability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9 Switching adaptive control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
9.1 The supervisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.1.1 Multi-estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.1.2 Monitoring signal generator . . . . . . . . . . . . . . . . . . . . . . . 107
9.1.3 Switching logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.2 Example: linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9.3 Modular design objectives and analysis steps . . . . . . . . . . . . . . . . . . 112
9.3.1 Achieving detectability . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.3.2 Achieving bounded error gain and non-destabilization . . . . . . . . 117
10 Singular perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
NONLINEAR AND ADAPTIVE CONTROL 5

10.1 Unmodeled dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120


10.2 Singular perturbations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
10.3 Direct MRAC with unmodeled dynamics . . . . . . . . . . . . . . . . . . . . 123
11 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6 DANIEL LIBERZON

1 Introduction
The meaning of “nonlinear” should be clear, even if you only studied linear systems so far (by
exclusion).
The meaning of “adaptive” is less clear and takes longer to explain.
From www.webster.com:
Adaptive: showing or having a capacity for or tendency toward adaptation.
Adaptation: the act or process of adapting.
Adapt: to become adapted.
Perhaps it’s easier to first explain the class of problems it studies: modeling uncertainty. This
includes (but is not limited to) the presence of unknown parameters in the model of the plant.
There are many specialized techniques in adaptive control, and details of analysis and design
tend to be challenging. We’ll try to extract fundamental concepts and ideas, of interest not only
in adaptive control. The presentation of adaptive control results will mostly be at the level of
examples, not general theory.
The pattern will be: general concept in nonlinear systems/control, followed by its application
in adaptive control. Or, even better: a motivating example/problem in adaptive control, then
the general treatment of the concept or technique, then back to its adaptive application. Overall,
the course is designed to provide an introduction to further studies both in nonlinear systems and
control and in adaptive control.

1.1 Motivating example


Example 1 Consider the scalar system

ẋ = θx + u

where x is state, u is control, and θ is an unknown fixed parameter.


A word on notation: There’s no consistent notation in adaptive control literature for the true
value of the unknown parameters. When there is only one parameter, θ is a fairly standard symbol.
Sometimes it’s denoted as θ∗ (to further emphasize that it is the actual value of θ). In other sources,
p∗ is used. When there are several unknown parameters, they are either combined into a vector
(θ, θ∗ , p∗ , etc.) or written individually using different letters such as a, b, and so on. Estimates of
the unknown parameters commonly have hats over them, e.g., θ̂, and estimation errors commonly
have tildas over them, e.g., θ̃ = θ̂ − θ.
Goal: regulation, i.e., make x(t) → 0 as t → ∞.
If θ < 0, then u ≡ 0 works.
NONLINEAR AND ADAPTIVE CONTROL 7

If θ > 0 but is known, then the feedback law

u = −(θ + 1)x

gives ẋ = −x ⇒ x → 0. (Instead of +1 can use any other positive number.)


But if (as is the case of interest) θ is unknown, this u is not implementable.
−→ Even in this simplest possible example, it’s not obvious what to do.
Adaptive control law:

˙
θ̂ = x2 (1)
u = −(θ̂ + 1)x (2)

Here (1) is the tuning law, it “tunes” the feedback gain.


Closed-loop system:

ẋ = (θ − θ̂ − 1)x
˙
θ̂ = x2

Intuition: the growth of θ̂ dominates the linear growth of x, and eventually the feedback gain
θ̂ + 1 becomes large enough to overcome the uncertainty and stabilize the system.
Analysis: let’s try to find a Lyapunov function.
If we take
x2
V :=
2
then its derivative along the closed-loop system is

V̇ = (θ − θ̂ − 1)x2

and this is not guaranteed to be negative.


Besides, V should be a function of both states of the closed-loop system, x and θ̂.
Actually, with the above V we can still prove stability, although analysis is more intricate. We’ll
see this later.
Let’s take
x2 (θ̂ − θ)2
V (x, θ̂) := + (3)
2 2
The choice of the second term reflects the fact that in principle, we want to have an asymptotically
stable equilibrium at x = 0, θ̂ = θ. In other words, we can think of θ̂ as an estimate of θ. However,
the control objective doesn’t explicitly require that θ̂ → θ.
8 DANIEL LIBERZON

With V given by (3), we get

V̇ = (θ − θ̂ − 1)x2 + (θ̂ − θ)x2 = −x2 (4)

Is this enough to prove that x(t) → 0?


Recall:

Theorem 1 (Lyapunov) Let V be a positive definite C 1 function. If its derivative along solutions
satisfies
V̇ ≤ 0 (5)
everywhere, then the system is stable. If
V̇ < 0 (6)
everywhere (except at the equilibrium being studied), then the system is asymptotically stable. If
in the latter case V is also radially unbounded (i.e., V → ∞ as the state approaches ∞ along any
direction), then the system is globally asymptotically stable.

From (4) we certainly have (5), hence we have stability (in the sense of Lyapunov). In particular,
both x and θ̂ remain bounded for all time (by a constant depending on initial conditions).
On the other hand, we don’t have (6) because V̇ = 0 for (x, θ̂) with x = 0 and θ̂ arbitrary.
It seems plausible that at least convergence of x to 0 should follow from (4). This is indeed true,
but proving this requires knowing a precise result about weak (nonstrictly decreasing) Lyapunov
functions. We will learn/review such results and will then finish the example.

Some observations from the above example:

• Even though the plant is linear, the control is nonlinear (because of the square terms). To
analyze the closed-loop system, we need nonlinear analysis tools.

• The control law is dynamic as it incorporates the tuning equation for θ̂. Intuitively, this
equation “learns” the unknown value of θ, providing estimates of θ.

• The standard Lyapunov stability theorem is not enough, and we need to work with a weak
Lyapunov function. As we will see, this is typical in adaptive control, because the esti-
mates (θ̂) might not converge to the actual parameter values (θ). We will discuss techniques
for making the parameter estimates converge. However, even without this we can achieve
regulation of the state x to 0 (i.e., have convergence of those variables that we care about).

So, what is adaptive control?


We can think of the tuning law in the above example as an adaptation block in the overall
system—see figure. (The diagram is a bit more general than what we had in the example.)
NONLINEAR AND ADAPTIVE CONTROL 9

Adaptation

u y
Controller Plant

A priori, this is no different from any other (non-adaptive) dynamic feedback. But the main
feature of the adaptive controller is that it achieves the control objective (regulation) despite large-
scale uncertainty associated with θ (which can be any real number). This is to be contrasted with
robust control where uncertainty range is usually bounded and small.
So, one may call a controller adaptive if it can handle such systems, i.e., if it can “adapt”
to large uncertainties. Ideally we want an adaptive controller to handle more than just constant
unknown parameters: parameters may vary with time, there may be unmodeled dynamics, noise
and disturbance inputs, etc.
Certainty equivalence principle:
The way we designed the controller in Example 1 was:

• If θ were known, we would use u = −(θ + 1)x.

• Since θ was unknown, we introduced its estimate, θ̂.

• Then we used the control u = −(θ̂ + 1)x. I.e., we used the estimate for control purposes,
pretending that it is correct, even though it may not be (and it works, at least for this
example).

Adaptive controllers designed using the above logic are called certainty equivalence controllers.
This means essentially that estimator design and controller design were decoupled. A vast majority
of adaptive control techniques are based on certainty equivalence principle, although in some situa-
tions it may be desirable to rethink this and to design the controller by explicitly taking parameter
estimation into account. (More on this later.)

1.2 Course logistics


−→ On those Wednesdays when there is a Decision and Control seminar, we will start about 5
minutes late.
10 DANIEL LIBERZON

Prerequisite is ECE 515 (Linear Systems). Some knowledge of nonlinear systems (such as
Lyapunov stability, actually covered to some extent in ECE 515) is a plus. If everything in the
above discussion was familiar to you, you should be OK.
There is no single textbook for this course. Lecture notes will be posted on the class website.
Material for the lecture notes is drawn from several sources.

• Adaptive control of linear plants:


Ioannou and Sun, Robust Adaptive Control, Prentice-Hall, 1996. Out of print, available
on-line (see class website).
Ioannou and Fidan, Adaptive Control Tutorial, SIAM, 2006. This is an updated (and some-
what simplified) version of the first book.

• Nonlinear systems:
Khalil, Nonlinear Systems, Prentice-Hall, 2002 (third edition). This is a standard and very
good text, which is also the main text for ECE 528 (Nonlinear Systems). It has some adaptive
control examples too (see index). I recommend that you get this book as it is quite useful
for this course and you’ll need it for ECE 528 anyway.

• Adaptive control of nonlinear plants:


Krstic, Kanellakopoulos and Kokotović, Nonlinear and Adaptive Control Design, Wiley, 1995.
(Referred to as “KKK book” below.) Contains some more advanced adaptive control mate-
rial, and covers some nonlinear systems and control theory concepts as well.

• Some material is also drawn from lecture notes on adaptive control by A. S. Morse (com-
municated privately to the instructor). These are in turn based on research articles, I can
provide the references upon request.

For outline of the topics, see the table of contents. On average, 1–1.5 weeks will be spent on
each topic (except parameter estimation which is longer).
−→ Since everything discussed in class will be in the lecture notes posted on the web (well, except
perhaps some pictures I might draw on the board to explain something better), you don’t have to
worry about copying things down. Instead, try to understand and participate in the discussion as
much as possible.
Grading scheme: Homework—50%, Final Project—50%.
Homework: There will be about 4 problem sets. Some MATLAB simulations, to validate the
designs discussed in class and study their robustness properties. Theoretical questions as well.
Project: Topic to be defined by you, based on reading papers or your own research. Should
have a nonlinear systems/control or adaptive control component related to the course material.
Application-oriented projects are especially welcome. The project will consist of an oral presenta-
tion and/or a written report, details to be announced later
NONLINEAR AND ADAPTIVE CONTROL 11

−→ Come to discuss your project with me during office hours, so I can confirm that it is appropriate
and doesn’t overlap with projects of other students.

2 Weak Lyapunov functions


2.1 LaSalle and Barbalat
Consider the general system
ẋ = f (x) (7)
where x ∈ Rn and we assume, here and later, that f : Rn → Rn is sufficiently regular (at least
locally Lipschitz) so that solutions of the system exist and are unique. We will typically also assume
that f (0) = 0, to have an equilibrium at the origin.
Let V be a C 1 (continuously differentiable) function from Rn to [0, ∞). Such functions are
usually called candidate Lyapunov functions. Its derivative along solutions of (7) is defined as

∂V
V̇ (x) := · f (x)
∂x
Theorem 2 Suppose that for some nonnegative definite continuous function W we have

V̇ (x) ≤ −W (x) ≤ 0 ∀x (8)

Then, for every solution x(t) that remains bounded for all t ≥ 0, it is true that

W (x(t)) → 0 as t → ∞ (9)

Comments:

• It is very important to remember that the claim (9) is made only for bounded solutions.
Boundedness of x(t) will be used in the proof. So if, for example, |x(t)| → ∞ then it does
not necessarily satisfy (9). (There is a famous paper by Monopoli which makes this mistake
while trying to prove stability of an adaptive control system, and for many years many people
have been trying to determine whether his claim is still correct or not.)

• Boundedness of all solutions of (7) follows from (8) if V is radially unbounded . As we will
see, sometimes boundedness of solutions can be shown by a separate argument, so we don’t
require it in Theorem 8.

• Note that, unlike in Theorem 1, here V is not required to be positive definite, only positive
semidefinite. This is useful in adaptive control, where one often works with functions V that
depend on some but not all states. Unless V is positive definite, we can’t use Theorem 1 to
conclude stability in the sense of Lyapunov, but we’ll rarely need this.
12 DANIEL LIBERZON

• LaSalle’s theorem makes a more precise claim, namely, that every bounded solution ap-
proaches the largest positive-invariant set inside the set

{x : V̇ (x) = 0}

which in turn belongs to the set {x : W (x) = 0} in view of (8). As a corollary, we have
the following stability result, known as Barbashin-Krasovskii-LaSalle theorem: If V is as in
Theorem 2 and V̇ does not stay identically 0 along any nonzero trajectory, then all bounded
solutions converge to 0. The simpler but weaker claim of Theorem 2 remains true for time-
varying systems1 ẋ = f (t, x), while LaSalle’s theorem in general does not. For our purposes,
just convergence of W (x(t)) to 0 will usually be enough.

Proof of Theorem 2. Integrate (8) from time 0 to current time t, to get


Z t
V (x(t)) − V (x(0)) ≤ − W (x(s))ds
0

or, rearranging terms,


Z t
W (x(s))ds ≤ V (x(0)) − V (x(t)) ≤ V (x(0)) < ∞
0

where the second inequality follows from the fact that V (x) ≥ 0 ∀ x. Since the above calculation
is true for every t, and V (x(0)) is a constant (fixed by the choice of initial condition), we can take
the limit as t → ∞ and conclude that
Z ∞
W (x(s))ds < ∞ (10)
0

The integral is of course also bounded from below by 0 because W is nonnegative definite.
We now need to show that the finiteness of the above improper integral, together with the
hypotheses of Theorem 2 that we haven’t used yet (which ones are these?) implies (9). We
formulate this remaining step as a lemma, because it is often useful as an independent result.

Lemma 3 (“Barbalat’s lemma”) Let x(t) be bounded and let its derivative ẋ(t) also be bounded.
Let W be a continuous function of x such that the integral
Z ∞
W (x(s))ds
0

is well defined and finite. Then (9) is true.

Theorem 2 follows from Barbalat’s lemma because ẋ = f (x) and this is bounded if x is bounded.
Barbalat’s lemma seems intuitively obvious. In fact, how can a function whose integral over
[0, ∞) is finite not converge to 0? Well, here’s one possibility:
NONLINEAR AND ADAPTIVE CONTROL 13

W (x(t))

If the areas under the pulses decrease as geometric series, then the total area will be finite, even
though the function does not converge to 0. Of course, we can smooth out all corners and make
this function C 1 , no problem.
So, what’s the catch? Why can’t such behavior come from a bounded solution of the system (7)?
Reason: this requires unbounded derivative. Indeed, to come up to the same height and back
down over shorter and shorter time intervals, the pulses would need to get steeper and steeper. So,
if we invoke the condition that ẋ is bounded, then we can indeed show rigorously that we should
have convergence.
Proof of Barbalat’s lemma. Note: in this proof we are still assuming for simplicity that
W ≥ 0, as in the Theorem, but with a little extra care the argument still goes through if W is not
sign definite.
Suppose that W (x(t)) does not converge to 0 as t → ∞. Then, there would have to exist a
number ε > 0 and a sequence of times {tk } converging to ∞ at which we have

W (x(tk )) ≥ ε ∀k

(To see why, just recall the definition of convergence: ∀ ε > 0 ∃ T > 0 such that |W (x(t))| ≤ ε
∀ t ≥ T .)
This situation is similar to the one in the previous figure, ε being the pulse height.
Since W is a continuous function of x, and since x(t) is bounded for all t, there exists a constant
∆x > 0 with the following property:

W (x(t)) ≥ ε/2 whenever |x(tk ) − x(t)| ≤ ∆x

This is because for any choice of time t such that x(t) is close enough to some x(tk ), the correspond-
ing values of W will be within ε/2 of each other. (This is uniform continuity of W as a function
of x. Boundedness of x(·) is crucial here, because otherwise, a uniform constant ∆x with the above
property may not exist; think of the function W (x) = x2 over an infinite range of x.)
1
Under appropriate Lipschitzness and boundedness conditions on f ; see [Khalil, Theorem 8.4].
14 DANIEL LIBERZON

We have a similar property for x as a function of t: there exists a constant ∆t > 0 such that

|x(tk ) − x(t)| ≤ ∆x whenever |tk − t| ≤ ∆t

This is true because the derivative ẋ is bounded, so the time it takes for x to grow by ∆x is bounded
from above.
The function W (x(t)) is a composition of W (x) and x(t). Combining the above two properties,
we see that for t ∈ [tk , tk + ∆t], k = 1, 2, . . . we have W (x(t)) ≥ ε/2. Hence, each such interval
contributes an area of at least ∆t · ε/2 to the total integral in (10). Since there are infinitely many
such intervals, the total integral cannot be finite, which is a contradiction.

2.2 Connection with observability


(This will be useful later for understanding the persistency of excitation property.)
We would like to know under what circumstances Theorem 2 allows us to conclude asymptotic
stability of the system (7), i.e., when does a weak Lyapunov function (V̇ ≤ 0) let us show the same
property as does a strong Lyapunov function (V̇ < 0)?
LaSalle’s theorem (stated earlier) says that the largest invariant set inside the set {x : V̇ (x) = 0}
should be {0}. In other words, there should be no nonzero trajectories of (7) along which we have
V̇ ≡ 0. For linear systems, there is a simpler way to understand this result, in terms of observability.
This relies on Theorem 2 and so, compared to LaSalle, has the advantage that it automatically
applies to time-varying systems.
Consider first a linear time-invariant (LTI) system

ẋ = Ax

and a quadratic candidate Lyapunov function

V (x) = xT P x

where P is a symmetric positive definite matrix. Its derivative along solutions is given by

V̇ (x) = xT (P A + AT P )x

This is nonnegative definite if for some (not necessarily square) matrix C we have

P A + AT P ≤ −C T C ≤ 0 (11)

(the second inequality is of course automatic). This gives

V̇ (x) ≤ −xT C T Cx = −y T y

where we defined
y := Cx
NONLINEAR AND ADAPTIVE CONTROL 15

It is convenient to think of y as a fictitious output.


All conditions of Theorem 2 are satisfied; in particular, boundedness of solutions follows because
V is radially unbounded. Therefore, y = Cx → 0 as t → ∞.
Now, let us assume that (A, C) is an observable pair. This is well-known to imply that x → 0
(y → 0 ⇒ x → 0 for observable systems, can see this from observability Gramian inversion or from
output injection [ECE 515]). So, the closed-loop system must be asymptotically stable (in fact,
exponentially stable since it’s linear).
−→ Detectability is enough to get the above implication, but we work with observability here
because it is somewhat easier to define and check. We will come across detectability later, though.
The above result also extends to linear time-varying (LTV) systems, i.e., we can allow all
matrices (A, P , C) to depend on time. The time-varying version of (11) is

Ṗ (t) + P (t)A(t) + AT (t)P (t) ≤ −C T (t)C(t) ≤ 0 (12)

We can get an asymptotic stability result from this, provided we have the correct observability
property of the time-varying pair (A(t), C(t)). This goes as follows. The observability Gramian is
defined as Z t0 +T
M (t0 , t0 + T ) := ΦT (t, t0 )C T (t)C(t)Φ(t, t0 )dt
t0
where Φ(·, ·) is the system transition matrix [ECE 515]. The system is said to be uniformly com-
pletely observable (UCO) [Kalman, 1960] if for some positive constants T, β1 , β2 we have

β1 I ≤ M (t0 , t0 + T ) ≤ β2 I ∀ t0

For UCO systems, the implication y → 0 ⇒ x → 0 is still true; this follows from the identity
Z t0 +T
ΦT (t, t0 )C T (t)y(t)dt = M (t0 , t0 + T )x(t0 )
t0

For LTI systems, the dependence on t0 disappears and UCO simply says that the observability
Gramian (which is now defined more explicitly in terms of matrix exponentials) is positive definite
for some choice of T . This is the usual observability notion, which is equivalent to the well-known
rank condition.
We can now easily see that for LTV systems, (12) plus UCO give asymptotic stability. It can
be shown that asymptotic stability is uniform (with respect to initial time), which implies that it is
in fact exponential (since the system is linear). See [Khalil, Example 8.11] for an argument slightly
different from the one given above, which proves the exponential stability claim rigorously. This
result is also stated in [Ioannou-Sun, Theorem 3.4.8] and it will be useful for us in the sequel.

2.3 Back to the adaptive control example


(Modulo different notation, the same example is analyzed in Khalil, p. 130.)
16 DANIEL LIBERZON

In Example 1 we had the closed-loop system

ẋ = (θ − θ̂ − 1)x (13)
˙
θ̂ = x2 (14)

and the candidate Lyapunov function

x2 (θ̂ − θ)2
V (x, θ̂) = +
2 2
with derivative
V̇ (x, θ̂) = (θ − θ̂ − 1)x2 + (θ̂ − θ)x2 = −x2 ≤ 0

Let’s check carefully that Theorem 2 indeed applies. V is radially unbounded, hence all trajec-
tories are bounded (as we said earlier). Define

W (x, θ̂) := x2

which is nonnegative definite and continuous. (Remember that the system state includes both x
and θ̂.) Thus by Theorem 2 we have W (x(t)) → 0, hence x(t) converges to 0 as needed.

On the other hand, Theorem 2 tells us nothing about θ̂(t). It may not converge to θ, or to
anything else. Indeed, the line in the (x, θ̂) space given by

{(x, θ̂) : x = 0}

consists entirely of equilibria (this is obvious from the system equations). This line is exactly the
set
{(x, θ̂) : V̇ (x, θ̂) = 0}
This set is invariant, so even the stronger LaSalle’s theorem gives nothing further than convergence
to this line. For example, it is clear that if we start on this line, with any value of θ̂, then we’ll stay
there and θ̂ won’t change.
−→ Since the value of θ is unknown, an interesting feature of the above Lyapunov function is that
it’s not completely known, but is instead an abstract function whose existence is guaranteed. This
is typical in adaptive control.
−→ Problem Set 1 is assigned.
When we first studied the example, we also tried the candidate Lyapunov function

x2
V (x, θ̂) = (15)
2
NONLINEAR AND ADAPTIVE CONTROL 17

whose derivative is
V̇ = (θ − θ̂ − 1)x2 (16)
This can be both positive or negative, so Theorem 2 does not apply. In fact, we can’t even use this
V to show boundedness of solutions. However, with a bit of clever analysis and by using Barbalat’s
lemma, we can still use this V to show that x(t) → 0 as follows.
˙
First, use the θ̂ equation (14) to rewrite (16) as
˙ ˙ ˙
V̇ = (θ − θ̂ − 1)θ̂ = (θ − 1)θ̂ − θ̂θ̂

Integrate this to get


x2 (t) 1
= (θ − 1)θ̂(t) − θ̂2 (t) + C (17)
2 2
where C is a constant determined by the initial conditions (of x and θ̂).
We know from (14) that θ̂ is monotonically nondecreasing. Thus it must either approach a
finite limit or grow without bound. But if it were unbounded, then the right-hand side of (17)
would become negative, while the left-hand side cannot be negative—a contradiction. This shows
that θ̂(t) is bounded. Looking at (17) again, we see that x(t) must then also be bounded.
Integrating (14), we have
Z t
x2 (s)ds = θ̂(t)
0
We just showed that this is bounded for all t, so in the limit as t → ∞ we have
Z ∞
x2 (s)ds < ∞
0

We already know that x is bounded, and ẋ is also bounded because of (13). So, we can apply
Barbalat’s lemma and conclude that x(t) → 0 as t → ∞, and we’re done.
This argument is certainly more complicated than the previous one based on Theorem 2. How-
ever, it is often difficult to find a weak Lyapunov function satisfying the hypotheses of Theorem 2,
and so it is useful to have as many tools as possible at our disposal. The above steps—integrating
differential equations, showing boundedness, and invoking Barbalat’s lemma—are used very often
in adaptive control.

3 Minimum-phase systems and universal regulators


The control law in the above example is capable of stabilizing any plant from the class of scalar
plants
ẋ = θx + u
parameterized by θ ∈ R.
18 DANIEL LIBERZON

A controller is called a universal regulator for a given class of plants if, when connected to any
plant from this class, it guarantees that:

• All signals in the closed-loop system remain bounded, and

• The plant’s state converges to 0 as t → ∞

under arbitrary initial conditions (of both the plant and the controller).
Note: even though we used the notation θ̂, we are not really trying to estimate the value of θ.
We have no reason to believe that θ̂ − θ becomes small in any sense whatsoever. We can think of a
universal regulator as doing some kind of exhaustive search through the space of possible controller
gains, so that eventually it will find the gains that will stabilize the given unknown plant, whatever
it is (as long as it is in a prescribed class). Later, we will study different controllers which explicitly
rely on parameter estimation and certainty equivalence. (See Section 6.5 and Section 8.2.)
−→ It is hard to make a rigorous distinction between universal regulators and estimation-based
adaptive controllers. This is because in both cases the controller has dynamics, and it’s impossible
to define formally when these dynamics are estimating anything and when they are not. The
difference is not so much in the appearance of the controller, but in the design philosophy behind
it. With some experience, it is usually easy to tell one from another.
Exhaustive search means that transient behavior might be poor (and this is true). However,
from the theoretical point of view the possibility of designing universal regulators for large classes
of plants is quite interesting.
Next, we want to see if universal regulators exist for larger classes of plants.

3.1 Universal regulators for scalar plants


We study the class of scalar plants
ẏ = ay + bu (18)
parameterized by a and b, where a is completely unknown and b 6= 0.
We write y, not x, because later we’ll generalize to higher-dimensional plants with scalar out-
puts.

3.1.1 The case b > 0

We first consider the simpler case where the sign of the control gain b (the “high-frequency gain”)
is known, and with no loss of generality we take it to be positive.
It turns out that this makes relatively minor difference with Example 1 (where b was equal to
1). In fact, the same controller works, and almost the same analysis applies.
NONLINEAR AND ADAPTIVE CONTROL 19

Controller:

k̇ = y 2 (19)
u = −ky

This notation better reflects the fact that we’re not estimating a and b, but just searching the
space of controller gains for a suitable one. The differential equation for k is a tuning law, or
tuner. (k here plays the role of θ̂ + 1 before.)
Closed-loop plant:
ẏ = (a − bk)y (20)

Candidate Lyapunov function—?


Recall that in Example 1 we examined two different choices. One was (3), with which the
analysis was simple via Theorem 2. The other choice was (15), which itself is simpler but the
analysis was a bit more complicated. For the present case of b > 0, both choices would still work.
However, for the general case of b with arbitrary sign, only the second one will be helpful. So, we
go with
y2
V :=
2

Its derivative is
V̇ = (a − bk)y 2 = (a − bk)k̇

Integrate this to get


y 2 (t) bk 2 (t)
= ak(t) − +C (21)
2 2
where C is a constant determined by the initial conditions.
We know from the tuning law (19) that k is monotonically nondecreasing. Thus it must either
approach a finite limit or grow without bound. But if it were unbounded, then the right-hand side
of (21) would become negative (since b > 0; that’s where we’re using this assumption), while the
left-hand side cannot be negative—a contradiction. Hence k is bounded, and so y is also bounded.
Integrating (19), we have
Z t
y 2 (s)ds = k(t)
0

Since k is bounded, this means y ∈ L2 .


We already know that y is bounded, and ẏ is also bounded because of (20). By Barbalat,
y(t) → 0 as t → ∞.
Our conclusion so far is that the above controller is a universal regulator for the class of plants
given by (18) for all a ∈ R and b > 0.
20 DANIEL LIBERZON

3.1.2 General case: non-existence results

We now go back to the general case where the sign of b in (18) can be arbitrary (as long as b 6= 0).
It turns out that this extra uncertainty makes the problem of designing a universal regulator
significantly more challenging.
Let us consider a controller of the form
ż = f (z, y)
(22)
u = h(z, y)

where z ∈ R (i.e., the controller dynamics are scalar) and f and h are continuous rational functions
(i.e., ratios of polynomials with no real poles).
−→ Our previous controller is of this form—it’s polynomial.
Claim: No such 1-D rational controller can be a universal stabilizer for (18) with unknown
sign of b.
Closed-loop system:

ẏ = ay + bh(z, y)
(23)
ż = f (z, y)

−→ Remember the definition of a universal regulator: it must globally stabilize all plants in the
class. So, to show that the above controller cannot be a universal regulator, we must show that for
every choice of f and h, there exists a choice of values for a and b such that y 6→ 0 (at least for
some special bad choices of initial conditions).
First, we show that f cannot be an identically zero function. If it were, then the value of z
would be constant: z ≡ z0 , and the plant would be

ẏ = ay + bh(z0 , y)

(the controller would then be static).


Suppose first that h(z0 , y) 6= 0 ∀ y > 0. Then it maintains the same sign. Assume that
h(z0 , y) > 0 ∀ y > 0 (the other case is completely analogous). Choose, for example,

a = 1, b=1

Then we have
ẏ = y + h(z0 , y) > 0 ∀y > 0
This means that for a positive initial condition, the value of y can only increase, so it cannot go to
0.
Now suppose that h(z0 , y0 ) = 0 for some y0 > 0. Then take, for example,

a = 0, b arbitrary
NONLINEAR AND ADAPTIVE CONTROL 21

This gives
ẏ = bh(z0 , y)
for which y = y0 is an equilibrium. Again, y cannot go to 0 from all initial conditions.
So, f is not identically zero, i.e., the controller has nontrivial dynamics.
−→ Note: in the above argument, rationality of h was in fact not used. In other words, no static
controller can be a universal regulator. (Rationality will become important when f is not identically
zero, see below.)
Therefore, there must exist a z0 for which f (z0 , y) is a nonzero function of y. Since it is rational,
it has finitely many zeros. Hence, there exists a y0 > 0 such that f (z0 , y) has the same sign for all
y ≥ y0 . Assume that this sign is positive (the other case is similar):

f (z0 , y) > 0 ∀ y ≥ y0 (24)

Now consider h(z, y0 ) as a function of z. It is also rational, hence there exists a z1 ≥ z0 such
that h(z, y0 ) has the same sign for all z ≥ z1 . Again, assume for concreteness that this sign is
positive.
By continuity, h(z, y0 ) is then bounded from below for z ≥ z0 (by some possibly negative but
finite number).
Now, pick a > 0, and then pick b > 0 small enough so that

ay0 + bh(z, y0 ) > 0 ∀z ≥ z0 (25)

Let’s now look at the plant (23) and the inequalities (24) and (25). We see that solutions cannot
leave the region
{(y, z) : y ≥ y0 , z ≥ z0 }
because everywhere on the boundary of this region they are directed inside. In other words, it is
an invariant region for the closed-loop system.
Therefore, convergence of y to 0 (i.e., convergence to the z-axis) from initial conditions inside
this region is not achieved.
Working out the remaining cases is left as an exercise.

3.1.3 Nussbaum gains

Does a universal regulator exist at all in the general case of unknown sign of b?
R. Nussbaum showed in 1983 that it does. (It is interesting that he’s a pure mathematician
at Rutgers and doesn’t work in adaptive control; he learned about this problem from a paper by
Morse and came up with the idea. His paper is in Systems and Control Letters, vol. 3, pp. 243-246;
the solution we give here is taken from Morse’s notes and is a bit simpler than what Nussbaum
had. The previous non-existence result is also established by Nussbaum in the same paper.)
22 DANIEL LIBERZON

We leave the tuner as before:


k̇ = y 2
but let the control law be
u = −N (k)ky
(whether or not to put a minus sign in front of N above is an arbitrary choice), where N (·) is some
function satisfying
Z k
1
sup N (s)sds = ∞,
k>0 k 0
Z k
1
inf N (s)sds = −∞
k>0 k 0

In other words, this function keeps crossing back and forth between positive and negative values
while giving higher and higher absolute value of the above integral.
Example: N (k) = k cos k.

−→ Since N must clearly have an infinite number of zeros, it is not rational, so the previous
non-existence result does not apply to this controller.
With this modified controller, the analysis goes through much like before:

y2
V =
2
gives
V̇ = (a − bN (k)k)y 2 = (a − bN (k)k)k̇
Integrating, we get
Z k(t)
y 2 (t)
= ak(t) − b N (s)sds + C (26)
2 0
where C is a constant determined by the initial conditions.
We have as before that k is monotonically nondecreasing, so it either has a finite limit or grows
to infinity. If it were unbounded, then the right-hand side of (26) would eventually become negative
(no matter what the sign of b is!) Indeed, this will happen when k reaches a value k̄ for which
Z k̄
1 C
b N (s)sds > a +
k̄ 0 k̄

which exists by the defining properties of N (·).


But the left-hand side cannot be negative, and we reach a contradiction. Hence k is bounded,
and so y is also bounded.
NONLINEAR AND ADAPTIVE CONTROL 23

After this, the analysis is exactly the same as before: y ∈ L2 (since k is bounded), y is bounded,
ẏ is bounded, and by Barbalat y → 0.
An important observation is that in the above construction, N (·) doesn’t need to be continuous.
We do need it to be locally bounded (to ensure that boundedness of k implies boundedness of N (k)
and hence boundedness of ẏ) and nice enough so that solutions of the closed-loop system are well
defined. Piecewise continuous N (·) will do.
In fact, piecewise constant N (·) is probably the easiest to design.

The values of N don’t actually need to grow in magnitude, they can just be maintained on
longer and longer intervals to satisfy the condition. This is an example of a switching logic.
This design was proposed by Willems and Byrnes in 1984. Similar ideas were presented earlier in
the Russian literature (work of Fradkov). It was probably the first example of a switching adaptive
control algorithm, which motivated a lot of subsequent work. We’ll talk more about switching
adaptive control later in the course (although in a different context: estimator-based). The best
way to design the switching is to use feedback, rather than a predefined switching pattern.

3.2 Relative degree and minimum phase


To move beyond scalar plants, we need a few new concepts.
Consider a linear system

ẋ = Ax + Bu
y = Cx

Assume that it is SISO, i.e., u and y are scalar (and so B and C are vectors, not matrices).
We’ll briefly mention MIMO case later.
Relative degree is, basically, the number of times that we need to differentiate the output until
the input appears.
ẏ = CAx + CBu
where CB is actually a scalar since the system is SISO.
If CB 6= 0, then relative degree equals 1. If CB = 0, then we differentiate again:

ÿ = CA2 x + CABu

and so on. Relative degree is the positive integer r such that

CB = CAB = · · · = CAr−2 B = 0, CAr−1 Bu 6= 0

which means that y (r) is the first derivative that depends on u because

y (r) = CAr x + CAr−1 Bu


24 DANIEL LIBERZON

The terminology “relative degree” is motivated by the following. Consider the transfer function
of the system:
q(s)
G(s) = C(Is − A)−1 B = (27)
p(s)
Then
r = deg(p) − deg(q)
i.e., relative degree is the difference between the degrees of the denominator and numerator poly-
nomials.
To define relative degree for MIMO systems, we need to differentiate each output until some
input appears. We’ll have (assuming both y and u have the same dimension m)
 (r ) 
y 1
 1(r2 ) 
 y2 
 .  = Lx + M u
 . 
 . 
(r )
ym m

The matrix M generalizes CAr−1 B we had earlier. If M is nonsingular, then (r1 , . . . , rm ) is called
the vector relative degree. It may not exist (the matrix M may be singular).
The concept of relative degree also extends quite easily to nonlinear systems. We’ll only deal
with nonlinear systems that are SISO and affine in controls.

Example 2

ẋ1 = x3 − x32
ẋ2 = −x2 − u
ẋ3 = x21 − x3 + u
y = x1

Differentiate y:
ẏ = x3 − x32
and this doesn’t depend on u. Differentiate again:

ÿ = x21 − x3 + u − 3x22 (−x2 − u) = x21 − x3 + 3x32 + (1 + 3x22 )u

Since
1 + 3x22 6= 0
relative degree is r = 2.
−→ In this example, r = 2 globally in the state space. In general for nonlinear systems, relative
degree is a local concept, because the term multiplying u may vanish for some x. (For linear systems
this doesn’t happen because it is a constant.)
NONLINEAR AND ADAPTIVE CONTROL 25

Defining new coordinates


z1 := x1 = y, z2 := x3 − x32
we get

ż1 = z2
ż2 = x21 − x3 + 3x32 + (1 + 3x22 )u

We want to complete a coordinate transformation and write everything in terms of z. For this we
need z3 . We can always find z3 whose differential equation doesn’t depend on u. In this example,
we can take
z3 := x2 + x3
for which
ż3 = x21 − x2 − x3 = z12 − z3
and this doesn’t depend on u.
Note that the Jacobian of the map from x to z is
 
1 0 0
0 −3x22 1
0 1 1

which is nonsingular, so the coordinate transformation is well defined (in fact, globally).
Can also check that this transformation preserves the origin.
The z-dynamics are of the form

ż1 = z2
ż2 = b(z1 , z2 , z3 ) + a(z1 , z2 , z3 )u
(28)
ż3 = z12 − z3
y = z1

where we know that


a(z1 , z2 , z3 ) 6= 0 ∀z
This system is in so-called normal form. It has a chain of integrators of length r − 1 (where r is
relative degree), followed by a state-dependent affine function of u with the term multiplying u
being non-zero, and completed by the rest of the dynamics that do not involve u.

The next concept we need is that of a minimum-phase system. A linear SISO system is called
minimum-phase if its zeros have negative real parts. These are the roots of the numerator polyno-
mial q(s) in the transfer function (27).
(The term “minimum-phase” comes from the fact that among transfer functions with the same
magnitude Bode plot, the one with stable zeros has the smallest variation of the phase Bode plot.)
26 DANIEL LIBERZON

The interpretation of this notion is that the “inverse system” is asymptotically stable. For
example, the inverse of
s+2
y= u
(s + 3)2
is
(s + 3)2
u= y
s+2
The problem is that this is not proper. But we can fix this if we work with derivatives of y instead.
For example, we can multiply both sides of the original equation by s + 1:

(s + 1)(s + 2)
ẏ + y = u
(s + 3)2

This has relative degree 0 and so we can invert it without losing properness:

(s + 3)2
u= (ẏ + y)
(s + 1)(s + 2)

Since the original system has a stable zero (at s = −2), and since we were careful not to add an
unstable zero, the inverse system has stable poles (which are the same as the zeros of the initial
system). Asymptotic stability of this inverse system implies, in particular:

y ≡ 0 ⇒ x, u → 0

where x is the internal state in any minimal (controllable and observable) state-space realization.
This last implication is a good way of thinking about minimum-phase. Suppose we chose a
control u which maintains the output to be identically zero (y ≡ 0). The resulting dynamics—
called zero dynamics—should then be asymptotically stable (x → 0, and consequently u → 0
because it is an output of the inverse system).
The minimum-phase property (as well as the relative degree) are coordinate-independent, but
they are easier to study using a normal form. Consider again the normal form (28). We want
y = z1 ≡ 0. Then we should also have ẏ = z2 ≡ 0. This in turn requires ż2 ≡ 0. Since the
differential equation for z2 is controllable (a 6= 0), we can apply the feedback control law

b(z1 , z2 , z3 )
u=−
a(z1 , z2 , z3 )

to enforce this.
The zero dynamics are the remaining dynamics of the system, constrained by the condition
y ≡ 0. I.e., zero dynamics describe the remaining freedom of motion after we fix the initial
conditions (in the present case, z1 (0) = z2 (0) = 0) and select the control as above.
Here, the zero dynamics are
ż3 = −z3
NONLINEAR AND ADAPTIVE CONTROL 27

So they are 1-dimensional and asymptotically stable. Hence, this system is minimum-phase. (For
nonlinear systems one defines minimum-phase via asymptotic stability of zero dynamics, since there
is no transfer function.)
Can check that u → 0 because b is 0 at 0.

3.2.1 Stabilization of nonlinear minimum-phase systems

This is an appropriate place to have a brief discussion about stabilization of nonlinear (SISO)
systems in normal form with asymptotically stable zero dynamics (minimum phase).
Consider a system in normal form

ξ˙1 = ξ2
ξ˙2 = ξ3
..
.
ξ˙r = b(ξ, η) + a(ξ, η)u
η̇ = q(ξ, η)

We assume that a(ξ, η) 6= 0 for all ξ, η at least near the origin, which means that the system has
relative degree r with respect to the output y = ξ1 .
Assume also that the system is minimum-phase, in the sense that its zero dynamics (as defined
above) are locally asymptotically stable.
What are the zero dynamics? y = ξ1 ≡ 0 ⇒ ẏ = ξ2 ≡ 0 ⇒ . . . ⇒ ξr ≡ 0. The last property
can be achieved by choice of u since a is nonzero. So, the zero dynamics are

η̇ = q(0, η) (29)

and this system is by assumption locally asymptotically stable around 0.


Local asymptotic stabilization problem: find a state feedback u = k(ξ, η) which makes the
closed-loop system locally asymptotically stable.
Note that this is not going to be the same feedback that gives the zero dynamics, because we
need additional damping to ensure that ξ → 0 from ξ0 6= 0. But we are almost there. Let’s try
1  
u= − b(ξ, η) − k1 ξ1 − k2 ξ2 − · · · − kr ξr (30)
a(ξ, η)

The closed-loop system is

ξ˙ = Acl ξ
η̇ = q(ξ, η)
28 DANIEL LIBERZON

where  
0 1 0 ··· 0
 0 0 1 ··· 0
 
 .. .. .. .. 
..
Acl =  . . . . 
.
 
 0 0 0 ··· 1 
−k1 −k2 −k3 · · · −kr
is a Hurwitz matrix in controllable canonical form for appropriately chosen values of the gains ki .
Claim: the closed loop is asymptotically stable.
If the zero dynamics (29) have asymptotically stable linearization, i.e., if the matrix

∂q(ξ, η)
(0, 0)
∂η

is Hurwitz, then the claim easily follows from Lyapunov’s first (indirect) method. But asymptotic
stability of (29) is a weaker assumption than asymptotic stability of its linearization. However, the
claim is still true even in the critical case when the linearization test fails. One way to show this is
by Lyapunov analysis, as follows (see [Khalil, Lemma 13.1, p. 531]).

• Since (29) is asymptotically stable, there exists (by a converse Lyapunov theorem) a function
V1 (η) such that
∂V1
q(0, η) < 0 ∀ η 6= 0
∂η

• Since Acl is Hurwitz, there exists a matrix P = P T > 0 such that

P Acl + ATcl P = −I

Now consider p
V (ξ, η) := V1 (η) + k ξT P ξ
where k > 0. Its derivative along closed-loop solutions is

∂V1 k
V̇ = q(ξ, η) + p ξ T (P Acl + ATcl P )ξ
∂η T
2 ξ Pξ
∂V1 ∂V1  kξ T ξ
= q(0, η) + q(ξ, η) − q(0, η) − p
∂η ∂η 2 ξT P ξ

The first term is negative definite in η. The second term is upper-bounded, on any neighborhood
of 0, by C|ξ| where the constant C comes from a bound on | ∂V ∂η | and a Lipschitz constant for q on
1

this neighborhood. The last term is negative definite in ξ and scales linearly with k. Therefore, by
choosing k large enough we can dominate the second term and get V̇ < 0. This implies the claim.
What about global stabilization?
NONLINEAR AND ADAPTIVE CONTROL 29

It is reasonable to expect that if we strengthen the minimum-phase property to mean global


asymptotic stability of zero dynamics, then the same feedback law (30) should be globally asymp-
totically stabilizing.
To try to reason more precisely: if we choose the controller gains k1 , . . . , kr so that the eigen-
values of Acl will have very large negative real parts, then ξ will converge to 0 very fast, hence η
will be very close to solving η̇ = q(0, η) and will also converge to 0.
Is this true?
Not always! We need to be more careful. The difficulty is that high-gain feedback gives
|ξ(t)| ≤ ce−λt
where the decay rate λ is large but the overshoot c can also get large. In other words, fast
convergence is preceded by large transient; this is known as the “peaking phenomenon”. But this
can in turn make η̇ = q(ξ, η) unstable, basically by pushing η so far away from 0 that it cannot
come back to 0 even after ξ starts decaying.
See [Khalil, Examples 13.16 and 13.17] for two different instances of this behavior, also to be
explored more in HW.
Later we will offer two ways to resolve this issue: one in Section 5.1 relying on backstepping,
and the other in Section 7.2 relying on input-to-state stability.

3.3 Universal regulators for higher-dimensional plants


Back to linear plants again.
So far we’ve discussed universal regulators for the scalar plant (18), which was
ẏ = ay + bu
We now want to consider higher-dimensional linear plants. From now on y will be an output; it’ll
still be scalar.
To get a class of systems which is large enough but for which the universal regulator problem
is still manageable, we assume that:

• The plant is SISO.


• The plant has relative degree 1.
• The plant is minimum-phase.

From the earlier discussion about normal forms it is clear that, up to a coordinate transforma-
tion, we can represent all such plants in the form
ẏ = ay + bu + cT z
ż = Az + dy
30 DANIEL LIBERZON

where A is a Hurwitz matrix (because it gives zero dynamics), c and d are vectors, a and b 6= 0 are
scalars, and z is a vector which together with y gives the plant state: x = (y, z). The entries of
A, a, b, c, d are all unknown.
The goal is to have y, z → 0 as t → ∞ while keeping all closed-signals bounded. And we want
to achieve this with output feedback (using only y, not z).
This class of plants includes our previous scalar case, but it’s obviously much larger. Neverthe-
less, it turns out that the same controller as in Section 3.1.3: the tuner

k̇ = y 2

and the controller


u = −N (k)ky
(where N is a Nussbaum gain function) solves the regulation problem for this class of plants!
The intuition here is that in some sense, minimum-phase systems with relative degree 1 behave
essentially like scalar systems: if you stabilize y, the rest of the state will also be automatically
stabilized. The analysis needed to show this becomes more challenging, though.
The closed-loop system is

ẏ = (a − bN (k)k)y + cT z
ż = Az + dy
k̇ = y 2

As in the scalar case, let’s try the Lyapunov function

y2
V =
2
It gives
V̇ = (a − bN (k)k)y 2 + ycT z = (a − bN (k)k)k̇ + ycT z
Integrating, we get
Z k(t) Z t
y 2 (t)
= ak(t) − b N (s)sds + y(s)cT z(s)ds +C (31)
2 0 0
| {z }
new term

where C is a constant determined by the initial conditions.


We can view cT z as an output of the exponentially stable linear system

ż = Az + dy (32)

In what follows, we’ll need some well-known facts about linear systems. Proofs of these facts
will either be assigned as homework or are variations on those to be assigned in homework.
NONLINEAR AND ADAPTIVE CONTROL 31

Fact 1: Exponentially stable linear systems have finite L2 induced norms.


This means, in our case, that there exist constants c1 , c2 such that
Z t Z t
2 2
T
(c z(s)) ds ≤ c1 |z(0)| + c2 y 2 (s)ds
0 0

From this, the Cauchy-Schwartz inequality, and simple square completion we have
Z t sZ sZ
t t
y(s)cT z(s)ds ≤ y 2 (s)ds (cT z(s))2 ds
0 0 0
Z t Z
1 1 t T
2
≤ y (s)ds + (c z(s))2 ds
2 0 2 0
Z t
≤ c3 |z(0)|2 + c4 y 2 (s)ds = c3 |z(0)|2 + c4 (k(t) − k(0))
0

where
c1 c2 1
c3 = , c4 = +
2 2 2
Substituting this bound into (31), we get
Z k(t)
y 2 (t)
≤ (a + c4 )k(t) − b N (s)sds + C̄
2 0

Now that we were able to handle the extra term, we’re almost done. Exactly as before, we show
that k and y are bounded, and y ∈ L2 .
How do we show that ẏ is bounded? Need to know that z is bounded. (Or at least that cT z is
bounded. So the next fact can be stated for the output instead of the state.)
Fact 2: Given an exponentially stable linear system, if the input is bounded, then the state is
bounded.
Applying this to the system (32) and using boundedness of y, we have that z is bounded.
y ∈ L2 , ẏ is bounded =⇒ y → 0 (Barbalat).
Fact 3: Given an exponentially stable linear system, if the input is in L2 or converges to 0, then
the state converges to 0.
From this we conclude that z → 0. Thus the regulation objective is fulfilled.
Designing universal regulators for systems with relative degree 2 or higher is more challenging.
In Section 5 we will study backstepping, which a tool for overcoming the relative degree obstacle—
but we won’t be working in the context of universal regulators any more.
Additional reading on universal regulators: A. Ilchmann, “Non-identifier based adaptive control
of dynamical systems: A survey,” IMA J. Math. Contr. Inform., vol. 8, pp. 321–366, 1991.
32 DANIEL LIBERZON

4 Lyapunov-based design
In Example 1, we started with the system

ẋ = θx + u

How did we come up with the adaptive control design?


The feedback law
u = −(θ̂ + 1)x (33)
was motivated by the certainty equivalence principle. This makes sense, but since parameter
estimates don’t converge, it is not a rigorous justification why this controller is stabilizing.
Then we had the tuning law
˙
θ̂ = x2 (34)
for which we didn’t really have any justification, except to say that the quadratic dependence of
the right-hand side on x should dominate linear growth of the plant.
Then we took the candidate Lyapunov function

x2 (θ̂ − θ)2
V (x, θ̂) = + (35)
2 2
and showed that its derivative satisfies

V̇ (x, θ̂) = −x2 (36)

from which, by Theorem 2, we have x(t) → 0. The choice of V was, again, not systematic and
involved trial and error.
If we’re going to base stability analysis on a (weak) Lyapunov function, then an alternative
approach is to start by picking V as in (35), and then design the control law and tuning law so
as to get (36). The step of choosing V still involves guessing, but then V provides the basis for
controller design and we get stability by construction.

˙
For the above V , but keeping u and θ̂ unspecified for now, we get
˙ ˙ ˙
V̇ = x(θx + u) + (θ̂ − θ)θ̂ = xu + θ̂θ̂ + θ(x2 − θ̂)

The last term is not going to give us anything useful, because θ is the unknown parameter and we
don’t know its sign. So, it makes sense to cancel it. This immediately suggests the tuning law (34).
We now have
V̇ = xu + θ̂x2
and so, if we want (36), we need to pick u such that

xu + θ̂x2 = −x2
NONLINEAR AND ADAPTIVE CONTROL 33

It is clear that the control law (33) is, in fact, a unique control law that gives this.
So, Lyapunov-based design allows us to reconstruct our original choices in a more methodical
way. Also note that it gives us more flexibility. For example, we see that any u for which

xu + θ̂x2 ≤ −x2

would give us the same conclusion (since strict equality is not required in Theorem 2). So, for
example, any control of the form

u = −(θ̂ + k)x, k>1

works just as well. This may be somewhat obvious for this example, but in more complicated
situations Lyapunov-based design can make it easier to see which controllers are stabilizing.

4.1 Control Lyapunov functions


A general concept that formalizes the idea of Lyapunov-based control design is that of a control
Lyapunov function, or CLF. It was introduced by Artstein in 1983. Among the texts for this course,
it is discussed in KKK book, Section 2.1.2.
Consider a general control system
ẋ = f (x, u)
where x ∈ Rn is the state and u ∈ Rm is the control input. Let V be a function having the usual
properties of a candidate Lyapunov function, i.e., it is C 1 and at least nonnegative definite.
−→ In fact, here for simplicity we will work with the more common situation of a strong Lyapunov
function described by the condition (6) in Theorem 1, and not a weak Lyapunov function as in
Theorem 2. So, we assume that V is strictly positive definite, and that our goal is to make V̇
strictly negative (by choice of u).
The scenario with nonstrict inequalities (weak CLF) can be developed similarly. We will see
an example of this.
Let’s say that V is a CLF for our system if for each x 6= 0 there is a value of u for which
∂V
V̇ (x, u) := · f (x, u) < 0
∂x
Somewhat more formally, but equivalently:
 
∂V
inf · f (x, u) < 0 ∀ x 6= 0
u ∂x

Now, suppose that we’re interested in finding a continuous feedback law

u = k(x)
34 DANIEL LIBERZON

which makes the closed-loop system globally asymptotically stable. And we want our V to be a
Lyapunov function for the closed loop, i.e., we want

∂V
· f (x, k(x)) < 0 ∀ x 6= 0
∂x
In fact, for the closed-loop system to be well-posed, we probably need more regularity from the
feedback law—at least local Lipschitzness—but continuity is the absolute minimum we should ask
for.2 It is often also required that k(0) = 0, to preserve the equilibrium at the origin (assuming
also that f (0, 0) = 0).
Does the existence of a CLF imply the existence of such a stabilizing feedback law?
One is tempted to say yes. However, being able to find a value of u for each x does not
automatically imply being able to glue them together into a continuous function k(x). This is
known as the continuous selection problem: from a given collection of sets parameterized by x (in
our case, the sets of “good” values of u) select a continuous function of x.
Counterexample: [Sontag and Sussmann, 1980]
  
ẋ = x (u − 1)2 − (x − 1) (u + 1)2 + (x − 2)

Let
x2
V (x) :=
2
then
  
V̇ = x2 (u − 1)2 − (x − 1) (u + 1)2 + (x − 2)
For this to be negative, one (and only one) of the expressions in square brackets must be negative.
It is easy to see that the points in the (x, u) plane where this happens are given by the interiors of
the two parabolas in the picture.

1
1 2
x
−1

2
Continuity of the right-hand side of an ODE is enough for existence of solutions, but not enough for
their uniqueness (see [Khalil, p. 88] for a reference).
NONLINEAR AND ADAPTIVE CONTROL 35

The projection of the union of the two parabolas onto the x-axis covers the whole axis. This
means that V is a CLF (directly from the definition). On the other hand, the parabolas do not
intersect, which means that no continuous feedback law exists that makes V̇ negative. Any such
feedback law would have to pass somehow from one parabola to the other, see the dashed curve in
the picture.
−→ There are several possibilities for overcoming this difficulty. One is to consider discontinuous
feedback laws (or time-varying ones, or switching ones). Another is to look for classes of systems
for which continuous feedback laws can be constructed.
For now, let’s stick with continuous feedback laws. For systems affine in controls, all is well.
Not only do continuous stabilizers exist [Artstein, 1983], but they can be generated from a CLF
by an explicit formula [Sontag, 1989]. However, it is clear from the preceding discussion that it
will take some effort to establish this result (it is far from trivial, and the affine structure must
somehow play a role).
−→ Problem Set 2 is assigned.

4.1.1 Sontag’s universal formula

Consider the system affine in control


m
X
ẋ = f (x) + G(x)u = f (x) + gi (x)ui
i=1

Here f and gi are n-vectors, and G is a matrix whose columns are the gi ’s. We assume that
f (0) = 0.
The definition of CLF in this case becomes
( m
)
∂V X ∂V
inf · f (x) + · gi (x)ui < 0 ∀ x 6= 0
u ∂x ∂x
i=1

It is easy to see that this is equivalent to the condition that for all x 6= 0,

∂V ∂V
· gi (x) = 0 ∀ i =⇒ · f (x) < 0
∂x ∂x
Indeed, since the controls are unbounded, we can always pick u to get

X ∂Vm
∂V
· f (x) + · gi (x)ui < 0 (37)
∂x ∂x
i=1

except for those x at which the terms with u are all 0, where we lose control authority and need
the first term to be negative by itself.
36 DANIEL LIBERZON

To simplify notation, define


 T
∂V ∂V ∂V
a(x) := · f (x), b(x) := · g1 (x), . . . , · gm (x)
∂x ∂x ∂x

In particular, the CLF property can be written as

|b(x)| = 0 =⇒ a(x) < 0 (38)

for all x 6= 0.
Consider the feedback law
 p

 a + a2 + |b|4
− b, b 6= 0
k(x) = K(a(x), b(x)) := |b|2 (39)

0, b=0

It can be shown that the apparent singularity at b = 0 is removable. In fact, K is analytic as a


function of a and b, except at x = 0 (where (38), which is used for showing this, does not hold).
This means that the above control law does not lead to any loss of smoothness in closed loop; for
example, it is smooth (away from x 6= 0) if f , G, and V are smooth.
It is also not hard to show that this feedback stabilizes the closed-loop system, with V as
Lyapunov function. We do this now. Write

∂V  
Xm Xm
· f (x) + gi (x)ui = a(x) + bi (x)ui
∂x
i=1 i=1
m p p m
X a + a2 + |b|4 a + a2 + |b|4 X 2
=a− bi bi = a − bi
|b|2 |b|2
i=1 i=1
p
= − a2 + |b|4 < 0 ∀ x 6= 0

where the very last inequality follows from (38). The claim follows from Theorem 1.
The reason to put |b|4 and not, e.g., |b|2 inside the square root is to ensure the above smoothness
property of the control law as well as its continuity at 0 under an additional hypothesis (as discussed
below).
Note that formally, to talk about global asymptotic stability of the zero equilibrium we need to
make sure that x = 0 is indeed an equilibrium. This is why we need f (0) = 0 (since k(0) = 0 by
construction).
As for x = 0, the feedback law (39) is automatically continuous there if V has the property
that for small x, the values of u that give (37) can also be chosen small (small control property).
This is not always possible. For example, the scalar system

ẋ = x + x2 u
NONLINEAR AND ADAPTIVE CONTROL 37

cannot be stabilized with a feedback that is continuous at 0, because x2 is small near 0 compared
to x so u needs to be large there, and u needs to be negative for x > 0 and positive for x < 0. The
function V (x) = x2 /2 is a CLF for this system, but it doesn’t have the small control property. If
x and x2 were flipped, then all would be well.
Anyway, continuity of k(x) at 0 is not so crucial: if it is continuous away from 0, then the
closed-loop system is well-posed for x(0) 6= 0 and all solutions go to 0, which is what we want.
The formula (39) is known as Sontag’s formula, or universal formula. Similar formulas exist
for control spaces different from Rm (e.g., bounded controls).

Example 3
ẋ = −x3 + u, x, u ∈ R
Even without using Sontag’s formula, there are several rather obvious stabilizing feedback laws we
can apply. For example:
u = x3 − x =⇒ ẋ = −x
This an example of feedback linearization design: cancel nonlinearities and get a stable linear
closed-loop system.
However, the previous feedback law requires very large control effort for large x, while −x3 is
actually a “friendly” nonlinearity which we don’t need to cancel. Indeed, consider

u = −x =⇒ ẋ = −x3 − x

This is globally asymptotically stable:


x2
V (x) =
2
gives
V̇ = −x4 − x2 < 0 ∀ x 6= 0

A third option is to do nothing:

u=0 =⇒ ẋ = −x3

and the same V shows stability because


V̇ = −x4
However, this “lazy” control design gives very slow convergence near x = 0. It is better to add
linear damping as in the previous control law, which gives better convergence and the control effort
is reasonable.
What does Sontag’s universal formula give?

a(x) = −x4 , b(x) = x

so we get √ √
−x4 + x8 + x4 −x4 + x2 x4 + 1 p
u=− x=− = x3 − x x4 + 1
x2 x
38 DANIEL LIBERZON

(initially we should have defined u separately to be 0 when x = 0 but the final formula captures
this). The closed-loop system is p
ẋ = −x x4 + 1

This control law has a slightly more complicated expression than the previous ones, but it has
the following interesting properties. First, we have u → 0 as |x| → ∞, i.e., for large x we do
nothing and let the −x3 term do all the work. On the other hand, for small x we have ẋ ≈ −x,
which ensures nice convergence to 0. So, this control is a good compromise between the previous
designs!

4.2 Back to the adaptive control example


Back to Example 1: consider the plant and the tuning law but open-loop (control yet to be de-
signed):

ẋ = θx + u
˙
θ̂ = x2

Our Lyapunov function candidate was

x2 (θ̂ − θ)2
V (x, θ̂) = +
2 2
Is this a CLF? Differentiate it:

V̇ = θx2 + xu + (θ̂ − θ)x2 = θ̂x2 + xu (40)

Does it satisfy the definition of CLF? We have

a = θ̂x2 , b=x

We have to be careful: the answer is no! This is because the state of the system is (x, θ̂), so for
V to be a CLF we would need the implication

b(x, θ̂) = 0 =⇒ a(x, θ̂) < 0

to hold for all (x, θ̂) 6= (0, 0). But it only holds when x 6= 0. In other words, the subset where it
doesn’t hold is not just the origin, but the whole line {x = 0}. This is not surprising because we
know it is the line of equilibria.
But since our objective is to stabilize the system only to this line, and not to the origin, the
universal formula will still hopefully solve our problem. Actually, the above V is a “weak CLF” in
the sense that
b(x, θ̂) = 0 =⇒ a(x, θ̂) ≤ 0
NONLINEAR AND ADAPTIVE CONTROL 39

everywhere (note the nonstrict inequality). And since b = x, we do have control authority to make
V̇ negative where it matters, i.e., for x 6= 0.
The universal formula gives
 p
 θ̂x2 + θ̂2 x4 + x4
k(x) = − x2
x, x 6= 0

0, x=0

which simplifies to p
θ̂x2 + x2 θ̂2 + 1  p 
k(x) = − = − θ̂ + θ̂2 + 1 x
x
Plugging this into the expression (40) for V̇ , we get
p
V̇ = − θ̂2 + 1 x2

Now Theorem 2 implies x → 0 as before.


The important thing was that a and b did not depend on the unknown parameter θ, so the
above control is implementable. Contrast this with

x2
V =
2
which gives
V̇ = θx2 + xu
and we can’t do anything unless we know θ.
We see that the CLF concept does provide useful insight into the adaptive example, but it needs
to be tweaked. First, we only have a weak CLF. Second, we have to make sure that V̇ does not
involve θ. This kind of situation is typical in adaptive control.
We’ll see Lyapunov-based design again in the next two sections—backstepping and parameter
estimation. The universal formula itself is not used very often in the actual control design, because
usually simpler control laws can be found on a case by case basis (as in the above examples). But
it is nice to know that a universal formula exists.

5 Backstepping
Reference: [KKK book, Chapters 2 and 3]
We now discuss one of very few available tools for systematically generating Lyapunov functions
for certain classes of systems. This is also a nonlinear design technique, and it’ll allow us to venture
into adaptive control of nonlinear plants. Up to now, the only nonlinear control design tool we’ve
40 DANIEL LIBERZON

discussed is Lyapunov-based design (and Sontag’s universal formula), and backstepping is in a way
a continuation of that. Plants in all adaptive control scenarios have been linear so far.
In the non-adaptive context, the idea of backstepping first appeared in the Russian literature:
paper by Meilakhs, 1978. Independently and around the same time, it was investigated in the
adaptive control context (MRAC problem) by Feuer and Morse, and then by Morse and the authors
of the KKK book (who subsequently coined the term “backstepping” as a general technique for not
necessarily adaptive control).

5.1 Integrator backstepping

Start with the affine control system

ẋ = f (x) + G(x)u, x ∈ Rn , u ∈ Rm (41)

and assume that we have a CLF V0 (x) and a stabilizing control law

u = k0 (x), k0 (0) = 0

for which
∂V0 ∂V0
· f (x) + · G(x)k0 (x) ≤ −W (x) < 0 ∀ x 6= 0
∂x ∂x

Assume that all data (V0 , f , G, k0 ) are smooth.


Now, suppose that our system is augmented with an integrator, increasing relative degree (think
of x as output) from 1 to 2:

ẋ = f (x) + G(x)ξ
ξ˙ = u

(Note that ξ is a vector in Rm , so this is an m-dimensional integrator.) We want to find a CLF


and a stabilizing feedback law for this new system.
We can view ξ = k0 (x) as a “virtual” control law, which is no longer implementable (because ξ
is a state and not a control). Motivated by this, define an “augmented” candidate CLF

1
V1 (x, ξ) := V0 (x) + |ξ − k0 (x)|2 (42)
2
NONLINEAR AND ADAPTIVE CONTROL 41

∂k0
Its derivative along the (x, ξ)-system is (k0′ stands for the Jacobian matrix ∂x )

∂V0 ∂V0
V̇1 = ·f + · Gξ + (ξ − k0 )T (u − k0′ f − k0′ Gξ)
∂x ∂x
∂V0 ∂V0 ∂V0
= ·f + · Gk0 + · G(ξ − k0 ) + (ξ − k0 )T (u − k0′ f − k0′ Gξ)
∂x ∂x ∂x
∂V0 ∂V0  ∂V0 
= ·f + · Gk0 +(ξ − k0 )T u − k0′ f − k0′ Gξ + GT
|∂x {z∂x } ∂x
“old” V̇0 , for (41)
 ∂V0 
≤ −W (x) + (ξ − k0 )T u − k0′ f − k0′ Gξ + GT
∂x

Claim: V1 is a CLF. (See the characterization (38) of CLF for affine systems.)
Indeed, the term multiplying u is (ξ − k0 (x))T . Suppose that it is 0. Then what remains
is ≤ −W (x), hence it is negative unless x = 0 (since W is positive definite). If x = 0, then
k0 (x) = k0 (0) = 0 and so to have ξ − k0 (x) = 0 we must have ξ = 0. We showed that away from
(x, ξ) = (0, 0), we can make V̇1 < 0 by a proper choice of u, which proves the claim.
It is also not hard to see how a stabilizing feedback law can be designed. For example, we can
simply cancel the terms in large parentheses and add a negative square to V̇1 :
∂V0
u = k1 (x, ξ) := −(ξ − k0 ) + k0′ f + k0′ Gξ − GT
∂x
gives
V̇1 ≤ −W (x) − |ξ − k0 |2 < 0 ∀ (x, ξ) 6= (0, 0)
where the last inequality is proved as in the proof of the previous Claim.
Note that since k1 involves the derivative of the nominal control law k0 , we lose one degree of
smoothness in the control law.
−→ The key idea of backstepping is not the actual formula for the control law, but the procedure
of constructing the augmented Lyapunov function V1 as in (42). We usually have some flexibility in
the choice of the control law, which is common in Lyapunov-based design as we already discussed
before. The next example illustrates the procedure and this last point.

Example 4 Consider the 2-D system

ẋ = −x3 + ξ
(43)
ξ˙ = u

Rather than just applying the general formula derived above, let’s follow the procedure to see better
how it works. We first need a CLF and a stabilizing control law u = k0 (x) for the scalar system

ẋ = −x3 + u
42 DANIEL LIBERZON

We already considered this system in Example 3 (page 37), where we had the CLF

x2
V0 (x) =
2

In fact, when the x-system is scalar, a CLF (if one exists) can always be this one. One choice of
the control law was
k0 (x) = −x

which gives
V̇0 = −x4 − x2

Next, consider
1 x2 1
V1 (x, ξ) = V0 (x) + (ξ − k0 (x))2 = + (ξ + x)2
2 2 2
We compute its derivative along the (x, ξ)-system:

V̇1 = −x4 + xξ + (ξ + x)(u − x3 + ξ)


= −x4 − x2 + x(ξ + x) + (ξ + x)(u − x3 + ξ)
= −x4 − x2 + (ξ + x)(u + x − x3 + ξ)

hence we can define


u = k1 (x, ξ) := −2(ξ + x) + x3

to get
V̇1 = −x4 − x2 − (ξ + x)2 < 0 ∀ (x, ξ) 6= (0, 0)

and we are done.


The above feedback law cancels the x3 term in V̇1 , which is not really necessary. With the CLF
V1 in hand, we have flexibility in selecting another feedback as long as it gives V̇1 < 0. An example
of another choice of feedback is
u = −2(ξ + x) − x2 (ξ + x)

This is still stabilizing because it gives

V̇1 = −x4 − x2 − (ξ + x)2 − x2 (ξ + x)2 − x3 (ξ + x)


3 1
= − x4 − x2 − (ξ + x)2 −x2 (ξ + x)2 − x3 (ξ + x) − x4 < 0 ∀ (x, ξ) 6= (0, 0)
4 | {z 4 }
2
=−(x(ξ+x)+ 21 x2 )

This control law perhaps makes a little bit more sense because it is close to 0 when ξ ≈ −x, i.e.,
when the behavior of ξ is consistent with k0 (x).
NONLINEAR AND ADAPTIVE CONTROL 43

We presented only one step of integrator backstepping. However, it is a recursive procedure


which can be applied in the obvious way to a chain of k integrators:

ẋ = f (x) + G(x)ξ1
ξ˙1 = ξ2
..
.
ξ˙k−1 = ξk
ξ˙k = u

We assume that a CLF V0 for the x-system is given (if x is scalar or has a low dimension, we can
hope to find one easily by hand.) Then, we generate a sequence of CLFs V1 , . . . , Vk , and the last
one is a CLF for the entire system. As before, we lose one degree of smoothness of the feedback
law at each step, so we need to make sure that k0 is at least C k .
It is useful to compare backstepping with feedback linearization on the above example. The
system (43) is feedback linearizable. To better explain what we mean by feedback linearization
(which was only briefly mentioned in an earlier example illustrating Sontag’s formula), define

z := −x3 + ξ

and compute its derivative:

ż = −3x2 (−x3 + ξ) + u = 3x5 − 3x2 ξ + u

If we apply the feedback law


u = −3x5 + 3x2 ξ − x − z
then, if we convert everything to the (x, z)-coordinates, we get the closed-loop system

ẋ = z
ż = −x − z

which is linear and asymptotically stable. (We can check that the Jacobian of the map from (x, ξ)
to (x, z) is nonsingular and hence the coordinate transformation is well defined.)
This particular feedback linearizing controller involves terms of higher degrees than the back-
stepping one, because it cancels all nonlinearities while the backstepping controller tries to preserve
“friendly” nonlinearities. On the other hand, backstepping requires more structure from the sys-
tem: all “virtual controls” ξi must enter affinely on the right-hand sides. Feedback linearization
doesn’t require this; for example, we can apply it to

ẋ = f (x, ξ)
ξ˙ = u
44 DANIEL LIBERZON

∂f
as long as ∂ξ 6= 0. For backstepping, the first equation must be of the form

ẋ = f (x) + g(x)ξ

Note, however, that we don’t need to assume g(x) 6= 0 to use the Lyapunov-based design.
In other words, the two techniques—backstepping and feedback linearization—are complemen-
tary as they apply to different classes of systems (although in both cases, relative degree must equal
state space dimension).
Recall that in Section 3.2.1 we considered the system in normal form

ξ˙1 = ξ2
ξ˙2 = ξ3
.. (44)
.
ξ˙r = b(ξ, η) + a(ξ, η)u
η̇ = q(ξ, η)

with a(ξ, η) 6= 0 for all ξ, η. We saw there that it cannot always be globally asymptotically stabilized
by the partially linearizing high-gain feedback of the form (30), even if the zero dynamics η̇ = q(0, η)
are globally asymptotically stable. If we assume, additionally, that the η-dynamics are affine in ξ1
and don’t depend on ξ2 , . . . , ξr :
η̇ = f (η) + G(η)ξ1
then backstepping provides a method to design a globally stabilizing feedback. Start by noting
that the system
η̇ = f (η) + G(η)u
is, by the minimum-phase assumption, stabilized by u ≡ 0. Then add an integrator:

η̇ = f (η) + G(η)ξ1
ξ˙1 = u

and use backstepping to find a stabilizing u. Proceeding in this way, we eventually obtain a
stabilizing feedback for (44). To handle the ξr -equation we need to go beyond pure integrator
backstepping, this is covered in HW. Unlike the feedback (30) that we tried earlier, the feedback
constructed by backstepping doesn’t have linear gains k1 , k2 , . . . and is purely nonlinear in general.

5.2 Adaptive integrator backstepping


In our old Example 1 we had: the plant

ẋ = θx + u
NONLINEAR AND ADAPTIVE CONTROL 45

the controller
u = −(θ̂ + 1)x =: k0 (x, θ̂)
the tuning law
˙
θ̂ = x2 =: τ0 (x)
and the Lyapunov function
1 2 
x + (θ̂ − θ)2
V0 (x, θ̂) :=
2
For convenience let us introduce the parameter error

θ̃ := θ̂ − θ

then we get
1 2 
V0 (x, θ̂) := x + θ̃2
2
and
∂V0 ∂V0
V̇0 = (θx + k0 ) + τ0 (x) = x(θx − (θ̂ + 1)x) + θ̃x2 = −x2
∂x ∂ θ̂
Let us now add an integrator:

ẋ = θx + ξ
ξ˙ = u

One complication is that the above “virtual” control law is dynamic. However, we can still apply
the same idea and consider the augmented candidate CLF
1 2  1 
V1 (x, θ̂, ξ) := x + θ̃2 + (ξ − k0 (x, θ̂))2 = x2 + θ̃2 + (ξ + (θ̂ + 1)x)2
2 2
˙
Let’s write its derivative, keeping θ̂ open for now since we’re not yet sure if the same tuning law
will work.
˙ ˙ 
V̇1 = x(θx + ξ) + θ̃θ̂ + (ξ + (θ̂ + 1)x) u + θ̂x + (θ̂ + 1)(θx + ξ)
˙ ˙ 
= x(θx − (θ̂ + 1)x) + θ̃θ̂ +(ξ + (θ̂ + 1)x) u + x + θ̂x + (θ̂ + 1)(θx + ξ)
| {z }
˙
=−x2 +θ̃(θ̂−x2 )

˙
Difficulty: if we define θ̂ = x2 as before, then, to get V̇1 < 0, we need to define u to cancel the
terms in the last parentheses and add a damping term −(ξ + (θ̂ + 1)x). This is what we did in the
non-adaptive case. But the terms in the parentheses depend on the unknown parameter θ!
Solution: Replace θ by θ̂ in the last parentheses, and the carry the difference—which depends
˙
on θ̃—outside the parentheses and combine it with the θ̃θ̂ term.
˙  ˙ 
V̇1 = −x2 + θ̃ θ̂ − x2 − (ξ + (θ̂ + 1)x)(θ̂ + 1)x + (ξ + (θ̂ + 1)x) u + x + θ̂x + (θ̂ + 1)(θ̂x + ξ)
46 DANIEL LIBERZON

Now the choice of the tuning law and the control law is clear: first set
˙
θ̂ = x2 + (ξ + (θ̂ + 1)x)(θ̂ + 1)x

which cancels the θ̃ term. Note that this takes the form
˙
θ̂ = τ0 + τ1 =: τ

where τ0 = x2 is what we had for the scalar plant and τ1 is a new term. Next, set

u = −x − τ x − (θ̂ + 1)(θ̂x + ξ) − (ξ + (θ̂ + 1)x) =: k1 (x, θ̂, ξ)

This gives

V̇1 = −x2 − (ξ + (θ̂ + 1)x)2 =: −W1 (x, θ̂, ξ) < 0 when (x, ξ) 6= (0, 0)

and we are in good shape because this implies that x, θ̂, ξ are bounded and x, ξ → 0 (Theorem 2).
−→ Note that the above controller is not really a certainty equivalence controller any more: it
incorporates explicitly a correction term (τ x) coming from the tuning law. (Compare this with the
discussion on page 9.)
We see—and this is not surprising—that adaptive backstepping proceeds along the same lines
as non-adaptive backstepping but it is more challenging.
The next step would be to try to stabilize

ẋ = θx + ξ1
ξ˙1 = ξ2
ξ˙2 = u

The above example suggests considering the candidate CLF


1
V2 (x, θ̂, ξ1 , ξ2 ) := V1 (x, θ̂, ξ1 ) + (ξ2 − k1 (x, θ̂, ξ))2
2
and a tuning law of the form
˙
θ̂ = τ0 + τ1 + τ2

The functions τ0 , τ1 , τ2 , . . . are called tuning functions. For a general procedure of designing
them, see [KKK book, Chapter 4].

6 Parameter estimation
There are no nonlinear system theory concepts here, but instead some ideas from optimization
algorithms and the important concept of a persistently exciting signal.
General points:
NONLINEAR AND ADAPTIVE CONTROL 47

• Unknown parameters are often difficult to estimate off-line, hence the need for on-line es-
timation. Examples: camera calibration; calibration of mechanical system models (such as
helicopter in flight).
• We will see that convergence of parameter estimates to their true values requires some special
properties of the input signal u.
• Sometimes parameter convergence is not crucial, and u is chosen to meet behavior specs of the
system. (For example, to fly a helicopter, it is not necessary to know all its model parameters
exactly.) To analyze stability of the resulting adaptive control system, other properties of
the parameter estimation scheme (in particular, slow adaptation speed) will be important.

Choice of control—later. Here—on-line parameter estimation. Reference: [Ioannou-Sun, Chap.


4,5]. Most of this material is also in [Ioannou-Fidan].
A word on terminology: “Parameter estimation” is some procedure for generating parameter
estimates, whether or not they converge to the true values of the parameters. “Parameter iden-
tification” is a stronger term, which assumes that parameter estimates do converge to the true
values. (This is the difference between Chapters 4 and 5 in [Ioannou-Sun]; in [Ioannou-Fidan] this
is combined in one chapter.) For control purposes, we will always use some form of parameter
estimation, but will not necessarily require parameter identification. Of course, some other nice
properties of the parameter estimation scheme will be needed in such cases. By the end of the
course, we will understand why adaptive control works even without parameter identification (and
will thus justify control design based on the certainty equivalence principle).

6.1 Gradient method


Parameter estimation algorithms usually aim at minimizing some cost function that reflects the
quality of estimation. Let us recall some basic facts about minimizing a given function f : Rn → R.
We say that x∗ is a (global) minimum of f if
f (x∗ ) ≤ f (x) ∀x
If f ∈ C1 then its gradient is the vector
 T
∂f ∂f
∇f (x) := (x), . . . , (x)
∂x1 ∂xn
A necessary condition for x∗ to be a global minimum is
∇f (x∗ ) = 0 (45)
This condition is in general not sufficient, because it also holds at local minima, as well as at
maxima and saddle points. However, if f is a convex function, then the above 1st-order condition
is sufficient (for convex functions, a stationary point is automatically a global minimum).
The most basic algorithm3 for computing the minimum is the gradient method (or the method
3
There are of course others, such as Newton’s method.
48 DANIEL LIBERZON

of steepest descent). Its continuous version is given by the differential equation

ẋ = −∇f (x) (46)

(with some initial condition x0 ). More generally, one considers

ẋ = −Γ∇f (x) (47)

where
Γ = ΓT > 0
is a scaling matrix. This is exactly what we get from the original (unscaled) gradient method by
changing coordinates in Rn according to

x = Γ1 x̄, Γ1 ΓT1 = Γ (48)

(See, e.g., [Ioannou-Sun, Section B.2, p. 785].)


The idea behind the gradient method is that f decreases along trajectories of (47), and even-
tually we approach an equilibrium where (45) holds. Convergence of the gradient method is a
standard topic in optimization textbooks, although results are more often stated for discrete iter-
ations (with an appropriate choice of stepsize). For (46), we can consider the candidate Lyapunov
function
V (x) := f (x) − f (x∗ )
where x∗ is a global minimum (assumed to exist). Its derivative is

V̇ (x) = −|∇f (x)|2

hence we have
∇f (x) → 0
along any bounded solution (cf. Theorem 2). For (47), the same V gives

V̇ (x) = −∇f (x)Γ∇f (x) (49)

and the same conclusion holds since Γ is positive definite.


In view of earlier remarks, (49) does not necessarily imply that x(t) converges to the global
minimum x∗ . But if f is a convex function, then we do in fact have convergence to a global
minimum.
If we know that f is convex, then we can use another Lyapunov function:

V (x) := |x − x∗ |2

which can be shown to decrease along solutions of (46). (Reason: if f (x∗ ) < f (x) and f is convex,
then f decreases as we start moving along the line from x to x∗ . Hence, the inner product with
the gradient is negative.) To handle the scaled gradient law (47), we need to consider

V (x) := (x − x∗ )T Γ−1 (x − x∗ ) (50)


NONLINEAR AND ADAPTIVE CONTROL 49

which is the same as |x̄ − x̄∗ |2 for x̄ given by (48).


−→ We will see Lyapunov functions of this form several times below when analyzing convergence
of parameter estimation schemes based on gradient laws. However, since the gradient law will be
coupled with plant dynamics, we will need to do the proof from scratch.

6.2 Parameter estimation: stable case


We first discuss stable plants with bounded inputs. Afterwards, we’ll discuss how to lift these
assumptions.

Example 5 Let us start with the static scalar example

y(t) = θu(t)

where θ ∈ R is the unknown parameter and u(·), y(·) are known continuous signals. For now we
assume that u is bounded (hence so is y). We’ll sometimes write this as u ∈ L∞ , y ∈ L∞ , etc.
Problem: estimate θ.
A naive approach is just to use
y
θ=
u
However, this has problems. First, it’s ill-defined if u = 0. Second, this is very sensitive to noise.
But most importantly, we want a method that will work for the dynamic case (when u and y are
the input and output of a dynamical system containing uncertain parameter θ).

So, instead we want to design a dynamical system which will generate a time-varying estimate
θ̂(t) of θ. This will lead also to
ŷ(t) := θ̂(t)u(t)

Note: ŷ is not really an estimate of y, since we can measure y directly. But ŷ will provide us
feedback on the quality of the estimate θ̂. To this end, we define the output estimation (or output
prediction) error
e(t) := ŷ(t) − y(t) = (θ̂(t) − θ)u(t)

(in Ioannou-Sun, e is defined with the opposite sign). Based on this error e, we define the cost
function
e2 (t) (θ̂ − θ)2 u2
J(θ̂(t), t) := =
2 2
For simplicity, we will omit the argument t and just write J(θ̂).
Idea: update θ̂ so as to minimize J(θ̂).
50 DANIEL LIBERZON

The motivation is that J ≥ 0 always and J = 0 when θ̂ = θ. Note, however, that it’s possible
to have J = 0 when θ̂ 6= θ if u = 0. To avoid this, we make the (temporary) assumption that

u2 ≥ c > 0 (51)

We want to use the gradient method.

∇J(θ̂) = (θ̂ − θ)u2 = eu

Thus the gradient law is


˙
θ̂ = −γeu (52)
with some initial condition θ̂0 , where γ > 0 is the scaling factor (or “adaptive gain”).
Note that J is convex as a function of θ̂, thus there’s hope for convergence. But J also depends
on u(t), so this is not quite as easy as the standard case discussed earlier. Candidate Lyapunov
function
(θ̂ − θ)2
V (θ̂) := (53)
2
gives
˙
V̇ = (θ̂ − θ)θ̂ = −(θ̂ − θ)γeu = −γ(θ̂ − θ)2 u2 ≤ −γ(θ̂ − θ)2 c < 0 ∀ θ̂ 6= θ
where c comes from (51). This implies that θ̂ = θ is an asymptotically stable equilibrium. (In fact,
it is exponentially stable since V̇ ≤ −2γcV and V is quadratic.)
Now, suppose we don’t want to assume (51), but allow instead any bounded u with bounded
derivative. The previous formulas for V̇ can be rewritten as

V̇ = −γe2

We can now use the same trick as in the proof of Theorem 2: integrate this and reduce to Barbalat’s
lemma.
Z t Z t
2
V (t) − V (0) = −γ e (s)ds =⇒ γ e2 (s)ds = V (0) − V (t) ≤ V (0) ∀t
0 0
R∞
which implies that 0 e2 is finite, i.e., e ∈ L2 .
To apply Barbalat, we need to show that e, ė are bounded.
Since V is nonincreasing, from the definition (53) of V it is clear that θ̂ is bounded.
Thus e = (θ̂ − θ)u is also bounded.
˙
In view of (52), this shows that θ̂ is bounded, and belongs to L2 as well4 .
˙
We then have that ė = θ̂u + (θ̂ − θ)u̇ is also bounded.
4
Such “slow adaptation” properties will be useful later when we discuss stability of slowly time-varying
systems.
NONLINEAR AND ADAPTIVE CONTROL 51

˙
By Barbalat’s lemma, e → 0, i.e., ŷ → y, and θ̂ → 0.
Note: we didn’t use Theorem 2 directly because of the presence of the external input u, but we
essentially repeated the steps of its proof.
What does the above analysis imply about convergence of θ̂?
Since V is nonincreasing and bounded from below by 0, it has a limit as t → ∞. Hence, θ̂ must
˙
converge to a constant. (Note that this doesn’t follow just from the fact that θ̂ → 0.)
Can we conclude from the above analysis that θ̂ → θ?
Not necessarily. Define the parameter estimation error

θ̃ := θ̂ − θ

It satisfies the DE
˙ ˙
θ̃ = θ̂ = −γeu = −γu2 θ̃
which we can solve to get
Rt
−γ u2 (s)ds
θ̃(t) = e 0 θ̃(0)

It is now clear that convergence of θ̃ to 0 is affected by the choice of u. For example, if u


satisfies (51), then clearly θ̃ converges exponentially fast (at the rate at least γc), as we already
proved earlier. On the other hand, if u ∈ L2 , then the integral in the exponent is bounded and θ̃
does not converge.
Let us try to make precise the convergence property we want from θ̃ and the corresponding
condition that should be required from u. We say that θ̃ is uniformly exponentially convergent
(UEC) if for some c, λ > 0 we have

|θ̃(t)| ≤ ce−λ(t−t0 ) |θ̃(t0 )|

for all t ≥ t0 ≥ 0. We say that u is persistently exciting (PE) if for some α0 , T0 > 0 we have
Z t+T0
u2 (s)ds ≥ α0 T0 ∀t (54)
t

This is an important concept (we’ll extend it later to vector signals). Constant signals, or more
generally signals satisfying (51) are PE, while L2 signals are not. The constant α0 is called the
level of excitation.

Lemma 4 For the above gradient law, θ̃ is UEC if and only if u is PE.

Proof—next HW.
52 DANIEL LIBERZON

Example 6 Consider now the one-dimensional plant

ẋ = −ax + bu

where we assume that a > 0 and u is bounded (hence x is also bounded). We want to estimate the
unknown parameters a and b.
Estimator5 :
x̂˙ = −am (x̂ − x) − âx + b̂u
where am > 0 is a design constant which determines the damping rate of the estimator. To see
this, define the estimation errors

e := x̂ − x, ã := â − a, b̃ := b̂ − b

(Again, note that calling e an “estimation error” is not really accurate because both x̂ and x are
measured signals; sometimes the term “state prediction error” is used instead.) Then e satisfies
the DE
ė = −am e − ãx + b̃u
(this DE is not actually implemented, but only used for analysis purposes).
Observe that the above equation is stable with respect to x, u and has the autonomous contrac-
tion rate am . In particular, if ã, b̃ are 0 or converge to 0, then e → 0 (recall that x, u are bounded).
However, the converse is not true: e → 0 does not necessarily imply ã, b̃ → 0 unless the signals
x, x̂, u are PE in some sense (we will see this later).
Update laws for the estimates â, b̂ will be driven by e, and will take the form
˙
â˙ = f1 (e, x̂, x, u), b̂ = f2 (e, x̂, x, u)

Goal: make ã, b̃, e → 0 along solutions of the resulting 3-D system.
Lyapunov-based design: Consider the candidate Lyapunov function
1
V (e, ã, b̃) := (e2 + ã2 + b̃2 ) (55)
2
(has all desired properties: C 1 , positive definite, radially unbounded).
˙
V̇ = eė + ãâ˙ + b̃b̂ = e(−am e − ãx + b̃u) + ãf1 + b̃f2 = −am e2 −ãxe + b̃ue + ãf1 + b̃f2
| {z }
not helpful, want to cancel this

So, a natural choice is

f1 := xe, f2 := −ue =⇒ V̇ = −am e2 ≤ 0


5
This is not the only possible estimator; see [Ioannou-Sun, Sect. 4.2.2] for another estimator and com-
parison.
NONLINEAR AND ADAPTIVE CONTROL 53

What conclusions can we draw from this?


V is bounded =⇒ e, â, b̂ are bounded.
u, x, x̂(= x + e) are bounded =⇒ all signals are bounded.
Integrating V̇ = −am e2 as before we get e ∈ L2 .
ė is bounded =⇒ by Barbalat’s lemma, e → 0. (Can also appeal directly to Theorem 2.)
Thus â,˙ b̂˙ → 0, and they are also in L2 .
Since V is nonincreasing and bounded from below by 0, it has a limit as t → ∞. And, since
we know that e → 0, we get that ã2 + b̃2 also has a limit. However, this doesn’t mean that ã and b̃
individually converge to any values, let alone to 0.
So, the above parameter estimation scheme has a number of useful properties but it does not
guarantee convergence of parameter estimates â, b̂ to their true values a, b. This is actually not
surprising. For example, if u ≡ 0 and x0 = 0 then x stays at 0 and we cannot learn a, b no matter
what we do. In other words, some PE-like assumptions will have to be imposed on the input u to
guarantee convergence of parameter estimates (we already saw this in the previous example).

−→ These examples are just to build intuition. We’ll make things more rigorous and more general
a little later.
We now consider the general vector case, but still assuming stability. Namely, the plant is

ẋ = Ax + Bu

where x ∈ Rn , u ∈ Rm is bounded, and A, B are unknown matrices with A Hurwitz (hence we


know that x is bounded as well). We are assuming here that x is available for measurement (i.e.,
y = x).
Estimator is an extension of the one from the previous scalar example:

x̂˙ = Am (x̂ − x) + Âx + B̂u

where Am is a Hurwitz matrix (chosen by the designer) and Â, B̂ are estimates of A, B (to be
generated). Defining as before

e := x̂ − x, Ã := Â − A, B̃ := B̂ − B

we have
ė = Am e + Ãx + B̃u
Update laws:
˙ ˙
 = F1 (e, x̂, x, u), B̂ = F2 (e, x̂, x, u)
Candidate Lyapunov function:

V (e, Ã, B̃) := eT P e + tr(ÃT Ã) + tr(B̃ T B̃)


54 DANIEL LIBERZON

where P comes from the Lyapunov equation

ATm P + P Am = −I

and tr stands for trace of a matrix. Note that tr(ÃT Ã) is nothing but the sum of squares of all
elements of Ã.
−→ In class we skipped the calculation that follows.
We have

V̇ = (eT ATm + xT ÃT + uT B̃ T )P e + eT P (Am e + Ãx + B̃u)


+ tr(Ã˙ T Ã + ÃT Ã) ˙
˙ + tr(B̃˙ T B̃ + B̃ T B̃)

= eT (ATm P + P Am )e + xT ÃT P e + uT B̃ T P e + eT P Ãx + eT P B̃u


+ tr(F1T Ã + ÃT F1 ) + tr(F2T B̃ + B̃ T F2 )
= −eT e + 2xT ÃT P e + 2uT B̃ T P e + 2tr(ÃT F1 ) + 2tr(B̃ T F2 )

Useful property of trace: for two vectors k, l ∈ Rn we have tr(klT ) = k T l = lT k (quick exercise:
prove this). Hence
xT |ÃT{zP e} = tr(ÃT P exT )
|{z}
lT k

and similarly
uT B̃ T P e = tr(B̃ T P euT )
So we get
V̇ = −eT e + 2tr(ÃT P exT + ÃT F1 ) + 2tr(B̃ T P euT + B̃ T F2 )
This makes the choice of F1 , F2 obvious:

F1 := −P exT , F2 := −P euT =⇒ V̇ = −|e|2

Boundedness of all signals follows right away.


As before, we have e ∈ L2 , ė ∈ L∞ =⇒ e → 0.
˙ ˙
Â, B̂ → 0, and they are also in L2 .
To have  → A, B̂ → B we will need to place extra assumptions on u to guarantee that it
sufficiently excites the plant. Will come back to this later.

6.3 Unstable case: adaptive laws with normalization


Up to now: stable plants, bounded inputs. Not adequate for adaptive control. Want to handle
unstable plants, unbounded inputs.
NONLINEAR AND ADAPTIVE CONTROL 55

Let us revisit Example 5:


y(t) = θu(t)
θ ∈ R unknown, u, y scalar, not necessarily bounded any more.
The basic idea is to normalize u and y to get bounded signals. Namely, consider
u y
ū := , ȳ =
m m
where m is some signal that guarantees boundedness of ū (and consequently that of ȳ). A simple
choice is p
m := 1 + u2
The normalized signals satisfy the same relation:

ȳ = θū

and we can proceed as before to define the output estimate/predictor

ȳˆ := θ̂ū

the output estimation (prediction) error


e
ē := ȳˆ − ȳ =
m
(recall that e = ŷ − y) and the cost

ē2 (θ̂ − θ)2 u2


J(θ̂) := =
2 2m2

We want to use the gradient method to minimize J(θ̂).

(θ̂ − θ)u2 eu
∇J(θ̂) = 2
= 2
m m

Let us define
e
en :=
m2
as this quantity will appear frequently in calculations below. It is called the normalized output
estimation error. (In [Ioannou-Sun] it is denoted by ǫ.)
Note: at this point we don’t yet really need en , could also write everything in terms of ē, ū
(which would look closer to the unnormalized case; exercise: do this) but we’ll be relying on this
notation later.
The gradient law is
˙
θ̂ = −γen u
56 DANIEL LIBERZON

where γ > 0. Introduce the parameter estimation error

θ̃ := θ̂ − θ

to have
(θ̂ − θ)u 2
˙ ˙
θ̃ = θ̂ = −γen u = −γ = −γ ū2 θ̃
m2
Let’s try the candidate Lyapunov function

θ̃2
V (θ̃) :=
2
We have
˙ e2
V̇ = θ̃θ̃ = −γ ū2 θ̃2 = −γ 2 = −γe2n m2 ≤ 0
m
From this we know how to quickly deduce the following:
θ̃ ∈ L∞ and en m ∈ L2 . Since en m = ūθ̃, it is also in L∞ .
˙
θ̂ = −γen mū ∈ L2 ∩ L∞ . I.e., speed of adaptation is bounded in L2 and L∞ sense.
d d ˙ ˙
dt (en m) = dt (θ̃ū) = θ̃ū + θ̃ ū. By Barbalat, en m → 0 under the additional assumption that
ū˙ ∈ L∞ (which we also needed in the stable static case, but not in the stable dynamic case). This
˙
would in turn imply θ̃ → 0.
We see that basically, normalization lets us recover the main properties of the unnormalized
scheme, namely:

• bounded parameter estimates (or, what is the same, bounded parameter estimation errors)

• bounded speed of adaptation in L2 and L∞ sense

even though the input is no longer bounded. (But we no longer have convergence of the output
estimation error to 0.)
If m = 1 (no normalization), then the first item still holds (V̇ is still ≤ 0) but the second item
doesn’t. And we will see later that slow adaptation is important for stability of adaptive control
(when estimation is combined with certainty equivalence control).
Of course, without additional assumptions we cannot guarantee that θ̃ converges to 0 (or to
anything else).
In Example 6, we had
ẋ = −ax + bu
but we no longer want to assume a > 0 or u ∈ L∞ . Could try to use normalization, but it’s not as
straightforward: if we define
u x
ū := , x̄ :=
m m
NONLINEAR AND ADAPTIVE CONTROL 57

where m is chosen so that ū ∈ L∞ , then we don’t have any simple relationship between ū and x̄.
We certainly don’t have
x̄˙ = −ax̄ + bū (WRONG)
because m is time-varying (it cannot be constant unless u ∈ L∞ in the first place) and so ṁ will
˙
appear in x̄.
In [Ioannou-Sun, Sect. 4.3.2] an approach to this problem is proposed which relies on working
with a different state estimation error produced by a special system. Rather than trying to fix this
example, we now discuss how the general case can be made to resemble the static case of Example 5,
so that the previous design can be extended to it.
−→ Later we will see another approach, which does rely on estimator design (see indirect MRAC).
We will return to it again in switching adaptive control. What will happen is that estimator design
will be combined with control design, and stabilizing properties of the controller will let us show
boundedness of signals, something we cannot do here for arbitrary control inputs.

−→ Problem Set 3 is assigned.

6.3.1 Linear plant parameterizations (parametric models)

We now want to study a general SISO plant

ẋ = Ax + bu
y = cT x

where x ∈ Rn , the measured signals are u, y ∈ R, A is an unknown matrix, and b, c are unknown
vectors. It is more convenient for us here to represent it in the form

y (n) + an−1 y (n−1) + · · · + a0 y = bn−1 u(n−1) + · · · + b0 u (56)

(ignoring initial conditions). The numbers ai , bi are the coefficients of the denominator and the
numerator of the plant transfer function, respectively:

bn−1 sn−1 + · · · + b1 s + b0
y(s) = u(s) = cT (Is − A)−1 bu(s)
sn + an−1 sn−1 + · · · + a1 s + a0

We can define

θ := (bn−1 , . . . , b0 , an−1 , . . . , a0 )T , Y := (u(n−1) , . . . , u, −y (n−1) , . . . , −y)T

and solve (56) for the highest derivative to get

y (n) = θT Y (57)
58 DANIEL LIBERZON

This already looks familiar, but the problem is that y (n) and the vector Y cannot be obtained
without differentiation. To fix this, filter both sides of (57) with an n-th order stable filter

1 1
= n
Λ(s) s + λn−1 sn−1 + · · · + λ0

In other words, consider


1 (n) sn
z := y = y
Λ(s) Λ(s)
and  T
1 sn−1 1 sn−1 1
φ := Y = u, . . . , u, − y, . . . , − y (58)
Λ(s) Λ(s) Λ(s) Λ(s) Λ(s)
The vector φ is called the regressor vector. It is clear that both the scalar z and the vector φ can be
generated as outputs of suitable causal linear systems with inputs u and y. (Realizations of proper
transfer functions are covered in ECE 515.) They are related by

z = θT φ (59)

which is the parameterization we were looking for. It is called a linear parametric model because
the unknown parameters enter it linearly (this is unfortunately crucial for most adaptive laws to
work6 ).
−→ The number of parameters in the vector θ is at most 2n, whereas the number of original
parameters (entries of A, b, c was n2 + 2n). This difference is due to the fact that we are only
measuring the input/output data, not the state. Different plants with the same transfer function
are not distinguished by the parametric model (59).
−→ The initial condition x0 is ignored by (59). However, since Λ(s) is stable, the contribution
of x0 decays exponentially with time and doesn’t affect any of the results below. For details, see
[Ioannou-Sun, Sect. 4.3.7].

6.3.2 Gradient method

Estimate/prediction of z:
ẑ := θ̂T φ
where θ̂ is the estimate of θ as usual. Normalized estimation error:
ẑ − z
en :=
m2
where
m2 = 1 + n2s (60)
6
This restriction is lifted in switching adaptive control, to be discussed towards the end of the course.
NONLINEAR AND ADAPTIVE CONTROL 59

φ
p p
and ns is a normalizing signal such that m ∈ L∞ . For example: ns = φT φ, or ns = φT P φ for
some P = P T > 0. Parameter estimation error:

θ̃ := θ̂ − θ

and we have
θ̃T φ
en = (61)
m2
Define the cost function

(en m)2 (θ̃T φ)2 ((θ̂ − θ)T φ)2


J(θ̂) := = =
2 2m2 2m2
φ
Intuition: en m = θ̃T m , the second term is bounded, so the cost is big when θ̃ is big.
Also note that J is convex in θ̂.

θ̃T φ
∇J(θ̂) = φ = en φ
m2
Gradient adaptive law:
˙
θ̂ = −Γen φ
where Γ = ΓT > 0 is a scaling matrix (adaptive gain).

Theorem 5 (Ioannou-Sun, Theorem 4.3.27 )


˙
(i) θ̂ ∈ L∞ and en , en ns , θ̂ ∈ L2 ∩ L∞ .
φ
(ii) If m is PE, then θ̂ → θ exponentially fast.

For item (ii), the definition of a persistently exciting (PE) signal needs to be extended to vector-
valued signals. This is a direct extension of the earlier scalar definition (54) but instead of squaring
the signal we form a matrix by multiplying the vector signal on the right by its transpose, and the
φ
inequality becomes a matrix one. In other words, m is PE if for some α0 , T0 > 0 we have
Z t+T0
φ(s)φT (s)
ds ≥ α0 T0 I ∀t
t m2

Note that φφT is a singular matrix at each time instant (it has rank 1). PE means that over time,
it varies in such a way that its integral is uniformly positive definite (φ “generates energy in all
directions”). When defining PE, one usually requires an upper bound as well as a lower bound:
Z t+T0
φ(s)φT (s)
α 1 T0 I ≥ ds ≥ α0 T0 I
t m2
7
The theorem statement there is different, but the same proof works. See also [Ioannou-Fidan, Theorem
3.6.1].
60 DANIEL LIBERZON

φ
but since m is bounded, the first inequality is automatic.
Proof of Theorem 5.
(i) Take the candidate Lyapunov function (cf. (50))
1
V (θ̃) := θ̃T Γ−1 θ̃ (62)
2
whose derivative is (recall that Γ is symmetric)
˙
V̇ = θ̃T Γ−1 θ̃ = −θ̃T en φ = −e2n m2 (63)

where the last step follows from (61). Now the usual steps:
θ̃ ∈ L∞ , hence by (61) en m ∈ L∞ . By (60), m2 = 1 + n2s and so en , en ns ∈ L∞ .
Next, en m ∈ L2 . Recalling (60) again, we get en , en ns ∈ L2 .
Write
˙ φ
θ̂ = −Γen m
m
φ ˙
and use en m ∈ L2 ∩ L∞ and m ∈ L∞ to conclude that θ̂ ∈ L2 ∩ L∞ .
(ii) Using (61), we can write

φφ T
˙
θ̃ = −Γ 2 θ̃ =: A(t)θ̃
m (64)
φ T
en m = θ̃ =: cT (t)θ̃
m
View this as an LTV system with state θ̃ and output en m. The right-hand side of (63) is minus
the output squared, which is exactly the situation considered in Section 2.2 (the slight difference
in notation is because the output in (64) is scalar: cT here corresponds to C there). By the
result stated there, the system (64) will be exponentially stable if we can show that this system is
uniformly completely observable (UCO).
To check UCO, we need to analyze the observability Gramian of (64). This looks complicated.
However, there is a trick:

Lemma 6 (Ioannou-Sun, Lemma 4.8.1) A pair (A(t), C(t)) is UCO if the pair (A(t)+L(t)C(t), C(t))
is UCO for some bounded “output injection” matrix L(t).

For LTI systems, this is clear from the rank condition for observability. For LTV systems, the
proof is harder. See [Ioannou-Sun] for details (the proof is also in the book by Sastry and Bodson).
L(t) doesn’t actually need to be bounded, but for us this is good enough.
Continuing with the proof of the Theorem: take
φ
L(t) := Γ
m
NONLINEAR AND ADAPTIVE CONTROL 61

This gives
A(t) + L(t)cT (t) = 0
 T

The observability Gramian of the new pair 0, φm is
Z t0 +T
φ(s)φT (s)
M (t0 , t0 + T ) = ds
t0 m2

and we see that the UCO condition is equivalent to the PE condition.


From the above proof we see another interpretation of the PE property: for a single-output
system

ẋ = 0
y = cT (t)x

the PE property of c(t) makes the system observable, i.e., we can recover x from y, even though x is
stationary. At any time t, cT (t)x gives information only about the component of x in the direction
of c(t); however, c(t) is moving around in Rn so that over a finite period of time, we get complete
information about x.
−→ Slow adaptation properties of parameter estimation schemes (adaptive laws) will be useful later
˙
for stability analysis of adaptive control systems—this is why we care about things like θ̂ ∈ L2 .
Also will be relevant for ISS modular design.
Let us take a step back to the static example

y(t) = θu(t)

and recap the gradient method. We have been working with the instantaneous cost, initially defined
as
e2 (t) (θ̂(t)u(t) − y(t))2
J(θ̂(t), t) := =
2 2
whose gradient is
∇J(θ̂(t), t) = (θ̂(t)u(t) − y(t))u(t)
The graph of J as a function of θ̂ is a parabola which is moving with time (due to the changing
input and output). At each t, the minimum is given by

y(t)
θ̂(t) =
u(t)

Ideally, we’d like to jump to this minimum right away and follow it as it evolves with time. However,
we can’t really do this. First, it is defined only when u(t) 6= 0. Second, this is really sensitive to
noise. And in the vector case
z(t) = θT φ(t) (65)
62 DANIEL LIBERZON

we can never define θ̂(t) by a simple division. So, instead we slide down along the gradient. This
takes the data into account over time. It’s also clear from this why we need signal boundedness (so
that the graph doesn’t expand infinitely), and need to work with normalized data if we don’t have
boundedness.
Now, it would be nice to have a different cost for which we could actually compute the minimum
in a well-defined way for each t and follow it as time evolves. Since this doesn’t work for the
instantaneous cost, we need a cost which, at each time t, takes into account past data up to time
t. We can think of this data as a curve (or a set of points) in the u, y plane, and what we’re trying
to do is really just find a line that best fits this data. Assuming there is no noise, the data would
be exactly on a line, and we need to find the slope of this line.
How do we do this?

6.3.3 Least squares

Collect the data into an integral cost, e.g.,


Z t
1
J(θ̂(t), t) := (θ̂(t)u(s) − y(s))2 ds
2 0

(Note that the argument of θ̂ inside the integral is t, not s!) This is still convex in θ̂, and its
minimum is where the gradient vanishes:
Z t Z t Z t
2
∇J(θ̂) = (θ̂(t)u(s) − y(s))u(s)ds = θ̂(t) u (s)ds − u(s)y(s)ds
0 0 0

which equals 0 when


Rt
0 u(s)y(s)ds
θ̂(t) = Rt
2
0 u (s)ds

We see that this is well-defined even if u occasionally equals 0, as long as it is not identically 0 on
[0, t]. As new data comes, this minimum changes, and it is easy to derive a DE to update it. We’ll
do this below for the general case.
With a slight modification, we can make the least squares method work for the general case
of (65) as well:
Z t
1 (θ̂T (t)φ(s) − z(s))2 1
J(θ̂) := ds + (θ̂ − θ̂0 )T Q0 (θ̂ − θ̂0 )
2 0 m2 (s) 2
| {z }
(en m)2

where Q0 = QT0 > 0. Here m2 = 1 + n2s as before. The last term penalizes deviations from the
initial estimate, and ensures that the least-squares estimate is well defined as we’ll see in a minute.
NONLINEAR AND ADAPTIVE CONTROL 63

The gradient is
Z t
θ̂T (t)φ(s) − z(s)
∇J(θ̂) = φ(s)ds + Q0 (θ̂ − θ̂0 )
0 m2 (s)
Z t  Z t 
φ(s)φT (s) z(s)φ(s)
= ds + Q 0 θ̂ − ds + Q θ̂
0 0
0 m2 (s) 0 m2 (s)

so the minimum is at
Z t −1 Z t 
φ(s)φT (s) z(s)φ(s)
θ̂(t) = ds + Q0 ds + Q0 θ̂0
0 m2 (s) 0 m2 (s)
| {z }
=: P (t)

The matrix P (t) is well defined because φφT ≥ 0 and Q0 > 0 (that’s why we needed it).
We want to express this recursively, to avoid computing the inverse and run a DE instead. We
have
φφT
Ṗ = −P 2 P
m
T
(derivation: 0 = d
dt (P P
−1 ) = Ṗ P −1 + P φφ
m2
). The differential equation for θ̂ is

T
Z t 
˙ φφ z(s)φ(s) zφ z − θ̂T φ
θ̂ = −P 2 P ds + Q 0 θ̂ 0 +P = P φ = −P en φ
m 0 m2 (s) m2 m2
| {z }
θ̂

So, the continuous-time recursive least-squares algorithm is:

˙
θ̂ = −P en φ
φφT
Ṗ = −P P
m2
This is very similar to the Kalman filter. P is the “covariance matrix” (this is all deterministic,
though).

Theorem 7 (Ioannou-Sun, Theorem 4.3.4; Ioannou-Fidan, Theorem 3.7.2)


˙
(i) θ̂, P ∈ L∞ and en , en ns , θ̂ ∈ L2 ∩ L∞ .
(ii) θ̂(t) → θ̄ where θ̄ is some constant vector.
φ
(iii) If m is PE, then θ̄ = θ.

Item (ii) is a unique property of least squares. On the other hand, note that convergence in
(iii) is not claimed to be exponential.
Proof.
64 DANIEL LIBERZON

(i), (ii) P is positive definite and Ṗ ≤ 0 =⇒ P is bounded and has a limit. With the usual
definition θ̃ := θ̂ − θ we have

d −1 φφT
(P θ̃) = θ̃ − P −1 P en φ = 0 (66)
dt m2 |{z}
T
= φ 2θ̃
m

which means that

P −1 (t)θ̃(t) ≡ P −1 (0)θ̃(0) =⇒ θ̃(t) = P (t)P −1 (0)θ̃(0)

Hence θ̃ also is bounded and has a limit, and thus so does θ̂. The other claims in (i) are shown
using the Lyapunov function
1
V (θ̃) := θ̃T P −1 θ̃
2
in the same way as in the proof of Theorem 5 (do this yourself and check [Ioannou-Sun]).
φ
(iii) We want to show that P −1 → ∞ when m is PE; this would force θ̃ to go to 0 by (66) and
we would be done. Integrating
d −1 φφT
(P ) =
dt m2
and using the definition of PE, we have
Z T0
φφT
P −1 (T0 ) − P −1 (0) = ds ≥ α0 T0 I
0 m2

Similarly,
P −1 (2T0 ) − P −1 (T0 ) ≥ α0 T0 I
and we see that P −1 indeed increases to ∞.
There are various modifications to the above scheme. For example, one could introduce a
“forgetting factor”:
Z t
1 (θ̂T (t)φ(s) − z(s))2 1
J(θ̂) := e−λ(t−s) ds + e−λt (θ̂ − θ̂0 )T Q0 (θ̂ − θ̂0 )
2 0 m2 (s) 2

where λ > 0. Then one can show that convergence of θ̂ to θ in (iii) becomes exponential. The DE
for P changes to
φφT
Ṗ = λP − P 2 P
m
which actually prevents P from becoming arbitrarily small and slowing down adaptation too much,
something that happens with pure least squares where P −1 grows without bound. However, items
(i) and (ii) will no longer be true.
NONLINEAR AND ADAPTIVE CONTROL 65

The above integral cost with forgetting factor (and Q0 = 0) could also be used in the gradient
method instead of the instantaneous cost. Although the gradient expression for it is more com-
plicated, the resulting gradient adaptive law has the same properties as the previous one, and in
˙
addition one can prove that θ̂ → 0. See [Ioannou-Sun, Sect. 4.3.5] for details.
−→ So, the main difference between the gradient and least squares methods is not so much in the
cost used, but in the method itself (moving along the negative gradient of a time-varying function,
versus following its exact minimum).

6.3.4 Projection

In the above, we were assuming that θ ∈ Rm is arbitrary, i.e., completely unknown. In practice, we
usually have some set S which we know contains θ, and we can take this knowledge into account
when designing an adaptive law. The basic idea is to use projection to ensure that θ̂(t) ∈ S ∀ t.
This is a good idea for two reasons:

• Improves convergence and transient behavior of the estimation scheme.

• Helps avoid estimated plant models which are not suitable for control purposes; e.g, for some
values of θ̂ the model may not be stabilizable (loss of stabilizability problem) and certainty
equivalence-based design breaks down. (More on this later.)

For example, suppose that we have some cost J(θ̂) and want to implement the gradient law

˙
θ̂ = −∇J(θ̂)

but want to use the fact that θ ∈ S where S is given by

S = {θ : g(θ) = 0}

for some (nice) scalar-valued function g. This captures cases where θ belongs to some known
subspace. Given a θ̂ ∈ S, we can split −∇J(θ̂) into a sum of two terms, one normal to S and the
other tangent to S:
−∇J(θ̂) = α ∇g(θ̂) + Pr(θ̂) (67)
| {z } | {z }
normal tangent

Then we will discard the normal direction and just keep the tangent direction. This will ensure
that θ̂ travels along S.
But how do we calculate Pr(θ̂)? For this we need to know α. Multiply both sides of (67) by
the normal, i.e., (∇g(θ̂))T :

∇g T ∇J
−∇g T ∇J = α∇g T ∇g =⇒ α=−
∇g T ∇g
66 DANIEL LIBERZON

and so  
∇g T ∇J ∇g∇g T
Pr = −∇J + ∇g = − I − ∇J
∇g T ∇g ∇g T ∇g
This gives us the projected gradient law.
More generally, if we started with the scaled gradient law
˙
θ̂ = −Γ∇J(θ̂)

(which is a coordinate change away from the previous one) then its projected version can be shown
to be  
˙ ∇g∇g T
θ̂ = − I − Γ T Γ∇J(θ̂)
∇g Γ∇g

It is easy to extend this idea to the case when the set S is given by an inequality constraint:

S = {θ : g(θ) ≤ 0}

For example, S could be a ball. Start with θ̂0 ∈ S. If θ̂(t) is in the interior of S, or it’s on the
boundary but −Γ∇J(θ̂) points inside S, then follow the usual gradient method. If θ̂(t) is on the
boundary and −Γ∇J(θ̂) points outside, then apply the projection.
−→ If S is a convex set, then the projected gradient adaptive law retains all properties of the
unprojected gradient adaptive law established in Theorem 5 (and in addition maintains θ̂ in S by
construction).
˙
This is true because when we subtract a term proportional to ∇g(θ̂) from θ̂, this can only
make the Lyapunov function (62) decrease more. When the sublevel sets of g are convex, the
contribution of −∇g(θ̂) to V̇ is negative basically for the same reason that the contribution of
−∇J(θ̂) was negative for the convex cost function J. See [Ioannou-Sun, Sect. 4.4] for details.
The projection modification can also be applied to the least squares method.

6.4 Sufficiently rich signals and parameter identification


We’ve been studying the parametric model

z = θT φ (68)

and have developed parameter estimation schemes which guarantee parameter convergence (θ̂ → θ,
exponentially except for pure least squares, need forgetting factor) if φ is PE. The regressor vector
φ contains (filtered version of) u, y, and their derivatives, see (58). We can write this as

φ = H(s)u

But the only signal we have direct control over is u.


NONLINEAR AND ADAPTIVE CONTROL 67

Basic idea:
u sufficiently rich =⇒ φ is PE =⇒ θ̂(t) → θ (69)
We want to understand what “sufficiently rich” means in various cases.

Example 7 (we already had this earlier) Scalar plant:

ẏ = −ay + bu

Assume a > 0 and u ∈ L∞ . In the parametric model for this example, we have
     
b u 1
z = ẏ, θ= , φ= = b u
a −y − s+a

(we ignore the filtering).


What would be a good input to inject?
Let’s try a constant input u = c:
 bc 
ẏ = −ay + bc = −a y −
a
hence 
bc bc 
y(t) = + e−at y0 −
a a
Can we recover both parameters a and b by observing y?
bc bc
No. The easiest way to see this is to consider the case when y0 = , then y(t) = and this
a a
b
only carries information about the ratio .
a
For other initial conditions, we do have information on a itself but it disappears exponentially
b
fast and in steady state y only carries information about .
a
Accordingly, we can show that
!
c 
φ=
− bc
a −e
−at y − bc
0 a

is not PE. Think about it: when we form the matrix φ(t)φT (t), it is the sum of two terms. One
is constant and singular. The other decays exponentially to 0. So the definition of PE cannot be
satisfied.
What would be a better input? How does one in general identify frequency response of a system
b
with transfer function g(s)? (In our case, g(s) = s+a ).
Let’s try
u(t) = sin ωt
68 DANIEL LIBERZON

where ω is some constant frequency. In steady state (i.e., modulo an exponentially decaying tran-
sient), the corresponding output is

y(t) = Re g(jω) sin ωt + Im g(jω) cos ωt = A sin(ωt + α)

where
b |b|
A := |g(jω)| = =√ ,
jω + a ω + a2
2
(70)
−1 Im g(jω) ω
α := tan = − tan−1
Re g(jω) a

where the second formula is valid if b > 0, otherwise add 180◦ to α.


By observing y we can get A and α from which, using algebraic operations, we can recover a
and b. (Note that now the information doesn’t disappear with time).
Accordingly,  
sin ωt
φ=
−A sin(ωt + α)
can be shown to be PE; see homework.

Example 8 Second-order plant:

b1 s + b0
y(s) = u(s)
s2 + a1 s + a0

where a0 , a1 > 0 to ensure stability. There are 4 unknown parameters in the transfer function now,
so a single sinusoid is not enough. But

u(t) = sin ω1 t + sin ω2 t, ω1 6= ω2

gives
y(t) = A1 sin(ω1 t + α1 ) + A2 sin(ω2 t + α2 )
We now have 4 unknowns and 4 equations =⇒ OK.

Back to the general parametric model (68) where θ and φ both have dimension m.
Generically, u is sufficiently rich (in the sense of guaranteeing that φ is PE) if it contains at
m
least distinct frequencies. This matches the above examples.
2
The above definition assumes that u is sinusoidal. We can make this more general, by saying
that the power spectrum contains at least m distinct points (power spectrum is symmetric).
“Generically” means modulo some degenerate situations, e.g., when frequencies ωi coincide with
zeros of H(s).
NONLINEAR AND ADAPTIVE CONTROL 69

The proof (which we will not give) relies on frequency domain arguments. The basic idea is that
the integral appearing in the definition of PE is the autocovariance of φ, and it is related to the
power spectrum (or spectral measure) of φ via Fourier transform. See [Ioannou-Sun, Sect. 5.2.1]
for details.
Note: if the plant is partially known, then we need fewer distinct frequencies. Or, with fewer
frequencies we can partially identify the unknown plant.
Earlier, we described parameter estimation schemes of two kinds: using state-space models, and
using parametric models. Combining them with the discussion we just had, we can formally state
results on parameter convergence. Note that there are two steps in (69). The above discussion
was primarily about the first step (we did talk about recovering the system parameters but this
was only for simple examples, in general we need adaptive laws to do that). The second step was
addressed—for parametric models only—in Theorems 5 and 7.
For full-state measurements, we had

ẋ = Ax + Bu

where A is Hurwitz, x ∈ Rn , u ∈ L∞ . For simplicity assume that u is scalar (the multi-input


case is in [Ioannou-Sun]). The adaptive law design was given on pages 53–54, but we didn’t study
parameter convergence. We can now state the following property of that scheme.

Theorem 8 (Ioannou-Sun, Theorem 5.2.2) If (A, B) is a controllable pair and u contains at


n+1
least distinct frequencies, then the adaptive law discussed in class gives  → A, B̂ → B
2
(exponentially fast).

We don’t give a proof, but note that the number n + 1 comes from the fact that state dimension
is n plus we have 1 input. If there are uncontrollable modes, then they are not affected by u and
decay to 0, so the corresponding parameters cannot be identified.
For partial state measurements, we had

ẋ = Ax + bu
y = cT x

where A is Hurwitz, x ∈ Rn , and u, y ∈ R are bounded. We already said that, even though the
total number of parameters in the state model is n2 + 2n, the dimension of the parametric model
(the number of unknown coefficients in the transfer function) is at most m = 2n. We can only hope
to identify these 2n parameters. We have described adaptive laws based on the gradient method
and least-square method. Since we’re in the stable case here, we don’t need normalization.

Theorem 9 (Ioannou-Sun, Theorem 5.2.4) If the transfer function g(s) has no pole-zero can-
cellations and u contains at least n distinct frequencies, then the adaptive laws discussed in class
give θ̂ → 0 (exponentially fast, except for the case of pure least squares).
70 DANIEL LIBERZON

This result matches the examples we had earlier. As for the previous result, we need fewer
frequencies there because we are observing more information (the whole state instead of output).
Some remarks on parameter estimation/identification:

• As we emphasized several times, parameter identification is not necessary for satisfactory


control (already saw in the first example).
• When considering control objectives such as stabilization, PE is actually in direct conflict
with control, since we don’t want to use inputs such as sinusoids.
• Even without PE assumptions, we know that parameter estimation schemes have useful
properties. Look, for example, at statement (i) of Theorem 5, and similar statements in
other results we had. These properties are directly useful in adaptive control, as we will see.
PE conditions, on the other hand, are not of primary importance for this course.
• There are, however, other control objectives such as reference tracking or model matching.
For these, satisfying PE conditions may sometimes be feasible.
• Adaptive control schemes based on estimating unknown plant parameters, and then using
them for control, are called indirect. In contrast, in direct adaptive control one works directly
with controller parameters: first, reparameterize the problem in terms of unknown desired
controller parameters, and then design an adaptive law for estimating them. (See the block
diagrams in [Ioannou-Sun, pp. 10–11].)

The next example, on model reference adaptive control, is taken from [Ioannou-Sun, pp. 320
and 397] and [Khalil, pp. 16 and 327]. It illustrates the last two points above. It also reinforces the
concept of PE and other constructions we saw above. This example will be used again later when we
study singular perturbations. See [Ioannous-Sun, Chap. 6] for more information on model-reference
adaptive control.
−→ This will also be the first time we combine parameter estimation with control (not counting
our early example which wasn’t really using an estimation scheme but an ad-hoc tuning law).

6.5 Case study: model reference adaptive control


Model reference adaptive control (MRAC): want the closed-loop system to reproduce the in-
put/output behavior of a given reference model, driven by the reference signal r.

Consider the scalar plant


ẏ = ay + bu
Here a is completely unknown, i.e., we don’t assume that we know its sign—hence we no longer
use the minus sign we had earlier. The control gain b is also unknown but we assume that we know
its sign; for concreteness, suppose b > 0.
NONLINEAR AND ADAPTIVE CONTROL 71

Adaptation

r u y
Controller Plant

− e
+
ym
Reference Model

Reference model:
ẏm = −am ym + bm r (71)
We assume that am > 0 (reference model is stable) and r ∈ L∞ .
For control, we use a combination of feedforward and feedback of the form

u = −ky + lr

which gives
ẏ = (a − bk)y + blr
Hence to match the model, we need
a + am bm
a − bk = −am , bl = bm ⇔ k= , l=
b b

These controller gains k, l are not implementable, since we don’t know a and b. Instead, we will
use
u = −k̂y + ˆlr (72)
where k̂, ˆl are estimates of k, l. There are two approaches to generating such an adaptive control
law: direct and indirect.

6.5.1 Direct MRAC

Let us rewrite the plant as


ẏ = −am y + bm r + b(u + ky − lr) (73)
72 DANIEL LIBERZON

This is a reparameterization which directly displays control parameters; note that it is bilinear in
the parameters, not linear. To check (73), just plug in the formulas for k and l.
Plugging the control law (72) into (73), we have

ẏ = −am y + bm r + b(−k̃y + ˜lr) (74)

where we defined the parameter errors

k̃ := k̂ − k, ˜l := ˆl − l

Define also the output tracking error


e := ym − y
Using (71) and (74), we compute:

ė = −am e + bk̃y − b˜lr

Let’s try the candidate Lyapunov function


 
1 e2 1
V (e, k̃, ˜l) := + (k̃ 2 + ˜l2 )
2 b γ
where γ > 0 is arbitrary.
−→ The reason for division by b will be clear soon; recall that we assumed b > 0.
−→ Division by γ is just to have an extra degree of freedom; γ will play the same role in the
adaptive law as the scaling factor in the gradient method. (We didn’t use it in Lyapunov-based
design earlier, e.g., in (55), but we could have used it there.) We could be even more general and
divide k̃ 2 by γ1 and ˜l2 by γ2 6= γ1 .
The derivative of V is
am 2 1 ˙ 1 ˙
V̇ = − e + k̃ey − ˜ler + k̃ k̃ + ˜l˜l
b γ γ
˙ ˙ ˙ ˙
where k̃ = k̂ and ˜l = ˆl are yet to be specified. But now their choice is clear:
˙ ˆl˙ := γer
k̂ := −γey,

which gives
am 2
V̇ = − e (75)
b
Now the analysis follows the familiar steps:
e, k̃, ˜l are bounded.
e ∈ L2 .
We assumed that r is bounded and the reference model is stable. Hence ym is bounded, which
since e is bounded implies that y = ym − e is bounded. Hence, ė is bounded.
NONLINEAR AND ADAPTIVE CONTROL 73

˙ ˙
By Barbalat, e → 0. Also, k̂, ˆl are in L2 and converge to 0.
We showed that, for any bounded input, the plant output y asymptotically tracks the reference
model output ym , which was the original objective.
Do we know that k̂, ˆl converge to k, l?
Need a PE condition. More precisely, we need the reference signal r to be sufficiently rich.
Since the plant is scalar, we guess that one sinusoid will do:

r = sin ωt

which in steady state gives the output8

y = A sin(ωt + α)

where A and α are given in (70). Collecting the above equations, we can write the resulting system
in (e, k̃, ˜l) coordinates as an LTV system:
      
ė −am bA sin(ωt + α) −b sin ωt e e
˙   
k̃  = −γA sin(ωt + α) 0 0 k̃ =: A(t) k̃ 
 
˜l˙ γ sin ωt 0 0 ˜l ˜l

Looking at (75), we see that we’ll have exponential stability of this system—and thus exponential
convergence of the controller parameter estimates k̂ and ˆl to their true values—if we can show that
the system is UCO with respect to the output
   
 e e
y := e = 1 0 0 k̃  =: C k̃ 
˜l ˜l

We also know the trick of passing from (A(t), C) to (A(t) + L(t)C, C), see Lemma 6. It is easy to
see that by proper choice of L(t), we can get
 
−am bA sin(ωt + α) −b sin ωt
A(t) + L(t)C =  0 0 0 
0 0 0

(we could also kill −am if we wanted, but it helps to keep it). Still, showing UCO (defined in
Section 2.2) looks complicated because we need to compute the transition matrix of A(t) + L(t)C,
and it’s not clear how to do that.
8
Note that, first, the reference model is stable so the effects of its initial conditions vanish with time, and
second, the steady-state responses of the plant and of the reference model are the same because ym −y = e →
0. So, what we’re doing is ignore some terms that go to 0 with time, and it can be shown that the presence of
these terms doesn’t affect stability which we’re about to establish without these terms (cf. [Khalil, Example
9.6].
74 DANIEL LIBERZON

−→ By the way, it is straightforward to check that in case r were constant instead of sinusoidal,
and hence y were also constant in steady state, the above (LTI) pair would not be observable.
Consider the system ẋ = (A + LC)x, y = Cx (note: this y is not to be confused with the
original plant output y, we are considering an abstract system here). To investigate its UCO, note
that the observability Gramian M (t0 , t0 + T ) by definition satisfies
Z t0 +T
|y(t)|2 dt = x(t0 )T M (t0 , t0 + T )x(t0 )
t0

hence all we need to show is that


Z t0 +T
|y(t)|2 dt ≥ α1 |x(t0 )|2 (76)
t0

for some α1 > 0. But we have y = x1 and

ẏ = −am y + bA sin(ωt + α)x2 − b sin ωtx3

and also
ẋ2 = ẋ3 = 0 =⇒ x2 , x3 = const
(note that x2 , x3 no longer correspond to k̃, ˜l because we’re working with an auxiliary system
obtained by output injection).
We can now easily solve for y:
R Rt  x 
−am t t −am (t−s) 2
y(t) = e y(t0 ) + t0 e bA sin(ωs + α)ds − t0 e−am (t−s) b sin ωsds
x3

We know that the vector  


A sin(ωt + α)
sin ωt
is PE. The integral terms are outputs of the stable linear system ẏ = −am y + u driven (componen-
twise) by this PE input vector, and thus can be shown to form a PE vector as well. With some
calculations, the desired property (76) follows. For details, see [Ioannou-Sun, Lemmas 4.8.3 and
4.8.4].
So, the conclusion is that output tracking is achieved for any bounded reference, and in addition
for sinusoidal references the controller parameters are identified.

6.5.2 Indirect MRAC

We still want to use a control law of the form

u = −k̂y + ˆlr (77)


NONLINEAR AND ADAPTIVE CONTROL 75

but instead of defining k̂, ˆl directly as estimates of k and l, we’ll now define them indirectly via
estimates â, b̂ of the plant parameters a, b. Earlier we derived the ideal control parameters to be
a + am bm
k= , l=
b b
so we set
â + am ˆl = bm
k̂ := , (78)
b̂ b̂
To generate â and b̂, we can follow a familiar scheme discussed earlier—cf. Example 6 in Section 6.2.
Comparing with Example 6, we know that we would like to run the estimator

ŷ˙ = −am (ŷ − y) + ây + b̂u

Note that we choose the damping rate of the estimator to be the same as in the reference model.
As a consequence, it turns out that we don’t actually need to run the estimator, and can just use
the state ym of the reference model instead of ŷ. Indeed, let’s rewrite the reference model like this:

ẏm = −am ym + bm r = −am (ym − y) − am y + bm r = −am (ym − y) + ây − (â + am )y + bm r


 
â + am bm
= −am (ym − y) + ây + b̂ − y+ r = −am (ym − y) + ây + b̂u (79)
| b̂ {z b̂ }
=u

(this is valid for the closed-loop system).


The tracking error is
e := ym − y
and its derivative is (using the final equation for ẏm above)

ė = ẏm − ẏ = −am e + ãy + b̃u (80)

where, as usual,
ã := â − a, b̃ := b̂ − b
The next step is Lyapunov analysis.
Candidate Lyapunov function:
 
1 2 1 2 2
V (e, ã, b̃) := e + (ã + b̃ )
2 γ
where γ > 0 is arbitrary. This is the same V as we’ve been using earlier, with the parameter γ for
extra flexibility. Note that we don’t need the division by b as in the direct scheme, since we’re no
longer using the bilinear parameterization we used there and so b doesn’t multiply the parameter
errors in the ė equation. (In fact, we’re not seeing the need for assuming b > 0 yet, but we’ll see it
later.)
1 1 ˙
V̇ = −am e2 + ãey + b̃eu + ãã˙ + b̃b̃
γ γ
76 DANIEL LIBERZON

˙ ˙
which makes the choice of adaptive law, i.e., the choice of ã˙ = â˙ and b̃ = b̂, obvious:
˙
â˙ := −γey, b̂ = −γeu

This is very similar to what we had in the direct scheme, and it gives

V̇ = −am e2

Following the standard steps, we get e, â, b̂ ∈ L∞ , e ∈ L2 .


Need to look at ė, given by (80). If we can show that it’s bounded, then we’ll have e → 0 by
Barbalat, meaning that asymptotic tracking is achieved.
As before, since we assumed that r is bounded the reference model is stable, ym is bounded,
hence y = ym − e is bounded.
But we also need to know that u is bounded. It is given by (77), (78).
−→ Here comes the major difference with the direct case: unlike in the direct case, the Lyapunov
analysis doesn’t tell us anything about boundedness of k̂ and ˆl.
Is boundedness of u guaranteed in the present scheme?
In general, no, because there’s nothing to prevent b̂ from becoming arbitrarily close to 0, or
even hit 0. Then u will become unbounded, or will not even be defined.
To understand why b̂ = 0 is a problem, think about the plant model

ẏ = ây + b̂u

on which our certainty equivalence controller is based. When b̂ = 0, the model is not stabilizable,
and the procedure breaks down (u doesn’t exist). This is a well-known issue in indirect adaptive
control, known as the loss of stabilizability problem.
There are several approaches for dealing with loss of stabilizability; see [Ioannou-Sun, Sect.
7.6.2]. The simplest is based on projection and works if we have a bit more information on the
plant, as follows. We assumed that b > 0, i.e., we know that the actual plant is stabilizable.
Suppose that we know a constant b0 > 0 such that

b ≥ b0 (known)

Then, we can modify the adaptive law for b̂ by projecting it onto the interval [b0 , ∞):
(
˙ −γeu, if b̂ > b0 or if b̂ = b0 and eu ≤ 0
b̂ =
0, otherwise

(of course, we initialize b̂ at a value higher than b0 ).


The analysis should be slightly modified, because when b̂ stops, from the V̇ equation we have

V̇ = −am e2 + b̃eu
NONLINEAR AND ADAPTIVE CONTROL 77

However, when we stop, we know by construction that b̂ = b0 and eu > 0. Since b ≥ b0 , at this
time we have b̃ = b̂ − b ≤ 0. Hence, the extra term in V̇ is nonpositive, and we’re still OK. And
now b̂ is bounded away from 0, hence k̂, ˆl are bounded, u is bounded, ė is bounded, and we can
apply Barbalat to conclude that e → 0 and tracking is achieved.
˙ b̂˙ are in L2 and converge to 0.
Also, â,
−→ Note that b̂ may or may not approach 0 in practice, we just can’t rule it out theoretically in the
absence of projection. In this example the loss of stabilizability issue is not really severe because
we know the sign of b. In problems where the sign of b is unknown (as we had earlier) and thus
b̂ might be initialized with the wrong sign, zero crossings for gradient adaptive laws are in fact
almost unavoidable. But in higher dimensions things are different.
−→ What we proved about exponential parameter convergence under sinusoidal reference signals
for direct MRAC is still true for indirect MRAC.
From this example, it may appear that direct MRAC is better because it doesn’t suffer from the
loss of stabilizability issue. However, in general this is not the case. The reason is that direct
MRAC requires us to come up with a controller reparameterization which can then be used to
design an adaptive law. In the above simple example this worked, although the direct controller
reparameterization was more complicated than the original plant parameterization (it was bilinear).
In general, direct schemes tend to be more complicated and apply to narrower classes of systems
than do indirect schemes. (This is true not only for MRAC.) See [Ioannou-Sun, Sect. 1.2.3] for a
detailed discussion. In this course, we’re primarily dealing with indirect adaptive control.
Note that the estimation scheme we used for indirect MRAC is quite different from the ones
we developed in Section 6.3, in several key aspects:

• It does not use normalization.

• It relies on an estimator/reference model (if we don’t ask for parameter identification, then
we can set r = 0 if we want) instead of plant parametric model.

• It guarantees that e → 0, something we didn’t have with the normalized adaptive laws based
on parametric models.

• Since the design relies on the matching between the estimator and the reference model, given
by (79), it is difficult to extend it to higher-order plants (actually, the issue is not so much the
plant order but its relative degree), while the earlier schemes relying on parametric models
work for general plants.

See [Ioannou-Sun, Chapter 6] for more information.


78 DANIEL LIBERZON

7 Input-to-state stability
7.1 Weakness of certainty equivalence
Reference: [KKK book, Sect. 5.1]
When using the certainty equivalence principle for control design, we substitute θ̂ for θ in the
controller. The resulting closed-loop system thus differs from the “ideal” one by terms involving
the parameter estimation error θ̃. We then hope that these terms do not have too much negative
effect on system stability. However, we haven’t really formally addressed this issue yet.
Ideally, we’d like to have modular design, i.e., formulate some design objectives for the controller
and the tuning law (tuner) separately so that, when combined using certainty equivalence, they
give stability. For example:
)
Controller is stabilizing when θ̃ = 0
=⇒ x is bounded (81)
Tuner guarantees bounded θ̃

Another conjecture might be:


)
Controller is stabilizing when θ̃ = 0
=⇒ x → 0 (82)
Tuner guarantees θ̃ → 0

In our old Example 1, we had the plant

ẋ = θx + u

and the controller


u = −x − θ̂x
which give the closed-loop dynamics
ẋ = −x − θ̃x

Let’s view the error θ̃ as a (disturbance) input to this system. We’ve shown earlier that it is
bounded (but doesn’t necessarily converge to 0).
Is it true that a bounded θ̃ always leads to a bounded x for the above system?
No. For example, if θ̃ ≡ −2 then we have ẋ = x and x → ∞.
˙
The tuning law that we used for Example 1, θ̂ = x2 , provides more than just boundedness of
θ̃. In fact, the Lyapunov analysis we had before implies that x → 0. Some other tuning law which
guarantees boundedness of θ̃ but doesn’t ensure its correct sign may not work, so we have to be
careful.
So the first property above, (81), is not true in general, as the above example shows.
NONLINEAR AND ADAPTIVE CONTROL 79

What about the second one, (82)?


In the above example, it is clear that if θ̃ → 0 then x → 0 (regardless of the sign of θ̃). But
the above plant is linear. The next example shows that for nonlinear plants, much more drastic
behavior is possible: x may not converge to 0 and may not even be bounded despite the fact that
θ̃ → 0!

Example 9 Consider the scalar plant

ẋ = −θx2 + u

An obvious choice for the controller is

u = −x + θ̂x2

and the closed-loop system is


ẋ = −x + θ̃x2

Suppose that we were able to design a parameter identifier which gives

θ̃(t) = e−t θ̃0

i.e., θ̃ → 0 exponentially fast (we take the decay rate to be equal to 1 just for simplicity, this is not
important). This seems to be as good as it gets.
−→ We haven’t discussed parameter identification for nonlinear plants, so we just take such an
identifier as given and don’t worry about actually designing it. One example of such identifier
design is discussed in [KKK book, p. 187], see also the references cited there.
The closed-loop system is now
ẋ = −x + e−t θ̃0 x2 (83)

What can we then say about the closed-loop system (83)? Is x bounded? Does it converge to
0?
Claim: For some initial conditions, x escapes to ∞ in finite time!
One system that’s known to do that is

ẋ = x2

whose solution is
x0
x(t) =
1 − x0 t
and for x0 > 0 this is only defined on the finite time interval [0, 1/x0 ) and approaches ∞ as
t → 1/x0 . This behavior is due to the rapid nonlinear growth at infinity of the function f (x) = x2 .
It is clear that the system

ẋ = −x + 2x2 = −x + x2 + x2 (84)
80 DANIEL LIBERZON

also has finite escape time, for x0 large enough, because for large positive x the term −x is dominated
by one of the x2 terms. (This is despite the fact that locally around 0 this system is asymptotically
stable.)
But now we can also see that our closed-loop system (83) has finite escape time. The argument
is like this: let T be the time that it takes for the solution of (84) to escape to ∞ for a given x0 > 0.
Choose θ̃0 large enough so that
e−t θ̃0 ≥ 2 ∀ t ∈ [0, T ]
Then the corresponding solution of (83) is no smaller than that of (84) and hence it escapes to ∞
in time ≤ T .
In fact, one can check that the solution of (83) is given by the formula
2
x(t) = x0
x0 θ̃0 e−t + (2 − x0 θ̃0 )et
and we see that for
x0 θ̃0 > 2
we have
1 x0 θ̃0
x(t) → ∞ as t → log
2 x0 θ̃0 − 2

The above example highlights the challenge of nonlinear adaptive control and weakness of
certainty equivalence approach: bounded/converging estimation error does not guarantee bound-
edness/convergence of the closed-loop system.
For a modular design to work, we need to demand more from the controller. Namely, we
need the controller to possess some robustness with respect to disturbance inputs which in our
case correspond to the parameter estimation error θ̃. Such robustness properties are captured in
nonlinear system theory by the general concept of input-to-state stability (ISS) which is discussed
next.

7.2 Input-to-state stability and stabilization


References: [Khalil, Sect. 4.9]; [KKK book, Appendix C]; papers by Eduardo Sontag, in particular
his recent survey paper “Input to state stability: basic concepts and results” downloadable from his
website, as well as the paper “Universal construction of feedback laws achieving ISS and integral-
ISS disturbance attenuation” by Sontag, Wang and myself, Systems and Control Letters, vol. 4,
pp. 111-127, 2002.
First, we need to define a few useful function classes.
A function γ : [0, ∞) → [0, ∞) is said to be of class K if it is continuous, strictly increasing,
and γ(0) = 0.
NONLINEAR AND ADAPTIVE CONTROL 81

If γ is also unbounded, then it is said to be of class K∞ . Example: γ(r) = cr for some c > 0.
A function β : [0, ∞) × [0, ∞) → [0, ∞) is said to be of class KL if β(·, t) is of class K for each
fixed t ≥ 0 and β(r, t) is decreasing to zero as t → ∞ for each fixed r ≥ 0. Example: β(r, t) = ce−λt r
for some c, λ > 0.
We will write β ∈ KL, γ ∈ K∞ to indicate that β is a class KL function and γ is a class K∞
function, respectively.
Now, consider a general nonlinear system
ẋ = f (x, d) (85)
where x ∈ Rn is the state and d ∈ Rℓ is the (disturbance) input. To ensure existence and uniqueness
of solutions, we assume that f is sufficiently nice (e.g., locally Lipschitz) and d is also sufficiently
nice (e.g., piecewise continuous).
According to [Sontag, 1989] the system (85) is called input-to-state stable (ISS) with respect
to d if for some functions β ∈ KL and γ ∈ K∞ , for every initial state x0 , and every input d the
corresponding solution of (85) satisfies the inequality

|x(t)| ≤ β(|x0 |, t) + γ sup0≤s≤t |d(s)| ∀t ≥ 0
The above formula assumes that the initial time is t0 = 0. But since the system (85) is time-
invariant, it wouldn’t make any difference if we kept the initial time t0 general and wrote

|x(t)| ≤ β(|x0 |, t − t0 ) + γ supt0 ≤s≤t |d(s)| ∀ t ≥ t0 ≥ 0. (86)
−→ Also note that by causality, it makes no difference if we take sup over t0 ≤ s < ∞.
Let’s try to decipher what the ISS definition says. When there is no input, i.e., when d ≡ 0, it
reduces to
|x(t)| ≤ β(|x0 |, t)
This means that the state is upper-bounded by β(|x0 |, 0) at all times and converges to 0 as t → ∞
(because β is decreasing to 0 in the second argument). It turns out that this is exactly equivalent
to global asymptotic stability (GAS) of the unforced system
ẋ = f (x, 0)
In this case one says that (85) is 0-GAS.
A more standard definition of GAS is not in terms of class KL functions, but the two are
equivalent; see [Khalil, Definition 4.4 and Lemma 4.5]. In particular, global exponential stability
(GES) is defined as
|x(t)| ≤ ce−λt |x0 |
and the right-hand side is an example of a class KL function.
However, GAS and GES are internal stability notions while ISS is an external stability notion.
It says that in the presence of d, there’s another term in the upper bound for the state, and this
term depends on the size of the disturbance. The implications of ISS are:
82 DANIEL LIBERZON

• If d is bounded, then x is bounded.

• If d → 0, then x → 0.

The first one is obvious from the definition of ISS. The second one holds because we can always
“restart” the system after d becomes small, and use (86). We already used this trick earlier to
prove Fact 3 in Section 3.3, see an earlier homework problem.
Some examples:
The linear system
ẋ = Ax + Bd
is ISS if (and only if) A is a Hurwitz matrix. In other words, for linear systems internal stability
(stability of ẋ = Ax) automatically implies external stability (ISS). We have

|x(t)| ≤ ce−λt |x0 | + b · supt0 ≤s≤t |d(s)|

for suitable constants b, c, λ > 0. We actually already discussed this before (cf. Fact 2 in Section 3.3,
also homework).
For nonlinear systems, it’s no longer true that internal stability (GAS for d ≡ 0) implies ISS.
We already saw this in the previous subsection. The system

ẋ = −x + xd

fails the first bullet item above (just set d ≡ 2). The system

ẋ = −x + x2 d

fails both bullet items above—in fact, it fails miserably: not only does x not converge, not only is
it unbounded, but it escapes to infinity in finite time. And all this for exponentially converging d,
and despite the fact that for d ≡ 0 we have the nice stable linear system ẋ = −x.
It is possible to construct examples with less drastic behavior which fail the two bullet items
above.
From the above discussion it is clear that a new theory is needed to study ISS. We will not
pursue this in detail here, but we will mention a few basic results.
The ISS property admits the following Lyapunov-like equivalent characterization: the sys-
tem (85) is ISS if and only if there exists a positive definite radially unbounded C 1 function
V : Rn → R such that for some class K∞ function ρ we have
∂V
|x| ≥ ρ(|d|) =⇒ · f (x, d) < 0 ∀ x 6= 0, ∀ d (87)
∂x
(to be read like this: for all x 6= 0 and all d, the implication holds). Such a function V is called an
ISS-Lyapunov function.
Idea of the proof that the existence of an ISS-Lyapunov function implies ISS:
NONLINEAR AND ADAPTIVE CONTROL 83

• As long as |x| ≥ ρ(|d|), we have V̇ < 0 and the system behaves like a usual GAS system, i.e.,
it converges towards the origin. During this period of time, we have an estimate of the form

|x(t)| ≤ β(|x0 |, t)

• Assume that d is bounded (for unbounded d the ISS estimate gives no finite bound and so
there’s nothing to prove). Once we enter a level set of V superscribed around the ball of
radius

ρ supt0 ≤s<∞ |d(s)|

we may no longer have V̇ < 0, but we know that we cannot exit this level set because outside
it, V̇ < 0 and we are pushed back in. So, from this time onward, we satisfy the bound of
the form

|x(t)| ≤ γ supt0 ≤s<∞ |d(s)|
Here γ is obtained from ρ and V (geometrically, the ball of radius γ is superscribed around
the level set of V which is in turn superscribed around the ball of radius ρ).

• Combining the two bounds—the one before we enter the level set of V and the one after—we
obtain ISS.

The proof of the converse implication is much more difficult.


Begin optional material
As we said, 0-GAS (GAS under zero input) does not imply ISS. However, 0-GAS does imply ISS-like
properties for sufficiently small inputs.
To see why, let V be a Lyapunov function for ẋ = f (x, 0), and write

∂V ∂V ∂V ∂V
· f (x, d) = · f (x, 0) + · (f (x, d) − f (x, 0)) ≤ −W (x) + L|d|
∂x ∂x ∂x ∂x

where W is positive definite and L is the Lipschitz constant for f on a region containing the initial
condition x0 . It is now easy to see that for d small enough, the negative term dominates. We can
now establish the ISS estimate as in the previous argument.
So, the key thing about ISS is that the estimate (86) holds for arbitrary inputs, no matter how
large. This is a much stronger property and it doesn’t follow from 0-GAS.
End optional material
So, to check ISS, we need to look for a pair (V, ρ) satisfying (87). We’ll see examples of how to
do this very soon.
Suppose now that we have a system with both disturbance inputs and control inputs:

ẋ = f (x, d, u)
84 DANIEL LIBERZON

Then a natural problem to consider is to design a feedback law u = k(x) such that the closed-loop
system is ISS with respect to the disturbance d. Such a control law is called input-to-state stabilizing
(it attenuates the disturbance in the ISS sense).
Combining the notion of control Lyapunov function (CLF) discussed in Section 4.1 with the
above notion of ISS-Lyapunov function, we arrive at the following definition of ISS control Lyapunov
function (ISS-CLF):
 
∂V
|x| ≥ ρ(|d|) =⇒ inf · f (x, d, u) < 0 ∀ x 6= 0, ∀ d
u ∂x
where ρ ∈ K∞ .
Given an ISS-CLF, we want to have a systematic procedure—in fact, a universal formula—for
designing an input-to-state stabilizing controller, with V serving as an ISS-Lyapunov function for
the closed loop.
We remember from the earlier case of no disturbances that to get this, we need to impose an
affine structure on the system. Namely, let us assume that the right-hand side is affine in both9 u
and d:
ẋ = f (x) + L(x)d + G(x)u (88)
Then V is an ISS-CLF if
 
∂V ∂V ∂V
|x| ≥ ρ(|d|) =⇒ inf · f (x) + · L(x)d + · G(x)u <0 ∀ x 6= 0, ∀ d
u ∂x ∂x ∂x

It is still not clear how to apply Sontag’s universal formula to this. First, the conditions are
stated differently than before (in terms of the implication). Second, the expression inside the inf
involves d, while we want the controller to be independent of d (which is usually not measured10 ).
(Recall the CLF setting: want to have inf u {a(x) + b(x) · u} < 0. Cannot define a in the obvious
way because it’ll depend on d.)
The trick is to realize that
∂V ∂V
· L(x)d ≤ LT (x) |d|
∂x ∂x
and that under the condition |x| ≥ ρ(|d|), the worst-case value of the above expression is
∂V
LT (x) ρ−1 (|x|)
∂x
This is well-defined because ρ, being a class K∞ function, is invertible on [0, ∞) (since it’s strictly
increasing from 0 to ∞). Also, there does in fact exist an admissible disturbance for which the
∂V
above upper bound is achieved: just align d with the vector LT (x) .
∂x
9
Affine dependence on d is not necessary, but without it the construction is more complicated [L-Sontag-
Wang]. Affine dependence on u is essential.
10
One exception arises in switching adaptive control, to be discussed later, where the disturbance corre-
sponds to the output estimation error which is available for control.
NONLINEAR AND ADAPTIVE CONTROL 85

We conclude that the following is an equivalent characterization of an ISS-CLF for the affine
system (88):
 
∂V T ∂V −1 ∂V
inf · f (x) + L (x) ρ (|x|) + · G(x)u < 0 ∀ x 6= 0
u ∂x ∂x ∂x

This looks more familiar. Defining


∂V ∂V ∂V
a(x) := · f (x) + LT (x) ρ−1 (|x|) , b(x) := GT (x)
∂x ∂x ∂x
we can rewrite the property of being an ISS-CLF as
inf {a(x) + b(x) · u} < 0 ∀ x 6= 0
u

or, equivalently, as
|b(x)| = 0 =⇒ a(x) < 0
for all x 6= 0, which is exactly (38). And neither a nor b depends on d.
Now, the desired input-to-state stabilizing feedback law u = k(x) is given by the universal
formula (39). (If one wants this feedback law to be smooth, then one needs to replace the second
term in a(x), which is in general just continuous, by a smooth approximation.)
Claim: The closed-loop system is ISS.
Indeed, the derivative of V along the closed-loop system is
∂V
V̇ = · (f (x) + L(x)d + G(x)k(x))
∂x
and we have
|x| ≥ ρ(|d|)

∂V ∂V ∂V
V̇ ≤ · f (x) + LT (x) ρ−1 (|x|) + · G(x)k(x) = a(x) + b(x) · k(x) < 0
∂x ∂x ∂x
for all x 6= 0, where the last inequality (< 0) is guaranteed by the universal formula. Thus V is an
ISS-Lyapunov function for the closed-loop system, which implies ISS by a result given earlier.
As a quick application of the ISS concept, consider again the system in normal form from
Section 3.2.1:
ξ˙1 = ξ2
ξ˙2 = ξ3
..
.
ξ˙r = b(ξ, η) + a(ξ, η)u
η̇ = q(ξ, η)
86 DANIEL LIBERZON

with a(ξ, η) 6= 0 for all ξ, η and with globally asymptotically stable zero dynamics η̇ = q(0, η)
(minimum-phase property). We saw that the control (30) globally stabilizes the ξ-dynamics but
this is in general not enough to globally stabilize the whole system (“peaking phenomenon”).
However, if we strengthen the minimum-phase property by assuming that the η-dynamics are ISS
with respect to ξ, then all is well: the fact that ξ → 0 is then enough, as we know, to conclude that
η → 0 too. So, ISS of the η-dynamics is the right property to guarantee that any feedback that
globally stabilizes the ξ-subsystem also automatically globally stabilizes the entire system, and no
“peaking” occurs. We call systems with this ISS property strongly minimum phase.

7.2.1 ISS backstepping

Now we’d like to design an ISS controller for the system

ẋ = f (x) + L(x)d + G(x)ξ


ξ˙ = u

Before we do this, we need one more fact about ISS-Lyapunov functions. We defined ISS-
Lyapunov functions via
∂V
|x| ≥ ρ(|d|) =⇒ · f (x, d) < 0 ∀ x 6= 0, ∀ d
∂x
where ρ ∈ K∞ . We could also rewrite this more precisely as
∂V
|x| ≥ ρ(|d|) =⇒ · f (x, d) ≤ −α(|x|) ∀ x, d (89)
∂x
where α ∈ K∞ (the fact that we can take α to be of class K∞ , and not just positive definite, is not
obvious, but this can be always be achieved by modifying V ). Another equivalent characterization
of ISS-Lyapunov functions is as follows:
∂V
· f (x, d) ≤ −α(|x|) + χ(|d|) ∀ x, d (90)
∂x
where α, χ ∈ K∞ .
Begin optional material
Proof of the equivalence (sketch). It is not hard to obtain χ from ρ and vice versa:
Suppose that (90) holds. Rewrite it as
∂V 1 1
· f (x, d) ≤ − α(|x|) − α(|x|) + χ(|d|)
∂x 2 2
from which we see that
∂V 1
|x| ≥ α−1 (2χ(|d|)) =⇒ · f (x, d) ≤ − α(|x|)
∂x 2
NONLINEAR AND ADAPTIVE CONTROL 87

(In this case we get not α but 12 α on the right-hand side, but the constant 1
2 is arbitrary and it
could be arbitrarily close to 1.)
Conversely, suppose that (89) holds. We only need to show (90) when

|x| ≤ ρ(|d|)

which can be done because  


∂V
max · f (x, d) + α(|x|)
|x|≤ρ(|d|) ∂x
is a function of d only and hence
 
∂V
χ(r) := max · f (x, d) + α(|x|)
|d|≤r, |x|≤ρ(r) ∂x

is well defined and gives (90).


End optional material
Accordingly, we could have equivalently defined ISS-CLFs via
 
∂V
inf · f (x, d, u) ≤ −α(|x|) + χ(|d|) ∀ x, d
u ∂x
For applying the universal formula, this would not have been as convenient (the term χ(|d|) outside
the inf complicates things, it needs to be moved inside first). But for backstepping, this alterna-
tive formulation will be more convenient. More precisely, we will see that it’s not useful for the
initialization step of backstepping but useful for the recursion.
We’re now ready for ISS backstepping. Suppose that we’re given an ISS-CLF V0 (x) and a
corresponding control law k0 (x), smooth and satisfying k0 (0) = 0, for which we have
∂V0
· (f (x) + L(x)d + G(x)k0 (x)) ≤ −α0 (|x|) + χ0 (|d|)
∂x
We claim that the “augmented” function defined in the usual way:
1
V1 (x, ξ) := V0 (x) + |ξ − k0 (x)|2
2
is an ISS-CLF for the augmented system, and we’ll show this by explicitly constructing the new
input-to-state stabilizing control law k1 (x, ξ).
∂k0
The derivative of V1 is given by (k0′ stands for the Jacobian matrix ∂x )

∂V0 ∂V0
V̇1 = · (f + Ld + Gk0 ) + · G(ξ − k0 ) + (ξ − k0 )T (u − k0′ f − k0′ Ld − k0′ Gξ)
|∂x {z } ∂x
“old” V̇0
 
∂V0
≤ −α0 (|x|) + χ0 (|d|) + (ξ − k0 ) T
u− k0′ f − k0′ Ld − k0′ Gξ +G T
∂x
88 DANIEL LIBERZON

We can cancel all terms inside the parentheses, except k0′ Ld. But we can dominate this term
using square completion: define
∂V0
k1 (x, ξ) := −(ξ − k0 ) + k0′ f + k0′ Gξ − GT − (k0′ L)(k0′ L)T (ξ − k0 )
∂x
Then we get
V̇1 ≤ −α0 (|x|) − |ξ − k0 (x)|2 + χ0 (|d|) − (ξ − k0 )T (k0′ L)(k0′ L)T (ξ − k0 ) − (ξ − k0 )T k0′ Ld
= −α0 (|x|) − |ξ − k0 (x)|2 + χ0 (|d|)
1 1
−(ξ − k0 )T (k0′ L)(k0′ L)T (ξ − k0 ) − (ξ − k0 )T k0′ Ld − dT d + dT d
| {z 4 } 4
2
=− (k0′ L)T (ξ−k0 )+ 12 d
1 
≤ −α0 (|x|) − |ξ − k0 (x)|2 + χ0 (|d|) + dT d ≤ −α1 xξ + χ1 (|d|)
4
where
1
χ1 (|d|) := χ0 (|d|) + |d|2
4
and α1 ∈ K∞ can be suitably defined since α0 (|x|) + |ξ − k0 (x)|2 is positive definite and radially
unbounded as a function of (x, ξ).
As before, we can apply the above backstepping procedure recursively to handle a chain of
integrators.
−→ For initializing the backstepping procedure, it is useful to know that any scalar affine system
ẋ = f (x) + ℓ(x)d + g(x)u
with g(x) 6= 0 is input-to-state stabilized by the feedback
1 
u= − f (x) − x − |ℓ(x)|x
g(x)
Indeed, the closed-loop system is
ẋ = −x + ℓ(x)d − |ℓ(x)|x
and for
x2
V (x) =
2
we have
V̇ = −x2 + ℓ(x)dx − |ℓ(x)|x2 ≤ −x2 + |ℓ(x)||d||x| − |ℓ(x)|x2 = −x2 − |ℓ(x)||x|(|x| − |d|)
≤ −x2 if |x| ≥ |d|
hence V is an ISS-Lyapunov function.
(We also know that in general, the assumption g(x) 6= 0 is not necessary for V to be an
ISS-CLF.)
As before, this can be generalized to strict feedback systems (cf. earlier homework).
NONLINEAR AND ADAPTIVE CONTROL 89

7.3 Adaptive ISS controller design


Reference: [KKK book, Section 5.2]
We now go back to adaptive control and continue the discussion we started in Section 7.1.
There, the disturbance was the parameter estimation error. So, we’ll now try to adopt the above
general theory of ISS controller design to this more concrete scenario.
−→ We will confine ourselves to a specific example. The general formulas are quite similar to those
we derived for the non-adaptive case. The example will be enough to bring out the differences
between the non-adaptive and adaptive cases.
Consider again the scalar plant from Example 9:

ẋ = −θx2 + u

We saw that if we use the controller


u = −x + θ̂x2
then the closed-loop system
ẋ = −x + θ̃x2
is not ISS with respect to the parameter estimation error θ̃. As a result, if θ̃ is bounded or even
converges to 0, convergence or even boundedness of x is not guaranteed. We recognize this as lack
of ISS property with respect to θ̃.
Can we find a different controller that does provide ISS with respect to θ̃?
We need to add some term that will dominate x2 for large x. With this intuition, one obvious
choice is
u = −x + θ̂x2 − x3
The closed-loop system is
ẋ = −x + θ̃x2 − x3
It is quite clear that solutions will no longer escape to ∞. But we need to prove ISS.
Since the system is scalar, we take the candidate ISS-Lyapunov function

x2
V (x) =
2
for which
V̇ = −x2 + θ̃x3 − x4 ≤ −x2 + |θ̃||x3 | − x4 = −x2 − |x3 |(|x| − |θ̃|)
and we get
|x| ≥ |θ̃| =⇒ V̇ = −x2 < 0 ∀ x 6= 0
This by definition shows that V is an ISS-Lyapunov function, hence the system is indeed ISS.
90 DANIEL LIBERZON

Remark 1 We’re viewing θ̃ as an external input. However, once the tuning law which generates
θ̂ is in place, it is more accurate to think of θ̃ as an output of the overall adaptive system, and not
as an input. This is because θ̃ is determined by θ and θ̂, which are the internal system parameters.
Then, a more accurate term is output-to-state stability (OSS), which is a form of detectability (small
output implies small state, etc.) This doesn’t change much as far as the theory goes, but it’s helpful
for achieving conceptual clarity. Detectability is very important in the adaptive control context,
we’ll see more on this in Section 8.3 and Section 9.

7.3.1 Adaptive ISS backstepping

To go beyond scalar systems, we can again apply backstepping. More precisely, we need to de-
velop an adaptive version of ISS backstepping discussed above. As in the case of usual (not ISS)
backstepping, discussed in Section 5, the basic idea will carry over but there will be some differences.
Let us see how it works by adding an integrator to the previous system:

ẋ = −θx2 + ξ
ξ˙ = u

We already did the initial step, which we reproduce below with appropriate notation:

u = −x + θ̂x2 − x3 =: k0 (x, θ̂)


x2
V0 (x) =
2
∂V0
(−θx2 + k0 ) = x(−x + θ̃x2 − x3 ) ≤ −x2 − |x3 |(|x| − |θ̃|) ≤ −x2 if |x| ≥ |θ̃|
∂x
But we know that this “gain-margin” characterization of ISS-Lyapunov function is not the right
one to use for backstepping. We want one of the form (90). This is easy to get: when

|x| ≤ |θ̃|

we can rewrite the above expression for V̇0 as

−x2 + |x3 |(|θ̃| − |x|) ≤ −x2 + |θ̃3 ||θ̃| = −x2 + θ̃4

Combining the two cases, we arrive at


∂V0
(−θx2 + k0 ) ≤ −x2 + θ̃4
∂x
Now, the usual candidate ISS-CLF for the augmented system:
1 x2 1
V1 (x, θ̂) := V0 (x) + |ξ − k0 (x, θ̂)|2 = + (ξ + x − θ̂x2 + x3 )2
2 2 2
NONLINEAR AND ADAPTIVE CONTROL 91

−→ Note: this Lyapunov function depends on θ̂, because so does the control law k0 . As we’ll see
in a moment, this introduces additional complications compared to the non-adaptive case.
∂V0 ˙
V̇1 = (−θx2 + ξ) + (ξ + x − θ̂x2 + x3 ) u + (1 − 2θ̂x + 3x2 )(−θx2 + ξ) − x2 θ̂
∂x
= x(−x + θ̃x2 − x3 ) +(ξ + x − θ̂x2 + x3 ) u + x + (1 − 2θ̂x + 3x2 )(−θ̂x2 + ξ)
| {z }
≤−x2 +θ̃ 4
˙
+ (1 − 2θ̂x + 3x2 )x2 θ̃ − x2 θ̂

(As earlier on page 45, we used θ = θ̂ − θ̃ to split the θ-dependent term.) We know that we can
cancel the terms that come after u on the second line of the above formula, and add damping:

k1 (x, θ̂, ξ) := −x − (1 − 2θ̂x + 3x2 )(−θ̂x2 + ξ) − (ξ + x − θ̂x2 + x3 ) . . . (not yet complete)

What should we do about the term multiplied by θ̃?


This is the disturbance-dependent term, which in the non-adaptive case was k0′ Ld. As we did
there, we can dominate it by square completion:

k1 (x, θ̂, ξ) := − x − (1 − 2θ̂x + 3x2 )(−θ̂x2 + ξ) − (ξ + x − θ̂x2 + x3 )


− (ξ + x − θ̂x2 + x3 )(1 − 2θ̂x + 3x2 )2 x4 . . . (not yet complete)

and this will contribute the term


1 2
θ̃
4
to the bound for V̇1 (because we have to subtract and add this term to complete the square).
˙
What should we do about the term −x2 θ̂?
Actually, there’s nothing we can really do about this term! In our previous adaptive backstep-
˙
ping design (Section 5.2), we allowed u to depend on this term, i.e., we allowed u to depend on θ̂
and thus be coupled to the tuning law. But here we are aiming for modular design, i.e., we want
the control law and tuning law designs to be decoupled.
−→ To achieve this, we need to reformulate our ISS objective. In addition to the parameter
˙ ˙
estimation error θ̃, we will view its derivative θ̃ = θ̂ as an additional input and ask for ISS with
respect to the vector input !
θ̃
˙
θ̃

Thus the ISS notion we’re now asking for is weaker than ISS with respect to θ̃ only. (Just recall
the definition of ISS; clearly, ISS with respect to a part of the input vector implies ISS with respect
to the whole input vector, but not vice versa.)
92 DANIEL LIBERZON

˙
Now that we’re treating θ̂ as another input, we can handle the last term in exactly the same
way as the previous term, i.e., dominate it using square completion. The complete formula for the
control law stands as

k1 (x, θ̂, ξ) := − x − (1 − 2θ̂x + 3x2 )(−θ̂x2 + ξ) − (ξ + x − θ̂x2 + x3 )


− (ξ + x − θ̂x2 + x3 )(1 − 2θ̂x + 3x2 )2 x4 − (ξ + x − θ̂x2 + x3 )x4

This will contribute the term


1 ˙2
θ̃
4
to the bound for V̇1 , which in the end will be
  ! 4 ! 2
1 1˙ x θ̃ 1 θ̃
V̇1 ≤ −x2 − (ξ + x − θ̂x2 + x3 )2 + θ̃4 + θ̃2 + θ̃2 ≤ −α1 + ˙ + ˙
4 4 ξ θ̃ 4 θ̃

for a suitable α1 ∈ K∞ , and ISS is established.


˙
Since the control law k1 doesn’t depend on θ̂, we can see that the procedure can be repeated
if another integrator is added. So, we have a recursive procedure which can handle chains of
integrators. And this procedure is cleaner than what we had in Section 5.2 using tuning functions.

7.3.2 Modular design

Now, suppose that our plant is


ẋ = f (x) + L(x)θ + G(x)u
and we have an adaptive control law which guarantees ISS of the closed-loop system with respect
˙
to (θ̃, θ̃). Then, we know that the following is true:

˙
• If θ̃ and θ̃ are bounded, then x is bounded
˙
• If θ̃ and θ̃ converge to 0, then x converges to 0

Thus to guarantee the desired properties of the state x, we need to find an adaptive law which
ensures the corresponding properties of the parameter estimation error θ̃. This is what we meant
by “modular design”: here are the design objectives for the controller and the estimation law, and
we can go ahead and design them separately. It doesn’t matter exactly how we’ll design them; as
long as the above properties are satisfied, the problem will be solved. (Go back again to Section 5.2:
do you see that we didn’t have modularity there?)
We’re not going to discuss the design of parameter estimation laws for nonlinear systems.
But for linear systems, we studied this in detail in Chapter 6. In particular, we saw how to get
˙
boundedness of θ̃ and θ̃, as well as (in some cases, at least without normalization) convergence of
NONLINEAR AND ADAPTIVE CONTROL 93

˙
θ̃ to 0. We also had such properties in indirect MRAC. So, while we may not always know how
to achieve these properties for general systems, they are not surprising and we’re comfortable with
them.
We’re less comfortable, however, with the requirement

θ̃ → 0

We know that we do not have this unless we have PE conditions (and we don’t want to impose
PE because it typically conflicts with the control design objective). In this context, it is helpful to
˙
know that ISS with respect to (θ̃, θ̃) also implies the following.

˙ ˙
• If θ̃ and θ̃ are bounded and L(x)θ̃ and θ̃ converge to 0, then x converges to 0

In other words, θ̃ → 0 is replaced by the less demanding condition that L(x)θ̃ → 0. The proof
of this fact relies on the result mentioned in the optional material in the ISS subsection. For more
details, see [KKK book, Lemma 5.3 on p. 193].
Just to get an idea how we might get this latter convergence property, consider the estimator

x̂˙ = Am (x̂ − x) + f (x) + L(x)θ̂ + G(x)u

where Am is a Hurwitz matrix. The state estimation error

e = x̂ − x

satisfies
ė = Ae + L(x)θ̃
Suppose that we were able to show that all signals are bounded and e → 0 (again, for linear plants
˙
we were often able to get such properties in Chapter 6). Then ė, ë are bounded (θ̃ is bounded
because it’s some function of the other signals which are bounded). Also,
Z ∞ Z t
ė(s)ds = lim ė(s)ds = lim e(t) − e(0) = −e(0)
0 t→∞ 0 t→∞

is bounded. Applying Barbalat’s lemma (this means applying it twice, because one first already
applies it once to show that e → 0), we conclude that

ė → 0

In view of the expression for ė and the fact that e → 0, we have

L(x)θ̃ → 0

as needed.
For details on designing parameter estimation schemes providing the above properties, read
[KKK book, Chapters 5 and 6].
94 DANIEL LIBERZON

8 Stability of slowly time-varying systems


We already had to deal with stability of linear time-varying systems (using observability). Need to
understand this subject better.

8.1 Stability
Suppose that we’re given an LTV system

ẋ = A(t)x (91)

and we want to know when it is uniformly asymptotically stable (this stability is then automatically
global and exponential by linearity).
Is it enough to assume that A(t) is a Hurwitz matrix for each fixed t?
No! (Not even if eigenvalues of A(t) are bounded away from the imaginary axis.)
Can we come up with a counterexample?
To understand this, the easiest case to consider is when A(t) is a piecewise-constant function
of time, i.e., it switches between several fixed matrices. Then instead of a more usual time-varying
system we have a switched system. The two are closely related, and later we’ll study switched
systems more explicitly when we discuss switching adaptive control.
Counterexample:
Suppose that we are switching between two systems in the plane. Suppose that the two indi-
vidual systems are asymptotically stable, with trajectories as shown in the first figure (the solid
curve and the dotted curve).

For different choices of the switching sequence, the switched system might be asymptotically
stable or unstable; these two possibilities are shown in the second figure.
From this example, the instability mechanism is also quite clear: even though each system is
stable, we catch it at the “peak” of its transient and switch to the other system, without giving it
a chance to decay.
NONLINEAR AND ADAPTIVE CONTROL 95

It is easy to modify this example to make A(t) continuously time-varying (just imagine a family
of systems homotopically connecting these two, and sliding fast through them would mimic the
effect of a switch).
There are many results dealing with stability of switched and time-varying systems. In fact, sta-
bility of the constituent systems for frozen t is neither sufficient nor necessary (we could turn things
around in the above example and have unstable systems but with switching having a stabilizing
effect).
The direction we pursue here is suggested by the above example, and is also the one that will
be relevant for adaptive control. As the section title suggests, we will restrict attention to time-
variations that are sufficiently slow (assuming still that A(t) is stable for each t). In the above
example, it is clear that if we waited longer, we’d be OK. Here’s a general result that characterizes
such slow variations by imposing suitable conditions on the derivative Ȧ(·).

Theorem 10 (Ioannou-Sun, Theorem 3.4.11) Consider the LTV system (91) and assume that:

• A(t) is Hurwitz for each fixed t, and there exist constants c, λ0 > 0 such that for all t and s
we have11
eA(t)s ≤ ce−λ0 s (92)

• A(·) is C 1 and uniformly bounded: there exists an L > 0 such that kA(t)k ≤ L ∀ t.

• For all t and all T we have either


R
t+T
a) kȦ(s)kds ≤ µT + α
t
or
R
t+T
b) kȦ(s)k2 ds ≤ µT + α
t
where µ, α > 0 and µ is sufficiently small.

Then the system (91) is exponentially stable.


11
Here k · k stands for induced matrix norm corresponding to Euclidean norm.
96 DANIEL LIBERZON

The condition (92) means that all matrices A(t) have a common stability margin λ0 and a
common overshoot constant c.
−→ Note that A(·) satisfies one of the hypotheses a), b) if kȦ(·)k is in L∞ and sufficiently small,
or if it is in L1 or L2 .
When Ȧ(·) satisfies a), it is sometimes called nondestabilizing with growth rate µ. If kȦ(·)k is
in L1 then the growth rate is 0.
−→ In the proof we’ll get a formula for µ (an upper bound). However, we view this result more as
a qualitative one (which is how it is stated).
Proof of Theorem 10.
Let’s prove a).
For each fixed t, let P (t) be the symmetric positive-definite solution of the Lyapunov equation

P (t)A(t) + AT (t)P (t) = −I (93)

−→ This is pointwise in t (“system snapshot”). Don’t confuse this with the Lyapunov equation
for general LTV systems, which would be Ṗ (t) + P (t)A(t) + AT (t)P (t) = −I. Of course Ṗ (t) will
eventually arise here too, but we’ll use the slow-varying conditions to bound it. Basically, we first
treat the system as if it were LTI and then use perturbation arguments.
Candidate Lyapunov function:

V (t, x) := xT (t)P (t)x(t)

Its derivative:
V̇ = −|x|2 + xT Ṗ x

We need to show some properties for P (·) and Ṗ (·).


First, it is well known (ECE 515) that since each P (t) is the solution of the corresponding
Lyapunov equation (93), it’s given by the formula
Z ∞
T (t)s
P (t) = eA eA(t)s ds (94)
0

so from (92) we have


Z ∞
kP (t)k ≤ c2 e−2λ0 s ds =: β2
0

so P (t) is uniformly bounded (L∞ ).


Second, since kA(t)k ≤ L ∀ t, we also have the lower bound

eA(t)s ≥ e−Ls ∀ t, s
NONLINEAR AND ADAPTIVE CONTROL 97

(Proof: for every x we can write |x| = |e−A(t)s eA(t)s x| ≤ ke−A(t)s k|eA(t)s x| ≤ ekA(t)ks |eA(t)s x| ≤
eLs |eA(t)s x|, hence |eA(t)s x| ≥ e−Ls |x| and the claim follows.) Then, using (94) again, we similarly
get
kP (t)k ≥ β1 > 0
so P (t) is uniformly bounded away from 0. (Cf. [Khalil, p. 159 or 371].)
Now we differentiate (93):
Ṗ (t)A(t) + AT (t)Ṗ (t) = −P (t)Ȧ(t) − ȦT (t)P (t) =: −Q(t)
which implies
Z ∞ Z ∞
T (t)s
Ṗ (t) = eA Q(t)eA(t)s ds =⇒ kṖ (t)k ≤ kQ(t)k c2 e−2λ0 s ds ≤ β2 kQ(t)k
0 0

From definition of Q,
kQ(t)k ≤ 2kP (t)kkȦ(t)k ≤ 2β2 kȦ(t)k
Combining the two:
kṖ (t)k ≤ 2β22 kȦ(t)k
Plugging this into the earlier formula for V̇ :
V̇ ≤ −|x|2 + 2β22 kȦ(t)k|x|2
From the definition of V and the above bounds we have the following:
β1 |x|2 ≤ V (t, x) ≤ β2 |x|2
This gives  
V̇ ≤ −β2−1 V + 2β22 β1−1 kȦ(t)kV = − β2−1 − 2β22 β1−1 kȦ(t)k V
By standard comparison principle,
Rt
− (β2−1 −2β22 β1−1 kȦ(s)k)ds
2 −1 α −1
−2β22 β1−1 µ)(t−t0 )
V (t) ≤ e t0
V (t0 ) ≤ e2β2 β1 e−(β2 V (t0 )
by a)
Therefore, V (t) decays to 0 exponentially if
β1
µ<
2β23

To prove b), one uses Cauchy-Schwartz inequality to write


Z t sZ
t √ p √ √ √
kȦ(s)kds ≤ kȦ(s)k2 ds · t − t0 ≤ µ(t − t0 )2 + α(t − t0 ) ≤ µ(t − t0 ) + α t − t0
t0 t0

and notes that the term t − t0 inside the exponential will be dominated by −β2−1 (t − t0 ). See
[Ioannou-Sun] for details.
98 DANIEL LIBERZON

8.2 Application to adaptive stabilization


In the context of adaptive control, time-variation comes from on-line adjustments of the parameter
estimates θ̂(t), and slow time-variation in the sense of Theorem 10 corresponds to slow speed of
adaptation.
Recall that adaptive laws we studied earlier (see, e.g., Theorem 5 in Section 6.3) provide slow
˙
adaptation in the sense that θ̂(t) is in L∞ and L2 . This makes Theorem 10 applicable.
In adaptive control, our basic idea has been to design the control law based on certainty equiva-
lence and combine it with an adaptive law. Then, for each frozen value of the parameter estimates
the corresponding controller by construction provides some form of stability, and these parameter
estimates don’t change too fast.
We need to be a little more careful because for each fixed value of the parameter estimates, the
controller stabilizes the corresponding plant model, but not the actual unknown plant. So, typically
Theorem 10 is applied to a plant model and then a perturbation argument is used to show stability
of the real closed-loop plant.
Consider the two-dimensional, single-input example
   
a 1 b
ẋ = x+ u
0 0 b

where b 6= 0 and both a and b are unknown. (We don’t assume that a < 0.)
The objective is to asymptotically stabilize this system (x → 0) using state feedback.
−→ A similar system arises in the context of adaptive tracking in [Ioannou-Sun], see examples in
Sect. 7.4.3 and 7.4.4. (There is a preliminary step in converting the tracking problem to stabilization
of the error system, which we skipped. The error system in [Ioannou-Sun] is similar to the above
system, but it has more equations because there is also an observer part there, while here we assume
that both components of x are available for control.)

To apply the certainty equivalence principle of control design, we first need to select a controller
that stabilizes our system for the case when a, b are known.
It is easy to check that the system is controllable (hence stabilizable). We can stabilize it with
a state feedback controller 
v = − k1 k2 x = −k1 x1 − k2 x2 (95)
where k1 and k2 are selected by any standard linear control design method discussed in ECE 515.
One option is to place closed-loop eigenvalues at some chosen locations in the left half of the
complex plane (pole placement). The closed-loop matrix is
     
a 1 b  a − bk1 1 − bk2
Acl = + −k1 −k2 =
0 0 b −bk1 −bk2
NONLINEAR AND ADAPTIVE CONTROL 99

Choosing, e.g.,
a+1 1
k1 = , k2 =
b b
we get  
−1 0
Acl =
−a − 1 −1
whose eigenvalues are −1, −1. (There is a systematic procedure for designing pole-placement feed-
back laws, via transforming to controllable canonical form, but here we can just find the gains by
trial and error.)
Begin optional material
Another option is, instead of arbitrarily selecting closed-loop poles, to consider an LQR problem
Z ∞
J= (xT (s)Qx(s) + Ru2 (s))ds −→ min
0 u

where the matrix Q = QT ≥ 0 and the scalar R > 0 are design parameters (they weight stability
against control effort). The solution is
1
u = − BT P x
R
where P = P T > 0 satisfies the algebraic Riccati equation
1
P A + AT P − P BB T P + Q = 0
R

End optional material


For simplicity, we choose the first option and go with the above formulas for k1 and k2 .
Now, back to the case when a and b are unknown, and we need to work with their estimates â,
b̂ and the corresponding estimation errors

ã := â − a, b̃ := b̂ − b

Rewrite the system as


           
â − ã 1 b̂ − b̃ â 1 b̂ ã b̃
ẋ = x+ u= x+ u − x1 − u
0 0 b̂ − b̃ 0 0 b̂ 0 b̃
| {z } | {z }
design model perturbations

The design model has the same form as the original system but instead of the unknown parameters
we have their estimates which are available for control. We need a control law

u = −k̂1 x1 − k̂2 x2
100 DANIEL LIBERZON

such that the matrix


     
â 1 b̂  â − b̂k̂1 1 − b̂k̂2
bcl :=
A + −k̂1 −k̂2 =
0 0 b̂ −b̂k̂1 −b̂k̂2

is Hurwitz. Repeating the above control design with â, b̂ in place of a, b, we get
â + 1 1
k̂1 = , k̂2 =
b̂ b̂
and  
bcl = −1 0
A (96)
−â − 1 −1
whose eigenvalues are −1, −1.
Now, we need an adaptive law for updating the parameter estimates â and b̂. We already devel-
oped parameter estimation schemes that will cover this example. We can use, e.g., the normalized
gradient law of Section 6.3.2, whose properties are listed in Theorem 5.12 That theorem guarantees,
in particular:
â, b̂ ∈ L∞ , ˙ b̂˙ ∈ L2 ∩ L∞
â, (97)
(only item (i) of Theorem 5 applies since we don’t have persistency of excitation here).
−→ The design of the control law and the parameter estimation law are completely independent
of one another (“modular design”).
Combining the two, we get the closed-loop system
   
b ã b̃
ẋ = Acl (t)x − x − u (98)
0 1 b̃

where we write the argument t in A bcl to emphasize that it is time-varying because â evolves with
bcl doesn’t depend on it.)
time. (So does b̂ but A
The analysis now proceeds in two steps:

bcl (t) is Hurwitz for each t.


Step 1 By construction, A Using (96) and (97), we see that
 
ḃcl (t) = 0 0 ˙ ∈ L2
A = |â|
−â˙ 0

Hence, Theorem 10 applies and tells us that the part of (98) coming from the design model,
i.e.,
bcl (t)x
ẋ = A
12
There we had a single output, while here we have full state measurements. But this only makes things
easier; we can always choose a scalar output and use it to design the adaptive law. For example, y = x1
bs + b
gives the transfer function 2 . Note that there are only two parameters to estimate.
s + as
NONLINEAR AND ADAPTIVE CONTROL 101

is exponentially stable.
−→ What if b̂(t) = 0 at some time t? This is loss of stabilizability again. The fix is similar
to the case of indirect MRAC: if we know the sign of b and if

|b| ≥ b0 (known)

then we project the gradient law onto (−∞, −b0 ] or [b0 , ∞) (depending on the sign of b),
which we know doesn’t affect the properties (97).

Step 2 Using (97) and well-known facts about various induced gains of exponentially stable linear
systems being finite, it is possible (although not easy) to show that the perturbed system (98)
is still asymptotically stable, thus x → 0 as needed.
We skip Step 2. It can be found (modulo different notation and presence of extra dynamics)
in [Ioannou-Sun, pp. 481–482]. (Relies on a swapping lemma.)

We see that the basic idea behind the analysis is rather simple:

• The controller stabilizes the design model for each frozen value of the parameters

• The adaptive law guarantees slow adaptation

• Theorem 10 implies stability of the time-varying closed-loop design model

• The real closed-loop system has extra terms coming from parameter estimation errors, which
are small in an appropriate sense by the properties of the adaptive law

• A perturbation argument finishes the stability proof

We also learn the following lessons from this: the control design should really be based on the
design model, and it should be robust with respect to the errors between the design model and
the real model. This suggests a departure from the certainty equivalence principle, which doesn’t
address such robustness. We already saw this idea in the nonlinear context (ISS).

8.3 Detectability

The above design is not completely satisfactory. First, Step 2 is messy (we didn’t give it).
Second, adaptive laws based on normalization are somewhat difficult to analyze and do not extend
well to nonlinear plants.
Recall that in indirect MRAC we had a different, estimator-based adaptive law with no normal-
ization, and the estimator coincided with the reference model in closed loop. (We had a reference
signal there but it’s not important, we can set it to 0.) It also guaranteed slow adaptation speed
102 DANIEL LIBERZON

˙ b̂˙ → 0 and e = y − ym → 0 where ym was the state


in the sense of (97), and in addition we had â,
of the estimator/model, something we don’t get when we use normalization.
This suggests the following alternative strategy:

• Use an unnormalized estimator-based adaptive law

• Show stability of the estimator in closed loop using Theorem 10

• Use e → 0 to prove stability of the actual plant

We don’t pursue this alternative approach in detail here (mainly because the design we used
for indirect MRAC doesn’t generalize very well beyond plants of relative degree 1). This reasoning
will be taken up later in the course (see switching adaptive control).
However, we want to plant the seed now and discuss the last step above. In the full measurement
case (y = x), which includes the scalar case we studied in indirect MRAC, convergence of the
estimator state x̂ plus convergence of e immediately give convergence of the plant state x = x̂ − e.
In general, however, e is an output of lower dimension than x and x̂. The relevant concept then
is detectability with respect to this output. This is a refinement of our earlier reasoning based on
observability. For now we’ll just make a brief general remark on detectability, and will later revisit
it and will see its importance more clearly.
Suppose we write the overall closed-loop adaptive control system as

ẋ = Aθ̂(t) x
e = Cθ̂(t) x

Here x is the combined state of plant, controller, and estimator. As usual, θ̂ is the vector of
parameter estimates and e is the output estimation error. The above form is completely general,
as long as the plant, controller, and estimator dynamics are linear.
−→ The adaptive/tuning law, which is typically nonlinear, is not a part of the system dynamics.
Assume that the above system is detectable for each fixed value of θ̂. (This property is sometimes
also called tunability [Morse].)
Recall: detectability means stability of unobservable modes, meaning that if the output equals
0 (or even just converges to 0) then the state converges to 0. Note that this is strictly weaker than
observability. For now we’re talking about detectability of LTI systems, i.e., detectability of each
fixed pair (Aθ̂(t) , Cθ̂(t) ).

An equivalent characterization of detectability is the existence (for each frozen value of θ̂) of an
output injection matrix, Lθ̂ , such that Aθ̂ − Lθ̂ Cθ̂ is Hurwitz.
We can take A(·) and C(·) to be C 1 in θ̂. Then can show that L(·) is also C 1 in θ̂.
Suppose that the adaptive law ensures that:
NONLINEAR AND ADAPTIVE CONTROL 103

1) θ̂ is bounded
˙
2) θ̂ is in L2 , or converges to 0, or is small in some other sense so that the hypotheses of
Theorem 10 are fulfilled

3) e is in L2 , or converges to 0, or is some other zeroing signal, i.e., signal which, when injected
into an exponentially stable linear system, produces a state converging to 0

As we said, one example of an adaptive law with these properties is the one we used for indirect
MRAC.
Combining this with the detectability assumption, we get stability immediately:
Rewrite the x-dynamics as

ẋ = (Aθ̂(t) − Lθ̂(t) Cθ̂(t) )x + Lθ̂(t) e

The homogeneous part is exponentially stable by Theorem 10. (Here we are using the facts
that dA/dθ̂, dC/dθ̂, dL/dθ̂ are continuous as functions of θ̂, and θ̂ is bounded.)
e is a zeroing signal, hence so is Lθ̂(t) e (because L is continuous with respect to θ̂ and θ̂(t) is
bounded).
Therefore, x → 0 and we are done.
This argument follows the same logic as in the previous subsection, but is much cleaner (espe-
cially the last step).
−→ The above argument shows that to get stability, we should design the adaptive system so that
it is detectable with respect to the output estimation error, for each frozen value of the parameter
estimates.
Detectability is not only sufficient for stability, but also necessary, in the following sense. Sup-
pose that for some value θ̂, the system is not detectable. This means that we can have e ≡ 0 but
x 6→ 0. Of course, if θ̂ will change, we don’t necessarily have a problem. But if θ̂ is an equilibrium
value of the adaptive law, i.e., if e ≡ 0 will make θ̂ stuck at this value, then x will never converge.
So, we must have detectability for all equilibrium values of the adaptive law.

9 Switching adaptive control


References: see papers on my website under the category “Supervisory Control” which build on
previous work by Morse, Hespanha, and others. In particular, the paper “Overcoming the limi-
tations of adaptive control by means of logic-based switching” (2003) contains direct comparison
with traditional adaptive control. Also Chapter 6 of my book “Switching in Systems and Control”
contains a detailed tutorial treatment of the subject.
104 DANIEL LIBERZON

Switched systems and switching control were already mentioned a couple of times earlier in
the course. In particular, we looked at an example of a switched system when discussing stability
of time-varying systems (Section 8.1) and we briefly discussed a switching control law when we
studied universal regulators (at the end of Section 3.1.3).
Switching adaptive control aims to solve the same problems as traditional adaptive control,
discussed so far in this course. It also uses many of the same concepts and ideas. The primary
difference is that instead of using continuous tuning laws to define the control gains, it relies on
logic-based switching among a family of candidate controllers.
We saw that continuous tuning/estimation has some limitations. To begin, we need a nice pa-
rameterization of the plant model. In fact, we only considered cases where the unknown parameters
enter linearly. One reason we needed this is that for the gradient law, this guaranteed that the
cost function was convex in θ̂. Without this property, the gradient law would not give any useful
convergence results.
Another issue that we encountered was loss of stabilizability. Continuous tuning can take us to
places in the parameter space where no stabilizing control gains exist. To overcome this, we had
to use some form of projection (which requires a priori information about unknown parameters).
Switching adaptive control aims to lift these restrictions by abandoning continuous tuning
altogether, and instead updating the controller gains in a discrete (switched) fashion. This gives us
greater design flexibility.

Supervisor

Switching
Signal
u1
Controller 1

u2 u y
Controller 2
Plant
.
.
.

−→ The above figure suggests that the set of controllers is discrete (even finite). This is an option
that was not possible in continuous adaptive control. However, we can still have a continuum of
controllers and pick controllers from this continuous family one at a time.
−→ The unknown parameters in the plant may take values in a discrete or a continuous set. Intu-
itively speaking, we should have one controller for each possible value of the unknown parameter,
NONLINEAR AND ADAPTIVE CONTROL 105

and we’ll usually assume that this is the case. However, we don’t need to have exact correspondence
between the two parameter sets (plant parameters and controller indices). For example, we can
try to “cover” a continuous set of plant parameter values (such as a ball) with a finite number of
controllers (each of which is robust enough to stabilize nearby plants).
Notation: We will write P for the set in which the unknown parameters take values. We assume
that this is a compact set. The vector of unknown parameters itself will be denoted by p∗ , and its
estimates will be denoted by p. This is the same as what we earlier denoted as θ and θ̂, respectively,
but this new notation will be more convenient (because we will frequently use p as subscripts) and
it is more consistent with the switching adaptive control literature.
As in Section 7, we want to have a modular design, i.e., formulate separate design objectives
on the controllers and on the supervisor.

9.1 The supervisor


We now assume for the moment that the controllers are given (we’ll specify later what properties
they should have) and discuss the architecture of the decision-making block, which we call the
supervisor. The basic components of the supervisor are given in the following figure.

y Monitoring
Multi- yp ep µp Switching σ
Estimator p∈P +

p∈P
Signal p∈P Logic
u Generator

9.1.1 Multi-estimator

The scheme we discuss here is estimator-based, and is similar to the estimator equations we
saw earlier. The difference is that we will no longer design a continuous tuning law to drive the
output estimation error to 0, but instead we will generate a family of estimates

yp , p ∈ P

and the corresponding output estimation errors

ep := yp − y, p∈P

and will pick the smallest one at each time instant (roughly speaking).
106 DANIEL LIBERZON

−→ The basic idea is to design the multi-estimator so that ep∗ , i.e., the error corresponding to the
true parameter value, is small. Usually we cannot guarantee anything about the other, “wrong”
estimation errors (there is no a priori reason for them to be small). Thus, the smallness of ep
indicates the likelihood that p = p∗ . In other words, it seems intuitively reasonable (although not
yet justified formally in any way) to pick as a current estimate of p∗ the index of the smallest
estimation error.
This property is basically the same as what we had for the estimators we had earlier: in case
when the parameter estimates match the true values, e is small—e.g., converges to 0 exponentially
at a prescribed rate. (See the estimator for Example 6 in Section 6.2, which we saw again in indirect
MRAC.)

Example 10 Consider the scalar example

ẏ = y 2 + p∗ u

where p∗ is an unknown element of some set P ⊂ R containing both positive and negative values.
We can let the estimator equations be

ẏp = −am (yp − y) + y 2 + pu, p∈P (99)

where am > 0. Then the estimation error ep∗ = yp∗ − y satisfies

ėp∗ = −am ep∗

and hence converges to 0 exponentially fast, for an arbitrary control u.


(It is useful to compare this with the estimator

ŷ˙ = −am (ŷ − y) + y 2 + p̂u

which we would have used in conventional adaptive control.)


Note that if we add a disturbance to the plant:

ẏ = y 2 + p∗ u − d

then we have
ėp∗ = −am ep∗ + d
hence ep∗ exponentially converges to d/am , not to 0. In this case, the steady-state value of the
error ep∗ is determined by the size of the disturbance.
One concern is that realizing the multi-estimator simply as a parallel connection of individual
estimator equations for p ∈ P is not efficient and actually impossible if P is an infinite set. The
estimator equations (99) can be implemented differently as follows. Consider the system

ż1 = −am z1 + am y + y 2
(100)
ż2 = −am z2 + u
NONLINEAR AND ADAPTIVE CONTROL 107

together with the outputs


yp := z1 + pz2 , p∈P (101)
The two-dimensional system (100) produces the same signals as does the (possibly infinite-dimensional)
system (99):

ẏp = ż1 + pż2 = −am z1 + am y + y 2 − pam z2 + pu = −am yp + am y + y 2 + pu


by (101)

This idea is known as state sharing. The family of signals (101) is of course still infinite, but at
each particular time we can look any one of them up or perform mathematical operations—such
as computing the minimum—with the entire family.

9.1.2 Monitoring signal generator

Rather than basing decisions on the instantaneous values of the output estimation errors, we would
like to take their past behavior into account. Thus we need to implement an appropriate filter,
which we call the monitoring signal generator. This is a dynamical system whose inputs are the
estimation errors and whose outputs
µp , p ∈ P
are suitably defined integral norms of the estimation errors, called monitoring signals. For example,
we can simply work with the squared L2 norm:
Z t
µp (t) := |ep (s)|2 ds, p∈P (102)
0

These monitoring signals can be generated by the differential equations

µ̇p = |ep |2 , µp (0) = 0, p∈P (103)

If we don’t want the signals µp to grow unbounded, we can introduce a “forgetting factor”:
Z t
µp (t) := e−λ(t−s) |ep (s)|2 ds
0

with λ > 0, which can be implemented via

µ̇p = −λµp + |ep |2 , µp (0) = 0, p∈P (104)

Again, we do not want to generate each monitoring signal individually. The idea of state sharing
can be applied here as well. To see how this works, let us revisit the multi-estimator for Example 10,
given by (100) and (101). Each estimation error can be equivalently expressed as

ep = z1 + pz2 − y
108 DANIEL LIBERZON

so that we have
e2p = (z1 − y)2 + 2pz2 (z1 − y) + p2 z22 , p∈P
If we now define the monitoring signal generator via

η̇1 = (z1 − y)2


η̇2 = 2z2 (z1 − y)
η̇3 = z22
µp = η1 + pη2 + p2 η3 , p∈P (105)

then the equations (103) still hold. By now you should be able to see why this works and how it
can be extended to (104).

9.1.3 Switching logic

This is a dynamical system whose inputs are the monitoring signals µp , p ∈ P and whose output is
a piecewise constant switching signal σ. The switching signal determines the actual control law

u = uσ

applied to the plant, where up , p ∈ P are the control signals generated by the candidate controllers.
−→ We don’t actually need to physically generate the out-of-the-loop control signals up , p 6= σ(t).
The diagram with a bank of controllers is just for illustration. What we’re really implementing is
a switching controller.
−→ The controllers can be parameterized by a different set Q, which can for example be a subset
of P. This can be done by defining a controller assignment map from P to Q. To keep things
simple, we will assume that Q = P. But we’ll usually write q and not p for controller indices, to
distinguish them from the plant parameters and to remind us of the more general situation. We’ll
also sometimes denote the controllers themselves as Cq , q ∈ P.
Basic idea of the switching logic:

σ(t) := arg min µp (t)


p∈P

This is essentially what we’ll do, but we’ll only update the value of σ from time to time, rather
than continuously. This discrete update strategy for σ can be either time-based (use a fixed time
interval between updates—dwell time) or event-based (update when the difference between the old
minimum and the new one gets large enough—hysteresis). More details on this later.
Justification–?

µq small =⇒ eq small =⇒ q = p∗ =⇒ q-th controller stabilizes the plant


WRONG

This is tempting, but the second implication is not known to hold. In fact, what we have is its
converse.
NONLINEAR AND ADAPTIVE CONTROL 109

We saw this issue before in continuous adaptive control: the estimation scheme drives e to 0,
and we plug the estimate θ̂ into the controller as if it were θ (certainty equivalence). However, even
if e → 0, we don’t know that θ̂ → θ, so a separate justification is required why this works. Before,
we were using Lyapunov functions to prove stability. Here the reasoning will be different, based on
detectability.
In fact, let’s drop the wrong implication and state directly the property we want:

eq small =⇒ q-th controller stabilizes the plant


want this

This is precisely the detectability property, which we already discussed in Section 8.3. And it must
be true for all q since we don’t know the behavior of σ a priori.
More precisely, this is detectability of the supervisory control system with the q-th controller
in the loop, with respect to the q-th output estimation error. Assume that this system is linear,
and write it as

ẋ = Aq x
eq = C q x

Recall (we’ll need this later) that detectability is equivalent to the existence (for each frozen value
of q) of an output injection matrix, Lq , such that Aq − Lq Cq is Hurwitz. Use this to rewrite the
system as
ẋ = (Aq − Lq Cq )x + Lq eq
It is then clear that small eq (bounded, convergent to 0) implies the same property for x.

9.2 Example: linear systems


Consider a linear plant

ẋ = Ap∗ x + Bp∗ u
(106)
y = Cp ∗ x

where x ∈ Rn , u ∈ Rm , y ∈ Rk , {Ap , Bp , Cp : p ∈ P} is a given finite family of matrices, and p∗ ∈ P


is unknown. We assume that each system in this family is controllable and observable (or at least
stabilizable and detectable).
Problem: regulate x to 0 using output feedback.
We haven’t discussed estimator design for the case of output measurements (we’ve only con-
sidered full state measurements or scalar systems before). However, we can use the standard
Luenberger observer to design the multi-estimator:

ẋp = (Ap − Lp Cp )xp + Bp u + Lp y


p∈P
yp = C p x p
110 DANIEL LIBERZON

where each Lp is an output injection matrix such that Ap − Lp Cp is Hurwitz. It follows that the
estimation error ep∗ = yp∗ − y converges to zero exponentially fast, regardless of the control u that
is applied (just subtract the two right-hand sides when p = p∗ ).
We can also use the above observers to design the candidate control laws:

up = −Kp xp , p∈P (107)

where the matrices Kp are such that Ap − Bp Kp are Hurwitz for each p ∈ P.
Let the monitoring signals be the L2 norms of the output estimation errors as in (102), i.e.,
generate them by the differential equations (103).
It remains to define the switching signal σ : [0, ∞) → P, which will give us the switching
controller u(t) = −Kσ(t) xσ(t) . One way to do this is by means of the so-called hysteresis switching
logic, see figure.

Initialize σ

σ =p
∃p :
no µp + h ≤ µσ yes
?

This switching logic works as follows. Fix a positive number h called the hysteresis constant.
Set σ(0) = arg minp∈P µp (0). Now, suppose that at a certain time σ has just switched to some
q ∈ P. The value of σ is then held fixed until we have minp∈P µp (t) + h ≤ µq (t). If and when that
happens, we set σ equal to arg minp∈P µp (t). When the indicated arg min is not unique, break the
tie arbitrarily.
−→ In Example 10, for each fixed time t, µp is a quadratic polynomial in p given by the for-
∂µ
mula (105), hence the value of p that minimizes µp (t) is either a real root of ∂pp (t) or a boundary
point of P. Thus for that example, the on-line minimization procedure required for implementing
the switching logic is relatively straightforward.

Lemma 11 The switching stops in finite time, i.e., there exists a time T ∗ and an index q ∗ ∈ P
such that σ(t) = q ∗ ∈ P for all t ≥ T ∗ . Moreover, eq∗ ∈ L2 .

Proof. Since ep∗ converges to 0 exponentially, the formula (102) implies that µp∗ (t) is bounded
from above by some number K for all t ≥ 0. In addition, all monitoring signals µp are nondecreasing
by construction. Using these two facts and the definition of the hysteresis switching logic, we now
prove that the switching must stop in finite time. Indeed, each µp has a limit (possibly ∞) as
NONLINEAR AND ADAPTIVE CONTROL 111

t → ∞. Since P is finite, there exists a time T such that for each p ∈ P we either have µp (T ) > K
or µp (t2 ) − µp (t1 ) < h for all t2 > t1 ≥ T . Then for t ≥ T at most one more switch can occur. We
conclude that there exists a time T ∗ such that σ(t) = q ∗ ∈ P for all t ≥ T ∗ , and µq∗ is bounded.
Finally, eq∗ ∈ L2 by (102).
After the switching stops, the closed-loop system (excluding out-of-the-loop signals) can be
written as
   
ẋ x
=A
ẋq∗ xq ∗
  (108)
x
eq ∗ = C
xq ∗
where  
Ap∗ −Bp∗ Kq∗
A :=
L q ∗ Cp ∗ Aq∗ − Lq∗ Cq∗ − Bq∗ Kq∗
and 
C := −Cp∗ Cq ∗ .
If we let  
−Lp∗
L :=
−Lq∗
then it is straightforward to check that
 
Ap∗ − Lp∗ Cp∗ −Bp∗ Kq∗ + Lp∗ Cq∗
A − LC = .
0 Aq∗ − Bq∗ Kq∗
The matrix on the right-hand side is Hurwitz, which shows that the system (108) is detectable with
respect to eq∗ .
It now remains to apply the standard output injection argument. Namely, write
   
ẋ x
= (A − L C) + Leq∗
ẋq ∗ xq ∗

and observe that x and xq∗ converge to zero in view of stability of A − K C and the fact that
eq ∗ ∈ L 2 .
This reasoning is similar to what we had for slowly time-varying system, but there’s no time
variation here after the switching has stopped so this is easier.
The above problem is rather special, and the solution and the method of proof have several
drawbacks:

• The Luenberger-based multi-estimator that we gave only works when P is a finite set. It is
not suitable for state-shared implementation. However, there are standard results in linear
identification that can be used to design state-shared multi-estimators. (We will not discuss
this.)
112 DANIEL LIBERZON

• Our controller design and multi-estimator design were coupled because both relied on the
Luenberger observer. However, the particular choice of candidate control laws (107) is just
one example, and can be easily changed. Assume, for example, that every system in the
family to which the plant (106) belongs is stabilizable by a static linear output feedback. In
other words, assume that for each p ∈ P there exists a matrix Kp such that the eigenvalues
of Ap − Bp Kp Cp have negative real parts. A straightforward modification of the above
argument shows that if we keep the estimators as they are but replace the control laws (107)
by up = −Kp y, we still achieve state regulation.

• Detectability of the feedback connection of the plant with the q ∗ -th controller seems to come
out of the blue. Actually, it is a natural consequence of the fact that this controller stabilizes
the corresponding plant model (for p∗ = q ∗ ). Here is another proof of detectability which
makes this clearer. Consider first the dynamics of xq∗ after switching stops, i.e., under the
action of the q ∗ -th controller:

ẋq∗ = (Aq∗ − Lq∗ Cq∗ )xq∗ − Bq∗ Kq∗ xq∗ + Lq∗ Cp ∗ x = (Aq∗ − Bq∗ Kq∗ )xq∗ − Lq∗ eq∗
| {z }
Cq∗ xq∗ −eq∗

Suppose that eq∗ → 0 (this is just for showing detectability, it might not actually be true).
Then xq∗ → 0 because Aq∗ − Bq∗ Kq∗ is Hurwitz. This implies that y = Cq∗ xq∗ − eq∗ → 0,
and also u = −Kq∗ xq∗ → 0. Since the plant is assumed to be detectable and u, y → 0, we
have x → 0 (apply an output injection argument again, this time to the plant only). We have
just shown that the (x, xq∗ )-system, driven by the q ∗ -th controller, is detectable with respect
to eq∗ as desired. Thus we know that the output injection matrix L, which we so magically
found earlier, must exist.

In any case, the above example was a useful illustration of the main ideas, and we will now take
another pass through the approach to refine and generalize it.

9.3 Modular design objectives and analysis steps


We see that detectability is a key concept here.
Up to now, we discussed detectability of the closed-loop system for each fixed controller with
respect to the corresponding estimation error. In the above example, we applied this to the q ∗ -th
controller, where q ∗ is the index at which the switching stops.
However, if the switching doesn’t stop, we need switched detectability. In the linear case, this
means that if we write the switched closed-loop system as

ẋ = Aσ x
(109)
eσ = C σ x
NONLINEAR AND ADAPTIVE CONTROL 113

then we want it to be detectable with respect to eσ . In other words, eσ being small (→ 0) should
imply that x is small (→ 0). Here x is the state of the plant, the multi-estimator, and the active
controller (at each time).
Does this switched detectability property follow from detectability for each frozen value of σ
(which we assumed earlier)?
No. This is similar to the stability issue, which for time-varying or switched systems doesn’t
follow from stability of individual fixed subsystems among which we are switching.
In fact, the connection with stability is quite direct. Recall that detectability for each fixed
index q is equivalent to the existence of a matrix Lq such that Aq − Lq Cq is Hurwitz. Use this to
rewrite the switched system as
ẋ = (Aσ − Lσ Cσ )x + Lσ eσ (110)
We know that the switched system
ẋ = (Aσ − Lσ Cσ )x
may not be stable even if each Aq − Lq Cq is Hurwitz. Some further properties of σ are required.
We saw this in stability of time-varying systems. One option is to require that σ be slow enough.
It no longer makes sense to bound its derivative (as in Chapter 8) because σ is not differentiable.
But there are several possible slow-switching conditions that work:

• Switching stops in finite time (as in the above example). Then stability is obvious. But this
condition is quite strong.
• There exists a sufficiently large dwell time τD , i.e., the time between any two consecutive
switches is lower-bounded by τD .
• There exists a sufficiently large average dwell time τAD . This means that the number of
switches on any interval (t, T ], which we denote by Nσ (T, t), satisfies
T −t
Nσ (T, t) ≤ N0 +
τAD
Here N0 ≥ 1 is an arbitrary number (it cannot depend on the choice of the time interval).
For example, if N0 = 1, then σ cannot switch twice on any interval of length smaller than
τAD . This is exactly the dwell time property. Note also that N0 = 0 corresponds to the case
of no switching, since σ cannot switch at all on any interval of length smaller than τAD . In
general, if we discard the N0 “extra” switches, then the average time between consecutive
switches is at least τAD . Average dwell time is more general than dwell time, because it
allows us to switch fast when necessary and then compensate for it by switching sufficiently
slowly later.

There is a large literature on stability of switched systems. It is known that any of the above
conditions guarantees that stability is preserved under switching. The last result (on average dwell
time) is due to Hespanha.
114 DANIEL LIBERZON

Now, suppose σ does satisfy one of the above conditions. Then we can conclude detectability
of (109), i.e., we know that x → 0 if eσ → 0, in view of (110).
But for this to be useful, we need to know that eσ is small. We hope that the switching logic
will somehow guarantee this, even if the switching doesn’t stop.
We are now ready to state, at least qualitatively, four main design objectives placed on the
individual components of the supervisory control system:

Matching At least one of the estimation errors is small

Detectability For each fixed controller, the closed-loop system is detectable through the corre-
sponding estimation error

Bounded Error Gain The signal eσ is bounded in terms of the smallest of the estimation errors

Non-Destabilization The switched closed-loop system is detectable through eσ provided that


detectability holds for every frozen value of σ

The Matching property is a requirement imposed on the multi-estimator design. The estimation
error that we can hope to be small is ep∗ , where p∗ is the true value of the unknown parameter.
The above somewhat more vague statement is a bit more general.
In Example 10, we could get ep∗ → 0 when there is no disturbance, and we could get ep∗
bounded when there is a bounded disturbance (regardless of the control law being applied).
For linear systems, there exists a theory for designing multi-estimators with such properties. For
some classes of nonlinear systems, this can be done along the lines of what we saw in Example 10
and earlier: put terms on the right-hand side of the multi-estimator that match the right-hand side
of the plant, and add linear damping.
We will not discuss the Matching property further.
The Detectability property is a requirement imposed on the candidate controllers. It is inter-
esting and we’ll discuss it further below.
The last two properties are requirements placed on the switching logic. We’ll also discuss them
a little bit below.
It is not difficult to see now, at least conceptually, how the above properties of the various
blocks of the supervisory control system can be put together to analyze its behavior.
Analysis:

)
Matching + Bounded Error Gain =⇒ eσ is small
=⇒ x is small
Detectability + Non-Destabilization =⇒ detectability with respect to eσ
NONLINEAR AND ADAPTIVE CONTROL 115

9.3.1 Achieving detectability

The first figure shows the closed-loop system which we want to be detectable, with eq viewed as
output.

y
Plant

uq Multi− yq − eq
Controller Cq Estimator +

The second figure shows an equivalent but more convenient representation of the same system.

y
Plant

uq Multi− yq − eq
Controller Cq Estimator +

y +

The system inside the box is called the injected system. It is the connection of the q-th controller
with the multi-estimator, and eq is an input injected into it.
−→ The reason this transformation is useful is because, as we already discussed earlier, the control
design should really be based on the design model (multi-estimator in the present setting) and not
on the unknown plant itself.
We now state, in very informal terms, a representative “theorem” which gives sufficient condi-
tions for detectability.

Theorem 12 Let xP , xC , xE be the states of the plant, the controller, and the multi-estimator, re-
spectively. Assume that:

1) The injected system satisfies


eq small =⇒ xC , xE small

2) The plant satisfies


uq , y small =⇒ xP small
116 DANIEL LIBERZON

Then the overall closed-loop system is detectable, i.e.,

eq small =⇒ xP , xC , xE small

“Proof”—almost immediate:
If eq is small, then 1) guarantees that xC and xE are small.
uq and yq are then small since they are functions of xC and xE .
Hence y = yq − eq is also small.
Finally, 2) guarantees that xP is small.
The above result can be made completely rigorous. Property 1) of the injected system can
be formalized using the ISS notion, i.e., by saying that the controller input-to-state stabilizes the
multi-estimator. Thus the material of Section 7.2 is directly relevant here.
−→ Actually, here things are even simpler because the estimation error eq , with respect to which
we want to achieve ISS, is known and can be used by the controller (while in Section 7.2 the
disturbance d was assumed to be unknown).
−→ If the dynamics are linear, then as we know this ISS property is automatic from the internal
asymptotic stabilization.
Property 2) is detectability of the plant, and can be formalized in a way similar to ISS (input-
output-to-state stability). A formal proof then requires some manipulations with class K∞ and KL
functions.
Actually, ISS is an overkill. Can use a weaker property, called integral-ISS, instead. This is
suitable because estimation errors are typically small in an integral sense, since we integrate them
to construct monitoring signals. The injected system we get for Example 10 with the obvious choice
of the control laws
y2 + y
up = − , p∈P
p
is not ISS, but it can be shown to be integral ISS.
Here is another representative result:

Theorem 13 Assume that:

1) The injected system satisfies


eq small =⇒ yq small

2) The plant satisfies


y small =⇒ xP , uq small

3) The controller satisfies


y, uq small =⇒ xC small
NONLINEAR AND ADAPTIVE CONTROL 117

4) The multi-estimator satisfies

y, uq , yq small =⇒ xE small

Then the overall closed-loop system is detectable, i.e.,

eq small =⇒ xP , xC , xE small

1) is input-to-output stability of the injected system. This is weaker than 1) in the previous
theorem.
2) is essentially13 the minimum-phase property of the plant. This is stronger than 2) in the
previous theorem. The requirement that the plant be minimum-phase is actually quite common in
adaptive control; see [Ioannou-Sun, pp. 25, 332, 412, and elsewhere].
3), 4) are detectability of the controller and multi-estimator. These are usually reasonable
assumptions.
Can “prove” this result by a simple signal-chasing argument along the lines of the previous one.
Try it!

9.3.2 Achieving bounded error gain and non-destabilization

We now turn to the Bounded Error Gain and Non-Destabilization properties, which need to be
enforced by the switching logic.
The Bounded Error Gain property is about making eσ small. Its counterpart in continuous
adaptive control was the smallness of the output estimation error e (in the sense of L2 or convergence
to 0). Here, the bound for eσ is usually stated in terms of suitable integral expressions for the
estimation errors, which are related to the monitoring signals.
The Non-Destabilization property, as we discussed, is about σ switching slowly enough (in the
sense of dwell time, average dwell time, or termination of switching if we’re lucky). Its counterpart
˙
in continuous adaptive control was slow adaptation speed (θ̂ being in L2 or converging to 0).
It’s not hard to see that these two properties are actually conflicting. To enforce bounded error
gain, we want to switch to arg minp µp (t) as quickly as possible. But for non-destabilization, we
want to switch slowly enough (or even stop). So, need to find a compromise.
One option is to use the dwell-time switching logic, which enforces dwell time by construction.
This switching logic ensures non-destabilization, and is easy to implement. Bounded error gain
is harder to get, because if the switching is only allowed at event times separated by τD , then µσ is
not always guaranteed to be small compared to the other monitoring signals. Although there are
13
The definition of minimum phase is stated in terms of y ≡ 0, while here we say “y is small.” For linear
plants this makes no difference. For nonlinear plants, we need the notion of a “strongly minimum phase”
system as defined on page 86.
118 DANIEL LIBERZON

Initialize σ

Dwell
Time
τ =0

τ̇ = 1
σ =p

no τ ≥ τD ?

yes
∃p :
µp ≤ µσ
no yes
?

ways to handle this problem, we see that dwell-time switching has significant shortcomings. With
a prespecified dwell time, the performance of the currently active controller might deteriorate to an
unacceptable level before the next switch is permitted. If the system is nonlinear, the trajectories
may even escape to infinity in finite time.
Another option (more suitable for nonlinear plants) is to use the hysteresis switching logic.
Hysteresis means that we do not switch every time minp∈P µp (t) becomes smaller than µσ (t), but
switch only when it becomes “significantly” smaller. The threshold of tolerance is determined by a
hysteresis constant h > 0.
We already used this idea in Section 9.2. The only difference is that here we are using multi-
plicative and not additive hysteresis. (This is better for technical reasons; in particular, σ doesn’t
change if we scale all monitoring signals by some, possibly time-varying, factor.)
We already saw that when ep∗ is exponentially converging to 0 (the case of no disturbance),
switching stops in finite time.
It turns out that when ep∗ is just bounded (the case of bounded disturbance), the switching
doesn’t stop but we do get average dwell time, which can be made large by increasing the hysteresis
constant h. This is quite a remarkable result14 due to Hespanha and Morse.
14
In fact, the concept of average dwell time originally arose out of the study of hysteresis-based switching
logics for adaptive control.
NONLINEAR AND ADAPTIVE CONTROL 119

Initialize σ

σ =p
∃p :
no (1 + h)µp ≤ µσ yes
?

This gives the Non-Destabilization property.


The Bounded Error Gain property can also be obtained, in the form
Z t Z t
|eσ (τ )|2 dτ ≤ |P|(1 + h) min |ep (τ )|2 dτ
0 p∈P 0

where |P| is the cardinality of P.


−→ This is only valid when the index set P is finite. When P is infinite, we need to consider its
finite partition, and the switching logic needs to be modified.

Summary of switching adaptive control Putting together the above design ingredients:

• Multi-estimator design giving the Matching property

• Input-to-state stabilizing controller design (via Theorem 12)

• Hysteresis switching logic (giving average dwell time and bounded error gain)

one obtains quite general results for linear systems as well as some classes of nonlinear systems.
The analysis proceeds along the lines outlined on page 114. It is of course not quite as simple, but
arguably simpler than analysis of traditional, continuously-tuned adaptive control algorithms. The
observed performance is also typically quite good. See the references given at the beginning of this
chapter.
Note that the first two bullets above are reiterations of issues we’ve already seen in continuous
adaptive control. First, we need to be able to design an estimation scheme—this places constraints
on what types of systems we can deal with (and in particular, on the way in which uncertain
parameters can enter the dynamics). In continuous adaptive control we typically required the
parameters to enter linearly, but this is not necessary here as long as state-sharing is possible.
Second, the controllers need to provide robustness to parameter estimation errors, beyond the
usual certainty equivalence stabilization assumption. We saw this type of property earlier already.
120 DANIEL LIBERZON

The last bullet is specific to switching adaptive control. Switching in fact allows us to overcome
some difficulties associated with continuous tuning, as already discussed earlier.
Optional exercise: simulate Example 10. (The ISS condition is not easily enforceable, but the
controller works anyway.)

10 Singular perturbations
Reference: [Khalil, Chapter 11]

10.1 Unmodeled dynamics


In this course, the modeling uncertainty in the plant has primarily been described by unknown
parameters (parametric uncertainty). But this is not the only form of modeling uncertainty.
A more general formulation should allow for the presence of dynamic uncertainties, i.e., dy-
namics which are neglected at the control design stage (because they are hard to model and/or are
insignificant in some sense). Constant unknown parameters are then a special case (θ̇ = 0).

Example 11 Suppose that the plant is given by the transfer function

b
g(s) =
(s − a)(εs + 1)

where a, b, ε are unknown parameters, b > 0, and ε > 0 is small.


We can write this as a cascade of two systems:

b 1
y= z, z= u
s−a εs + 1

This suggests the following state-space representation:

ẏ = ay + bz
(111)
εż = −z + u

Here we think of ε as a small “parasitic” time constant, corresponding to a very fast extra pole
at −1/ε. It gives fast unmodeled dynamics. In the limit as ε → 0 the z-dynamics disappear, and
for ε = 0 the system reduces to
b
y= u
s−a
or
ẏ = ay + bu (112)
NONLINEAR AND ADAPTIVE CONTROL 121

while the differential equation for z degenerates into the hard constraint

z=u

One says that the system (111) is a singular perturbation of (112). The perturbation is “singu-
lar” because for ε > 0 the dimension jumps from 1 to 2, which is very different from perturbations
which simply contribute extra terms on the right-hand side of the nominal system.

There are other tools for handling unmodeled dynamics, based on robust control theory (and
small-gain theorems). We will not discuss those in this course (see [Ioannou-Sun, Sect. 8.2]).

10.2 Singular perturbations


The general singular perturbation model is

ẋ = f (t, x, z, ε)
εż = g(t, x, z, ε)

where x ∈ Rn and z ∈ Rm . We think of the x-system as the slow system and of the z-system as the
fast system, since for ε close to 0, ż becomes very large (imagine dividing the second ODE by ε).
For ε = 0, the second ODE becomes the algebraic constraint

g(t, x, z, 0) = 0

Suppose that we can solve this equation for z:

z = h(t, x)

Plugging this into the x-system, and keeping ε = 0, we obtain the reduced system

ẋ = f (t, x, h(t, x), 0) (113)

Denote the solution of this system (starting from a given initial condition) as x̄(t).
We would like to know to what extent x̄(t) serves as a good approximation to the actual solution
x(t), i.e., to the x-component of the solution of our overall singularly perturbed system, when ε is
very small.
Even if ε is small, initially z(t) may be quite far from h(t, x), and the reduced system will not
be a valid approximation. However, since the dynamics of z are fast, we can expect that z(t) will
converge to h(t, x) very fast, and after some (small) time t∗ , z will be close to h(t, x). Then, if we
have initialized x̄(t) correctly, this initial fast transient of z will not have significant effect and x̄(t)
will be close to x(t) for all time.
But, to make sure that z becomes close to its equilibrium value h(t, x), we need to have some
stability property for the z-system.
122 DANIEL LIBERZON

To analyze the z-system, it is convenient to make the following two transformations. First,
introduce
z̄ := z − h(t, x)
which has the effect of shifting the equilibrium to the origin. We have

d d
εz̄˙ = εż − ε h(t, x) = g(t, x, z̄ + h(t, x), ε) − ε h(t, x)
dt dt

Second, introduce the “stretched time”


t
τ :=
ε
We get
dz̄ dz̄ d
=ε = g(t, x, z̄ + h(t, x), ε) − ε h(t, x) (114)
dτ dt dt
Setting ε = 0 in the above equation, we obtain the auxiliary system

dz̄
= g(t, x, z̄ + h(t, x), 0) (115)

It is this system that we want to be stable. It has an equlibrium at z̄ = 0 by construction. In


this system, t and x are actually fixed parameters. Indeed, we have

t = ετ

and so by setting ε = 0 we are freezing the original time t (and hence x(t) also).
−→ Here t and τ give two completely different time scales.
Assumption 1: The system (115) is exponentially stable, uniformly in (t, x). This means that we
have an exponential estimate
|z̄(τ )| ≤ ce−λτ |z̄(0)|
which holds for each fixed value of t and x in (115).
When ε is small but positive, t and x will vary—but slowly. In other words, in the τ time scale,
we’ll have a slowly time-varying system. And the ε-dependent terms will enter it. Using results on
stability of slowly-time varying systems (cf. Chapter 8) one can show that the actual z̄, described
by (114) for ε > 0, will eventually become of order ε, and x̄ will indeed be a good approximation
of x (again up to terms of order ε).
More precisely, the statement is as follows. Pick some times T > t∗ > 0. Let Assumption 1
hold, and let the system data satisfy appropriate technical conditions (smoothness, etc.). Then
there exists a ε∗ > 0 such that for all ε ∈ (0, ε∗ ) we have

x(t) − x̄(t) = O(ε) ∀ t ∈ [0, T ]


NONLINEAR AND ADAPTIVE CONTROL 123

and
z(t) − h(t, x̄(t)) = O(ε) ∀ t ∈ [t∗ , T ]

This is known as Tikhonov’s theorem. See [Khalil, Theorem 11.1] for a precise statement (and
a proof).
Geometrically: in the (x, z)-space, consider the surface z = h(x) (assume the system is au-
tonomous). On this surface, the dynamics are defined by ẋ = f (x, h(x)), z = h(x). Trajectories of
the full (x, z)-system approach this surface fast and then stay in an “ε-tube” around it.

The above estimates apply only on a finite interval [0, T ]. The O(ε) terms are valid on a
compact subset of the state space that contains the state up to time T , and are not in general valid
as t → ∞.
We can actually get an approximation on the infinite time interval, if we assume that the
reduced system is stable.
Assumption 2: x = 0 is an exponentially stable equilibrium of the reduced system (113).
With this additional assumption, the previous result is valid for T = ∞. See [Khalil, Theorem
11.2].

The above result does not guarantee exponential stability of the origin. In fact, the origin is in
general not an equilibrium point of the full system, for ε > 0; we only know that (x, z̄) = (0, 0) is an
equilibrium for ε = 0. Note that z = h(t, x) is not an equilibrium of the z-subsystem unless ε = 0.
If we make further assumptions to ensure that h(t, 0) = 0 and (x, z) = (0, 0) is an equilibrium of
the full system for ε > 0, then its exponential stability does follow from Assumptions 1 and 2 via
Lyapunov analysis. See [Khalil, Theorem 11.4].

10.3 Direct MRAC with unmodeled dynamics


Go back to the system (111). Suppose that we want to solve the MRAC problem for it, with the
reference model
ẏm = −am ym + bm r
where am > 0 and bm are known and we take the reference signal to be

r = sin ωt

Let us first ignore the z-dynamics and follow the direct control approach of Section 6.5.1 where
we designed the adaptive controller for the scalar plant (112).
The control law we derived for direct MRAC is

u = −k̂y + ˆlr (116)


124 DANIEL LIBERZON

and the tuning laws are


˙ ˆl˙ := γer
k̂ := −γey,
where γ > 0.
Now, let’s see what happens if we apply the above adaptive controller to the singularly perturbed
plant (111). We have to repeat some of the derivations we did earlier for direct MRAC, with minor
differences.
First, we need the direct control reparameterization of the y-dynamics in terms of the “nominal”
controller gains
a + am bm
k= , l=
b b
Compared to what we had in direct MRAC before, there is one additional term which gives the
difference between z and u:

ẏ = −am y + bm r + b(z + ky − lr)


= −am y + bm r + b(u + ky − lr) + b(z − u)

Plugging in the expression (116) for u into this, we get

ẏ = −am y + bm r + b(−k̃y + ˜lr) + b(z − u)

From this, we see that the output tracking error

e := ym − y

satisfies
ė = −am e + bk̃y − b˜lr − b(z − u)

Next, use y = ym − e to rewrite the full closed-loop system using the (e, k̃, ˜l) coordinates15 :
     
ė −am e + bk̃(A sin(ωt + α) − e) − b˜l sin ωt b
˙  
k̃  = −γe(A sin(ωt + α) − e) − 0 (z − u)

˙˜ γe sin ωt 0
l | {z }
=: F (t, e, k̃, ˜l)
εż = −z + u

Define T
x := e, k̃, ˜l
We can write the control law (116) as

u = −(k̃ + k)(ym (t) − e) + (˜l + l)r(t)


15
Up to terms decaying to 0; see footnote 8 in Section 6.5.1.
NONLINEAR AND ADAPTIVE CONTROL 125

which takes the form


u = h(t, x)
where t on the right-hand side takes care of the dependence on r(t) and ym (t) (which are known
functions of time). We obtain the x-dynamics
 
b
ẋ = F (t, x) − 0 (z − h(t, x))
0
The reduced model is thus
ẋ = F (t, x)
and we showed in Section 6.5.1 that it is exponentially stable.
We also need to consider
εż = −z + h(t, x)
The system (115) in this case is
dz̄
= −z̄ − ✘
h(t,
✘✘ x)
✘ + h(t,
✘✘✘ x)
✘ = −z̄

and this is clearly exponentially stable uniformly over t and x.
Assumptions 1 and 2 of the previous subsection are satisfied, and we conclude that there exists
a ε∗ > 0 such that for all ε ∈ (0, ε∗ ) we have

x(t) − x̄(t) = O(ε) ∀t ≥ 0

In particular, output tracking is achieved up to order ε. This characterizes robustness of our direct
MRAC design to fast unmodeled dynamics. (The general result also provides a similar estimate
for z, but it’s not important for us here.)
Can we try to show that the closed-loop system is exponentially stable for ε small enough?
No, because the origin is not even an equilibrium of the closed-loop system. This is because

h(t, 0) = −kym (t) + lr(t) 6= 0

(See also the last paragraph of the previous subsection.) The effect of unmodeled dynamics does
not necessarily diminish with time, the displacement is small but persistent.
Optional exercise: simulate the system and investigate its behavior.

11 Conclusion
So, can we now define what “adaptive control” means?
Remaining lectures: final project presentations.

You might also like