Nonlinear and Adaptive Control
Nonlinear and Adaptive Control
Daniel Liberzon
Disclaimers
Don’t print future lectures in advance as the material is always in the process of being
updated. You can consider the material here stable 2 days after it was presented in class.
These lecture notes are posted for class use only.
This is a very rough draft which contains many errors.
I don’t always give proper references to sources from which results are taken. A lack of
reference does not mean that the result is original. In fact, all results presented in these
notes (with possible exception of some simple examples) were borrowed from the literature
and are not mine.
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1 Motivating example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Course logistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Weak Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 LaSalle and Barbalat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Connection with observability . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Back to the adaptive control example . . . . . . . . . . . . . . . . . . . . . . 15
3 Minimum-phase systems and universal regulators . . . . . . . . . . . . . . . . . . . . 17
3.1 Universal regulators for scalar plants . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1 The case b > 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 General case: non-existence results . . . . . . . . . . . . . . . . . . . 20
3.1.3 Nussbaum gains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Relative degree and minimum phase . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Stabilization of nonlinear minimum-phase systems . . . . . . . . . . 27
3.3 Universal regulators for higher-dimensional plants . . . . . . . . . . . . . . . 29
4 Lyapunov-based design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1 Control Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.1 Sontag’s universal formula . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Back to the adaptive control example . . . . . . . . . . . . . . . . . . . . . . 38
5 Backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.1 Integrator backstepping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 Adaptive integrator backstepping . . . . . . . . . . . . . . . . . . . . . . . . . 44
6 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.1 Gradient method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3
4 DANIEL LIBERZON
1 Introduction
The meaning of “nonlinear” should be clear, even if you only studied linear systems so far (by
exclusion).
The meaning of “adaptive” is less clear and takes longer to explain.
From www.webster.com:
Adaptive: showing or having a capacity for or tendency toward adaptation.
Adaptation: the act or process of adapting.
Adapt: to become adapted.
Perhaps it’s easier to first explain the class of problems it studies: modeling uncertainty. This
includes (but is not limited to) the presence of unknown parameters in the model of the plant.
There are many specialized techniques in adaptive control, and details of analysis and design
tend to be challenging. We’ll try to extract fundamental concepts and ideas, of interest not only
in adaptive control. The presentation of adaptive control results will mostly be at the level of
examples, not general theory.
The pattern will be: general concept in nonlinear systems/control, followed by its application
in adaptive control. Or, even better: a motivating example/problem in adaptive control, then
the general treatment of the concept or technique, then back to its adaptive application. Overall,
the course is designed to provide an introduction to further studies both in nonlinear systems and
control and in adaptive control.
ẋ = θx + u
u = −(θ + 1)x
˙
θ̂ = x2 (1)
u = −(θ̂ + 1)x (2)
ẋ = (θ − θ̂ − 1)x
˙
θ̂ = x2
Intuition: the growth of θ̂ dominates the linear growth of x, and eventually the feedback gain
θ̂ + 1 becomes large enough to overcome the uncertainty and stabilize the system.
Analysis: let’s try to find a Lyapunov function.
If we take
x2
V :=
2
then its derivative along the closed-loop system is
V̇ = (θ − θ̂ − 1)x2
Theorem 1 (Lyapunov) Let V be a positive definite C 1 function. If its derivative along solutions
satisfies
V̇ ≤ 0 (5)
everywhere, then the system is stable. If
V̇ < 0 (6)
everywhere (except at the equilibrium being studied), then the system is asymptotically stable. If
in the latter case V is also radially unbounded (i.e., V → ∞ as the state approaches ∞ along any
direction), then the system is globally asymptotically stable.
From (4) we certainly have (5), hence we have stability (in the sense of Lyapunov). In particular,
both x and θ̂ remain bounded for all time (by a constant depending on initial conditions).
On the other hand, we don’t have (6) because V̇ = 0 for (x, θ̂) with x = 0 and θ̂ arbitrary.
It seems plausible that at least convergence of x to 0 should follow from (4). This is indeed true,
but proving this requires knowing a precise result about weak (nonstrictly decreasing) Lyapunov
functions. We will learn/review such results and will then finish the example.
• Even though the plant is linear, the control is nonlinear (because of the square terms). To
analyze the closed-loop system, we need nonlinear analysis tools.
• The control law is dynamic as it incorporates the tuning equation for θ̂. Intuitively, this
equation “learns” the unknown value of θ, providing estimates of θ.
• The standard Lyapunov stability theorem is not enough, and we need to work with a weak
Lyapunov function. As we will see, this is typical in adaptive control, because the esti-
mates (θ̂) might not converge to the actual parameter values (θ). We will discuss techniques
for making the parameter estimates converge. However, even without this we can achieve
regulation of the state x to 0 (i.e., have convergence of those variables that we care about).
Adaptation
u y
Controller Plant
A priori, this is no different from any other (non-adaptive) dynamic feedback. But the main
feature of the adaptive controller is that it achieves the control objective (regulation) despite large-
scale uncertainty associated with θ (which can be any real number). This is to be contrasted with
robust control where uncertainty range is usually bounded and small.
So, one may call a controller adaptive if it can handle such systems, i.e., if it can “adapt”
to large uncertainties. Ideally we want an adaptive controller to handle more than just constant
unknown parameters: parameters may vary with time, there may be unmodeled dynamics, noise
and disturbance inputs, etc.
Certainty equivalence principle:
The way we designed the controller in Example 1 was:
• Then we used the control u = −(θ̂ + 1)x. I.e., we used the estimate for control purposes,
pretending that it is correct, even though it may not be (and it works, at least for this
example).
Adaptive controllers designed using the above logic are called certainty equivalence controllers.
This means essentially that estimator design and controller design were decoupled. A vast majority
of adaptive control techniques are based on certainty equivalence principle, although in some situa-
tions it may be desirable to rethink this and to design the controller by explicitly taking parameter
estimation into account. (More on this later.)
Prerequisite is ECE 515 (Linear Systems). Some knowledge of nonlinear systems (such as
Lyapunov stability, actually covered to some extent in ECE 515) is a plus. If everything in the
above discussion was familiar to you, you should be OK.
There is no single textbook for this course. Lecture notes will be posted on the class website.
Material for the lecture notes is drawn from several sources.
• Nonlinear systems:
Khalil, Nonlinear Systems, Prentice-Hall, 2002 (third edition). This is a standard and very
good text, which is also the main text for ECE 528 (Nonlinear Systems). It has some adaptive
control examples too (see index). I recommend that you get this book as it is quite useful
for this course and you’ll need it for ECE 528 anyway.
• Some material is also drawn from lecture notes on adaptive control by A. S. Morse (com-
municated privately to the instructor). These are in turn based on research articles, I can
provide the references upon request.
For outline of the topics, see the table of contents. On average, 1–1.5 weeks will be spent on
each topic (except parameter estimation which is longer).
−→ Since everything discussed in class will be in the lecture notes posted on the web (well, except
perhaps some pictures I might draw on the board to explain something better), you don’t have to
worry about copying things down. Instead, try to understand and participate in the discussion as
much as possible.
Grading scheme: Homework—50%, Final Project—50%.
Homework: There will be about 4 problem sets. Some MATLAB simulations, to validate the
designs discussed in class and study their robustness properties. Theoretical questions as well.
Project: Topic to be defined by you, based on reading papers or your own research. Should
have a nonlinear systems/control or adaptive control component related to the course material.
Application-oriented projects are especially welcome. The project will consist of an oral presenta-
tion and/or a written report, details to be announced later
NONLINEAR AND ADAPTIVE CONTROL 11
−→ Come to discuss your project with me during office hours, so I can confirm that it is appropriate
and doesn’t overlap with projects of other students.
∂V
V̇ (x) := · f (x)
∂x
Theorem 2 Suppose that for some nonnegative definite continuous function W we have
Then, for every solution x(t) that remains bounded for all t ≥ 0, it is true that
W (x(t)) → 0 as t → ∞ (9)
Comments:
• It is very important to remember that the claim (9) is made only for bounded solutions.
Boundedness of x(t) will be used in the proof. So if, for example, |x(t)| → ∞ then it does
not necessarily satisfy (9). (There is a famous paper by Monopoli which makes this mistake
while trying to prove stability of an adaptive control system, and for many years many people
have been trying to determine whether his claim is still correct or not.)
• Boundedness of all solutions of (7) follows from (8) if V is radially unbounded . As we will
see, sometimes boundedness of solutions can be shown by a separate argument, so we don’t
require it in Theorem 8.
• Note that, unlike in Theorem 1, here V is not required to be positive definite, only positive
semidefinite. This is useful in adaptive control, where one often works with functions V that
depend on some but not all states. Unless V is positive definite, we can’t use Theorem 1 to
conclude stability in the sense of Lyapunov, but we’ll rarely need this.
12 DANIEL LIBERZON
• LaSalle’s theorem makes a more precise claim, namely, that every bounded solution ap-
proaches the largest positive-invariant set inside the set
{x : V̇ (x) = 0}
which in turn belongs to the set {x : W (x) = 0} in view of (8). As a corollary, we have
the following stability result, known as Barbashin-Krasovskii-LaSalle theorem: If V is as in
Theorem 2 and V̇ does not stay identically 0 along any nonzero trajectory, then all bounded
solutions converge to 0. The simpler but weaker claim of Theorem 2 remains true for time-
varying systems1 ẋ = f (t, x), while LaSalle’s theorem in general does not. For our purposes,
just convergence of W (x(t)) to 0 will usually be enough.
where the second inequality follows from the fact that V (x) ≥ 0 ∀ x. Since the above calculation
is true for every t, and V (x(0)) is a constant (fixed by the choice of initial condition), we can take
the limit as t → ∞ and conclude that
Z ∞
W (x(s))ds < ∞ (10)
0
The integral is of course also bounded from below by 0 because W is nonnegative definite.
We now need to show that the finiteness of the above improper integral, together with the
hypotheses of Theorem 2 that we haven’t used yet (which ones are these?) implies (9). We
formulate this remaining step as a lemma, because it is often useful as an independent result.
Lemma 3 (“Barbalat’s lemma”) Let x(t) be bounded and let its derivative ẋ(t) also be bounded.
Let W be a continuous function of x such that the integral
Z ∞
W (x(s))ds
0
Theorem 2 follows from Barbalat’s lemma because ẋ = f (x) and this is bounded if x is bounded.
Barbalat’s lemma seems intuitively obvious. In fact, how can a function whose integral over
[0, ∞) is finite not converge to 0? Well, here’s one possibility:
NONLINEAR AND ADAPTIVE CONTROL 13
W (x(t))
If the areas under the pulses decrease as geometric series, then the total area will be finite, even
though the function does not converge to 0. Of course, we can smooth out all corners and make
this function C 1 , no problem.
So, what’s the catch? Why can’t such behavior come from a bounded solution of the system (7)?
Reason: this requires unbounded derivative. Indeed, to come up to the same height and back
down over shorter and shorter time intervals, the pulses would need to get steeper and steeper. So,
if we invoke the condition that ẋ is bounded, then we can indeed show rigorously that we should
have convergence.
Proof of Barbalat’s lemma. Note: in this proof we are still assuming for simplicity that
W ≥ 0, as in the Theorem, but with a little extra care the argument still goes through if W is not
sign definite.
Suppose that W (x(t)) does not converge to 0 as t → ∞. Then, there would have to exist a
number ε > 0 and a sequence of times {tk } converging to ∞ at which we have
W (x(tk )) ≥ ε ∀k
(To see why, just recall the definition of convergence: ∀ ε > 0 ∃ T > 0 such that |W (x(t))| ≤ ε
∀ t ≥ T .)
This situation is similar to the one in the previous figure, ε being the pulse height.
Since W is a continuous function of x, and since x(t) is bounded for all t, there exists a constant
∆x > 0 with the following property:
This is because for any choice of time t such that x(t) is close enough to some x(tk ), the correspond-
ing values of W will be within ε/2 of each other. (This is uniform continuity of W as a function
of x. Boundedness of x(·) is crucial here, because otherwise, a uniform constant ∆x with the above
property may not exist; think of the function W (x) = x2 over an infinite range of x.)
1
Under appropriate Lipschitzness and boundedness conditions on f ; see [Khalil, Theorem 8.4].
14 DANIEL LIBERZON
We have a similar property for x as a function of t: there exists a constant ∆t > 0 such that
This is true because the derivative ẋ is bounded, so the time it takes for x to grow by ∆x is bounded
from above.
The function W (x(t)) is a composition of W (x) and x(t). Combining the above two properties,
we see that for t ∈ [tk , tk + ∆t], k = 1, 2, . . . we have W (x(t)) ≥ ε/2. Hence, each such interval
contributes an area of at least ∆t · ε/2 to the total integral in (10). Since there are infinitely many
such intervals, the total integral cannot be finite, which is a contradiction.
ẋ = Ax
V (x) = xT P x
where P is a symmetric positive definite matrix. Its derivative along solutions is given by
V̇ (x) = xT (P A + AT P )x
This is nonnegative definite if for some (not necessarily square) matrix C we have
P A + AT P ≤ −C T C ≤ 0 (11)
V̇ (x) ≤ −xT C T Cx = −y T y
where we defined
y := Cx
NONLINEAR AND ADAPTIVE CONTROL 15
We can get an asymptotic stability result from this, provided we have the correct observability
property of the time-varying pair (A(t), C(t)). This goes as follows. The observability Gramian is
defined as Z t0 +T
M (t0 , t0 + T ) := ΦT (t, t0 )C T (t)C(t)Φ(t, t0 )dt
t0
where Φ(·, ·) is the system transition matrix [ECE 515]. The system is said to be uniformly com-
pletely observable (UCO) [Kalman, 1960] if for some positive constants T, β1 , β2 we have
β1 I ≤ M (t0 , t0 + T ) ≤ β2 I ∀ t0
For UCO systems, the implication y → 0 ⇒ x → 0 is still true; this follows from the identity
Z t0 +T
ΦT (t, t0 )C T (t)y(t)dt = M (t0 , t0 + T )x(t0 )
t0
For LTI systems, the dependence on t0 disappears and UCO simply says that the observability
Gramian (which is now defined more explicitly in terms of matrix exponentials) is positive definite
for some choice of T . This is the usual observability notion, which is equivalent to the well-known
rank condition.
We can now easily see that for LTV systems, (12) plus UCO give asymptotic stability. It can
be shown that asymptotic stability is uniform (with respect to initial time), which implies that it is
in fact exponential (since the system is linear). See [Khalil, Example 8.11] for an argument slightly
different from the one given above, which proves the exponential stability claim rigorously. This
result is also stated in [Ioannou-Sun, Theorem 3.4.8] and it will be useful for us in the sequel.
ẋ = (θ − θ̂ − 1)x (13)
˙
θ̂ = x2 (14)
x2 (θ̂ − θ)2
V (x, θ̂) = +
2 2
with derivative
V̇ (x, θ̂) = (θ − θ̂ − 1)x2 + (θ̂ − θ)x2 = −x2 ≤ 0
Let’s check carefully that Theorem 2 indeed applies. V is radially unbounded, hence all trajec-
tories are bounded (as we said earlier). Define
W (x, θ̂) := x2
which is nonnegative definite and continuous. (Remember that the system state includes both x
and θ̂.) Thus by Theorem 2 we have W (x(t)) → 0, hence x(t) converges to 0 as needed.
On the other hand, Theorem 2 tells us nothing about θ̂(t). It may not converge to θ, or to
anything else. Indeed, the line in the (x, θ̂) space given by
{(x, θ̂) : x = 0}
consists entirely of equilibria (this is obvious from the system equations). This line is exactly the
set
{(x, θ̂) : V̇ (x, θ̂) = 0}
This set is invariant, so even the stronger LaSalle’s theorem gives nothing further than convergence
to this line. For example, it is clear that if we start on this line, with any value of θ̂, then we’ll stay
there and θ̂ won’t change.
−→ Since the value of θ is unknown, an interesting feature of the above Lyapunov function is that
it’s not completely known, but is instead an abstract function whose existence is guaranteed. This
is typical in adaptive control.
−→ Problem Set 1 is assigned.
When we first studied the example, we also tried the candidate Lyapunov function
x2
V (x, θ̂) = (15)
2
NONLINEAR AND ADAPTIVE CONTROL 17
whose derivative is
V̇ = (θ − θ̂ − 1)x2 (16)
This can be both positive or negative, so Theorem 2 does not apply. In fact, we can’t even use this
V to show boundedness of solutions. However, with a bit of clever analysis and by using Barbalat’s
lemma, we can still use this V to show that x(t) → 0 as follows.
˙
First, use the θ̂ equation (14) to rewrite (16) as
˙ ˙ ˙
V̇ = (θ − θ̂ − 1)θ̂ = (θ − 1)θ̂ − θ̂θ̂
We already know that x is bounded, and ẋ is also bounded because of (13). So, we can apply
Barbalat’s lemma and conclude that x(t) → 0 as t → ∞, and we’re done.
This argument is certainly more complicated than the previous one based on Theorem 2. How-
ever, it is often difficult to find a weak Lyapunov function satisfying the hypotheses of Theorem 2,
and so it is useful to have as many tools as possible at our disposal. The above steps—integrating
differential equations, showing boundedness, and invoking Barbalat’s lemma—are used very often
in adaptive control.
A controller is called a universal regulator for a given class of plants if, when connected to any
plant from this class, it guarantees that:
under arbitrary initial conditions (of both the plant and the controller).
Note: even though we used the notation θ̂, we are not really trying to estimate the value of θ.
We have no reason to believe that θ̂ − θ becomes small in any sense whatsoever. We can think of a
universal regulator as doing some kind of exhaustive search through the space of possible controller
gains, so that eventually it will find the gains that will stabilize the given unknown plant, whatever
it is (as long as it is in a prescribed class). Later, we will study different controllers which explicitly
rely on parameter estimation and certainty equivalence. (See Section 6.5 and Section 8.2.)
−→ It is hard to make a rigorous distinction between universal regulators and estimation-based
adaptive controllers. This is because in both cases the controller has dynamics, and it’s impossible
to define formally when these dynamics are estimating anything and when they are not. The
difference is not so much in the appearance of the controller, but in the design philosophy behind
it. With some experience, it is usually easy to tell one from another.
Exhaustive search means that transient behavior might be poor (and this is true). However,
from the theoretical point of view the possibility of designing universal regulators for large classes
of plants is quite interesting.
Next, we want to see if universal regulators exist for larger classes of plants.
We first consider the simpler case where the sign of the control gain b (the “high-frequency gain”)
is known, and with no loss of generality we take it to be positive.
It turns out that this makes relatively minor difference with Example 1 (where b was equal to
1). In fact, the same controller works, and almost the same analysis applies.
NONLINEAR AND ADAPTIVE CONTROL 19
Controller:
k̇ = y 2 (19)
u = −ky
This notation better reflects the fact that we’re not estimating a and b, but just searching the
space of controller gains for a suitable one. The differential equation for k is a tuning law, or
tuner. (k here plays the role of θ̂ + 1 before.)
Closed-loop plant:
ẏ = (a − bk)y (20)
Its derivative is
V̇ = (a − bk)y 2 = (a − bk)k̇
We now go back to the general case where the sign of b in (18) can be arbitrary (as long as b 6= 0).
It turns out that this extra uncertainty makes the problem of designing a universal regulator
significantly more challenging.
Let us consider a controller of the form
ż = f (z, y)
(22)
u = h(z, y)
where z ∈ R (i.e., the controller dynamics are scalar) and f and h are continuous rational functions
(i.e., ratios of polynomials with no real poles).
−→ Our previous controller is of this form—it’s polynomial.
Claim: No such 1-D rational controller can be a universal stabilizer for (18) with unknown
sign of b.
Closed-loop system:
ẏ = ay + bh(z, y)
(23)
ż = f (z, y)
−→ Remember the definition of a universal regulator: it must globally stabilize all plants in the
class. So, to show that the above controller cannot be a universal regulator, we must show that for
every choice of f and h, there exists a choice of values for a and b such that y 6→ 0 (at least for
some special bad choices of initial conditions).
First, we show that f cannot be an identically zero function. If it were, then the value of z
would be constant: z ≡ z0 , and the plant would be
ẏ = ay + bh(z0 , y)
a = 1, b=1
Then we have
ẏ = y + h(z0 , y) > 0 ∀y > 0
This means that for a positive initial condition, the value of y can only increase, so it cannot go to
0.
Now suppose that h(z0 , y0 ) = 0 for some y0 > 0. Then take, for example,
a = 0, b arbitrary
NONLINEAR AND ADAPTIVE CONTROL 21
This gives
ẏ = bh(z0 , y)
for which y = y0 is an equilibrium. Again, y cannot go to 0 from all initial conditions.
So, f is not identically zero, i.e., the controller has nontrivial dynamics.
−→ Note: in the above argument, rationality of h was in fact not used. In other words, no static
controller can be a universal regulator. (Rationality will become important when f is not identically
zero, see below.)
Therefore, there must exist a z0 for which f (z0 , y) is a nonzero function of y. Since it is rational,
it has finitely many zeros. Hence, there exists a y0 > 0 such that f (z0 , y) has the same sign for all
y ≥ y0 . Assume that this sign is positive (the other case is similar):
Now consider h(z, y0 ) as a function of z. It is also rational, hence there exists a z1 ≥ z0 such
that h(z, y0 ) has the same sign for all z ≥ z1 . Again, assume for concreteness that this sign is
positive.
By continuity, h(z, y0 ) is then bounded from below for z ≥ z0 (by some possibly negative but
finite number).
Now, pick a > 0, and then pick b > 0 small enough so that
Let’s now look at the plant (23) and the inequalities (24) and (25). We see that solutions cannot
leave the region
{(y, z) : y ≥ y0 , z ≥ z0 }
because everywhere on the boundary of this region they are directed inside. In other words, it is
an invariant region for the closed-loop system.
Therefore, convergence of y to 0 (i.e., convergence to the z-axis) from initial conditions inside
this region is not achieved.
Working out the remaining cases is left as an exercise.
Does a universal regulator exist at all in the general case of unknown sign of b?
R. Nussbaum showed in 1983 that it does. (It is interesting that he’s a pure mathematician
at Rutgers and doesn’t work in adaptive control; he learned about this problem from a paper by
Morse and came up with the idea. His paper is in Systems and Control Letters, vol. 3, pp. 243-246;
the solution we give here is taken from Morse’s notes and is a bit simpler than what Nussbaum
had. The previous non-existence result is also established by Nussbaum in the same paper.)
22 DANIEL LIBERZON
In other words, this function keeps crossing back and forth between positive and negative values
while giving higher and higher absolute value of the above integral.
Example: N (k) = k cos k.
−→ Since N must clearly have an infinite number of zeros, it is not rational, so the previous
non-existence result does not apply to this controller.
With this modified controller, the analysis goes through much like before:
y2
V =
2
gives
V̇ = (a − bN (k)k)y 2 = (a − bN (k)k)k̇
Integrating, we get
Z k(t)
y 2 (t)
= ak(t) − b N (s)sds + C (26)
2 0
where C is a constant determined by the initial conditions.
We have as before that k is monotonically nondecreasing, so it either has a finite limit or grows
to infinity. If it were unbounded, then the right-hand side of (26) would eventually become negative
(no matter what the sign of b is!) Indeed, this will happen when k reaches a value k̄ for which
Z k̄
1 C
b N (s)sds > a +
k̄ 0 k̄
After this, the analysis is exactly the same as before: y ∈ L2 (since k is bounded), y is bounded,
ẏ is bounded, and by Barbalat y → 0.
An important observation is that in the above construction, N (·) doesn’t need to be continuous.
We do need it to be locally bounded (to ensure that boundedness of k implies boundedness of N (k)
and hence boundedness of ẏ) and nice enough so that solutions of the closed-loop system are well
defined. Piecewise continuous N (·) will do.
In fact, piecewise constant N (·) is probably the easiest to design.
The values of N don’t actually need to grow in magnitude, they can just be maintained on
longer and longer intervals to satisfy the condition. This is an example of a switching logic.
This design was proposed by Willems and Byrnes in 1984. Similar ideas were presented earlier in
the Russian literature (work of Fradkov). It was probably the first example of a switching adaptive
control algorithm, which motivated a lot of subsequent work. We’ll talk more about switching
adaptive control later in the course (although in a different context: estimator-based). The best
way to design the switching is to use feedback, rather than a predefined switching pattern.
ẋ = Ax + Bu
y = Cx
Assume that it is SISO, i.e., u and y are scalar (and so B and C are vectors, not matrices).
We’ll briefly mention MIMO case later.
Relative degree is, basically, the number of times that we need to differentiate the output until
the input appears.
ẏ = CAx + CBu
where CB is actually a scalar since the system is SISO.
If CB 6= 0, then relative degree equals 1. If CB = 0, then we differentiate again:
ÿ = CA2 x + CABu
which means that y (r) is the first derivative that depends on u because
The terminology “relative degree” is motivated by the following. Consider the transfer function
of the system:
q(s)
G(s) = C(Is − A)−1 B = (27)
p(s)
Then
r = deg(p) − deg(q)
i.e., relative degree is the difference between the degrees of the denominator and numerator poly-
nomials.
To define relative degree for MIMO systems, we need to differentiate each output until some
input appears. We’ll have (assuming both y and u have the same dimension m)
(r )
y 1
1(r2 )
y2
. = Lx + M u
.
.
(r )
ym m
The matrix M generalizes CAr−1 B we had earlier. If M is nonsingular, then (r1 , . . . , rm ) is called
the vector relative degree. It may not exist (the matrix M may be singular).
The concept of relative degree also extends quite easily to nonlinear systems. We’ll only deal
with nonlinear systems that are SISO and affine in controls.
Example 2
ẋ1 = x3 − x32
ẋ2 = −x2 − u
ẋ3 = x21 − x3 + u
y = x1
Differentiate y:
ẏ = x3 − x32
and this doesn’t depend on u. Differentiate again:
Since
1 + 3x22 6= 0
relative degree is r = 2.
−→ In this example, r = 2 globally in the state space. In general for nonlinear systems, relative
degree is a local concept, because the term multiplying u may vanish for some x. (For linear systems
this doesn’t happen because it is a constant.)
NONLINEAR AND ADAPTIVE CONTROL 25
ż1 = z2
ż2 = x21 − x3 + 3x32 + (1 + 3x22 )u
We want to complete a coordinate transformation and write everything in terms of z. For this we
need z3 . We can always find z3 whose differential equation doesn’t depend on u. In this example,
we can take
z3 := x2 + x3
for which
ż3 = x21 − x2 − x3 = z12 − z3
and this doesn’t depend on u.
Note that the Jacobian of the map from x to z is
1 0 0
0 −3x22 1
0 1 1
which is nonsingular, so the coordinate transformation is well defined (in fact, globally).
Can also check that this transformation preserves the origin.
The z-dynamics are of the form
ż1 = z2
ż2 = b(z1 , z2 , z3 ) + a(z1 , z2 , z3 )u
(28)
ż3 = z12 − z3
y = z1
The next concept we need is that of a minimum-phase system. A linear SISO system is called
minimum-phase if its zeros have negative real parts. These are the roots of the numerator polyno-
mial q(s) in the transfer function (27).
(The term “minimum-phase” comes from the fact that among transfer functions with the same
magnitude Bode plot, the one with stable zeros has the smallest variation of the phase Bode plot.)
26 DANIEL LIBERZON
The interpretation of this notion is that the “inverse system” is asymptotically stable. For
example, the inverse of
s+2
y= u
(s + 3)2
is
(s + 3)2
u= y
s+2
The problem is that this is not proper. But we can fix this if we work with derivatives of y instead.
For example, we can multiply both sides of the original equation by s + 1:
(s + 1)(s + 2)
ẏ + y = u
(s + 3)2
This has relative degree 0 and so we can invert it without losing properness:
(s + 3)2
u= (ẏ + y)
(s + 1)(s + 2)
Since the original system has a stable zero (at s = −2), and since we were careful not to add an
unstable zero, the inverse system has stable poles (which are the same as the zeros of the initial
system). Asymptotic stability of this inverse system implies, in particular:
y ≡ 0 ⇒ x, u → 0
where x is the internal state in any minimal (controllable and observable) state-space realization.
This last implication is a good way of thinking about minimum-phase. Suppose we chose a
control u which maintains the output to be identically zero (y ≡ 0). The resulting dynamics—
called zero dynamics—should then be asymptotically stable (x → 0, and consequently u → 0
because it is an output of the inverse system).
The minimum-phase property (as well as the relative degree) are coordinate-independent, but
they are easier to study using a normal form. Consider again the normal form (28). We want
y = z1 ≡ 0. Then we should also have ẏ = z2 ≡ 0. This in turn requires ż2 ≡ 0. Since the
differential equation for z2 is controllable (a 6= 0), we can apply the feedback control law
b(z1 , z2 , z3 )
u=−
a(z1 , z2 , z3 )
to enforce this.
The zero dynamics are the remaining dynamics of the system, constrained by the condition
y ≡ 0. I.e., zero dynamics describe the remaining freedom of motion after we fix the initial
conditions (in the present case, z1 (0) = z2 (0) = 0) and select the control as above.
Here, the zero dynamics are
ż3 = −z3
NONLINEAR AND ADAPTIVE CONTROL 27
So they are 1-dimensional and asymptotically stable. Hence, this system is minimum-phase. (For
nonlinear systems one defines minimum-phase via asymptotic stability of zero dynamics, since there
is no transfer function.)
Can check that u → 0 because b is 0 at 0.
This is an appropriate place to have a brief discussion about stabilization of nonlinear (SISO)
systems in normal form with asymptotically stable zero dynamics (minimum phase).
Consider a system in normal form
ξ˙1 = ξ2
ξ˙2 = ξ3
..
.
ξ˙r = b(ξ, η) + a(ξ, η)u
η̇ = q(ξ, η)
We assume that a(ξ, η) 6= 0 for all ξ, η at least near the origin, which means that the system has
relative degree r with respect to the output y = ξ1 .
Assume also that the system is minimum-phase, in the sense that its zero dynamics (as defined
above) are locally asymptotically stable.
What are the zero dynamics? y = ξ1 ≡ 0 ⇒ ẏ = ξ2 ≡ 0 ⇒ . . . ⇒ ξr ≡ 0. The last property
can be achieved by choice of u since a is nonzero. So, the zero dynamics are
η̇ = q(0, η) (29)
ξ˙ = Acl ξ
η̇ = q(ξ, η)
28 DANIEL LIBERZON
where
0 1 0 ··· 0
0 0 1 ··· 0
.. .. .. ..
..
Acl = . . . .
.
0 0 0 ··· 1
−k1 −k2 −k3 · · · −kr
is a Hurwitz matrix in controllable canonical form for appropriately chosen values of the gains ki .
Claim: the closed loop is asymptotically stable.
If the zero dynamics (29) have asymptotically stable linearization, i.e., if the matrix
∂q(ξ, η)
(0, 0)
∂η
is Hurwitz, then the claim easily follows from Lyapunov’s first (indirect) method. But asymptotic
stability of (29) is a weaker assumption than asymptotic stability of its linearization. However, the
claim is still true even in the critical case when the linearization test fails. One way to show this is
by Lyapunov analysis, as follows (see [Khalil, Lemma 13.1, p. 531]).
• Since (29) is asymptotically stable, there exists (by a converse Lyapunov theorem) a function
V1 (η) such that
∂V1
q(0, η) < 0 ∀ η 6= 0
∂η
P Acl + ATcl P = −I
Now consider p
V (ξ, η) := V1 (η) + k ξT P ξ
where k > 0. Its derivative along closed-loop solutions is
∂V1 k
V̇ = q(ξ, η) + p ξ T (P Acl + ATcl P )ξ
∂η T
2 ξ Pξ
∂V1 ∂V1 kξ T ξ
= q(0, η) + q(ξ, η) − q(0, η) − p
∂η ∂η 2 ξT P ξ
The first term is negative definite in η. The second term is upper-bounded, on any neighborhood
of 0, by C|ξ| where the constant C comes from a bound on | ∂V ∂η | and a Lipschitz constant for q on
1
this neighborhood. The last term is negative definite in ξ and scales linearly with k. Therefore, by
choosing k large enough we can dominate the second term and get V̇ < 0. This implies the claim.
What about global stabilization?
NONLINEAR AND ADAPTIVE CONTROL 29
From the earlier discussion about normal forms it is clear that, up to a coordinate transforma-
tion, we can represent all such plants in the form
ẏ = ay + bu + cT z
ż = Az + dy
30 DANIEL LIBERZON
where A is a Hurwitz matrix (because it gives zero dynamics), c and d are vectors, a and b 6= 0 are
scalars, and z is a vector which together with y gives the plant state: x = (y, z). The entries of
A, a, b, c, d are all unknown.
The goal is to have y, z → 0 as t → ∞ while keeping all closed-signals bounded. And we want
to achieve this with output feedback (using only y, not z).
This class of plants includes our previous scalar case, but it’s obviously much larger. Neverthe-
less, it turns out that the same controller as in Section 3.1.3: the tuner
k̇ = y 2
ẏ = (a − bN (k)k)y + cT z
ż = Az + dy
k̇ = y 2
y2
V =
2
It gives
V̇ = (a − bN (k)k)y 2 + ycT z = (a − bN (k)k)k̇ + ycT z
Integrating, we get
Z k(t) Z t
y 2 (t)
= ak(t) − b N (s)sds + y(s)cT z(s)ds +C (31)
2 0 0
| {z }
new term
ż = Az + dy (32)
In what follows, we’ll need some well-known facts about linear systems. Proofs of these facts
will either be assigned as homework or are variations on those to be assigned in homework.
NONLINEAR AND ADAPTIVE CONTROL 31
From this, the Cauchy-Schwartz inequality, and simple square completion we have
Z t sZ sZ
t t
y(s)cT z(s)ds ≤ y 2 (s)ds (cT z(s))2 ds
0 0 0
Z t Z
1 1 t T
2
≤ y (s)ds + (c z(s))2 ds
2 0 2 0
Z t
≤ c3 |z(0)|2 + c4 y 2 (s)ds = c3 |z(0)|2 + c4 (k(t) − k(0))
0
where
c1 c2 1
c3 = , c4 = +
2 2 2
Substituting this bound into (31), we get
Z k(t)
y 2 (t)
≤ (a + c4 )k(t) − b N (s)sds + C̄
2 0
Now that we were able to handle the extra term, we’re almost done. Exactly as before, we show
that k and y are bounded, and y ∈ L2 .
How do we show that ẏ is bounded? Need to know that z is bounded. (Or at least that cT z is
bounded. So the next fact can be stated for the output instead of the state.)
Fact 2: Given an exponentially stable linear system, if the input is bounded, then the state is
bounded.
Applying this to the system (32) and using boundedness of y, we have that z is bounded.
y ∈ L2 , ẏ is bounded =⇒ y → 0 (Barbalat).
Fact 3: Given an exponentially stable linear system, if the input is in L2 or converges to 0, then
the state converges to 0.
From this we conclude that z → 0. Thus the regulation objective is fulfilled.
Designing universal regulators for systems with relative degree 2 or higher is more challenging.
In Section 5 we will study backstepping, which a tool for overcoming the relative degree obstacle—
but we won’t be working in the context of universal regulators any more.
Additional reading on universal regulators: A. Ilchmann, “Non-identifier based adaptive control
of dynamical systems: A survey,” IMA J. Math. Contr. Inform., vol. 8, pp. 321–366, 1991.
32 DANIEL LIBERZON
4 Lyapunov-based design
In Example 1, we started with the system
ẋ = θx + u
x2 (θ̂ − θ)2
V (x, θ̂) = + (35)
2 2
and showed that its derivative satisfies
from which, by Theorem 2, we have x(t) → 0. The choice of V was, again, not systematic and
involved trial and error.
If we’re going to base stability analysis on a (weak) Lyapunov function, then an alternative
approach is to start by picking V as in (35), and then design the control law and tuning law so
as to get (36). The step of choosing V still involves guessing, but then V provides the basis for
controller design and we get stability by construction.
˙
For the above V , but keeping u and θ̂ unspecified for now, we get
˙ ˙ ˙
V̇ = x(θx + u) + (θ̂ − θ)θ̂ = xu + θ̂θ̂ + θ(x2 − θ̂)
The last term is not going to give us anything useful, because θ is the unknown parameter and we
don’t know its sign. So, it makes sense to cancel it. This immediately suggests the tuning law (34).
We now have
V̇ = xu + θ̂x2
and so, if we want (36), we need to pick u such that
xu + θ̂x2 = −x2
NONLINEAR AND ADAPTIVE CONTROL 33
It is clear that the control law (33) is, in fact, a unique control law that gives this.
So, Lyapunov-based design allows us to reconstruct our original choices in a more methodical
way. Also note that it gives us more flexibility. For example, we see that any u for which
xu + θ̂x2 ≤ −x2
would give us the same conclusion (since strict equality is not required in Theorem 2). So, for
example, any control of the form
works just as well. This may be somewhat obvious for this example, but in more complicated
situations Lyapunov-based design can make it easier to see which controllers are stabilizing.
u = k(x)
34 DANIEL LIBERZON
which makes the closed-loop system globally asymptotically stable. And we want our V to be a
Lyapunov function for the closed loop, i.e., we want
∂V
· f (x, k(x)) < 0 ∀ x 6= 0
∂x
In fact, for the closed-loop system to be well-posed, we probably need more regularity from the
feedback law—at least local Lipschitzness—but continuity is the absolute minimum we should ask
for.2 It is often also required that k(0) = 0, to preserve the equilibrium at the origin (assuming
also that f (0, 0) = 0).
Does the existence of a CLF imply the existence of such a stabilizing feedback law?
One is tempted to say yes. However, being able to find a value of u for each x does not
automatically imply being able to glue them together into a continuous function k(x). This is
known as the continuous selection problem: from a given collection of sets parameterized by x (in
our case, the sets of “good” values of u) select a continuous function of x.
Counterexample: [Sontag and Sussmann, 1980]
ẋ = x (u − 1)2 − (x − 1) (u + 1)2 + (x − 2)
Let
x2
V (x) :=
2
then
V̇ = x2 (u − 1)2 − (x − 1) (u + 1)2 + (x − 2)
For this to be negative, one (and only one) of the expressions in square brackets must be negative.
It is easy to see that the points in the (x, u) plane where this happens are given by the interiors of
the two parabolas in the picture.
1
1 2
x
−1
2
Continuity of the right-hand side of an ODE is enough for existence of solutions, but not enough for
their uniqueness (see [Khalil, p. 88] for a reference).
NONLINEAR AND ADAPTIVE CONTROL 35
The projection of the union of the two parabolas onto the x-axis covers the whole axis. This
means that V is a CLF (directly from the definition). On the other hand, the parabolas do not
intersect, which means that no continuous feedback law exists that makes V̇ negative. Any such
feedback law would have to pass somehow from one parabola to the other, see the dashed curve in
the picture.
−→ There are several possibilities for overcoming this difficulty. One is to consider discontinuous
feedback laws (or time-varying ones, or switching ones). Another is to look for classes of systems
for which continuous feedback laws can be constructed.
For now, let’s stick with continuous feedback laws. For systems affine in controls, all is well.
Not only do continuous stabilizers exist [Artstein, 1983], but they can be generated from a CLF
by an explicit formula [Sontag, 1989]. However, it is clear from the preceding discussion that it
will take some effort to establish this result (it is far from trivial, and the affine structure must
somehow play a role).
−→ Problem Set 2 is assigned.
Here f and gi are n-vectors, and G is a matrix whose columns are the gi ’s. We assume that
f (0) = 0.
The definition of CLF in this case becomes
( m
)
∂V X ∂V
inf · f (x) + · gi (x)ui < 0 ∀ x 6= 0
u ∂x ∂x
i=1
It is easy to see that this is equivalent to the condition that for all x 6= 0,
∂V ∂V
· gi (x) = 0 ∀ i =⇒ · f (x) < 0
∂x ∂x
Indeed, since the controls are unbounded, we can always pick u to get
X ∂Vm
∂V
· f (x) + · gi (x)ui < 0 (37)
∂x ∂x
i=1
except for those x at which the terms with u are all 0, where we lose control authority and need
the first term to be negative by itself.
36 DANIEL LIBERZON
for all x 6= 0.
Consider the feedback law
p
a + a2 + |b|4
− b, b 6= 0
k(x) = K(a(x), b(x)) := |b|2 (39)
0, b=0
∂V
Xm Xm
· f (x) + gi (x)ui = a(x) + bi (x)ui
∂x
i=1 i=1
m p p m
X a + a2 + |b|4 a + a2 + |b|4 X 2
=a− bi bi = a − bi
|b|2 |b|2
i=1 i=1
p
= − a2 + |b|4 < 0 ∀ x 6= 0
where the very last inequality follows from (38). The claim follows from Theorem 1.
The reason to put |b|4 and not, e.g., |b|2 inside the square root is to ensure the above smoothness
property of the control law as well as its continuity at 0 under an additional hypothesis (as discussed
below).
Note that formally, to talk about global asymptotic stability of the zero equilibrium we need to
make sure that x = 0 is indeed an equilibrium. This is why we need f (0) = 0 (since k(0) = 0 by
construction).
As for x = 0, the feedback law (39) is automatically continuous there if V has the property
that for small x, the values of u that give (37) can also be chosen small (small control property).
This is not always possible. For example, the scalar system
ẋ = x + x2 u
NONLINEAR AND ADAPTIVE CONTROL 37
cannot be stabilized with a feedback that is continuous at 0, because x2 is small near 0 compared
to x so u needs to be large there, and u needs to be negative for x > 0 and positive for x < 0. The
function V (x) = x2 /2 is a CLF for this system, but it doesn’t have the small control property. If
x and x2 were flipped, then all would be well.
Anyway, continuity of k(x) at 0 is not so crucial: if it is continuous away from 0, then the
closed-loop system is well-posed for x(0) 6= 0 and all solutions go to 0, which is what we want.
The formula (39) is known as Sontag’s formula, or universal formula. Similar formulas exist
for control spaces different from Rm (e.g., bounded controls).
Example 3
ẋ = −x3 + u, x, u ∈ R
Even without using Sontag’s formula, there are several rather obvious stabilizing feedback laws we
can apply. For example:
u = x3 − x =⇒ ẋ = −x
This an example of feedback linearization design: cancel nonlinearities and get a stable linear
closed-loop system.
However, the previous feedback law requires very large control effort for large x, while −x3 is
actually a “friendly” nonlinearity which we don’t need to cancel. Indeed, consider
u = −x =⇒ ẋ = −x3 − x
u=0 =⇒ ẋ = −x3
so we get √ √
−x4 + x8 + x4 −x4 + x2 x4 + 1 p
u=− x=− = x3 − x x4 + 1
x2 x
38 DANIEL LIBERZON
(initially we should have defined u separately to be 0 when x = 0 but the final formula captures
this). The closed-loop system is p
ẋ = −x x4 + 1
This control law has a slightly more complicated expression than the previous ones, but it has
the following interesting properties. First, we have u → 0 as |x| → ∞, i.e., for large x we do
nothing and let the −x3 term do all the work. On the other hand, for small x we have ẋ ≈ −x,
which ensures nice convergence to 0. So, this control is a good compromise between the previous
designs!
ẋ = θx + u
˙
θ̂ = x2
x2 (θ̂ − θ)2
V (x, θ̂) = +
2 2
Is this a CLF? Differentiate it:
a = θ̂x2 , b=x
We have to be careful: the answer is no! This is because the state of the system is (x, θ̂), so for
V to be a CLF we would need the implication
to hold for all (x, θ̂) 6= (0, 0). But it only holds when x 6= 0. In other words, the subset where it
doesn’t hold is not just the origin, but the whole line {x = 0}. This is not surprising because we
know it is the line of equilibria.
But since our objective is to stabilize the system only to this line, and not to the origin, the
universal formula will still hopefully solve our problem. Actually, the above V is a “weak CLF” in
the sense that
b(x, θ̂) = 0 =⇒ a(x, θ̂) ≤ 0
NONLINEAR AND ADAPTIVE CONTROL 39
everywhere (note the nonstrict inequality). And since b = x, we do have control authority to make
V̇ negative where it matters, i.e., for x 6= 0.
The universal formula gives
p
θ̂x2 + θ̂2 x4 + x4
k(x) = − x2
x, x 6= 0
0, x=0
which simplifies to p
θ̂x2 + x2 θ̂2 + 1 p
k(x) = − = − θ̂ + θ̂2 + 1 x
x
Plugging this into the expression (40) for V̇ , we get
p
V̇ = − θ̂2 + 1 x2
x2
V =
2
which gives
V̇ = θx2 + xu
and we can’t do anything unless we know θ.
We see that the CLF concept does provide useful insight into the adaptive example, but it needs
to be tweaked. First, we only have a weak CLF. Second, we have to make sure that V̇ does not
involve θ. This kind of situation is typical in adaptive control.
We’ll see Lyapunov-based design again in the next two sections—backstepping and parameter
estimation. The universal formula itself is not used very often in the actual control design, because
usually simpler control laws can be found on a case by case basis (as in the above examples). But
it is nice to know that a universal formula exists.
5 Backstepping
Reference: [KKK book, Chapters 2 and 3]
We now discuss one of very few available tools for systematically generating Lyapunov functions
for certain classes of systems. This is also a nonlinear design technique, and it’ll allow us to venture
into adaptive control of nonlinear plants. Up to now, the only nonlinear control design tool we’ve
40 DANIEL LIBERZON
discussed is Lyapunov-based design (and Sontag’s universal formula), and backstepping is in a way
a continuation of that. Plants in all adaptive control scenarios have been linear so far.
In the non-adaptive context, the idea of backstepping first appeared in the Russian literature:
paper by Meilakhs, 1978. Independently and around the same time, it was investigated in the
adaptive control context (MRAC problem) by Feuer and Morse, and then by Morse and the authors
of the KKK book (who subsequently coined the term “backstepping” as a general technique for not
necessarily adaptive control).
and assume that we have a CLF V0 (x) and a stabilizing control law
u = k0 (x), k0 (0) = 0
for which
∂V0 ∂V0
· f (x) + · G(x)k0 (x) ≤ −W (x) < 0 ∀ x 6= 0
∂x ∂x
ẋ = f (x) + G(x)ξ
ξ˙ = u
1
V1 (x, ξ) := V0 (x) + |ξ − k0 (x)|2 (42)
2
NONLINEAR AND ADAPTIVE CONTROL 41
∂k0
Its derivative along the (x, ξ)-system is (k0′ stands for the Jacobian matrix ∂x )
∂V0 ∂V0
V̇1 = ·f + · Gξ + (ξ − k0 )T (u − k0′ f − k0′ Gξ)
∂x ∂x
∂V0 ∂V0 ∂V0
= ·f + · Gk0 + · G(ξ − k0 ) + (ξ − k0 )T (u − k0′ f − k0′ Gξ)
∂x ∂x ∂x
∂V0 ∂V0 ∂V0
= ·f + · Gk0 +(ξ − k0 )T u − k0′ f − k0′ Gξ + GT
|∂x {z∂x } ∂x
“old” V̇0 , for (41)
∂V0
≤ −W (x) + (ξ − k0 )T u − k0′ f − k0′ Gξ + GT
∂x
Claim: V1 is a CLF. (See the characterization (38) of CLF for affine systems.)
Indeed, the term multiplying u is (ξ − k0 (x))T . Suppose that it is 0. Then what remains
is ≤ −W (x), hence it is negative unless x = 0 (since W is positive definite). If x = 0, then
k0 (x) = k0 (0) = 0 and so to have ξ − k0 (x) = 0 we must have ξ = 0. We showed that away from
(x, ξ) = (0, 0), we can make V̇1 < 0 by a proper choice of u, which proves the claim.
It is also not hard to see how a stabilizing feedback law can be designed. For example, we can
simply cancel the terms in large parentheses and add a negative square to V̇1 :
∂V0
u = k1 (x, ξ) := −(ξ − k0 ) + k0′ f + k0′ Gξ − GT
∂x
gives
V̇1 ≤ −W (x) − |ξ − k0 |2 < 0 ∀ (x, ξ) 6= (0, 0)
where the last inequality is proved as in the proof of the previous Claim.
Note that since k1 involves the derivative of the nominal control law k0 , we lose one degree of
smoothness in the control law.
−→ The key idea of backstepping is not the actual formula for the control law, but the procedure
of constructing the augmented Lyapunov function V1 as in (42). We usually have some flexibility in
the choice of the control law, which is common in Lyapunov-based design as we already discussed
before. The next example illustrates the procedure and this last point.
ẋ = −x3 + ξ
(43)
ξ˙ = u
Rather than just applying the general formula derived above, let’s follow the procedure to see better
how it works. We first need a CLF and a stabilizing control law u = k0 (x) for the scalar system
ẋ = −x3 + u
42 DANIEL LIBERZON
We already considered this system in Example 3 (page 37), where we had the CLF
x2
V0 (x) =
2
In fact, when the x-system is scalar, a CLF (if one exists) can always be this one. One choice of
the control law was
k0 (x) = −x
which gives
V̇0 = −x4 − x2
Next, consider
1 x2 1
V1 (x, ξ) = V0 (x) + (ξ − k0 (x))2 = + (ξ + x)2
2 2 2
We compute its derivative along the (x, ξ)-system:
to get
V̇1 = −x4 − x2 − (ξ + x)2 < 0 ∀ (x, ξ) 6= (0, 0)
This control law perhaps makes a little bit more sense because it is close to 0 when ξ ≈ −x, i.e.,
when the behavior of ξ is consistent with k0 (x).
NONLINEAR AND ADAPTIVE CONTROL 43
ẋ = f (x) + G(x)ξ1
ξ˙1 = ξ2
..
.
ξ˙k−1 = ξk
ξ˙k = u
We assume that a CLF V0 for the x-system is given (if x is scalar or has a low dimension, we can
hope to find one easily by hand.) Then, we generate a sequence of CLFs V1 , . . . , Vk , and the last
one is a CLF for the entire system. As before, we lose one degree of smoothness of the feedback
law at each step, so we need to make sure that k0 is at least C k .
It is useful to compare backstepping with feedback linearization on the above example. The
system (43) is feedback linearizable. To better explain what we mean by feedback linearization
(which was only briefly mentioned in an earlier example illustrating Sontag’s formula), define
z := −x3 + ξ
ẋ = z
ż = −x − z
which is linear and asymptotically stable. (We can check that the Jacobian of the map from (x, ξ)
to (x, z) is nonsingular and hence the coordinate transformation is well defined.)
This particular feedback linearizing controller involves terms of higher degrees than the back-
stepping one, because it cancels all nonlinearities while the backstepping controller tries to preserve
“friendly” nonlinearities. On the other hand, backstepping requires more structure from the sys-
tem: all “virtual controls” ξi must enter affinely on the right-hand sides. Feedback linearization
doesn’t require this; for example, we can apply it to
ẋ = f (x, ξ)
ξ˙ = u
44 DANIEL LIBERZON
∂f
as long as ∂ξ 6= 0. For backstepping, the first equation must be of the form
ẋ = f (x) + g(x)ξ
Note, however, that we don’t need to assume g(x) 6= 0 to use the Lyapunov-based design.
In other words, the two techniques—backstepping and feedback linearization—are complemen-
tary as they apply to different classes of systems (although in both cases, relative degree must equal
state space dimension).
Recall that in Section 3.2.1 we considered the system in normal form
ξ˙1 = ξ2
ξ˙2 = ξ3
.. (44)
.
ξ˙r = b(ξ, η) + a(ξ, η)u
η̇ = q(ξ, η)
with a(ξ, η) 6= 0 for all ξ, η. We saw there that it cannot always be globally asymptotically stabilized
by the partially linearizing high-gain feedback of the form (30), even if the zero dynamics η̇ = q(0, η)
are globally asymptotically stable. If we assume, additionally, that the η-dynamics are affine in ξ1
and don’t depend on ξ2 , . . . , ξr :
η̇ = f (η) + G(η)ξ1
then backstepping provides a method to design a globally stabilizing feedback. Start by noting
that the system
η̇ = f (η) + G(η)u
is, by the minimum-phase assumption, stabilized by u ≡ 0. Then add an integrator:
η̇ = f (η) + G(η)ξ1
ξ˙1 = u
and use backstepping to find a stabilizing u. Proceeding in this way, we eventually obtain a
stabilizing feedback for (44). To handle the ξr -equation we need to go beyond pure integrator
backstepping, this is covered in HW. Unlike the feedback (30) that we tried earlier, the feedback
constructed by backstepping doesn’t have linear gains k1 , k2 , . . . and is purely nonlinear in general.
ẋ = θx + u
NONLINEAR AND ADAPTIVE CONTROL 45
the controller
u = −(θ̂ + 1)x =: k0 (x, θ̂)
the tuning law
˙
θ̂ = x2 =: τ0 (x)
and the Lyapunov function
1 2
x + (θ̂ − θ)2
V0 (x, θ̂) :=
2
For convenience let us introduce the parameter error
θ̃ := θ̂ − θ
then we get
1 2
V0 (x, θ̂) := x + θ̃2
2
and
∂V0 ∂V0
V̇0 = (θx + k0 ) + τ0 (x) = x(θx − (θ̂ + 1)x) + θ̃x2 = −x2
∂x ∂ θ̂
Let us now add an integrator:
ẋ = θx + ξ
ξ˙ = u
One complication is that the above “virtual” control law is dynamic. However, we can still apply
the same idea and consider the augmented candidate CLF
1 2 1
V1 (x, θ̂, ξ) := x + θ̃2 + (ξ − k0 (x, θ̂))2 = x2 + θ̃2 + (ξ + (θ̂ + 1)x)2
2 2
˙
Let’s write its derivative, keeping θ̂ open for now since we’re not yet sure if the same tuning law
will work.
˙ ˙
V̇1 = x(θx + ξ) + θ̃θ̂ + (ξ + (θ̂ + 1)x) u + θ̂x + (θ̂ + 1)(θx + ξ)
˙ ˙
= x(θx − (θ̂ + 1)x) + θ̃θ̂ +(ξ + (θ̂ + 1)x) u + x + θ̂x + (θ̂ + 1)(θx + ξ)
| {z }
˙
=−x2 +θ̃(θ̂−x2 )
˙
Difficulty: if we define θ̂ = x2 as before, then, to get V̇1 < 0, we need to define u to cancel the
terms in the last parentheses and add a damping term −(ξ + (θ̂ + 1)x). This is what we did in the
non-adaptive case. But the terms in the parentheses depend on the unknown parameter θ!
Solution: Replace θ by θ̂ in the last parentheses, and the carry the difference—which depends
˙
on θ̃—outside the parentheses and combine it with the θ̃θ̂ term.
˙ ˙
V̇1 = −x2 + θ̃ θ̂ − x2 − (ξ + (θ̂ + 1)x)(θ̂ + 1)x + (ξ + (θ̂ + 1)x) u + x + θ̂x + (θ̂ + 1)(θ̂x + ξ)
46 DANIEL LIBERZON
Now the choice of the tuning law and the control law is clear: first set
˙
θ̂ = x2 + (ξ + (θ̂ + 1)x)(θ̂ + 1)x
which cancels the θ̃ term. Note that this takes the form
˙
θ̂ = τ0 + τ1 =: τ
where τ0 = x2 is what we had for the scalar plant and τ1 is a new term. Next, set
This gives
V̇1 = −x2 − (ξ + (θ̂ + 1)x)2 =: −W1 (x, θ̂, ξ) < 0 when (x, ξ) 6= (0, 0)
and we are in good shape because this implies that x, θ̂, ξ are bounded and x, ξ → 0 (Theorem 2).
−→ Note that the above controller is not really a certainty equivalence controller any more: it
incorporates explicitly a correction term (τ x) coming from the tuning law. (Compare this with the
discussion on page 9.)
We see—and this is not surprising—that adaptive backstepping proceeds along the same lines
as non-adaptive backstepping but it is more challenging.
The next step would be to try to stabilize
ẋ = θx + ξ1
ξ˙1 = ξ2
ξ˙2 = u
The functions τ0 , τ1 , τ2 , . . . are called tuning functions. For a general procedure of designing
them, see [KKK book, Chapter 4].
6 Parameter estimation
There are no nonlinear system theory concepts here, but instead some ideas from optimization
algorithms and the important concept of a persistently exciting signal.
General points:
NONLINEAR AND ADAPTIVE CONTROL 47
• Unknown parameters are often difficult to estimate off-line, hence the need for on-line es-
timation. Examples: camera calibration; calibration of mechanical system models (such as
helicopter in flight).
• We will see that convergence of parameter estimates to their true values requires some special
properties of the input signal u.
• Sometimes parameter convergence is not crucial, and u is chosen to meet behavior specs of the
system. (For example, to fly a helicopter, it is not necessary to know all its model parameters
exactly.) To analyze stability of the resulting adaptive control system, other properties of
the parameter estimation scheme (in particular, slow adaptation speed) will be important.
where
Γ = ΓT > 0
is a scaling matrix. This is exactly what we get from the original (unscaled) gradient method by
changing coordinates in Rn according to
hence we have
∇f (x) → 0
along any bounded solution (cf. Theorem 2). For (47), the same V gives
V (x) := |x − x∗ |2
which can be shown to decrease along solutions of (46). (Reason: if f (x∗ ) < f (x) and f is convex,
then f decreases as we start moving along the line from x to x∗ . Hence, the inner product with
the gradient is negative.) To handle the scaled gradient law (47), we need to consider
y(t) = θu(t)
where θ ∈ R is the unknown parameter and u(·), y(·) are known continuous signals. For now we
assume that u is bounded (hence so is y). We’ll sometimes write this as u ∈ L∞ , y ∈ L∞ , etc.
Problem: estimate θ.
A naive approach is just to use
y
θ=
u
However, this has problems. First, it’s ill-defined if u = 0. Second, this is very sensitive to noise.
But most importantly, we want a method that will work for the dynamic case (when u and y are
the input and output of a dynamical system containing uncertain parameter θ).
So, instead we want to design a dynamical system which will generate a time-varying estimate
θ̂(t) of θ. This will lead also to
ŷ(t) := θ̂(t)u(t)
Note: ŷ is not really an estimate of y, since we can measure y directly. But ŷ will provide us
feedback on the quality of the estimate θ̂. To this end, we define the output estimation (or output
prediction) error
e(t) := ŷ(t) − y(t) = (θ̂(t) − θ)u(t)
(in Ioannou-Sun, e is defined with the opposite sign). Based on this error e, we define the cost
function
e2 (t) (θ̂ − θ)2 u2
J(θ̂(t), t) := =
2 2
For simplicity, we will omit the argument t and just write J(θ̂).
Idea: update θ̂ so as to minimize J(θ̂).
50 DANIEL LIBERZON
The motivation is that J ≥ 0 always and J = 0 when θ̂ = θ. Note, however, that it’s possible
to have J = 0 when θ̂ 6= θ if u = 0. To avoid this, we make the (temporary) assumption that
u2 ≥ c > 0 (51)
V̇ = −γe2
We can now use the same trick as in the proof of Theorem 2: integrate this and reduce to Barbalat’s
lemma.
Z t Z t
2
V (t) − V (0) = −γ e (s)ds =⇒ γ e2 (s)ds = V (0) − V (t) ≤ V (0) ∀t
0 0
R∞
which implies that 0 e2 is finite, i.e., e ∈ L2 .
To apply Barbalat, we need to show that e, ė are bounded.
Since V is nonincreasing, from the definition (53) of V it is clear that θ̂ is bounded.
Thus e = (θ̂ − θ)u is also bounded.
˙
In view of (52), this shows that θ̂ is bounded, and belongs to L2 as well4 .
˙
We then have that ė = θ̂u + (θ̂ − θ)u̇ is also bounded.
4
Such “slow adaptation” properties will be useful later when we discuss stability of slowly time-varying
systems.
NONLINEAR AND ADAPTIVE CONTROL 51
˙
By Barbalat’s lemma, e → 0, i.e., ŷ → y, and θ̂ → 0.
Note: we didn’t use Theorem 2 directly because of the presence of the external input u, but we
essentially repeated the steps of its proof.
What does the above analysis imply about convergence of θ̂?
Since V is nonincreasing and bounded from below by 0, it has a limit as t → ∞. Hence, θ̂ must
˙
converge to a constant. (Note that this doesn’t follow just from the fact that θ̂ → 0.)
Can we conclude from the above analysis that θ̂ → θ?
Not necessarily. Define the parameter estimation error
θ̃ := θ̂ − θ
It satisfies the DE
˙ ˙
θ̃ = θ̂ = −γeu = −γu2 θ̃
which we can solve to get
Rt
−γ u2 (s)ds
θ̃(t) = e 0 θ̃(0)
for all t ≥ t0 ≥ 0. We say that u is persistently exciting (PE) if for some α0 , T0 > 0 we have
Z t+T0
u2 (s)ds ≥ α0 T0 ∀t (54)
t
This is an important concept (we’ll extend it later to vector signals). Constant signals, or more
generally signals satisfying (51) are PE, while L2 signals are not. The constant α0 is called the
level of excitation.
Lemma 4 For the above gradient law, θ̃ is UEC if and only if u is PE.
Proof—next HW.
52 DANIEL LIBERZON
ẋ = −ax + bu
where we assume that a > 0 and u is bounded (hence x is also bounded). We want to estimate the
unknown parameters a and b.
Estimator5 :
x̂˙ = −am (x̂ − x) − âx + b̂u
where am > 0 is a design constant which determines the damping rate of the estimator. To see
this, define the estimation errors
e := x̂ − x, ã := â − a, b̃ := b̂ − b
(Again, note that calling e an “estimation error” is not really accurate because both x̂ and x are
measured signals; sometimes the term “state prediction error” is used instead.) Then e satisfies
the DE
ė = −am e − ãx + b̃u
(this DE is not actually implemented, but only used for analysis purposes).
Observe that the above equation is stable with respect to x, u and has the autonomous contrac-
tion rate am . In particular, if ã, b̃ are 0 or converge to 0, then e → 0 (recall that x, u are bounded).
However, the converse is not true: e → 0 does not necessarily imply ã, b̃ → 0 unless the signals
x, x̂, u are PE in some sense (we will see this later).
Update laws for the estimates â, b̂ will be driven by e, and will take the form
˙
â˙ = f1 (e, x̂, x, u), b̂ = f2 (e, x̂, x, u)
Goal: make ã, b̃, e → 0 along solutions of the resulting 3-D system.
Lyapunov-based design: Consider the candidate Lyapunov function
1
V (e, ã, b̃) := (e2 + ã2 + b̃2 ) (55)
2
(has all desired properties: C 1 , positive definite, radially unbounded).
˙
V̇ = eė + ãâ˙ + b̃b̂ = e(−am e − ãx + b̃u) + ãf1 + b̃f2 = −am e2 −ãxe + b̃ue + ãf1 + b̃f2
| {z }
not helpful, want to cancel this
−→ These examples are just to build intuition. We’ll make things more rigorous and more general
a little later.
We now consider the general vector case, but still assuming stability. Namely, the plant is
ẋ = Ax + Bu
where Am is a Hurwitz matrix (chosen by the designer) and Â, B̂ are estimates of A, B (to be
generated). Defining as before
e := x̂ − x, Ã := Â − A, B̃ := B̂ − B
we have
ė = Am e + Ãx + B̃u
Update laws:
˙ ˙
 = F1 (e, x̂, x, u), B̂ = F2 (e, x̂, x, u)
Candidate Lyapunov function:
ATm P + P Am = −I
and tr stands for trace of a matrix. Note that tr(ÃT Ã) is nothing but the sum of squares of all
elements of Ã.
−→ In class we skipped the calculation that follows.
We have
Useful property of trace: for two vectors k, l ∈ Rn we have tr(klT ) = k T l = lT k (quick exercise:
prove this). Hence
xT |ÃT{zP e} = tr(ÃT P exT )
|{z}
lT k
and similarly
uT B̃ T P e = tr(B̃ T P euT )
So we get
V̇ = −eT e + 2tr(ÃT P exT + ÃT F1 ) + 2tr(B̃ T P euT + B̃ T F2 )
This makes the choice of F1 , F2 obvious:
ȳ = θū
ȳˆ := θ̂ū
(θ̂ − θ)u2 eu
∇J(θ̂) = 2
= 2
m m
Let us define
e
en :=
m2
as this quantity will appear frequently in calculations below. It is called the normalized output
estimation error. (In [Ioannou-Sun] it is denoted by ǫ.)
Note: at this point we don’t yet really need en , could also write everything in terms of ē, ū
(which would look closer to the unnormalized case; exercise: do this) but we’ll be relying on this
notation later.
The gradient law is
˙
θ̂ = −γen u
56 DANIEL LIBERZON
θ̃ := θ̂ − θ
to have
(θ̂ − θ)u 2
˙ ˙
θ̃ = θ̂ = −γen u = −γ = −γ ū2 θ̃
m2
Let’s try the candidate Lyapunov function
θ̃2
V (θ̃) :=
2
We have
˙ e2
V̇ = θ̃θ̃ = −γ ū2 θ̃2 = −γ 2 = −γe2n m2 ≤ 0
m
From this we know how to quickly deduce the following:
θ̃ ∈ L∞ and en m ∈ L2 . Since en m = ūθ̃, it is also in L∞ .
˙
θ̂ = −γen mū ∈ L2 ∩ L∞ . I.e., speed of adaptation is bounded in L2 and L∞ sense.
d d ˙ ˙
dt (en m) = dt (θ̃ū) = θ̃ū + θ̃ ū. By Barbalat, en m → 0 under the additional assumption that
ū˙ ∈ L∞ (which we also needed in the stable static case, but not in the stable dynamic case). This
˙
would in turn imply θ̃ → 0.
We see that basically, normalization lets us recover the main properties of the unnormalized
scheme, namely:
• bounded parameter estimates (or, what is the same, bounded parameter estimation errors)
even though the input is no longer bounded. (But we no longer have convergence of the output
estimation error to 0.)
If m = 1 (no normalization), then the first item still holds (V̇ is still ≤ 0) but the second item
doesn’t. And we will see later that slow adaptation is important for stability of adaptive control
(when estimation is combined with certainty equivalence control).
Of course, without additional assumptions we cannot guarantee that θ̃ converges to 0 (or to
anything else).
In Example 6, we had
ẋ = −ax + bu
but we no longer want to assume a > 0 or u ∈ L∞ . Could try to use normalization, but it’s not as
straightforward: if we define
u x
ū := , x̄ :=
m m
NONLINEAR AND ADAPTIVE CONTROL 57
where m is chosen so that ū ∈ L∞ , then we don’t have any simple relationship between ū and x̄.
We certainly don’t have
x̄˙ = −ax̄ + bū (WRONG)
because m is time-varying (it cannot be constant unless u ∈ L∞ in the first place) and so ṁ will
˙
appear in x̄.
In [Ioannou-Sun, Sect. 4.3.2] an approach to this problem is proposed which relies on working
with a different state estimation error produced by a special system. Rather than trying to fix this
example, we now discuss how the general case can be made to resemble the static case of Example 5,
so that the previous design can be extended to it.
−→ Later we will see another approach, which does rely on estimator design (see indirect MRAC).
We will return to it again in switching adaptive control. What will happen is that estimator design
will be combined with control design, and stabilizing properties of the controller will let us show
boundedness of signals, something we cannot do here for arbitrary control inputs.
ẋ = Ax + bu
y = cT x
where x ∈ Rn , the measured signals are u, y ∈ R, A is an unknown matrix, and b, c are unknown
vectors. It is more convenient for us here to represent it in the form
(ignoring initial conditions). The numbers ai , bi are the coefficients of the denominator and the
numerator of the plant transfer function, respectively:
bn−1 sn−1 + · · · + b1 s + b0
y(s) = u(s) = cT (Is − A)−1 bu(s)
sn + an−1 sn−1 + · · · + a1 s + a0
We can define
y (n) = θT Y (57)
58 DANIEL LIBERZON
This already looks familiar, but the problem is that y (n) and the vector Y cannot be obtained
without differentiation. To fix this, filter both sides of (57) with an n-th order stable filter
1 1
= n
Λ(s) s + λn−1 sn−1 + · · · + λ0
z = θT φ (59)
which is the parameterization we were looking for. It is called a linear parametric model because
the unknown parameters enter it linearly (this is unfortunately crucial for most adaptive laws to
work6 ).
−→ The number of parameters in the vector θ is at most 2n, whereas the number of original
parameters (entries of A, b, c was n2 + 2n). This difference is due to the fact that we are only
measuring the input/output data, not the state. Different plants with the same transfer function
are not distinguished by the parametric model (59).
−→ The initial condition x0 is ignored by (59). However, since Λ(s) is stable, the contribution
of x0 decays exponentially with time and doesn’t affect any of the results below. For details, see
[Ioannou-Sun, Sect. 4.3.7].
Estimate/prediction of z:
ẑ := θ̂T φ
where θ̂ is the estimate of θ as usual. Normalized estimation error:
ẑ − z
en :=
m2
where
m2 = 1 + n2s (60)
6
This restriction is lifted in switching adaptive control, to be discussed towards the end of the course.
NONLINEAR AND ADAPTIVE CONTROL 59
φ
p p
and ns is a normalizing signal such that m ∈ L∞ . For example: ns = φT φ, or ns = φT P φ for
some P = P T > 0. Parameter estimation error:
θ̃ := θ̂ − θ
and we have
θ̃T φ
en = (61)
m2
Define the cost function
θ̃T φ
∇J(θ̂) = φ = en φ
m2
Gradient adaptive law:
˙
θ̂ = −Γen φ
where Γ = ΓT > 0 is a scaling matrix (adaptive gain).
For item (ii), the definition of a persistently exciting (PE) signal needs to be extended to vector-
valued signals. This is a direct extension of the earlier scalar definition (54) but instead of squaring
the signal we form a matrix by multiplying the vector signal on the right by its transpose, and the
φ
inequality becomes a matrix one. In other words, m is PE if for some α0 , T0 > 0 we have
Z t+T0
φ(s)φT (s)
ds ≥ α0 T0 I ∀t
t m2
Note that φφT is a singular matrix at each time instant (it has rank 1). PE means that over time,
it varies in such a way that its integral is uniformly positive definite (φ “generates energy in all
directions”). When defining PE, one usually requires an upper bound as well as a lower bound:
Z t+T0
φ(s)φT (s)
α 1 T0 I ≥ ds ≥ α0 T0 I
t m2
7
The theorem statement there is different, but the same proof works. See also [Ioannou-Fidan, Theorem
3.6.1].
60 DANIEL LIBERZON
φ
but since m is bounded, the first inequality is automatic.
Proof of Theorem 5.
(i) Take the candidate Lyapunov function (cf. (50))
1
V (θ̃) := θ̃T Γ−1 θ̃ (62)
2
whose derivative is (recall that Γ is symmetric)
˙
V̇ = θ̃T Γ−1 θ̃ = −θ̃T en φ = −e2n m2 (63)
where the last step follows from (61). Now the usual steps:
θ̃ ∈ L∞ , hence by (61) en m ∈ L∞ . By (60), m2 = 1 + n2s and so en , en ns ∈ L∞ .
Next, en m ∈ L2 . Recalling (60) again, we get en , en ns ∈ L2 .
Write
˙ φ
θ̂ = −Γen m
m
φ ˙
and use en m ∈ L2 ∩ L∞ and m ∈ L∞ to conclude that θ̂ ∈ L2 ∩ L∞ .
(ii) Using (61), we can write
φφ T
˙
θ̃ = −Γ 2 θ̃ =: A(t)θ̃
m (64)
φ T
en m = θ̃ =: cT (t)θ̃
m
View this as an LTV system with state θ̃ and output en m. The right-hand side of (63) is minus
the output squared, which is exactly the situation considered in Section 2.2 (the slight difference
in notation is because the output in (64) is scalar: cT here corresponds to C there). By the
result stated there, the system (64) will be exponentially stable if we can show that this system is
uniformly completely observable (UCO).
To check UCO, we need to analyze the observability Gramian of (64). This looks complicated.
However, there is a trick:
Lemma 6 (Ioannou-Sun, Lemma 4.8.1) A pair (A(t), C(t)) is UCO if the pair (A(t)+L(t)C(t), C(t))
is UCO for some bounded “output injection” matrix L(t).
For LTI systems, this is clear from the rank condition for observability. For LTV systems, the
proof is harder. See [Ioannou-Sun] for details (the proof is also in the book by Sastry and Bodson).
L(t) doesn’t actually need to be bounded, but for us this is good enough.
Continuing with the proof of the Theorem: take
φ
L(t) := Γ
m
NONLINEAR AND ADAPTIVE CONTROL 61
This gives
A(t) + L(t)cT (t) = 0
T
The observability Gramian of the new pair 0, φm is
Z t0 +T
φ(s)φT (s)
M (t0 , t0 + T ) = ds
t0 m2
ẋ = 0
y = cT (t)x
the PE property of c(t) makes the system observable, i.e., we can recover x from y, even though x is
stationary. At any time t, cT (t)x gives information only about the component of x in the direction
of c(t); however, c(t) is moving around in Rn so that over a finite period of time, we get complete
information about x.
−→ Slow adaptation properties of parameter estimation schemes (adaptive laws) will be useful later
˙
for stability analysis of adaptive control systems—this is why we care about things like θ̂ ∈ L2 .
Also will be relevant for ISS modular design.
Let us take a step back to the static example
y(t) = θu(t)
and recap the gradient method. We have been working with the instantaneous cost, initially defined
as
e2 (t) (θ̂(t)u(t) − y(t))2
J(θ̂(t), t) := =
2 2
whose gradient is
∇J(θ̂(t), t) = (θ̂(t)u(t) − y(t))u(t)
The graph of J as a function of θ̂ is a parabola which is moving with time (due to the changing
input and output). At each t, the minimum is given by
y(t)
θ̂(t) =
u(t)
Ideally, we’d like to jump to this minimum right away and follow it as it evolves with time. However,
we can’t really do this. First, it is defined only when u(t) 6= 0. Second, this is really sensitive to
noise. And in the vector case
z(t) = θT φ(t) (65)
62 DANIEL LIBERZON
we can never define θ̂(t) by a simple division. So, instead we slide down along the gradient. This
takes the data into account over time. It’s also clear from this why we need signal boundedness (so
that the graph doesn’t expand infinitely), and need to work with normalized data if we don’t have
boundedness.
Now, it would be nice to have a different cost for which we could actually compute the minimum
in a well-defined way for each t and follow it as time evolves. Since this doesn’t work for the
instantaneous cost, we need a cost which, at each time t, takes into account past data up to time
t. We can think of this data as a curve (or a set of points) in the u, y plane, and what we’re trying
to do is really just find a line that best fits this data. Assuming there is no noise, the data would
be exactly on a line, and we need to find the slope of this line.
How do we do this?
(Note that the argument of θ̂ inside the integral is t, not s!) This is still convex in θ̂, and its
minimum is where the gradient vanishes:
Z t Z t Z t
2
∇J(θ̂) = (θ̂(t)u(s) − y(s))u(s)ds = θ̂(t) u (s)ds − u(s)y(s)ds
0 0 0
We see that this is well-defined even if u occasionally equals 0, as long as it is not identically 0 on
[0, t]. As new data comes, this minimum changes, and it is easy to derive a DE to update it. We’ll
do this below for the general case.
With a slight modification, we can make the least squares method work for the general case
of (65) as well:
Z t
1 (θ̂T (t)φ(s) − z(s))2 1
J(θ̂) := ds + (θ̂ − θ̂0 )T Q0 (θ̂ − θ̂0 )
2 0 m2 (s) 2
| {z }
(en m)2
where Q0 = QT0 > 0. Here m2 = 1 + n2s as before. The last term penalizes deviations from the
initial estimate, and ensures that the least-squares estimate is well defined as we’ll see in a minute.
NONLINEAR AND ADAPTIVE CONTROL 63
The gradient is
Z t
θ̂T (t)φ(s) − z(s)
∇J(θ̂) = φ(s)ds + Q0 (θ̂ − θ̂0 )
0 m2 (s)
Z t Z t
φ(s)φT (s) z(s)φ(s)
= ds + Q 0 θ̂ − ds + Q θ̂
0 0
0 m2 (s) 0 m2 (s)
so the minimum is at
Z t −1 Z t
φ(s)φT (s) z(s)φ(s)
θ̂(t) = ds + Q0 ds + Q0 θ̂0
0 m2 (s) 0 m2 (s)
| {z }
=: P (t)
The matrix P (t) is well defined because φφT ≥ 0 and Q0 > 0 (that’s why we needed it).
We want to express this recursively, to avoid computing the inverse and run a DE instead. We
have
φφT
Ṗ = −P 2 P
m
T
(derivation: 0 = d
dt (P P
−1 ) = Ṗ P −1 + P φφ
m2
). The differential equation for θ̂ is
T
Z t
˙ φφ z(s)φ(s) zφ z − θ̂T φ
θ̂ = −P 2 P ds + Q 0 θ̂ 0 +P = P φ = −P en φ
m 0 m2 (s) m2 m2
| {z }
θ̂
˙
θ̂ = −P en φ
φφT
Ṗ = −P P
m2
This is very similar to the Kalman filter. P is the “covariance matrix” (this is all deterministic,
though).
Item (ii) is a unique property of least squares. On the other hand, note that convergence in
(iii) is not claimed to be exponential.
Proof.
64 DANIEL LIBERZON
(i), (ii) P is positive definite and Ṗ ≤ 0 =⇒ P is bounded and has a limit. With the usual
definition θ̃ := θ̂ − θ we have
d −1 φφT
(P θ̃) = θ̃ − P −1 P en φ = 0 (66)
dt m2 |{z}
T
= φ 2θ̃
m
Hence θ̃ also is bounded and has a limit, and thus so does θ̂. The other claims in (i) are shown
using the Lyapunov function
1
V (θ̃) := θ̃T P −1 θ̃
2
in the same way as in the proof of Theorem 5 (do this yourself and check [Ioannou-Sun]).
φ
(iii) We want to show that P −1 → ∞ when m is PE; this would force θ̃ to go to 0 by (66) and
we would be done. Integrating
d −1 φφT
(P ) =
dt m2
and using the definition of PE, we have
Z T0
φφT
P −1 (T0 ) − P −1 (0) = ds ≥ α0 T0 I
0 m2
Similarly,
P −1 (2T0 ) − P −1 (T0 ) ≥ α0 T0 I
and we see that P −1 indeed increases to ∞.
There are various modifications to the above scheme. For example, one could introduce a
“forgetting factor”:
Z t
1 (θ̂T (t)φ(s) − z(s))2 1
J(θ̂) := e−λ(t−s) ds + e−λt (θ̂ − θ̂0 )T Q0 (θ̂ − θ̂0 )
2 0 m2 (s) 2
where λ > 0. Then one can show that convergence of θ̂ to θ in (iii) becomes exponential. The DE
for P changes to
φφT
Ṗ = λP − P 2 P
m
which actually prevents P from becoming arbitrarily small and slowing down adaptation too much,
something that happens with pure least squares where P −1 grows without bound. However, items
(i) and (ii) will no longer be true.
NONLINEAR AND ADAPTIVE CONTROL 65
The above integral cost with forgetting factor (and Q0 = 0) could also be used in the gradient
method instead of the instantaneous cost. Although the gradient expression for it is more com-
plicated, the resulting gradient adaptive law has the same properties as the previous one, and in
˙
addition one can prove that θ̂ → 0. See [Ioannou-Sun, Sect. 4.3.5] for details.
−→ So, the main difference between the gradient and least squares methods is not so much in the
cost used, but in the method itself (moving along the negative gradient of a time-varying function,
versus following its exact minimum).
6.3.4 Projection
In the above, we were assuming that θ ∈ Rm is arbitrary, i.e., completely unknown. In practice, we
usually have some set S which we know contains θ, and we can take this knowledge into account
when designing an adaptive law. The basic idea is to use projection to ensure that θ̂(t) ∈ S ∀ t.
This is a good idea for two reasons:
• Helps avoid estimated plant models which are not suitable for control purposes; e.g, for some
values of θ̂ the model may not be stabilizable (loss of stabilizability problem) and certainty
equivalence-based design breaks down. (More on this later.)
For example, suppose that we have some cost J(θ̂) and want to implement the gradient law
˙
θ̂ = −∇J(θ̂)
S = {θ : g(θ) = 0}
for some (nice) scalar-valued function g. This captures cases where θ belongs to some known
subspace. Given a θ̂ ∈ S, we can split −∇J(θ̂) into a sum of two terms, one normal to S and the
other tangent to S:
−∇J(θ̂) = α ∇g(θ̂) + Pr(θ̂) (67)
| {z } | {z }
normal tangent
Then we will discard the normal direction and just keep the tangent direction. This will ensure
that θ̂ travels along S.
But how do we calculate Pr(θ̂)? For this we need to know α. Multiply both sides of (67) by
the normal, i.e., (∇g(θ̂))T :
∇g T ∇J
−∇g T ∇J = α∇g T ∇g =⇒ α=−
∇g T ∇g
66 DANIEL LIBERZON
and so
∇g T ∇J ∇g∇g T
Pr = −∇J + ∇g = − I − ∇J
∇g T ∇g ∇g T ∇g
This gives us the projected gradient law.
More generally, if we started with the scaled gradient law
˙
θ̂ = −Γ∇J(θ̂)
(which is a coordinate change away from the previous one) then its projected version can be shown
to be
˙ ∇g∇g T
θ̂ = − I − Γ T Γ∇J(θ̂)
∇g Γ∇g
It is easy to extend this idea to the case when the set S is given by an inequality constraint:
S = {θ : g(θ) ≤ 0}
For example, S could be a ball. Start with θ̂0 ∈ S. If θ̂(t) is in the interior of S, or it’s on the
boundary but −Γ∇J(θ̂) points inside S, then follow the usual gradient method. If θ̂(t) is on the
boundary and −Γ∇J(θ̂) points outside, then apply the projection.
−→ If S is a convex set, then the projected gradient adaptive law retains all properties of the
unprojected gradient adaptive law established in Theorem 5 (and in addition maintains θ̂ in S by
construction).
˙
This is true because when we subtract a term proportional to ∇g(θ̂) from θ̂, this can only
make the Lyapunov function (62) decrease more. When the sublevel sets of g are convex, the
contribution of −∇g(θ̂) to V̇ is negative basically for the same reason that the contribution of
−∇J(θ̂) was negative for the convex cost function J. See [Ioannou-Sun, Sect. 4.4] for details.
The projection modification can also be applied to the least squares method.
z = θT φ (68)
and have developed parameter estimation schemes which guarantee parameter convergence (θ̂ → θ,
exponentially except for pure least squares, need forgetting factor) if φ is PE. The regressor vector
φ contains (filtered version of) u, y, and their derivatives, see (58). We can write this as
φ = H(s)u
Basic idea:
u sufficiently rich =⇒ φ is PE =⇒ θ̂(t) → θ (69)
We want to understand what “sufficiently rich” means in various cases.
ẏ = −ay + bu
Assume a > 0 and u ∈ L∞ . In the parametric model for this example, we have
b u 1
z = ẏ, θ= , φ= = b u
a −y − s+a
is not PE. Think about it: when we form the matrix φ(t)φT (t), it is the sum of two terms. One
is constant and singular. The other decays exponentially to 0. So the definition of PE cannot be
satisfied.
What would be a better input? How does one in general identify frequency response of a system
b
with transfer function g(s)? (In our case, g(s) = s+a ).
Let’s try
u(t) = sin ωt
68 DANIEL LIBERZON
where ω is some constant frequency. In steady state (i.e., modulo an exponentially decaying tran-
sient), the corresponding output is
where
b |b|
A := |g(jω)| = =√ ,
jω + a ω + a2
2
(70)
−1 Im g(jω) ω
α := tan = − tan−1
Re g(jω) a
b1 s + b0
y(s) = u(s)
s2 + a1 s + a0
where a0 , a1 > 0 to ensure stability. There are 4 unknown parameters in the transfer function now,
so a single sinusoid is not enough. But
gives
y(t) = A1 sin(ω1 t + α1 ) + A2 sin(ω2 t + α2 )
We now have 4 unknowns and 4 equations =⇒ OK.
Back to the general parametric model (68) where θ and φ both have dimension m.
Generically, u is sufficiently rich (in the sense of guaranteeing that φ is PE) if it contains at
m
least distinct frequencies. This matches the above examples.
2
The above definition assumes that u is sinusoidal. We can make this more general, by saying
that the power spectrum contains at least m distinct points (power spectrum is symmetric).
“Generically” means modulo some degenerate situations, e.g., when frequencies ωi coincide with
zeros of H(s).
NONLINEAR AND ADAPTIVE CONTROL 69
The proof (which we will not give) relies on frequency domain arguments. The basic idea is that
the integral appearing in the definition of PE is the autocovariance of φ, and it is related to the
power spectrum (or spectral measure) of φ via Fourier transform. See [Ioannou-Sun, Sect. 5.2.1]
for details.
Note: if the plant is partially known, then we need fewer distinct frequencies. Or, with fewer
frequencies we can partially identify the unknown plant.
Earlier, we described parameter estimation schemes of two kinds: using state-space models, and
using parametric models. Combining them with the discussion we just had, we can formally state
results on parameter convergence. Note that there are two steps in (69). The above discussion
was primarily about the first step (we did talk about recovering the system parameters but this
was only for simple examples, in general we need adaptive laws to do that). The second step was
addressed—for parametric models only—in Theorems 5 and 7.
For full-state measurements, we had
ẋ = Ax + Bu
We don’t give a proof, but note that the number n + 1 comes from the fact that state dimension
is n plus we have 1 input. If there are uncontrollable modes, then they are not affected by u and
decay to 0, so the corresponding parameters cannot be identified.
For partial state measurements, we had
ẋ = Ax + bu
y = cT x
where A is Hurwitz, x ∈ Rn , and u, y ∈ R are bounded. We already said that, even though the
total number of parameters in the state model is n2 + 2n, the dimension of the parametric model
(the number of unknown coefficients in the transfer function) is at most m = 2n. We can only hope
to identify these 2n parameters. We have described adaptive laws based on the gradient method
and least-square method. Since we’re in the stable case here, we don’t need normalization.
Theorem 9 (Ioannou-Sun, Theorem 5.2.4) If the transfer function g(s) has no pole-zero can-
cellations and u contains at least n distinct frequencies, then the adaptive laws discussed in class
give θ̂ → 0 (exponentially fast, except for the case of pure least squares).
70 DANIEL LIBERZON
This result matches the examples we had earlier. As for the previous result, we need fewer
frequencies there because we are observing more information (the whole state instead of output).
Some remarks on parameter estimation/identification:
The next example, on model reference adaptive control, is taken from [Ioannou-Sun, pp. 320
and 397] and [Khalil, pp. 16 and 327]. It illustrates the last two points above. It also reinforces the
concept of PE and other constructions we saw above. This example will be used again later when we
study singular perturbations. See [Ioannous-Sun, Chap. 6] for more information on model-reference
adaptive control.
−→ This will also be the first time we combine parameter estimation with control (not counting
our early example which wasn’t really using an estimation scheme but an ad-hoc tuning law).
Adaptation
r u y
Controller Plant
− e
+
ym
Reference Model
Reference model:
ẏm = −am ym + bm r (71)
We assume that am > 0 (reference model is stable) and r ∈ L∞ .
For control, we use a combination of feedforward and feedback of the form
u = −ky + lr
which gives
ẏ = (a − bk)y + blr
Hence to match the model, we need
a + am bm
a − bk = −am , bl = bm ⇔ k= , l=
b b
These controller gains k, l are not implementable, since we don’t know a and b. Instead, we will
use
u = −k̂y + ˆlr (72)
where k̂, ˆl are estimates of k, l. There are two approaches to generating such an adaptive control
law: direct and indirect.
This is a reparameterization which directly displays control parameters; note that it is bilinear in
the parameters, not linear. To check (73), just plug in the formulas for k and l.
Plugging the control law (72) into (73), we have
k̃ := k̂ − k, ˜l := ˆl − l
which gives
am 2
V̇ = − e (75)
b
Now the analysis follows the familiar steps:
e, k̃, ˜l are bounded.
e ∈ L2 .
We assumed that r is bounded and the reference model is stable. Hence ym is bounded, which
since e is bounded implies that y = ym − e is bounded. Hence, ė is bounded.
NONLINEAR AND ADAPTIVE CONTROL 73
˙ ˙
By Barbalat, e → 0. Also, k̂, ˆl are in L2 and converge to 0.
We showed that, for any bounded input, the plant output y asymptotically tracks the reference
model output ym , which was the original objective.
Do we know that k̂, ˆl converge to k, l?
Need a PE condition. More precisely, we need the reference signal r to be sufficiently rich.
Since the plant is scalar, we guess that one sinusoid will do:
r = sin ωt
y = A sin(ωt + α)
where A and α are given in (70). Collecting the above equations, we can write the resulting system
in (e, k̃, ˜l) coordinates as an LTV system:
ė −am bA sin(ωt + α) −b sin ωt e e
˙
k̃ = −γA sin(ωt + α) 0 0 k̃ =: A(t) k̃
˜l˙ γ sin ωt 0 0 ˜l ˜l
Looking at (75), we see that we’ll have exponential stability of this system—and thus exponential
convergence of the controller parameter estimates k̂ and ˆl to their true values—if we can show that
the system is UCO with respect to the output
e e
y := e = 1 0 0 k̃ =: C k̃
˜l ˜l
We also know the trick of passing from (A(t), C) to (A(t) + L(t)C, C), see Lemma 6. It is easy to
see that by proper choice of L(t), we can get
−am bA sin(ωt + α) −b sin ωt
A(t) + L(t)C = 0 0 0
0 0 0
(we could also kill −am if we wanted, but it helps to keep it). Still, showing UCO (defined in
Section 2.2) looks complicated because we need to compute the transition matrix of A(t) + L(t)C,
and it’s not clear how to do that.
8
Note that, first, the reference model is stable so the effects of its initial conditions vanish with time, and
second, the steady-state responses of the plant and of the reference model are the same because ym −y = e →
0. So, what we’re doing is ignore some terms that go to 0 with time, and it can be shown that the presence of
these terms doesn’t affect stability which we’re about to establish without these terms (cf. [Khalil, Example
9.6].
74 DANIEL LIBERZON
−→ By the way, it is straightforward to check that in case r were constant instead of sinusoidal,
and hence y were also constant in steady state, the above (LTI) pair would not be observable.
Consider the system ẋ = (A + LC)x, y = Cx (note: this y is not to be confused with the
original plant output y, we are considering an abstract system here). To investigate its UCO, note
that the observability Gramian M (t0 , t0 + T ) by definition satisfies
Z t0 +T
|y(t)|2 dt = x(t0 )T M (t0 , t0 + T )x(t0 )
t0
and also
ẋ2 = ẋ3 = 0 =⇒ x2 , x3 = const
(note that x2 , x3 no longer correspond to k̃, ˜l because we’re working with an auxiliary system
obtained by output injection).
We can now easily solve for y:
R Rt x
−am t t −am (t−s) 2
y(t) = e y(t0 ) + t0 e bA sin(ωs + α)ds − t0 e−am (t−s) b sin ωsds
x3
but instead of defining k̂, ˆl directly as estimates of k and l, we’ll now define them indirectly via
estimates â, b̂ of the plant parameters a, b. Earlier we derived the ideal control parameters to be
a + am bm
k= , l=
b b
so we set
â + am ˆl = bm
k̂ := , (78)
b̂ b̂
To generate â and b̂, we can follow a familiar scheme discussed earlier—cf. Example 6 in Section 6.2.
Comparing with Example 6, we know that we would like to run the estimator
Note that we choose the damping rate of the estimator to be the same as in the reference model.
As a consequence, it turns out that we don’t actually need to run the estimator, and can just use
the state ym of the reference model instead of ŷ. Indeed, let’s rewrite the reference model like this:
where, as usual,
ã := â − a, b̃ := b̂ − b
The next step is Lyapunov analysis.
Candidate Lyapunov function:
1 2 1 2 2
V (e, ã, b̃) := e + (ã + b̃ )
2 γ
where γ > 0 is arbitrary. This is the same V as we’ve been using earlier, with the parameter γ for
extra flexibility. Note that we don’t need the division by b as in the direct scheme, since we’re no
longer using the bilinear parameterization we used there and so b doesn’t multiply the parameter
errors in the ė equation. (In fact, we’re not seeing the need for assuming b > 0 yet, but we’ll see it
later.)
1 1 ˙
V̇ = −am e2 + ãey + b̃eu + ãã˙ + b̃b̃
γ γ
76 DANIEL LIBERZON
˙ ˙
which makes the choice of adaptive law, i.e., the choice of ã˙ = â˙ and b̃ = b̂, obvious:
˙
â˙ := −γey, b̂ = −γeu
This is very similar to what we had in the direct scheme, and it gives
V̇ = −am e2
ẏ = ây + b̂u
on which our certainty equivalence controller is based. When b̂ = 0, the model is not stabilizable,
and the procedure breaks down (u doesn’t exist). This is a well-known issue in indirect adaptive
control, known as the loss of stabilizability problem.
There are several approaches for dealing with loss of stabilizability; see [Ioannou-Sun, Sect.
7.6.2]. The simplest is based on projection and works if we have a bit more information on the
plant, as follows. We assumed that b > 0, i.e., we know that the actual plant is stabilizable.
Suppose that we know a constant b0 > 0 such that
b ≥ b0 (known)
Then, we can modify the adaptive law for b̂ by projecting it onto the interval [b0 , ∞):
(
˙ −γeu, if b̂ > b0 or if b̂ = b0 and eu ≤ 0
b̂ =
0, otherwise
V̇ = −am e2 + b̃eu
NONLINEAR AND ADAPTIVE CONTROL 77
However, when we stop, we know by construction that b̂ = b0 and eu > 0. Since b ≥ b0 , at this
time we have b̃ = b̂ − b ≤ 0. Hence, the extra term in V̇ is nonpositive, and we’re still OK. And
now b̂ is bounded away from 0, hence k̂, ˆl are bounded, u is bounded, ė is bounded, and we can
apply Barbalat to conclude that e → 0 and tracking is achieved.
˙ b̂˙ are in L2 and converge to 0.
Also, â,
−→ Note that b̂ may or may not approach 0 in practice, we just can’t rule it out theoretically in the
absence of projection. In this example the loss of stabilizability issue is not really severe because
we know the sign of b. In problems where the sign of b is unknown (as we had earlier) and thus
b̂ might be initialized with the wrong sign, zero crossings for gradient adaptive laws are in fact
almost unavoidable. But in higher dimensions things are different.
−→ What we proved about exponential parameter convergence under sinusoidal reference signals
for direct MRAC is still true for indirect MRAC.
From this example, it may appear that direct MRAC is better because it doesn’t suffer from the
loss of stabilizability issue. However, in general this is not the case. The reason is that direct
MRAC requires us to come up with a controller reparameterization which can then be used to
design an adaptive law. In the above simple example this worked, although the direct controller
reparameterization was more complicated than the original plant parameterization (it was bilinear).
In general, direct schemes tend to be more complicated and apply to narrower classes of systems
than do indirect schemes. (This is true not only for MRAC.) See [Ioannou-Sun, Sect. 1.2.3] for a
detailed discussion. In this course, we’re primarily dealing with indirect adaptive control.
Note that the estimation scheme we used for indirect MRAC is quite different from the ones
we developed in Section 6.3, in several key aspects:
• It relies on an estimator/reference model (if we don’t ask for parameter identification, then
we can set r = 0 if we want) instead of plant parametric model.
• It guarantees that e → 0, something we didn’t have with the normalized adaptive laws based
on parametric models.
• Since the design relies on the matching between the estimator and the reference model, given
by (79), it is difficult to extend it to higher-order plants (actually, the issue is not so much the
plant order but its relative degree), while the earlier schemes relying on parametric models
work for general plants.
7 Input-to-state stability
7.1 Weakness of certainty equivalence
Reference: [KKK book, Sect. 5.1]
When using the certainty equivalence principle for control design, we substitute θ̂ for θ in the
controller. The resulting closed-loop system thus differs from the “ideal” one by terms involving
the parameter estimation error θ̃. We then hope that these terms do not have too much negative
effect on system stability. However, we haven’t really formally addressed this issue yet.
Ideally, we’d like to have modular design, i.e., formulate some design objectives for the controller
and the tuning law (tuner) separately so that, when combined using certainty equivalence, they
give stability. For example:
)
Controller is stabilizing when θ̃ = 0
=⇒ x is bounded (81)
Tuner guarantees bounded θ̃
ẋ = θx + u
Let’s view the error θ̃ as a (disturbance) input to this system. We’ve shown earlier that it is
bounded (but doesn’t necessarily converge to 0).
Is it true that a bounded θ̃ always leads to a bounded x for the above system?
No. For example, if θ̃ ≡ −2 then we have ẋ = x and x → ∞.
˙
The tuning law that we used for Example 1, θ̂ = x2 , provides more than just boundedness of
θ̃. In fact, the Lyapunov analysis we had before implies that x → 0. Some other tuning law which
guarantees boundedness of θ̃ but doesn’t ensure its correct sign may not work, so we have to be
careful.
So the first property above, (81), is not true in general, as the above example shows.
NONLINEAR AND ADAPTIVE CONTROL 79
ẋ = −θx2 + u
u = −x + θ̂x2
i.e., θ̃ → 0 exponentially fast (we take the decay rate to be equal to 1 just for simplicity, this is not
important). This seems to be as good as it gets.
−→ We haven’t discussed parameter identification for nonlinear plants, so we just take such an
identifier as given and don’t worry about actually designing it. One example of such identifier
design is discussed in [KKK book, p. 187], see also the references cited there.
The closed-loop system is now
ẋ = −x + e−t θ̃0 x2 (83)
What can we then say about the closed-loop system (83)? Is x bounded? Does it converge to
0?
Claim: For some initial conditions, x escapes to ∞ in finite time!
One system that’s known to do that is
ẋ = x2
whose solution is
x0
x(t) =
1 − x0 t
and for x0 > 0 this is only defined on the finite time interval [0, 1/x0 ) and approaches ∞ as
t → 1/x0 . This behavior is due to the rapid nonlinear growth at infinity of the function f (x) = x2 .
It is clear that the system
ẋ = −x + 2x2 = −x + x2 + x2 (84)
80 DANIEL LIBERZON
also has finite escape time, for x0 large enough, because for large positive x the term −x is dominated
by one of the x2 terms. (This is despite the fact that locally around 0 this system is asymptotically
stable.)
But now we can also see that our closed-loop system (83) has finite escape time. The argument
is like this: let T be the time that it takes for the solution of (84) to escape to ∞ for a given x0 > 0.
Choose θ̃0 large enough so that
e−t θ̃0 ≥ 2 ∀ t ∈ [0, T ]
Then the corresponding solution of (83) is no smaller than that of (84) and hence it escapes to ∞
in time ≤ T .
In fact, one can check that the solution of (83) is given by the formula
2
x(t) = x0
x0 θ̃0 e−t + (2 − x0 θ̃0 )et
and we see that for
x0 θ̃0 > 2
we have
1 x0 θ̃0
x(t) → ∞ as t → log
2 x0 θ̃0 − 2
The above example highlights the challenge of nonlinear adaptive control and weakness of
certainty equivalence approach: bounded/converging estimation error does not guarantee bound-
edness/convergence of the closed-loop system.
For a modular design to work, we need to demand more from the controller. Namely, we
need the controller to possess some robustness with respect to disturbance inputs which in our
case correspond to the parameter estimation error θ̃. Such robustness properties are captured in
nonlinear system theory by the general concept of input-to-state stability (ISS) which is discussed
next.
If γ is also unbounded, then it is said to be of class K∞ . Example: γ(r) = cr for some c > 0.
A function β : [0, ∞) × [0, ∞) → [0, ∞) is said to be of class KL if β(·, t) is of class K for each
fixed t ≥ 0 and β(r, t) is decreasing to zero as t → ∞ for each fixed r ≥ 0. Example: β(r, t) = ce−λt r
for some c, λ > 0.
We will write β ∈ KL, γ ∈ K∞ to indicate that β is a class KL function and γ is a class K∞
function, respectively.
Now, consider a general nonlinear system
ẋ = f (x, d) (85)
where x ∈ Rn is the state and d ∈ Rℓ is the (disturbance) input. To ensure existence and uniqueness
of solutions, we assume that f is sufficiently nice (e.g., locally Lipschitz) and d is also sufficiently
nice (e.g., piecewise continuous).
According to [Sontag, 1989] the system (85) is called input-to-state stable (ISS) with respect
to d if for some functions β ∈ KL and γ ∈ K∞ , for every initial state x0 , and every input d the
corresponding solution of (85) satisfies the inequality
|x(t)| ≤ β(|x0 |, t) + γ sup0≤s≤t |d(s)| ∀t ≥ 0
The above formula assumes that the initial time is t0 = 0. But since the system (85) is time-
invariant, it wouldn’t make any difference if we kept the initial time t0 general and wrote
|x(t)| ≤ β(|x0 |, t − t0 ) + γ supt0 ≤s≤t |d(s)| ∀ t ≥ t0 ≥ 0. (86)
−→ Also note that by causality, it makes no difference if we take sup over t0 ≤ s < ∞.
Let’s try to decipher what the ISS definition says. When there is no input, i.e., when d ≡ 0, it
reduces to
|x(t)| ≤ β(|x0 |, t)
This means that the state is upper-bounded by β(|x0 |, 0) at all times and converges to 0 as t → ∞
(because β is decreasing to 0 in the second argument). It turns out that this is exactly equivalent
to global asymptotic stability (GAS) of the unforced system
ẋ = f (x, 0)
In this case one says that (85) is 0-GAS.
A more standard definition of GAS is not in terms of class KL functions, but the two are
equivalent; see [Khalil, Definition 4.4 and Lemma 4.5]. In particular, global exponential stability
(GES) is defined as
|x(t)| ≤ ce−λt |x0 |
and the right-hand side is an example of a class KL function.
However, GAS and GES are internal stability notions while ISS is an external stability notion.
It says that in the presence of d, there’s another term in the upper bound for the state, and this
term depends on the size of the disturbance. The implications of ISS are:
82 DANIEL LIBERZON
• If d → 0, then x → 0.
The first one is obvious from the definition of ISS. The second one holds because we can always
“restart” the system after d becomes small, and use (86). We already used this trick earlier to
prove Fact 3 in Section 3.3, see an earlier homework problem.
Some examples:
The linear system
ẋ = Ax + Bd
is ISS if (and only if) A is a Hurwitz matrix. In other words, for linear systems internal stability
(stability of ẋ = Ax) automatically implies external stability (ISS). We have
for suitable constants b, c, λ > 0. We actually already discussed this before (cf. Fact 2 in Section 3.3,
also homework).
For nonlinear systems, it’s no longer true that internal stability (GAS for d ≡ 0) implies ISS.
We already saw this in the previous subsection. The system
ẋ = −x + xd
fails the first bullet item above (just set d ≡ 2). The system
ẋ = −x + x2 d
fails both bullet items above—in fact, it fails miserably: not only does x not converge, not only is
it unbounded, but it escapes to infinity in finite time. And all this for exponentially converging d,
and despite the fact that for d ≡ 0 we have the nice stable linear system ẋ = −x.
It is possible to construct examples with less drastic behavior which fail the two bullet items
above.
From the above discussion it is clear that a new theory is needed to study ISS. We will not
pursue this in detail here, but we will mention a few basic results.
The ISS property admits the following Lyapunov-like equivalent characterization: the sys-
tem (85) is ISS if and only if there exists a positive definite radially unbounded C 1 function
V : Rn → R such that for some class K∞ function ρ we have
∂V
|x| ≥ ρ(|d|) =⇒ · f (x, d) < 0 ∀ x 6= 0, ∀ d (87)
∂x
(to be read like this: for all x 6= 0 and all d, the implication holds). Such a function V is called an
ISS-Lyapunov function.
Idea of the proof that the existence of an ISS-Lyapunov function implies ISS:
NONLINEAR AND ADAPTIVE CONTROL 83
• As long as |x| ≥ ρ(|d|), we have V̇ < 0 and the system behaves like a usual GAS system, i.e.,
it converges towards the origin. During this period of time, we have an estimate of the form
|x(t)| ≤ β(|x0 |, t)
• Assume that d is bounded (for unbounded d the ISS estimate gives no finite bound and so
there’s nothing to prove). Once we enter a level set of V superscribed around the ball of
radius
ρ supt0 ≤s<∞ |d(s)|
we may no longer have V̇ < 0, but we know that we cannot exit this level set because outside
it, V̇ < 0 and we are pushed back in. So, from this time onward, we satisfy the bound of
the form
|x(t)| ≤ γ supt0 ≤s<∞ |d(s)|
Here γ is obtained from ρ and V (geometrically, the ball of radius γ is superscribed around
the level set of V which is in turn superscribed around the ball of radius ρ).
• Combining the two bounds—the one before we enter the level set of V and the one after—we
obtain ISS.
∂V ∂V ∂V ∂V
· f (x, d) = · f (x, 0) + · (f (x, d) − f (x, 0)) ≤ −W (x) + L|d|
∂x ∂x ∂x ∂x
where W is positive definite and L is the Lipschitz constant for f on a region containing the initial
condition x0 . It is now easy to see that for d small enough, the negative term dominates. We can
now establish the ISS estimate as in the previous argument.
So, the key thing about ISS is that the estimate (86) holds for arbitrary inputs, no matter how
large. This is a much stronger property and it doesn’t follow from 0-GAS.
End optional material
So, to check ISS, we need to look for a pair (V, ρ) satisfying (87). We’ll see examples of how to
do this very soon.
Suppose now that we have a system with both disturbance inputs and control inputs:
ẋ = f (x, d, u)
84 DANIEL LIBERZON
Then a natural problem to consider is to design a feedback law u = k(x) such that the closed-loop
system is ISS with respect to the disturbance d. Such a control law is called input-to-state stabilizing
(it attenuates the disturbance in the ISS sense).
Combining the notion of control Lyapunov function (CLF) discussed in Section 4.1 with the
above notion of ISS-Lyapunov function, we arrive at the following definition of ISS control Lyapunov
function (ISS-CLF):
∂V
|x| ≥ ρ(|d|) =⇒ inf · f (x, d, u) < 0 ∀ x 6= 0, ∀ d
u ∂x
where ρ ∈ K∞ .
Given an ISS-CLF, we want to have a systematic procedure—in fact, a universal formula—for
designing an input-to-state stabilizing controller, with V serving as an ISS-Lyapunov function for
the closed loop.
We remember from the earlier case of no disturbances that to get this, we need to impose an
affine structure on the system. Namely, let us assume that the right-hand side is affine in both9 u
and d:
ẋ = f (x) + L(x)d + G(x)u (88)
Then V is an ISS-CLF if
∂V ∂V ∂V
|x| ≥ ρ(|d|) =⇒ inf · f (x) + · L(x)d + · G(x)u <0 ∀ x 6= 0, ∀ d
u ∂x ∂x ∂x
It is still not clear how to apply Sontag’s universal formula to this. First, the conditions are
stated differently than before (in terms of the implication). Second, the expression inside the inf
involves d, while we want the controller to be independent of d (which is usually not measured10 ).
(Recall the CLF setting: want to have inf u {a(x) + b(x) · u} < 0. Cannot define a in the obvious
way because it’ll depend on d.)
The trick is to realize that
∂V ∂V
· L(x)d ≤ LT (x) |d|
∂x ∂x
and that under the condition |x| ≥ ρ(|d|), the worst-case value of the above expression is
∂V
LT (x) ρ−1 (|x|)
∂x
This is well-defined because ρ, being a class K∞ function, is invertible on [0, ∞) (since it’s strictly
increasing from 0 to ∞). Also, there does in fact exist an admissible disturbance for which the
∂V
above upper bound is achieved: just align d with the vector LT (x) .
∂x
9
Affine dependence on d is not necessary, but without it the construction is more complicated [L-Sontag-
Wang]. Affine dependence on u is essential.
10
One exception arises in switching adaptive control, to be discussed later, where the disturbance corre-
sponds to the output estimation error which is available for control.
NONLINEAR AND ADAPTIVE CONTROL 85
We conclude that the following is an equivalent characterization of an ISS-CLF for the affine
system (88):
∂V T ∂V −1 ∂V
inf · f (x) + L (x) ρ (|x|) + · G(x)u < 0 ∀ x 6= 0
u ∂x ∂x ∂x
or, equivalently, as
|b(x)| = 0 =⇒ a(x) < 0
for all x 6= 0, which is exactly (38). And neither a nor b depends on d.
Now, the desired input-to-state stabilizing feedback law u = k(x) is given by the universal
formula (39). (If one wants this feedback law to be smooth, then one needs to replace the second
term in a(x), which is in general just continuous, by a smooth approximation.)
Claim: The closed-loop system is ISS.
Indeed, the derivative of V along the closed-loop system is
∂V
V̇ = · (f (x) + L(x)d + G(x)k(x))
∂x
and we have
|x| ≥ ρ(|d|)
⇓
∂V ∂V ∂V
V̇ ≤ · f (x) + LT (x) ρ−1 (|x|) + · G(x)k(x) = a(x) + b(x) · k(x) < 0
∂x ∂x ∂x
for all x 6= 0, where the last inequality (< 0) is guaranteed by the universal formula. Thus V is an
ISS-Lyapunov function for the closed-loop system, which implies ISS by a result given earlier.
As a quick application of the ISS concept, consider again the system in normal form from
Section 3.2.1:
ξ˙1 = ξ2
ξ˙2 = ξ3
..
.
ξ˙r = b(ξ, η) + a(ξ, η)u
η̇ = q(ξ, η)
86 DANIEL LIBERZON
with a(ξ, η) 6= 0 for all ξ, η and with globally asymptotically stable zero dynamics η̇ = q(0, η)
(minimum-phase property). We saw that the control (30) globally stabilizes the ξ-dynamics but
this is in general not enough to globally stabilize the whole system (“peaking phenomenon”).
However, if we strengthen the minimum-phase property by assuming that the η-dynamics are ISS
with respect to ξ, then all is well: the fact that ξ → 0 is then enough, as we know, to conclude that
η → 0 too. So, ISS of the η-dynamics is the right property to guarantee that any feedback that
globally stabilizes the ξ-subsystem also automatically globally stabilizes the entire system, and no
“peaking” occurs. We call systems with this ISS property strongly minimum phase.
Before we do this, we need one more fact about ISS-Lyapunov functions. We defined ISS-
Lyapunov functions via
∂V
|x| ≥ ρ(|d|) =⇒ · f (x, d) < 0 ∀ x 6= 0, ∀ d
∂x
where ρ ∈ K∞ . We could also rewrite this more precisely as
∂V
|x| ≥ ρ(|d|) =⇒ · f (x, d) ≤ −α(|x|) ∀ x, d (89)
∂x
where α ∈ K∞ (the fact that we can take α to be of class K∞ , and not just positive definite, is not
obvious, but this can be always be achieved by modifying V ). Another equivalent characterization
of ISS-Lyapunov functions is as follows:
∂V
· f (x, d) ≤ −α(|x|) + χ(|d|) ∀ x, d (90)
∂x
where α, χ ∈ K∞ .
Begin optional material
Proof of the equivalence (sketch). It is not hard to obtain χ from ρ and vice versa:
Suppose that (90) holds. Rewrite it as
∂V 1 1
· f (x, d) ≤ − α(|x|) − α(|x|) + χ(|d|)
∂x 2 2
from which we see that
∂V 1
|x| ≥ α−1 (2χ(|d|)) =⇒ · f (x, d) ≤ − α(|x|)
∂x 2
NONLINEAR AND ADAPTIVE CONTROL 87
(In this case we get not α but 12 α on the right-hand side, but the constant 1
2 is arbitrary and it
could be arbitrarily close to 1.)
Conversely, suppose that (89) holds. We only need to show (90) when
|x| ≤ ρ(|d|)
∂V0 ∂V0
V̇1 = · (f + Ld + Gk0 ) + · G(ξ − k0 ) + (ξ − k0 )T (u − k0′ f − k0′ Ld − k0′ Gξ)
|∂x {z } ∂x
“old” V̇0
∂V0
≤ −α0 (|x|) + χ0 (|d|) + (ξ − k0 ) T
u− k0′ f − k0′ Ld − k0′ Gξ +G T
∂x
88 DANIEL LIBERZON
We can cancel all terms inside the parentheses, except k0′ Ld. But we can dominate this term
using square completion: define
∂V0
k1 (x, ξ) := −(ξ − k0 ) + k0′ f + k0′ Gξ − GT − (k0′ L)(k0′ L)T (ξ − k0 )
∂x
Then we get
V̇1 ≤ −α0 (|x|) − |ξ − k0 (x)|2 + χ0 (|d|) − (ξ − k0 )T (k0′ L)(k0′ L)T (ξ − k0 ) − (ξ − k0 )T k0′ Ld
= −α0 (|x|) − |ξ − k0 (x)|2 + χ0 (|d|)
1 1
−(ξ − k0 )T (k0′ L)(k0′ L)T (ξ − k0 ) − (ξ − k0 )T k0′ Ld − dT d + dT d
| {z 4 } 4
2
=− (k0′ L)T (ξ−k0 )+ 12 d
1
≤ −α0 (|x|) − |ξ − k0 (x)|2 + χ0 (|d|) + dT d ≤ −α1 xξ + χ1 (|d|)
4
where
1
χ1 (|d|) := χ0 (|d|) + |d|2
4
and α1 ∈ K∞ can be suitably defined since α0 (|x|) + |ξ − k0 (x)|2 is positive definite and radially
unbounded as a function of (x, ξ).
As before, we can apply the above backstepping procedure recursively to handle a chain of
integrators.
−→ For initializing the backstepping procedure, it is useful to know that any scalar affine system
ẋ = f (x) + ℓ(x)d + g(x)u
with g(x) 6= 0 is input-to-state stabilized by the feedback
1
u= − f (x) − x − |ℓ(x)|x
g(x)
Indeed, the closed-loop system is
ẋ = −x + ℓ(x)d − |ℓ(x)|x
and for
x2
V (x) =
2
we have
V̇ = −x2 + ℓ(x)dx − |ℓ(x)|x2 ≤ −x2 + |ℓ(x)||d||x| − |ℓ(x)|x2 = −x2 − |ℓ(x)||x|(|x| − |d|)
≤ −x2 if |x| ≥ |d|
hence V is an ISS-Lyapunov function.
(We also know that in general, the assumption g(x) 6= 0 is not necessary for V to be an
ISS-CLF.)
As before, this can be generalized to strict feedback systems (cf. earlier homework).
NONLINEAR AND ADAPTIVE CONTROL 89
ẋ = −θx2 + u
x2
V (x) =
2
for which
V̇ = −x2 + θ̃x3 − x4 ≤ −x2 + |θ̃||x3 | − x4 = −x2 − |x3 |(|x| − |θ̃|)
and we get
|x| ≥ |θ̃| =⇒ V̇ = −x2 < 0 ∀ x 6= 0
This by definition shows that V is an ISS-Lyapunov function, hence the system is indeed ISS.
90 DANIEL LIBERZON
Remark 1 We’re viewing θ̃ as an external input. However, once the tuning law which generates
θ̂ is in place, it is more accurate to think of θ̃ as an output of the overall adaptive system, and not
as an input. This is because θ̃ is determined by θ and θ̂, which are the internal system parameters.
Then, a more accurate term is output-to-state stability (OSS), which is a form of detectability (small
output implies small state, etc.) This doesn’t change much as far as the theory goes, but it’s helpful
for achieving conceptual clarity. Detectability is very important in the adaptive control context,
we’ll see more on this in Section 8.3 and Section 9.
To go beyond scalar systems, we can again apply backstepping. More precisely, we need to de-
velop an adaptive version of ISS backstepping discussed above. As in the case of usual (not ISS)
backstepping, discussed in Section 5, the basic idea will carry over but there will be some differences.
Let us see how it works by adding an integrator to the previous system:
ẋ = −θx2 + ξ
ξ˙ = u
We already did the initial step, which we reproduce below with appropriate notation:
|x| ≤ |θ̃|
−→ Note: this Lyapunov function depends on θ̂, because so does the control law k0 . As we’ll see
in a moment, this introduces additional complications compared to the non-adaptive case.
∂V0 ˙
V̇1 = (−θx2 + ξ) + (ξ + x − θ̂x2 + x3 ) u + (1 − 2θ̂x + 3x2 )(−θx2 + ξ) − x2 θ̂
∂x
= x(−x + θ̃x2 − x3 ) +(ξ + x − θ̂x2 + x3 ) u + x + (1 − 2θ̂x + 3x2 )(−θ̂x2 + ξ)
| {z }
≤−x2 +θ̃ 4
˙
+ (1 − 2θ̂x + 3x2 )x2 θ̃ − x2 θ̂
(As earlier on page 45, we used θ = θ̂ − θ̃ to split the θ-dependent term.) We know that we can
cancel the terms that come after u on the second line of the above formula, and add damping:
Thus the ISS notion we’re now asking for is weaker than ISS with respect to θ̃ only. (Just recall
the definition of ISS; clearly, ISS with respect to a part of the input vector implies ISS with respect
to the whole input vector, but not vice versa.)
92 DANIEL LIBERZON
˙
Now that we’re treating θ̂ as another input, we can handle the last term in exactly the same
way as the previous term, i.e., dominate it using square completion. The complete formula for the
control law stands as
˙
• If θ̃ and θ̃ are bounded, then x is bounded
˙
• If θ̃ and θ̃ converge to 0, then x converges to 0
Thus to guarantee the desired properties of the state x, we need to find an adaptive law which
ensures the corresponding properties of the parameter estimation error θ̃. This is what we meant
by “modular design”: here are the design objectives for the controller and the estimation law, and
we can go ahead and design them separately. It doesn’t matter exactly how we’ll design them; as
long as the above properties are satisfied, the problem will be solved. (Go back again to Section 5.2:
do you see that we didn’t have modularity there?)
We’re not going to discuss the design of parameter estimation laws for nonlinear systems.
But for linear systems, we studied this in detail in Chapter 6. In particular, we saw how to get
˙
boundedness of θ̃ and θ̃, as well as (in some cases, at least without normalization) convergence of
NONLINEAR AND ADAPTIVE CONTROL 93
˙
θ̃ to 0. We also had such properties in indirect MRAC. So, while we may not always know how
to achieve these properties for general systems, they are not surprising and we’re comfortable with
them.
We’re less comfortable, however, with the requirement
θ̃ → 0
We know that we do not have this unless we have PE conditions (and we don’t want to impose
PE because it typically conflicts with the control design objective). In this context, it is helpful to
˙
know that ISS with respect to (θ̃, θ̃) also implies the following.
˙ ˙
• If θ̃ and θ̃ are bounded and L(x)θ̃ and θ̃ converge to 0, then x converges to 0
In other words, θ̃ → 0 is replaced by the less demanding condition that L(x)θ̃ → 0. The proof
of this fact relies on the result mentioned in the optional material in the ISS subsection. For more
details, see [KKK book, Lemma 5.3 on p. 193].
Just to get an idea how we might get this latter convergence property, consider the estimator
e = x̂ − x
satisfies
ė = Ae + L(x)θ̃
Suppose that we were able to show that all signals are bounded and e → 0 (again, for linear plants
˙
we were often able to get such properties in Chapter 6). Then ė, ë are bounded (θ̃ is bounded
because it’s some function of the other signals which are bounded). Also,
Z ∞ Z t
ė(s)ds = lim ė(s)ds = lim e(t) − e(0) = −e(0)
0 t→∞ 0 t→∞
is bounded. Applying Barbalat’s lemma (this means applying it twice, because one first already
applies it once to show that e → 0), we conclude that
ė → 0
L(x)θ̃ → 0
as needed.
For details on designing parameter estimation schemes providing the above properties, read
[KKK book, Chapters 5 and 6].
94 DANIEL LIBERZON
8.1 Stability
Suppose that we’re given an LTV system
ẋ = A(t)x (91)
and we want to know when it is uniformly asymptotically stable (this stability is then automatically
global and exponential by linearity).
Is it enough to assume that A(t) is a Hurwitz matrix for each fixed t?
No! (Not even if eigenvalues of A(t) are bounded away from the imaginary axis.)
Can we come up with a counterexample?
To understand this, the easiest case to consider is when A(t) is a piecewise-constant function
of time, i.e., it switches between several fixed matrices. Then instead of a more usual time-varying
system we have a switched system. The two are closely related, and later we’ll study switched
systems more explicitly when we discuss switching adaptive control.
Counterexample:
Suppose that we are switching between two systems in the plane. Suppose that the two indi-
vidual systems are asymptotically stable, with trajectories as shown in the first figure (the solid
curve and the dotted curve).
For different choices of the switching sequence, the switched system might be asymptotically
stable or unstable; these two possibilities are shown in the second figure.
From this example, the instability mechanism is also quite clear: even though each system is
stable, we catch it at the “peak” of its transient and switch to the other system, without giving it
a chance to decay.
NONLINEAR AND ADAPTIVE CONTROL 95
It is easy to modify this example to make A(t) continuously time-varying (just imagine a family
of systems homotopically connecting these two, and sliding fast through them would mimic the
effect of a switch).
There are many results dealing with stability of switched and time-varying systems. In fact, sta-
bility of the constituent systems for frozen t is neither sufficient nor necessary (we could turn things
around in the above example and have unstable systems but with switching having a stabilizing
effect).
The direction we pursue here is suggested by the above example, and is also the one that will
be relevant for adaptive control. As the section title suggests, we will restrict attention to time-
variations that are sufficiently slow (assuming still that A(t) is stable for each t). In the above
example, it is clear that if we waited longer, we’d be OK. Here’s a general result that characterizes
such slow variations by imposing suitable conditions on the derivative Ȧ(·).
Theorem 10 (Ioannou-Sun, Theorem 3.4.11) Consider the LTV system (91) and assume that:
• A(t) is Hurwitz for each fixed t, and there exist constants c, λ0 > 0 such that for all t and s
we have11
eA(t)s ≤ ce−λ0 s (92)
• A(·) is C 1 and uniformly bounded: there exists an L > 0 such that kA(t)k ≤ L ∀ t.
The condition (92) means that all matrices A(t) have a common stability margin λ0 and a
common overshoot constant c.
−→ Note that A(·) satisfies one of the hypotheses a), b) if kȦ(·)k is in L∞ and sufficiently small,
or if it is in L1 or L2 .
When Ȧ(·) satisfies a), it is sometimes called nondestabilizing with growth rate µ. If kȦ(·)k is
in L1 then the growth rate is 0.
−→ In the proof we’ll get a formula for µ (an upper bound). However, we view this result more as
a qualitative one (which is how it is stated).
Proof of Theorem 10.
Let’s prove a).
For each fixed t, let P (t) be the symmetric positive-definite solution of the Lyapunov equation
−→ This is pointwise in t (“system snapshot”). Don’t confuse this with the Lyapunov equation
for general LTV systems, which would be Ṗ (t) + P (t)A(t) + AT (t)P (t) = −I. Of course Ṗ (t) will
eventually arise here too, but we’ll use the slow-varying conditions to bound it. Basically, we first
treat the system as if it were LTI and then use perturbation arguments.
Candidate Lyapunov function:
Its derivative:
V̇ = −|x|2 + xT Ṗ x
eA(t)s ≥ e−Ls ∀ t, s
NONLINEAR AND ADAPTIVE CONTROL 97
(Proof: for every x we can write |x| = |e−A(t)s eA(t)s x| ≤ ke−A(t)s k|eA(t)s x| ≤ ekA(t)ks |eA(t)s x| ≤
eLs |eA(t)s x|, hence |eA(t)s x| ≥ e−Ls |x| and the claim follows.) Then, using (94) again, we similarly
get
kP (t)k ≥ β1 > 0
so P (t) is uniformly bounded away from 0. (Cf. [Khalil, p. 159 or 371].)
Now we differentiate (93):
Ṗ (t)A(t) + AT (t)Ṗ (t) = −P (t)Ȧ(t) − ȦT (t)P (t) =: −Q(t)
which implies
Z ∞ Z ∞
T (t)s
Ṗ (t) = eA Q(t)eA(t)s ds =⇒ kṖ (t)k ≤ kQ(t)k c2 e−2λ0 s ds ≤ β2 kQ(t)k
0 0
From definition of Q,
kQ(t)k ≤ 2kP (t)kkȦ(t)k ≤ 2β2 kȦ(t)k
Combining the two:
kṖ (t)k ≤ 2β22 kȦ(t)k
Plugging this into the earlier formula for V̇ :
V̇ ≤ −|x|2 + 2β22 kȦ(t)k|x|2
From the definition of V and the above bounds we have the following:
β1 |x|2 ≤ V (t, x) ≤ β2 |x|2
This gives
V̇ ≤ −β2−1 V + 2β22 β1−1 kȦ(t)kV = − β2−1 − 2β22 β1−1 kȦ(t)k V
By standard comparison principle,
Rt
− (β2−1 −2β22 β1−1 kȦ(s)k)ds
2 −1 α −1
−2β22 β1−1 µ)(t−t0 )
V (t) ≤ e t0
V (t0 ) ≤ e2β2 β1 e−(β2 V (t0 )
by a)
Therefore, V (t) decays to 0 exponentially if
β1
µ<
2β23
where b 6= 0 and both a and b are unknown. (We don’t assume that a < 0.)
The objective is to asymptotically stabilize this system (x → 0) using state feedback.
−→ A similar system arises in the context of adaptive tracking in [Ioannou-Sun], see examples in
Sect. 7.4.3 and 7.4.4. (There is a preliminary step in converting the tracking problem to stabilization
of the error system, which we skipped. The error system in [Ioannou-Sun] is similar to the above
system, but it has more equations because there is also an observer part there, while here we assume
that both components of x are available for control.)
To apply the certainty equivalence principle of control design, we first need to select a controller
that stabilizes our system for the case when a, b are known.
It is easy to check that the system is controllable (hence stabilizable). We can stabilize it with
a state feedback controller
v = − k1 k2 x = −k1 x1 − k2 x2 (95)
where k1 and k2 are selected by any standard linear control design method discussed in ECE 515.
One option is to place closed-loop eigenvalues at some chosen locations in the left half of the
complex plane (pole placement). The closed-loop matrix is
a 1 b a − bk1 1 − bk2
Acl = + −k1 −k2 =
0 0 b −bk1 −bk2
NONLINEAR AND ADAPTIVE CONTROL 99
Choosing, e.g.,
a+1 1
k1 = , k2 =
b b
we get
−1 0
Acl =
−a − 1 −1
whose eigenvalues are −1, −1. (There is a systematic procedure for designing pole-placement feed-
back laws, via transforming to controllable canonical form, but here we can just find the gains by
trial and error.)
Begin optional material
Another option is, instead of arbitrarily selecting closed-loop poles, to consider an LQR problem
Z ∞
J= (xT (s)Qx(s) + Ru2 (s))ds −→ min
0 u
where the matrix Q = QT ≥ 0 and the scalar R > 0 are design parameters (they weight stability
against control effort). The solution is
1
u = − BT P x
R
where P = P T > 0 satisfies the algebraic Riccati equation
1
P A + AT P − P BB T P + Q = 0
R
ã := â − a, b̃ := b̂ − b
The design model has the same form as the original system but instead of the unknown parameters
we have their estimates which are available for control. We need a control law
u = −k̂1 x1 − k̂2 x2
100 DANIEL LIBERZON
is Hurwitz. Repeating the above control design with â, b̂ in place of a, b, we get
â + 1 1
k̂1 = , k̂2 =
b̂ b̂
and
bcl = −1 0
A (96)
−â − 1 −1
whose eigenvalues are −1, −1.
Now, we need an adaptive law for updating the parameter estimates â and b̂. We already devel-
oped parameter estimation schemes that will cover this example. We can use, e.g., the normalized
gradient law of Section 6.3.2, whose properties are listed in Theorem 5.12 That theorem guarantees,
in particular:
â, b̂ ∈ L∞ , ˙ b̂˙ ∈ L2 ∩ L∞
â, (97)
(only item (i) of Theorem 5 applies since we don’t have persistency of excitation here).
−→ The design of the control law and the parameter estimation law are completely independent
of one another (“modular design”).
Combining the two, we get the closed-loop system
b ã b̃
ẋ = Acl (t)x − x − u (98)
0 1 b̃
where we write the argument t in A bcl to emphasize that it is time-varying because â evolves with
bcl doesn’t depend on it.)
time. (So does b̂ but A
The analysis now proceeds in two steps:
Hence, Theorem 10 applies and tells us that the part of (98) coming from the design model,
i.e.,
bcl (t)x
ẋ = A
12
There we had a single output, while here we have full state measurements. But this only makes things
easier; we can always choose a scalar output and use it to design the adaptive law. For example, y = x1
bs + b
gives the transfer function 2 . Note that there are only two parameters to estimate.
s + as
NONLINEAR AND ADAPTIVE CONTROL 101
is exponentially stable.
−→ What if b̂(t) = 0 at some time t? This is loss of stabilizability again. The fix is similar
to the case of indirect MRAC: if we know the sign of b and if
|b| ≥ b0 (known)
then we project the gradient law onto (−∞, −b0 ] or [b0 , ∞) (depending on the sign of b),
which we know doesn’t affect the properties (97).
Step 2 Using (97) and well-known facts about various induced gains of exponentially stable linear
systems being finite, it is possible (although not easy) to show that the perturbed system (98)
is still asymptotically stable, thus x → 0 as needed.
We skip Step 2. It can be found (modulo different notation and presence of extra dynamics)
in [Ioannou-Sun, pp. 481–482]. (Relies on a swapping lemma.)
We see that the basic idea behind the analysis is rather simple:
• The controller stabilizes the design model for each frozen value of the parameters
• The real closed-loop system has extra terms coming from parameter estimation errors, which
are small in an appropriate sense by the properties of the adaptive law
We also learn the following lessons from this: the control design should really be based on the
design model, and it should be robust with respect to the errors between the design model and
the real model. This suggests a departure from the certainty equivalence principle, which doesn’t
address such robustness. We already saw this idea in the nonlinear context (ISS).
8.3 Detectability
The above design is not completely satisfactory. First, Step 2 is messy (we didn’t give it).
Second, adaptive laws based on normalization are somewhat difficult to analyze and do not extend
well to nonlinear plants.
Recall that in indirect MRAC we had a different, estimator-based adaptive law with no normal-
ization, and the estimator coincided with the reference model in closed loop. (We had a reference
signal there but it’s not important, we can set it to 0.) It also guaranteed slow adaptation speed
102 DANIEL LIBERZON
We don’t pursue this alternative approach in detail here (mainly because the design we used
for indirect MRAC doesn’t generalize very well beyond plants of relative degree 1). This reasoning
will be taken up later in the course (see switching adaptive control).
However, we want to plant the seed now and discuss the last step above. In the full measurement
case (y = x), which includes the scalar case we studied in indirect MRAC, convergence of the
estimator state x̂ plus convergence of e immediately give convergence of the plant state x = x̂ − e.
In general, however, e is an output of lower dimension than x and x̂. The relevant concept then
is detectability with respect to this output. This is a refinement of our earlier reasoning based on
observability. For now we’ll just make a brief general remark on detectability, and will later revisit
it and will see its importance more clearly.
Suppose we write the overall closed-loop adaptive control system as
ẋ = Aθ̂(t) x
e = Cθ̂(t) x
Here x is the combined state of plant, controller, and estimator. As usual, θ̂ is the vector of
parameter estimates and e is the output estimation error. The above form is completely general,
as long as the plant, controller, and estimator dynamics are linear.
−→ The adaptive/tuning law, which is typically nonlinear, is not a part of the system dynamics.
Assume that the above system is detectable for each fixed value of θ̂. (This property is sometimes
also called tunability [Morse].)
Recall: detectability means stability of unobservable modes, meaning that if the output equals
0 (or even just converges to 0) then the state converges to 0. Note that this is strictly weaker than
observability. For now we’re talking about detectability of LTI systems, i.e., detectability of each
fixed pair (Aθ̂(t) , Cθ̂(t) ).
An equivalent characterization of detectability is the existence (for each frozen value of θ̂) of an
output injection matrix, Lθ̂ , such that Aθ̂ − Lθ̂ Cθ̂ is Hurwitz.
We can take A(·) and C(·) to be C 1 in θ̂. Then can show that L(·) is also C 1 in θ̂.
Suppose that the adaptive law ensures that:
NONLINEAR AND ADAPTIVE CONTROL 103
1) θ̂ is bounded
˙
2) θ̂ is in L2 , or converges to 0, or is small in some other sense so that the hypotheses of
Theorem 10 are fulfilled
3) e is in L2 , or converges to 0, or is some other zeroing signal, i.e., signal which, when injected
into an exponentially stable linear system, produces a state converging to 0
As we said, one example of an adaptive law with these properties is the one we used for indirect
MRAC.
Combining this with the detectability assumption, we get stability immediately:
Rewrite the x-dynamics as
The homogeneous part is exponentially stable by Theorem 10. (Here we are using the facts
that dA/dθ̂, dC/dθ̂, dL/dθ̂ are continuous as functions of θ̂, and θ̂ is bounded.)
e is a zeroing signal, hence so is Lθ̂(t) e (because L is continuous with respect to θ̂ and θ̂(t) is
bounded).
Therefore, x → 0 and we are done.
This argument follows the same logic as in the previous subsection, but is much cleaner (espe-
cially the last step).
−→ The above argument shows that to get stability, we should design the adaptive system so that
it is detectable with respect to the output estimation error, for each frozen value of the parameter
estimates.
Detectability is not only sufficient for stability, but also necessary, in the following sense. Sup-
pose that for some value θ̂, the system is not detectable. This means that we can have e ≡ 0 but
x 6→ 0. Of course, if θ̂ will change, we don’t necessarily have a problem. But if θ̂ is an equilibrium
value of the adaptive law, i.e., if e ≡ 0 will make θ̂ stuck at this value, then x will never converge.
So, we must have detectability for all equilibrium values of the adaptive law.
Switched systems and switching control were already mentioned a couple of times earlier in
the course. In particular, we looked at an example of a switched system when discussing stability
of time-varying systems (Section 8.1) and we briefly discussed a switching control law when we
studied universal regulators (at the end of Section 3.1.3).
Switching adaptive control aims to solve the same problems as traditional adaptive control,
discussed so far in this course. It also uses many of the same concepts and ideas. The primary
difference is that instead of using continuous tuning laws to define the control gains, it relies on
logic-based switching among a family of candidate controllers.
We saw that continuous tuning/estimation has some limitations. To begin, we need a nice pa-
rameterization of the plant model. In fact, we only considered cases where the unknown parameters
enter linearly. One reason we needed this is that for the gradient law, this guaranteed that the
cost function was convex in θ̂. Without this property, the gradient law would not give any useful
convergence results.
Another issue that we encountered was loss of stabilizability. Continuous tuning can take us to
places in the parameter space where no stabilizing control gains exist. To overcome this, we had
to use some form of projection (which requires a priori information about unknown parameters).
Switching adaptive control aims to lift these restrictions by abandoning continuous tuning
altogether, and instead updating the controller gains in a discrete (switched) fashion. This gives us
greater design flexibility.
Supervisor
Switching
Signal
u1
Controller 1
u2 u y
Controller 2
Plant
.
.
.
−→ The above figure suggests that the set of controllers is discrete (even finite). This is an option
that was not possible in continuous adaptive control. However, we can still have a continuum of
controllers and pick controllers from this continuous family one at a time.
−→ The unknown parameters in the plant may take values in a discrete or a continuous set. Intu-
itively speaking, we should have one controller for each possible value of the unknown parameter,
NONLINEAR AND ADAPTIVE CONTROL 105
and we’ll usually assume that this is the case. However, we don’t need to have exact correspondence
between the two parameter sets (plant parameters and controller indices). For example, we can
try to “cover” a continuous set of plant parameter values (such as a ball) with a finite number of
controllers (each of which is robust enough to stabilize nearby plants).
Notation: We will write P for the set in which the unknown parameters take values. We assume
that this is a compact set. The vector of unknown parameters itself will be denoted by p∗ , and its
estimates will be denoted by p. This is the same as what we earlier denoted as θ and θ̂, respectively,
but this new notation will be more convenient (because we will frequently use p as subscripts) and
it is more consistent with the switching adaptive control literature.
As in Section 7, we want to have a modular design, i.e., formulate separate design objectives
on the controllers and on the supervisor.
y Monitoring
Multi- yp ep µp Switching σ
Estimator p∈P +
−
p∈P
Signal p∈P Logic
u Generator
9.1.1 Multi-estimator
The scheme we discuss here is estimator-based, and is similar to the estimator equations we
saw earlier. The difference is that we will no longer design a continuous tuning law to drive the
output estimation error to 0, but instead we will generate a family of estimates
yp , p ∈ P
ep := yp − y, p∈P
and will pick the smallest one at each time instant (roughly speaking).
106 DANIEL LIBERZON
−→ The basic idea is to design the multi-estimator so that ep∗ , i.e., the error corresponding to the
true parameter value, is small. Usually we cannot guarantee anything about the other, “wrong”
estimation errors (there is no a priori reason for them to be small). Thus, the smallness of ep
indicates the likelihood that p = p∗ . In other words, it seems intuitively reasonable (although not
yet justified formally in any way) to pick as a current estimate of p∗ the index of the smallest
estimation error.
This property is basically the same as what we had for the estimators we had earlier: in case
when the parameter estimates match the true values, e is small—e.g., converges to 0 exponentially
at a prescribed rate. (See the estimator for Example 6 in Section 6.2, which we saw again in indirect
MRAC.)
ẏ = y 2 + p∗ u
where p∗ is an unknown element of some set P ⊂ R containing both positive and negative values.
We can let the estimator equations be
ẏ = y 2 + p∗ u − d
then we have
ėp∗ = −am ep∗ + d
hence ep∗ exponentially converges to d/am , not to 0. In this case, the steady-state value of the
error ep∗ is determined by the size of the disturbance.
One concern is that realizing the multi-estimator simply as a parallel connection of individual
estimator equations for p ∈ P is not efficient and actually impossible if P is an infinite set. The
estimator equations (99) can be implemented differently as follows. Consider the system
ż1 = −am z1 + am y + y 2
(100)
ż2 = −am z2 + u
NONLINEAR AND ADAPTIVE CONTROL 107
This idea is known as state sharing. The family of signals (101) is of course still infinite, but at
each particular time we can look any one of them up or perform mathematical operations—such
as computing the minimum—with the entire family.
Rather than basing decisions on the instantaneous values of the output estimation errors, we would
like to take their past behavior into account. Thus we need to implement an appropriate filter,
which we call the monitoring signal generator. This is a dynamical system whose inputs are the
estimation errors and whose outputs
µp , p ∈ P
are suitably defined integral norms of the estimation errors, called monitoring signals. For example,
we can simply work with the squared L2 norm:
Z t
µp (t) := |ep (s)|2 ds, p∈P (102)
0
If we don’t want the signals µp to grow unbounded, we can introduce a “forgetting factor”:
Z t
µp (t) := e−λ(t−s) |ep (s)|2 ds
0
Again, we do not want to generate each monitoring signal individually. The idea of state sharing
can be applied here as well. To see how this works, let us revisit the multi-estimator for Example 10,
given by (100) and (101). Each estimation error can be equivalently expressed as
ep = z1 + pz2 − y
108 DANIEL LIBERZON
so that we have
e2p = (z1 − y)2 + 2pz2 (z1 − y) + p2 z22 , p∈P
If we now define the monitoring signal generator via
then the equations (103) still hold. By now you should be able to see why this works and how it
can be extended to (104).
This is a dynamical system whose inputs are the monitoring signals µp , p ∈ P and whose output is
a piecewise constant switching signal σ. The switching signal determines the actual control law
u = uσ
applied to the plant, where up , p ∈ P are the control signals generated by the candidate controllers.
−→ We don’t actually need to physically generate the out-of-the-loop control signals up , p 6= σ(t).
The diagram with a bank of controllers is just for illustration. What we’re really implementing is
a switching controller.
−→ The controllers can be parameterized by a different set Q, which can for example be a subset
of P. This can be done by defining a controller assignment map from P to Q. To keep things
simple, we will assume that Q = P. But we’ll usually write q and not p for controller indices, to
distinguish them from the plant parameters and to remind us of the more general situation. We’ll
also sometimes denote the controllers themselves as Cq , q ∈ P.
Basic idea of the switching logic:
This is essentially what we’ll do, but we’ll only update the value of σ from time to time, rather
than continuously. This discrete update strategy for σ can be either time-based (use a fixed time
interval between updates—dwell time) or event-based (update when the difference between the old
minimum and the new one gets large enough—hysteresis). More details on this later.
Justification–?
This is tempting, but the second implication is not known to hold. In fact, what we have is its
converse.
NONLINEAR AND ADAPTIVE CONTROL 109
We saw this issue before in continuous adaptive control: the estimation scheme drives e to 0,
and we plug the estimate θ̂ into the controller as if it were θ (certainty equivalence). However, even
if e → 0, we don’t know that θ̂ → θ, so a separate justification is required why this works. Before,
we were using Lyapunov functions to prove stability. Here the reasoning will be different, based on
detectability.
In fact, let’s drop the wrong implication and state directly the property we want:
This is precisely the detectability property, which we already discussed in Section 8.3. And it must
be true for all q since we don’t know the behavior of σ a priori.
More precisely, this is detectability of the supervisory control system with the q-th controller
in the loop, with respect to the q-th output estimation error. Assume that this system is linear,
and write it as
ẋ = Aq x
eq = C q x
Recall (we’ll need this later) that detectability is equivalent to the existence (for each frozen value
of q) of an output injection matrix, Lq , such that Aq − Lq Cq is Hurwitz. Use this to rewrite the
system as
ẋ = (Aq − Lq Cq )x + Lq eq
It is then clear that small eq (bounded, convergent to 0) implies the same property for x.
ẋ = Ap∗ x + Bp∗ u
(106)
y = Cp ∗ x
where each Lp is an output injection matrix such that Ap − Lp Cp is Hurwitz. It follows that the
estimation error ep∗ = yp∗ − y converges to zero exponentially fast, regardless of the control u that
is applied (just subtract the two right-hand sides when p = p∗ ).
We can also use the above observers to design the candidate control laws:
where the matrices Kp are such that Ap − Bp Kp are Hurwitz for each p ∈ P.
Let the monitoring signals be the L2 norms of the output estimation errors as in (102), i.e.,
generate them by the differential equations (103).
It remains to define the switching signal σ : [0, ∞) → P, which will give us the switching
controller u(t) = −Kσ(t) xσ(t) . One way to do this is by means of the so-called hysteresis switching
logic, see figure.
Initialize σ
σ =p
∃p :
no µp + h ≤ µσ yes
?
This switching logic works as follows. Fix a positive number h called the hysteresis constant.
Set σ(0) = arg minp∈P µp (0). Now, suppose that at a certain time σ has just switched to some
q ∈ P. The value of σ is then held fixed until we have minp∈P µp (t) + h ≤ µq (t). If and when that
happens, we set σ equal to arg minp∈P µp (t). When the indicated arg min is not unique, break the
tie arbitrarily.
−→ In Example 10, for each fixed time t, µp is a quadratic polynomial in p given by the for-
∂µ
mula (105), hence the value of p that minimizes µp (t) is either a real root of ∂pp (t) or a boundary
point of P. Thus for that example, the on-line minimization procedure required for implementing
the switching logic is relatively straightforward.
Lemma 11 The switching stops in finite time, i.e., there exists a time T ∗ and an index q ∗ ∈ P
such that σ(t) = q ∗ ∈ P for all t ≥ T ∗ . Moreover, eq∗ ∈ L2 .
Proof. Since ep∗ converges to 0 exponentially, the formula (102) implies that µp∗ (t) is bounded
from above by some number K for all t ≥ 0. In addition, all monitoring signals µp are nondecreasing
by construction. Using these two facts and the definition of the hysteresis switching logic, we now
prove that the switching must stop in finite time. Indeed, each µp has a limit (possibly ∞) as
NONLINEAR AND ADAPTIVE CONTROL 111
t → ∞. Since P is finite, there exists a time T such that for each p ∈ P we either have µp (T ) > K
or µp (t2 ) − µp (t1 ) < h for all t2 > t1 ≥ T . Then for t ≥ T at most one more switch can occur. We
conclude that there exists a time T ∗ such that σ(t) = q ∗ ∈ P for all t ≥ T ∗ , and µq∗ is bounded.
Finally, eq∗ ∈ L2 by (102).
After the switching stops, the closed-loop system (excluding out-of-the-loop signals) can be
written as
ẋ x
=A
ẋq∗ xq ∗
(108)
x
eq ∗ = C
xq ∗
where
Ap∗ −Bp∗ Kq∗
A :=
L q ∗ Cp ∗ Aq∗ − Lq∗ Cq∗ − Bq∗ Kq∗
and
C := −Cp∗ Cq ∗ .
If we let
−Lp∗
L :=
−Lq∗
then it is straightforward to check that
Ap∗ − Lp∗ Cp∗ −Bp∗ Kq∗ + Lp∗ Cq∗
A − LC = .
0 Aq∗ − Bq∗ Kq∗
The matrix on the right-hand side is Hurwitz, which shows that the system (108) is detectable with
respect to eq∗ .
It now remains to apply the standard output injection argument. Namely, write
ẋ x
= (A − L C) + Leq∗
ẋq ∗ xq ∗
and observe that x and xq∗ converge to zero in view of stability of A − K C and the fact that
eq ∗ ∈ L 2 .
This reasoning is similar to what we had for slowly time-varying system, but there’s no time
variation here after the switching has stopped so this is easier.
The above problem is rather special, and the solution and the method of proof have several
drawbacks:
• The Luenberger-based multi-estimator that we gave only works when P is a finite set. It is
not suitable for state-shared implementation. However, there are standard results in linear
identification that can be used to design state-shared multi-estimators. (We will not discuss
this.)
112 DANIEL LIBERZON
• Our controller design and multi-estimator design were coupled because both relied on the
Luenberger observer. However, the particular choice of candidate control laws (107) is just
one example, and can be easily changed. Assume, for example, that every system in the
family to which the plant (106) belongs is stabilizable by a static linear output feedback. In
other words, assume that for each p ∈ P there exists a matrix Kp such that the eigenvalues
of Ap − Bp Kp Cp have negative real parts. A straightforward modification of the above
argument shows that if we keep the estimators as they are but replace the control laws (107)
by up = −Kp y, we still achieve state regulation.
• Detectability of the feedback connection of the plant with the q ∗ -th controller seems to come
out of the blue. Actually, it is a natural consequence of the fact that this controller stabilizes
the corresponding plant model (for p∗ = q ∗ ). Here is another proof of detectability which
makes this clearer. Consider first the dynamics of xq∗ after switching stops, i.e., under the
action of the q ∗ -th controller:
ẋq∗ = (Aq∗ − Lq∗ Cq∗ )xq∗ − Bq∗ Kq∗ xq∗ + Lq∗ Cp ∗ x = (Aq∗ − Bq∗ Kq∗ )xq∗ − Lq∗ eq∗
| {z }
Cq∗ xq∗ −eq∗
Suppose that eq∗ → 0 (this is just for showing detectability, it might not actually be true).
Then xq∗ → 0 because Aq∗ − Bq∗ Kq∗ is Hurwitz. This implies that y = Cq∗ xq∗ − eq∗ → 0,
and also u = −Kq∗ xq∗ → 0. Since the plant is assumed to be detectable and u, y → 0, we
have x → 0 (apply an output injection argument again, this time to the plant only). We have
just shown that the (x, xq∗ )-system, driven by the q ∗ -th controller, is detectable with respect
to eq∗ as desired. Thus we know that the output injection matrix L, which we so magically
found earlier, must exist.
In any case, the above example was a useful illustration of the main ideas, and we will now take
another pass through the approach to refine and generalize it.
ẋ = Aσ x
(109)
eσ = C σ x
NONLINEAR AND ADAPTIVE CONTROL 113
then we want it to be detectable with respect to eσ . In other words, eσ being small (→ 0) should
imply that x is small (→ 0). Here x is the state of the plant, the multi-estimator, and the active
controller (at each time).
Does this switched detectability property follow from detectability for each frozen value of σ
(which we assumed earlier)?
No. This is similar to the stability issue, which for time-varying or switched systems doesn’t
follow from stability of individual fixed subsystems among which we are switching.
In fact, the connection with stability is quite direct. Recall that detectability for each fixed
index q is equivalent to the existence of a matrix Lq such that Aq − Lq Cq is Hurwitz. Use this to
rewrite the switched system as
ẋ = (Aσ − Lσ Cσ )x + Lσ eσ (110)
We know that the switched system
ẋ = (Aσ − Lσ Cσ )x
may not be stable even if each Aq − Lq Cq is Hurwitz. Some further properties of σ are required.
We saw this in stability of time-varying systems. One option is to require that σ be slow enough.
It no longer makes sense to bound its derivative (as in Chapter 8) because σ is not differentiable.
But there are several possible slow-switching conditions that work:
• Switching stops in finite time (as in the above example). Then stability is obvious. But this
condition is quite strong.
• There exists a sufficiently large dwell time τD , i.e., the time between any two consecutive
switches is lower-bounded by τD .
• There exists a sufficiently large average dwell time τAD . This means that the number of
switches on any interval (t, T ], which we denote by Nσ (T, t), satisfies
T −t
Nσ (T, t) ≤ N0 +
τAD
Here N0 ≥ 1 is an arbitrary number (it cannot depend on the choice of the time interval).
For example, if N0 = 1, then σ cannot switch twice on any interval of length smaller than
τAD . This is exactly the dwell time property. Note also that N0 = 0 corresponds to the case
of no switching, since σ cannot switch at all on any interval of length smaller than τAD . In
general, if we discard the N0 “extra” switches, then the average time between consecutive
switches is at least τAD . Average dwell time is more general than dwell time, because it
allows us to switch fast when necessary and then compensate for it by switching sufficiently
slowly later.
There is a large literature on stability of switched systems. It is known that any of the above
conditions guarantees that stability is preserved under switching. The last result (on average dwell
time) is due to Hespanha.
114 DANIEL LIBERZON
Now, suppose σ does satisfy one of the above conditions. Then we can conclude detectability
of (109), i.e., we know that x → 0 if eσ → 0, in view of (110).
But for this to be useful, we need to know that eσ is small. We hope that the switching logic
will somehow guarantee this, even if the switching doesn’t stop.
We are now ready to state, at least qualitatively, four main design objectives placed on the
individual components of the supervisory control system:
Detectability For each fixed controller, the closed-loop system is detectable through the corre-
sponding estimation error
Bounded Error Gain The signal eσ is bounded in terms of the smallest of the estimation errors
The Matching property is a requirement imposed on the multi-estimator design. The estimation
error that we can hope to be small is ep∗ , where p∗ is the true value of the unknown parameter.
The above somewhat more vague statement is a bit more general.
In Example 10, we could get ep∗ → 0 when there is no disturbance, and we could get ep∗
bounded when there is a bounded disturbance (regardless of the control law being applied).
For linear systems, there exists a theory for designing multi-estimators with such properties. For
some classes of nonlinear systems, this can be done along the lines of what we saw in Example 10
and earlier: put terms on the right-hand side of the multi-estimator that match the right-hand side
of the plant, and add linear damping.
We will not discuss the Matching property further.
The Detectability property is a requirement imposed on the candidate controllers. It is inter-
esting and we’ll discuss it further below.
The last two properties are requirements placed on the switching logic. We’ll also discuss them
a little bit below.
It is not difficult to see now, at least conceptually, how the above properties of the various
blocks of the supervisory control system can be put together to analyze its behavior.
Analysis:
)
Matching + Bounded Error Gain =⇒ eσ is small
=⇒ x is small
Detectability + Non-Destabilization =⇒ detectability with respect to eσ
NONLINEAR AND ADAPTIVE CONTROL 115
The first figure shows the closed-loop system which we want to be detectable, with eq viewed as
output.
y
Plant
uq Multi− yq − eq
Controller Cq Estimator +
The second figure shows an equivalent but more convenient representation of the same system.
y
Plant
uq Multi− yq − eq
Controller Cq Estimator +
y +
−
The system inside the box is called the injected system. It is the connection of the q-th controller
with the multi-estimator, and eq is an input injected into it.
−→ The reason this transformation is useful is because, as we already discussed earlier, the control
design should really be based on the design model (multi-estimator in the present setting) and not
on the unknown plant itself.
We now state, in very informal terms, a representative “theorem” which gives sufficient condi-
tions for detectability.
Theorem 12 Let xP , xC , xE be the states of the plant, the controller, and the multi-estimator, re-
spectively. Assume that:
eq small =⇒ xP , xC , xE small
“Proof”—almost immediate:
If eq is small, then 1) guarantees that xC and xE are small.
uq and yq are then small since they are functions of xC and xE .
Hence y = yq − eq is also small.
Finally, 2) guarantees that xP is small.
The above result can be made completely rigorous. Property 1) of the injected system can
be formalized using the ISS notion, i.e., by saying that the controller input-to-state stabilizes the
multi-estimator. Thus the material of Section 7.2 is directly relevant here.
−→ Actually, here things are even simpler because the estimation error eq , with respect to which
we want to achieve ISS, is known and can be used by the controller (while in Section 7.2 the
disturbance d was assumed to be unknown).
−→ If the dynamics are linear, then as we know this ISS property is automatic from the internal
asymptotic stabilization.
Property 2) is detectability of the plant, and can be formalized in a way similar to ISS (input-
output-to-state stability). A formal proof then requires some manipulations with class K∞ and KL
functions.
Actually, ISS is an overkill. Can use a weaker property, called integral-ISS, instead. This is
suitable because estimation errors are typically small in an integral sense, since we integrate them
to construct monitoring signals. The injected system we get for Example 10 with the obvious choice
of the control laws
y2 + y
up = − , p∈P
p
is not ISS, but it can be shown to be integral ISS.
Here is another representative result:
y, uq , yq small =⇒ xE small
eq small =⇒ xP , xC , xE small
1) is input-to-output stability of the injected system. This is weaker than 1) in the previous
theorem.
2) is essentially13 the minimum-phase property of the plant. This is stronger than 2) in the
previous theorem. The requirement that the plant be minimum-phase is actually quite common in
adaptive control; see [Ioannou-Sun, pp. 25, 332, 412, and elsewhere].
3), 4) are detectability of the controller and multi-estimator. These are usually reasonable
assumptions.
Can “prove” this result by a simple signal-chasing argument along the lines of the previous one.
Try it!
We now turn to the Bounded Error Gain and Non-Destabilization properties, which need to be
enforced by the switching logic.
The Bounded Error Gain property is about making eσ small. Its counterpart in continuous
adaptive control was the smallness of the output estimation error e (in the sense of L2 or convergence
to 0). Here, the bound for eσ is usually stated in terms of suitable integral expressions for the
estimation errors, which are related to the monitoring signals.
The Non-Destabilization property, as we discussed, is about σ switching slowly enough (in the
sense of dwell time, average dwell time, or termination of switching if we’re lucky). Its counterpart
˙
in continuous adaptive control was slow adaptation speed (θ̂ being in L2 or converging to 0).
It’s not hard to see that these two properties are actually conflicting. To enforce bounded error
gain, we want to switch to arg minp µp (t) as quickly as possible. But for non-destabilization, we
want to switch slowly enough (or even stop). So, need to find a compromise.
One option is to use the dwell-time switching logic, which enforces dwell time by construction.
This switching logic ensures non-destabilization, and is easy to implement. Bounded error gain
is harder to get, because if the switching is only allowed at event times separated by τD , then µσ is
not always guaranteed to be small compared to the other monitoring signals. Although there are
13
The definition of minimum phase is stated in terms of y ≡ 0, while here we say “y is small.” For linear
plants this makes no difference. For nonlinear plants, we need the notion of a “strongly minimum phase”
system as defined on page 86.
118 DANIEL LIBERZON
Initialize σ
Dwell
Time
τ =0
τ̇ = 1
σ =p
no τ ≥ τD ?
yes
∃p :
µp ≤ µσ
no yes
?
ways to handle this problem, we see that dwell-time switching has significant shortcomings. With
a prespecified dwell time, the performance of the currently active controller might deteriorate to an
unacceptable level before the next switch is permitted. If the system is nonlinear, the trajectories
may even escape to infinity in finite time.
Another option (more suitable for nonlinear plants) is to use the hysteresis switching logic.
Hysteresis means that we do not switch every time minp∈P µp (t) becomes smaller than µσ (t), but
switch only when it becomes “significantly” smaller. The threshold of tolerance is determined by a
hysteresis constant h > 0.
We already used this idea in Section 9.2. The only difference is that here we are using multi-
plicative and not additive hysteresis. (This is better for technical reasons; in particular, σ doesn’t
change if we scale all monitoring signals by some, possibly time-varying, factor.)
We already saw that when ep∗ is exponentially converging to 0 (the case of no disturbance),
switching stops in finite time.
It turns out that when ep∗ is just bounded (the case of bounded disturbance), the switching
doesn’t stop but we do get average dwell time, which can be made large by increasing the hysteresis
constant h. This is quite a remarkable result14 due to Hespanha and Morse.
14
In fact, the concept of average dwell time originally arose out of the study of hysteresis-based switching
logics for adaptive control.
NONLINEAR AND ADAPTIVE CONTROL 119
Initialize σ
σ =p
∃p :
no (1 + h)µp ≤ µσ yes
?
Summary of switching adaptive control Putting together the above design ingredients:
• Hysteresis switching logic (giving average dwell time and bounded error gain)
one obtains quite general results for linear systems as well as some classes of nonlinear systems.
The analysis proceeds along the lines outlined on page 114. It is of course not quite as simple, but
arguably simpler than analysis of traditional, continuously-tuned adaptive control algorithms. The
observed performance is also typically quite good. See the references given at the beginning of this
chapter.
Note that the first two bullets above are reiterations of issues we’ve already seen in continuous
adaptive control. First, we need to be able to design an estimation scheme—this places constraints
on what types of systems we can deal with (and in particular, on the way in which uncertain
parameters can enter the dynamics). In continuous adaptive control we typically required the
parameters to enter linearly, but this is not necessary here as long as state-sharing is possible.
Second, the controllers need to provide robustness to parameter estimation errors, beyond the
usual certainty equivalence stabilization assumption. We saw this type of property earlier already.
120 DANIEL LIBERZON
The last bullet is specific to switching adaptive control. Switching in fact allows us to overcome
some difficulties associated with continuous tuning, as already discussed earlier.
Optional exercise: simulate Example 10. (The ISS condition is not easily enforceable, but the
controller works anyway.)
10 Singular perturbations
Reference: [Khalil, Chapter 11]
b
g(s) =
(s − a)(εs + 1)
b 1
y= z, z= u
s−a εs + 1
ẏ = ay + bz
(111)
εż = −z + u
Here we think of ε as a small “parasitic” time constant, corresponding to a very fast extra pole
at −1/ε. It gives fast unmodeled dynamics. In the limit as ε → 0 the z-dynamics disappear, and
for ε = 0 the system reduces to
b
y= u
s−a
or
ẏ = ay + bu (112)
NONLINEAR AND ADAPTIVE CONTROL 121
while the differential equation for z degenerates into the hard constraint
z=u
One says that the system (111) is a singular perturbation of (112). The perturbation is “singu-
lar” because for ε > 0 the dimension jumps from 1 to 2, which is very different from perturbations
which simply contribute extra terms on the right-hand side of the nominal system.
There are other tools for handling unmodeled dynamics, based on robust control theory (and
small-gain theorems). We will not discuss those in this course (see [Ioannou-Sun, Sect. 8.2]).
ẋ = f (t, x, z, ε)
εż = g(t, x, z, ε)
where x ∈ Rn and z ∈ Rm . We think of the x-system as the slow system and of the z-system as the
fast system, since for ε close to 0, ż becomes very large (imagine dividing the second ODE by ε).
For ε = 0, the second ODE becomes the algebraic constraint
g(t, x, z, 0) = 0
z = h(t, x)
Plugging this into the x-system, and keeping ε = 0, we obtain the reduced system
Denote the solution of this system (starting from a given initial condition) as x̄(t).
We would like to know to what extent x̄(t) serves as a good approximation to the actual solution
x(t), i.e., to the x-component of the solution of our overall singularly perturbed system, when ε is
very small.
Even if ε is small, initially z(t) may be quite far from h(t, x), and the reduced system will not
be a valid approximation. However, since the dynamics of z are fast, we can expect that z(t) will
converge to h(t, x) very fast, and after some (small) time t∗ , z will be close to h(t, x). Then, if we
have initialized x̄(t) correctly, this initial fast transient of z will not have significant effect and x̄(t)
will be close to x(t) for all time.
But, to make sure that z becomes close to its equilibrium value h(t, x), we need to have some
stability property for the z-system.
122 DANIEL LIBERZON
To analyze the z-system, it is convenient to make the following two transformations. First,
introduce
z̄ := z − h(t, x)
which has the effect of shifting the equilibrium to the origin. We have
d d
εz̄˙ = εż − ε h(t, x) = g(t, x, z̄ + h(t, x), ε) − ε h(t, x)
dt dt
dz̄
= g(t, x, z̄ + h(t, x), 0) (115)
dτ
t = ετ
and so by setting ε = 0 we are freezing the original time t (and hence x(t) also).
−→ Here t and τ give two completely different time scales.
Assumption 1: The system (115) is exponentially stable, uniformly in (t, x). This means that we
have an exponential estimate
|z̄(τ )| ≤ ce−λτ |z̄(0)|
which holds for each fixed value of t and x in (115).
When ε is small but positive, t and x will vary—but slowly. In other words, in the τ time scale,
we’ll have a slowly time-varying system. And the ε-dependent terms will enter it. Using results on
stability of slowly-time varying systems (cf. Chapter 8) one can show that the actual z̄, described
by (114) for ε > 0, will eventually become of order ε, and x̄ will indeed be a good approximation
of x (again up to terms of order ε).
More precisely, the statement is as follows. Pick some times T > t∗ > 0. Let Assumption 1
hold, and let the system data satisfy appropriate technical conditions (smoothness, etc.). Then
there exists a ε∗ > 0 such that for all ε ∈ (0, ε∗ ) we have
and
z(t) − h(t, x̄(t)) = O(ε) ∀ t ∈ [t∗ , T ]
This is known as Tikhonov’s theorem. See [Khalil, Theorem 11.1] for a precise statement (and
a proof).
Geometrically: in the (x, z)-space, consider the surface z = h(x) (assume the system is au-
tonomous). On this surface, the dynamics are defined by ẋ = f (x, h(x)), z = h(x). Trajectories of
the full (x, z)-system approach this surface fast and then stay in an “ε-tube” around it.
The above estimates apply only on a finite interval [0, T ]. The O(ε) terms are valid on a
compact subset of the state space that contains the state up to time T , and are not in general valid
as t → ∞.
We can actually get an approximation on the infinite time interval, if we assume that the
reduced system is stable.
Assumption 2: x = 0 is an exponentially stable equilibrium of the reduced system (113).
With this additional assumption, the previous result is valid for T = ∞. See [Khalil, Theorem
11.2].
The above result does not guarantee exponential stability of the origin. In fact, the origin is in
general not an equilibrium point of the full system, for ε > 0; we only know that (x, z̄) = (0, 0) is an
equilibrium for ε = 0. Note that z = h(t, x) is not an equilibrium of the z-subsystem unless ε = 0.
If we make further assumptions to ensure that h(t, 0) = 0 and (x, z) = (0, 0) is an equilibrium of
the full system for ε > 0, then its exponential stability does follow from Assumptions 1 and 2 via
Lyapunov analysis. See [Khalil, Theorem 11.4].
r = sin ωt
Let us first ignore the z-dynamics and follow the direct control approach of Section 6.5.1 where
we designed the adaptive controller for the scalar plant (112).
The control law we derived for direct MRAC is
e := ym − y
satisfies
ė = −am e + bk̃y − b˜lr − b(z − u)
Next, use y = ym − e to rewrite the full closed-loop system using the (e, k̃, ˜l) coordinates15 :
ė −am e + bk̃(A sin(ωt + α) − e) − b˜l sin ωt b
˙
k̃ = −γe(A sin(ωt + α) − e) − 0 (z − u)
˙˜ γe sin ωt 0
l | {z }
=: F (t, e, k̃, ˜l)
εż = −z + u
Define T
x := e, k̃, ˜l
We can write the control law (116) as
In particular, output tracking is achieved up to order ε. This characterizes robustness of our direct
MRAC design to fast unmodeled dynamics. (The general result also provides a similar estimate
for z, but it’s not important for us here.)
Can we try to show that the closed-loop system is exponentially stable for ε small enough?
No, because the origin is not even an equilibrium of the closed-loop system. This is because
(See also the last paragraph of the previous subsection.) The effect of unmodeled dynamics does
not necessarily diminish with time, the displacement is small but persistent.
Optional exercise: simulate the system and investigate its behavior.
11 Conclusion
So, can we now define what “adaptive control” means?
Remaining lectures: final project presentations.