LQG Control With Missing Observations and Control Packets
LQG Control With Missing Observations and Control Packets
∗
Elec. Eng. Comp. Scien. Dept., U.C. Berkeley, U.S.A.,
{sinopoli,poolla,sastry}@eecs.berkeley.edu
∗∗
Dept. of Inform. Eng., Univ. of Padova, Italy,
[email protected]
∗∗∗
Elec. and Comp. Eng. Dept., U.C. San Diego, U.S.A,
[email protected]
Abstract: The paper considers the Linear Quadratic Gaussian (LQG) optimal
control problem in the discrete time setting and when data loss may occur
between the sensors and the estimation-control unit and between the latter and
the actuation points. For protocols where packets are acknowledged at the receiver
(e.g. TCP type protocols), the separation principle holds. Moreover, the optimal
LQG control is a linear function of the state. Finally, building upon our previous
results on estimation with unreliable communication, the paper shows the existence
of critical arrival probabilities below which the optimal controller fails to stabilize
the system. This is done by providing analytic upper and lower bounds on the
cost functional, and stochastically characterizing their convergence properties in
the infinite horizon. More interestingly, it turns out that when there is no feedback
on whether a control packet has been delivered or not (e.g. UDP type protocols),
the LQG optimal controller is in general nonlinear. Copyright c 2005 IFAC.
1
2004) are extended to the control case, showing
the existence of critical values for the parameters
of the Bernoulli arrival processes, below which a
transition to instability occurs and the optimal
controller fails to stabilize the system in both the
TCP and the UDP settings. In other words, in
order to have stability, the observation and control
packet loss rates must be below a given threshold
that depends on the dynamics of the system.
Finally, we want to mention some related work.
Study of stability of dynamical systems where
components are connected asynchronously via
communication channels has received considerable
Fig. 1. Overview of the system.We study attention in the past few years and our contribu-
the statistical convergence of the expected tion can be put in the context of the previous
state covariance of the discrete time LQG literature. In (Gupta et al., 2004), the authors
is performed, where both the observation proposed to place an estimator, i.e. a Kalman
and the control signal, travelling over an filter, at the sensor side of the link without as-
unreliable communication channel, can be suming any statistical model for the data loss pro-
lost at each time step with probability 1 − γ̄ cess. Other work includes Nilsson (Nilsson, 1998)
and 1 − ν̄ respectively. that presents the LQG optimal regulator with
bounded delays between sensors and controller,
implications of using unreliable networks for con- and between the controller and the actuator. In
trol. These require a generalization of classical this work bounds for the critical probability values
control techniques that explicitly take into ac- are not provided. Additionally, there is no analytic
count the stochastic nature of the communication solution for the optimal controller. The case where
channel. dropped measurements are replaced by zeros is
considered by Hadijcostis and Touri (Hadjicostis
Communication channels typically use one of two and Touri, 2002), in the scalar case. Other ap-
kinds of protocols: Transmission Control (TCP) or proaches include using the last received sample for
User Datagram (UDP). In the first case there is control, or designing a dropout compensator (Ling
acknowledgement of received packets, while in the and Lemmon, 2003), which combines in a single
second case no-feedback is provided on the com- process estimation and control.
munication link. This paper studies the effect of
data losses due to the unreliability of the network This paper considers the alternative approach
links. It generalizes the Linear Quadratic Gaus- where the external compensator feeding the con-
sian (LQG) optimal control problem —modeling troller is the optimal time varying Kalman gain.
the arrival of both observations and control pack- Moreover, the proposed solution is analyzed in
ets as random processes whose parameters are re- state space domain rather than in frequency do-
lated to the characteristics of the communication main as it was presented in (Ling and Lem-
channel. Accordingly, two independent Bernoulli mon, 2003), and this paper considers the more
processes are considered, of parameters γ and ν, general Multiple Input Multiple Output (MIMO)
that govern packet loss between the sensors and case. The work of (Imer et al., 2004) is the closest
the estimation-control unit, and between the lat- to the present paper. In addition we consider the
ter and the actuation points, see Figure 1. more general case when the matrix C is not the
identity and there is noise in the observation and
It turns out that in the TCP case the classic sepa- in the process.
ration principle holds and the optimal controller is
a linear function of the state. However, in the UDP The paper is organized as follows. Section 2 will
case, a counter-example shows that the optimal provide a mathematical formulation for the prob-
controller is in general non-linear. A similar, but lem. Section 3 provides some preliminary results.
slightly less general special case was previously Section 4 illustrates the TCP case, while the UDP
analyzed by (Imer et al., 2004), considering not case is studied in section 5. Finally conclusions
only the observation noise but also the process and directions for future work are presented in
noise to be zero and the input coefficient matrix section 6.
to be invertible.
A final set of results that this paper provides 2. PROBLEM FORMULATION
are on convergence in the infinite horizon. In
this case, previous results on estimation with Consider the following linear stochastic system
missing observation packets in (Sinopoli et al., with intermittent observations:
2
xk+1 = Axk + νk Buk + wk (1) E x0k Sxk | Ik = x̂0k S x̂k + trace SPk|k =
(b)
yk = Cxk + vk , (2) = x̂0k S x̂k + E e0k Sek | Ik , ∀S
(c) E [E[ g(xk+1 ) |Ik+1 ] | Ik ] = E [g(xk+1 ) | Ik ] , ∀g(·).
where (x0 , wk , vk ) are Gaussian, uncorrelated,
white, with mean (x̄0 , 0, 0) and covariance (P0 , Q, Rk )
respectively, Rk = γk R + (1 − γk )σ 2 I, and Use of the following properties will prove to be
(γk , νk ) are i.i.d. Bernoulli random variables with useful when deriving the equation for the optimal
P (γk = 1) = γ̄ and P (νk = 1) = ν̄. Let us define LQG controller. Let us compute the following
the following information sets: expectation:
∆
Fk = {yk , γ k , ν k−1 }, TCP comm. protocol
Ik = ∆ E[x0k+1 Sxk+1 | Ik ] =
Gk = {yk , γ k }, UDP comm. protocol
(3) = E[(Axk + νk Buk + wk )0 S(Axk + νk Buk + wk ) | Ik ]
where yk = (yk , yk−1 , . . . , y1 ), γ k = (γk , γk−1 , . . . , γ1 ),
= E[x0k A0 SAxk | Ik ] + ν̄u0k B 0 SBuk +
and ν k = (νk , νk−1 , . . . , ν1 ).
+ 2ν̄u0k B 0 SA x̂k|k + trace(SQ), (7)
Consider also the following cost function:
JN (uN −1 ) = (4) where both the independence of νk , wk , xk , and
" N −1
# the zero-mean property of wk are exploited. The
X previous expectation holds true for both the in-
= E x0N WN xN + (x0k Wk xk + νk u0k Uk uk ) IN .
formation sets Ik = {Fk , Gk }. Also
k=0
Note that we are weighting the input only if it E[e0k|k T ek|k | Ik ] = trace(T E[ek|k e0k|k | Ik ]) =
is successfully received at the plant. In fact, if it
is not received, the plant applies zero input and = trace(T Pk|k ). (8)
therefore there is no energy expenditure.
4. TCP
We now look for a control input sequence u∗N −1
as a function of the admissible information set Ik , First, equations for the optimal estimator are
i.e. uk = gk (Ik ), that minimizes the functional
defined in Equation (4), i.e. derived. They will be needed to solve the LQG
controller design problem, as it will be shown
∗ ∆
JN = min JN (uN −1 ) = JN (u∗N −1 ), (5) later.
uN −1
Before proceeding, let us define the following = (I − γk+1 Kk+1 C)ek+1|k − γk+1 Kk+1 vk+1
variables: Pk+1|k+1 = Pk+1|k − γk+1 Kk+1 CPk+1|k (14)
∆
x̂k|k = E[xk | Ik ],
∆ 0 0 −1
∆
ek|k = xk − x̂k|k , (6) Kk+1 = Pk+1|k C (CPk+1|k C + R) , (15)
∆
Pk|k = E[ek|k e0k|k | Ik ]. after taking the limit σ → +∞. The initial
Derivations below will make use of the following conditions for the estimator iterative equations
facts: are x̂0|−1 = 0 and P0|−1 = P0 .
3
will follow the dynamic programming approach Therefore, the cost function for the optimal LQG
based on the cost-to-go iterative procedure. using TCP is given by:
∗
JN = V0 (x0 ) = x̄00 S0 x̄0 + trace(S0 P0 ) + trace(Sk+1 Q))
Define the optimal value function Vk (xk ) as fol-
lows: N −1
X
∆
VN (xN ) = E[x0N WN xN | FN ] + (trace (A0 Sk+1 A + Wk − Sk )Eγ [Pk|k ] . (23)
∆ k=0
Vk (xk ) = min E[x0k Wk xk + νk u0k Uk uk + Vk+1 (xk+1 ) | Fk ]
uk
The matrices {Pk|k }Nk=0 are stochastic since they
Using dynamic programming theory (Bertsekas
and Tsitsiklis, 1996), one can show that JN ∗
= are nonlinear functions of the sequence {γk }. The
V0 (x0 ). We claim that the value function Vk (xk ) exact expected value of these matrices cannot
can be written as: be computed analytically, as shown in (Sinopoli
et al., 2004). However, they can be bounded by
Vk (xk ) = E[ x0k Sk xk | Fk ] + ck , k = 0, . . . , N (16)
computable deterministic quantities. In fact let us
where the matrix Sk and the scalar ck are consider the following equations:
to be determined and are independent of the
information set F. The proof follows an induction bk+1|k = APbk|k−1 A0 + Q −
P
argument. The claim is certainly true for k = N bk|k−1 C 0 (C Pbk|k−1 C 0 + R)−1 C Pbk|k−1 A0 (24)
with the choice of parameters SN = WN and cN = + γ̄AP
0. Suppose now that the claim is true for k + 1, bk|k = Pbk|k−1 − γ̄ Pbk|k−1 C 0 (C Pbk|k−1 C 0 + R)−1 C Pbk|k−1
P
i.e. Vk+1 (xk+1 ) = E[ x0k+1 Sk+1 xk+1 | Fk+1 ]+ck+1 .
The value function at time step k is the following: ek+1|k = (1 − γ̄)APek|k−1 A0 + Q
P (25)
ek|k = (1 − γ̄)Pek|k−1
P (26)
Vk (xk ) =
initialized to Pb0|−1 = Pe0|−1 = P0 . Using similar
= min E[x0k Wk xk + νk u0k Uk uk + Vk+1 (xk+1 ) | Ik ] arguments as those in (Sinopoli et al., 2004),
uk
it is possible to show that the matrices Pk|k ’s
= E[x0k Wk xk + x0k A0 Sk+1 Axk | Ik ] +
are concave and monotonic functions of Pk|k−1 .
+trace(Sk+1 Q) + E[ck+1 | Ik ] + (17) Therefore, the following bounds are true:
+ν̄ min u0k (Uk 0
+ B Sk+1 B)uk + 2u0k B 0 Sk+1 A x̂k|k ek|k ≤ Eγ [Pk|k ] ≤ Pbk|k ,
P (27)
uk
4
same conclusions can be drawn and the previous probabilities νmin and γmin can be computed via
result can be summarized in the following theo- the solution of the following LMIs optimization
rem: problems:
γmin = argminγ̄ Ψγ (Y, Z) > 0, 0 ≤ Y ≤ I.
Theorem 2. [Finite Horizon LQG under TCP]
Consider the system (1)-(2) and consider the Ψγ (Y, Z) =
problem of minimizing the cost function (4) with √ p
Y γ(Y A + ZC) 1 − γY A
policy uk = f (Fk ), where Fk is the information √
= γ(A 0 0 0
available under TCP communication, given in p Y +C Z ) Y 0
1 − γA0 Y 0 Y
Equation (3). Then, the optimal control is a
linear function of the estimated system state νmin = argminν̄ Ψν (Y, Z) > 0, 0 ≤ Y ≤ I.
given by Equation (18), where the matrix Sk Ψν (Y, Z) =
can be computed iteratively using Equation (21). " √ √ #
√ Y ν(Y A0 + ZB 0 ) 1 − νY A0
The separation principle still holds under TCP = ν(AY + BZ 0 ) Y 0
√
communication, since the optimal estimator is 1 − νAY 0 Y
independent of the control input uk . The optimal
state estimator is given by Equations (9)-(12) and 5. UDP
(11)-(15), and the minimal achievable cost is given
by Equation (23). In this section equations for the optimal estimator
and controller design for the case of UDP com-
Theorem 3. (Infinite Horizon LQG under TCP).
Consider the same system as defined in the pre- munication protocol are derived. The UDP case
vious theorem with the following additional hy- corresponds to the information set Gk , as defined
pothesis: WN = Wk = W and Uk = U . Moreover, in Equation (3). Some of the derivations are anal-
1
let (A, B) and (A, Q 2 ) be stabilizable, and let ogous to the previous section and are therefore
1
(A, C) and (A, W 2 ) be detectable. Let us consider skipped.
the limiting case N → +∞. There exist critical
arrival probabilities νmin and γmin which satisfy 5.1 Estimator Design, σ → +∞
the following property:
We derive the equations for the optimal estimator
1
min 1, 1 − ≤ νmin ≤ 1, (31) using similar arguments to the standard Kalman
|λmax (A)|2
filtering equations. The innovation step is given
min 1, 1 −
1 by:
≤ γmin ≤ 1, (32) ∆
|λmax (A)|2 x̂k+1|k = E[xk+1 |Gk ] = E[Axk + νk Buk + wk |Gk ]
where |λmax (A)| is the eigenvalue of matrix A = AE[xk |Gk ] + E[νk ]Buk
with the largest absolute value, such that for all
= Ax̂k|k + ν̄Buk (35)
ν̄ > νmin and γ̄ > γmin we have:
∆
Lk = L∞ = −(B 0 S∞ B + U )−1 B 0 S∞ A (33) ek+1|k = xk+1 − x̂k+1|k (36)
1 min 1 ∗ 1 max ∆
Pk+1|k = E[ek+1|k e0k+1|k |Gk ]
JN ≤ JN ≤ JN (34)
N N N = APk|k A0 + Q + ν̄(1 − ν̄)Buk u0k B 0 , (37)
min max
where the mean cost bounds J∞ , J∞ are given
by: where we used the independence and zero-mean
max 1 max of wk , (νk −ν̄), and Gk . Note how under UDP com-
J∞ = lim J
N →+∞ N N munication protocol, differently from TCP com-
b∞ −
= trace((A0 S∞ A + Wk − S∞ )(P munication, the error covariance Pk+1|k depends
explicitly on the control input uk . This is the main
b∞ C 0 (C Pb∞ C 0 + R)−1 C Pb∞ )) + trace(S∞ Q)
+ γ̄ P
difference with TCP.
min 1 min
J∞ = lim JN The correction step is the same as for the TCP
N →+∞ N
case, given by:
= (1 − γ̄)trace 0 e∞
(A S∞ A + Wk − S∞ )P +
x̂k+1|k+1 = x̂k+1|k + γk+1 Kk+1 (yk+1 − C x̂k+1|k )
+ trace(S∞ Q),
Pk+1|k+1 = Pk+1|k − γk+1 Kk+1 CPk+1|k , (38)
and the matrices S∞ , P ∞ , P ∞ are: ∆ 0 0 −1
Kk+1 = Pk+1|k C (CPk+1|k C + R) , (39)
5
of the state estimate, since estimator and con- state. Moreover, the infinite horizon cost function
troller design cannot be separated anymore. To J∞ is bounded if arrival probabilities γ̄, ν̄ are
show this, we consider a simple scalar system higher than a specified threshold. UDP-like pro-
and we proceed using the dynamic programming
approach. Let us consider the scalar system where tocols present a much more complex scenario, as
A = 1, B = 1, C = 1, WN = Wk = 1, Uk = 0, R = the lack of acknowledgement of the control packet
1, Q = 0. Similarly to the TCP case, the value at the controller makes the separation principle
function, Vk (xk ) for k = N is given by VN (xN ) = not valid anymore. Estimation and control are
E[x0N WN xN | GN ] = E[x2N | GN ]. Also, it is easy now coupled. The paper shows that in general
to show that VN −1 = (2 − ν̄)E[x2N −1 | GN −1 ] + the optimal control is non linear. The control law
ν̄PN −1|N −1 and u∗N −1 = −x̂N −1|N −1 . Let us con- cannot be determined in closed form, making this
sider the value function for k = N − 2: solution impractical.
VN −2 (xN −2 ) =
Future work will involve the study of special cases,
= min E[x2N −2 + VN −1 (xN −1 ) | GN −2 ]
uN −2 where, under UDP, the optimal controller is still
= E[(3 − ν̄)x2N −2 | GN −2 ] + γ̄ + ν̄PN −2|N −2 + linear. From a practical standpoint, it is also
useful to compute the optimal static linear control
+ν̄(1 − γ̄)PN −2|N −2 +
for the UDP case. Even though this constitutes a
+ min ν̄(2 − ν̄)u2N −2 + 2ν̄(2 − ν̄)uN −2 x̂N −2|N −2 suboptimal solution for the original problem, ease
uN −2
of computation and implementation will make it
+ν̄ 2 (1 − ν̄)(1 − γ̄)u2N −2 +
a valuable resource for the designer.
1
+ ν̄ γ̄ (40) REFERENCES
PN −2|N −2 + ν̄(1 − ν̄)u2N −2 + 1
Bertsekas, D. and J. Tsitsiklis (1996). Neuro-
The first three terms inside the round parenthesis Dynamic Programming. Athena Scientific.
are convex quadratic functions of the control input Gupta, V., D. Spanos, B. Hassibi and R. M. Mur-
uN −2 , however the last term is not. Therefore, the ray (2004). Optimal LQG control across a
minimizer u∗N −2 is, in general, a non-linear func- packet-dropping link. Technical report. Cal-
tion of the information set Gk . We can summarize ifornia Institute of technology. Preprint, sub-
this result in the following theorem: mitted for publication.
Hadjicostis, C. N. and R. Touri (2002). Feed-
Theorem 4. Let us consider the stochastic system back control utilizing packet dropping net-
defined in Equations (1) with horizon N ≥ 2. work links. In: Proceedings of the 41st IEEE
Then, the optimal control feedback uk = gk∗ (Gk ) Conference on Decision and Control. Las Ve-
that minimizes the functional (4) under UDP is, gas, NV. Invited.
in general, a nonlinear function of information Imer, O. C., S. Yuksel and T. Basar (2004). Op-
set Gk . timal control of dynamical systems over un-
The nonlinearity of the input feedback arises reliable communication links. In: NOLCOS.
from the fact that the correction error covariance Stutgart, Germany.
matrix Pk+1|k+1 is a non-linear function of the Ling, Q. and M.D. Lemmon (2003). Optimal
innovation error covariance Pk+1|k . dropout compensation in networked control
systems. In: IEEE conference on decision and
control. Maui, HI.
6. CONCLUSION AND FUTURE WORK Nilsson, Johan (1998). Real-Time Control Sys-
tems with Delays. PhD thesis. Department of
This paper analyzes the LQG control problem in Automatic Control, Lund Institute of Tech-
the case where both observation and control pack- nology.
ets may be lost when travelling through a com- Sinopoli, B., C. Sharp, S. Schaffert, L. Schen-
munication channel. This is the case of many dis- ato and S. Sastry (2003). Distributed control
tributed systems, where sensors, controllers and applications within sensor networks. Proceed-
actuators physically reside in different locations ings of the IEEE, Special Issue on Distributed
and they have to rely on network communication Sensor Networks 91(8), 1235–1246.
to exchange information. In this context the paper Sinopoli, B., L. Schenato, M. Franceschetti,
presents analysis of the LQG control problem for K. Poolla, M.I. Jordan and S. Sastry (2004).
two types of protocols, i.e. TCP and UDP. In the Kalman filtering with intermittent observa-
first case packet acknowledgement of arrival of tions. IEEE Transactions on Automatic Con-
control packets is available to the controller, while trol 49(9), 1453–1464.
it is not available in general in the second case. Sinopoli, B., L. Schenato, M. Franceschetti,
For TCP-like protocols a solution for a general K. Poolla, M.I. Jordan and S. Sastry (2005).
LTI stochastic system is provided for both the Optimal control with unreliable communica-
finite and infinite horizon case, showing that the tion: the tcp case. In: American Control Con-
optimal control is still a linear function of the ference. Portland, OR.