0% found this document useful (0 votes)
79 views7 pages

Gaussian Process Latent Force Models For Learning and Stochastic Control of Physical Systems

This document summarizes Gaussian process latent force models (LFMs) for learning and controlling physical systems with unknown input signals. LFMs combine a physical model with a non-parametric Gaussian process model for the unknown inputs. The document reviews statistical inference methods for LFMs, introduces stochastic control methods, and provides new theoretical results on observability and controllability. While LFMs are generally observable, they are never fully controllable, though output-controllability is still possible. Learning in LFMs involves estimating unknown functions from noisy measurements using Gaussian processes.

Uploaded by

aa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views7 pages

Gaussian Process Latent Force Models For Learning and Stochastic Control of Physical Systems

This document summarizes Gaussian process latent force models (LFMs) for learning and controlling physical systems with unknown input signals. LFMs combine a physical model with a non-parametric Gaussian process model for the unknown inputs. The document reviews statistical inference methods for LFMs, introduces stochastic control methods, and provides new theoretical results on observability and controllability. While LFMs are generally observable, they are never fully controllable, though output-controllability is still possible. Learning in LFMs involves estimating unknown functions from noisy measurements using Gaussian processes.

Uploaded by

aa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

TO APPEAR IN IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. XX, NO.

XX, XXXX 201X 1

Gaussian Process Latent Force Models for Learning and Stochastic


Control of Physical Systems
Simo Särkkä, Senior Member, IEEE, Mauricio A. Álvarez, and Neil D. Lawrence

Abstract— This article is concerned with learning and stochastic models


control in physical systems which contain unknown input signals. ∂f (x, t)
These unknown signals are modeled as Gaussian processes (GP) = Af f (x, t) + Bf u(x, t) + Mf c(x, t),
with certain parametrized covariance structures. The resulting ∂t (4)
latent force models (LFMs) can be seen as hybrid models that yk = Cf f (xk , tk ) + k ,
contain a first-principles physical model part and a non-parametric
where now Af is a matrix of spatial operators and Bf , Cf , and Mf
arXiv:1709.05409v2 [cs.SY] 13 Aug 2018

GP model part. We briefly review the statistical inference and


learning methods for this kind of models, introduce stochastic are given matrices.
control methodology for the models, and provide new theoretical In this article, we specifically concentrate on the above two
observability and controllability results for them. general classes of models. The aim is to consider the problems of
learning (estimating) the functions f (·) and u(·) from a set of noisy
Index Terms— Machine learning, Stochastic optimal con-
measurements {yk } as well as jointly design the optimal control
trol, Stochastic systems, System identification, Kalman fil-
function c(·). When the input function u(·) is modeled as a Gaussian
tering
process [6] with a covariance structure allowing for a state-space
representation [7]–[9], then the models have a tight connection to
I. I NTRODUCTION classical stochastic control theory. In that case it turns out that we
can readily apply some of the theory and methodology of Kalman
This article is concerned with the methodology and theory for filters and linear quadratic controllers on them provided that we recast
learning and stochastic control in Gaussian process latent force the model as an augmented white-noise driven state-space system.
models (LFMs) [1]–[5]. An example of such LFM is a second order The main contributions of the article are the stochastic optimal
differential equation model of a physical system control methods for LFMs as well as the theoretical results in
observability and controllability of the models. In particular, we show
d2 f (t) df (t) (1)
+λ + γ f (t) = u(t) + c(t), that although LFMs are observable in quite general conditions, they
dt2 dt are never controllable. However, as we discuss in the article, the non-
where λ, γ > 0 are parameters of the physical system, the input signal controllability is not a problem in applications, because they still are
u(t) and the solution f (t) are unknown, and c(t) is a control function output-controllable with respect to the physical system part and hence
to be optimized. Further assume that we measure the function f (t) the only uncontrollable part is the unknown input.
via noisy measurements at discrete instants of time t1 , t2 , . . . , tn via The learning methods for LFMs have previously been presented
the model yk = f (tk ) + k , where k is a Gaussian measurement in conference and journal articles [1]–[5], and they are also closely
noise and k = 1, . . . , n. related to the regularization network methodology considered already
Another example of a problem of interest is the controlled heat earlier in [10], [11]. The learning problem is also related to so
equation which we again measure via noisy measurements: called input estimation problem that has previously been addressed
in the target tracking literature (e.g. [12]) by replacing the input with
∂f (x, t)
= D ∇2 f (x, t) − λ f (x, t) + u(x, t) + c(x, t), (2) a white or colored noise. Another approach to this problem is to
∂t
use disturbance observers [13]. However, here we will specifically
where D, λ > 0 are given constants. The aim is to learn both the concentrate on the Gaussian process based machine learning point
input signal u(x, t) and the function f (x, t) from noisy observations of view which allows for encoding prior information into the driving
yk = f (xk , tk ) + k , and to design a control c(x, t) for regulating input as well as the use of modern machine learning methods for
the heat. coping with the related hyperparameter estimation problems and
The model (1) is a special case of state-space models of the form model extensions.
df (t) A. Learning in Gaussian process latent force models
= Af f (t) + Bf u(t) + Mf c(t),
dt (3)
yk = Cf f (t) + k . In machine learning, Gaussian processes (GPs) [6] are commonly
used as prior distributions over functions f (ξ). When used for
where f (t), u(t), and c(t) are vector-valued functions, and Af , Bf , regression, the GP encodes the uncertainty we have over a function,
Cf , and Mf are given matrices with appropriate dimensions. The before seeing the data. Given a set of noisy measurements pairs
second model (2) is a special case of spatio-temporal state-space D = {(ξk , yk )}n k=1 with, for example, yk = f (ξ k ) + k , where
k is a vector of Gaussian noises, we can then compute the posterior
Manuscript received August 15, 2018; revised XXX. Gaussian process using the Gaussian process regression equations [6]
Simo Särkkä is with the Department of Electrical Engineering and and use it to make predictions on test points. In the current article
Automation (EEA), Aalto University, Rakentajanaukio 2c, 02150 Espoo,
Finland ([email protected]). Tel. +358 50 512 4393
we consider cases where ξ = t is the time and ξ = (x, t), where the
Mauricio A. Álvarez is with the Department of Computer Science, input consists of both spatial and time components.
University of Sheffield, Sheffield, UK S1 4DP In Gaussian process regression notation [6] we write
Neil D. Lawrence is with the Department of Computer Science, Uni-
versity of Sheffield, Sheffield, UK S1 4DP and with Amazon, Cambridge,
f (ξ) ∼ GP(0, K(ξ, ξ0 )),
(5)
UK. yk = f (ξ) + k ,
2 TO APPEAR IN IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. XX, NO. XX, XXXX 201X

where K(ξ, ξ0 ) is a given covariance function, and the computational often represented in a similar state-space form, which now becomes
aim is to do inference on the posterior distribution of f (·) conditioned ∂f (x, t)
on the measurements D (obtained by the Bayes’ rule) as well as on = Af f (x, t) + Bf Cu z(x, t) + Mf c(x, t),
∂t
the parameters of the covariance function. Above, we have, without ∂z(x, t) (11)
loss of generality, assumed that the a priori Gaussian process has = Au z(x, t) + Bu w(x, t),
∂t
zero mean. yk = Cf f (xk , tk ) + k .
As shown in [1]–[5], given a model of the form (3) with
u(t) ∼ GP(0, K(t, t0 )) or a model of the form (4) with u(x, t) ∼ In order to obtain a single augmented model, we can define
GP(0, K(x, t; x0 , t0 )) the functions f (t) and f (x, t) are Gaussian
     
f Af Bf Cu Mf
g= , A= , M=
processes as well, and their covariance functions can be expressed u 0 Au 0 (12)
in terms of the impulse response or Green’s function of the (partial) 
B = 0 Bu , C = C f 0 ,

differential equation together with the covariance function of u. This
allows us to reduce inference on LFMs to ordinary GP regression. which leads to a model of the form
Another point of view is discussed in [4], [5] (see also [10], ∂g(x, t)
= A g(x, t) + B w(x, t) + M c(t),
[11]). In that approach the input GP u(t) ∼ GP(0, K(t, t0 )) is ∂t (13)
converted into an equivalent state-space representation by using a yk = C g(xk , tk ) + k .
spectral factorization: The joint state-space representations (10) and (13) of the LFMs now
dz(t) allows for full Bayesian inference in the models to be performed with
= Au z(t) + Bu w(t), Kalman filtering and smoothing methods [4], [5], [8]. Furthermore,
dt (6)
u(t) = Cu z(t). these representations also allow us to study control problems on
LFMs which aim at designing controller functions c. This problem
Here the state-vector typically consists of a set of derivatives of the is addressed in the next section.
process z = (u, du/dt, . . . , ds−1 u/dts−1 ), and w(t) is a vector-
valued white-noise process with a given spectral density matrix. II. S TOCHASTIC C ONTROL OF G AUSSIAN P ROCESS
The advantage of this kind of model formulation is that it allows L ATENT F ORCE M ODELS
for solving the GP regression problem using Kalman filters and
In this section, we discuss the stochastic control problems related
smoothers [14] in O(n) time when the traditional GP takes O(n3 )
to latent force models. In particular, we provide and analyze the
time (here n denotes the number of measurements).
solutions for the linear quadratic regulation (LQR) problem for them.
The same idea can be extended to spatio-temporal Gaussian
processes [8], [15]. The conversion of a spatio-temporal covariance
function into state-space form leads to a system of the form A. Controlled temporal LFMs
Let us consider the state-space model with a Gaussian process
∂z(x, t)
= Au z(x, t) + Bu w(x, t), input (3):
∂t (7)
u(x, t) = Cu z(x, t), df (t)
= Af f (t) + Bf u(t) + Mf c(t). (14)
dt
where Au is a matrix of linear operators (typically pseudo-differential We will specifically aim to consider optimal control problems which
operators) acting on the x-variable and w(x, t) is a vector-valued minimize the quadratic cost functional
time-white spatio-temporal Gaussian process with a given spectral
1 h
density kernel. In this case the inference can be done using infinite- J [c] = E f > (T ) Φ f (T )
dimensional Kalman filters and smoothers which typically are ap- 2
Z T i (15)
proximated with their finite-dimensional counterparts. More details + (f > (t) X(t) f (t) + c> (t) U(t) c(t)) dt ,
can be found in [8], [15]. 0
We can now combine the state-space ODE (3) with the state- where E[·] denotes the expected value, Φ, X(t), and U(t) are
space representation of LFMs to obtain an augmented state-space positive semidefinite matrices for all t ≥ 0, and T is the target time,
representation of the LFM [4], [5]: because they lead to computationally tractable control laws. However,
the principle outlined here can also be extended to more general
df (t)
= Af f (t) + Bf Cu u(t) + Mf c(t), cost functionals although the numerical methods become order of
dt
magnitude more complicated.
du(t) (8)
= Au u(t) + Bu w(t), A straightforward approach to optimal control with the quadratic
dt
cost (15) is to use the separation principle of linear estimation and
yk = Cf f (t) + k .
control which amounts to designing the optimal controller for the case
If we now define u(t) = 0 and use it in cascade with a Kalman filter. This indeed is
      the optimal solution in the case of white u(t), but not in our case.
f Af Bf Cu Mf
g= , A= , M= The correct approach in this case, which also utilizes the learning
u 0 Au 0 (9)
  outcome of the Gaussian process regression is to use the augmented
B= 0 Bu , C = C f 0 , state space model with the control signal. In this case it is given as
then the model can be written as a white-noise driven model (see (10))

dg(t)
= A g(t) + B w(t) + M c(t), dg(t)
dt (10) = A g(t) + B w(t) + M c(t), (16)
dt
yk = C g(t) + k .
with the measurement model given in (10) and the matrices A and
Spatio-temporal models (4) driven by Gaussian processes can also be B as defined in (9). We now aim to design a controller for the above
PREPRINT 3

model by assuming a perfectly observed state and run it in cascade where X and U are constant semidefinite matrices. By rewriting the
with a Kalman filter processing the measurements in the model. This model as an augmented state-space model as we did in the previous
yields to a controller which jointly learns the functions f and u and section and by following the classical results, the controller becomes
jointly optimizes the control with respect to the cost criterion [16].
In this case the control cost function can be rewritten in form c(t) = −U−1 M> P ĝ(t), (24)
1 h
J [c] = E g> (T ) Φg g(T ) where the matrix P is the solution to the algebraic Riccati equation
2 (ARE)
Z T i (17)
+ (g> (t) Xg (t) g(t) + c> (t) U(t) c(t)) dt , 0 = −A> P − P A + P M U−1 M> P − Xg , (25)
0
 
where X 0
where Xg = .
0 0
   
Φ 0 X(t) 0
Φg = , Xg (t) = . (18) By solving the control law from these equations, we get a controller
0 0 0 0
which is function of both the estimate of the function f and estimate
The design of the optimal linear quadratic controller for the resulting of the input u. Thus this control law is able to utilize both the estimate
model can be done by using the classical Riccati-equation-based of the function as well the learned input function.
approaches [17], [18]. Namely, the optimal control takes the form It is also possible to express the solution to the LFM control
problem above in terms of the corresponding control solution to
c(t) = −U−1 (t) M> P(t) ĝ(t), (19)
the non-forced problem similarly to the time-varying case considered
where ĝ(t) is the Kalman filter estimate of g(t) and the matrix P(t) in the previous section. This result is summarized in the following
solves the backward Riccati differential equation theorem.
dP(t) Theorem 2.2: The control law in (24) can now be written as
= −A> P(t) − P(t) A  
dt (20) c(t) = − U−1 M> U−1 M>
f Pf f P12 ĝ(t), (26)
+ P(t) M U−1 (t) M> P(t) − Xg (t)
where U−1 M> f Pf is just the non-forced-case gain and P12 can
with the boundary condition P(T ) = Φg . However, we can write
be solved from the Sylvester equation
this solution for the LFM model in more explicit form which reveals  
its structure better. That is summarized in the following theorem. Pf Mf U−1 M> f − Af
>
P12 − P12 Au = Pf Bf Cu . (27)
Theorem 2.1: The control law in (19) can be written as Proof: The result can be obtained by setting the time derivatives
 
c(t) = − U−1 M> −1
M> in Theorem 2.1 to zero.
f Pf (t) U f P12 (t) ĝ(t), (21)
Note that although the system is stabilizable also by setting
where Pf (t) , P11 (t) is the Riccati equation solution for the non- the second term to zero, that is, using the non-forced gain (cf.
forced physical model. The full set of equations is Theorem 3.3), a better solution than that is obtained by using the
dP11 (t) control in Theorem 2.2 which depends on the input as well.
= −A> f P11 − P11 Af
dt
C. Controlled spatio-temporal LFMs
+ P11 Mf U−1 M> f P11 − X(t),
dP12 (t) In the case of PDE LFMs we get models of the form
>
= −Af P12 − P11 Bf Cu − P12 Au
dt (22) ∂g(x, t)
= A g(x, t) + B w(x, t) + Mf c(x, t),
+ P11 Mf U−1 M> f P12 , ∂t (28)
dP22 (t) yk = C g(tk ) + k ,
= −C> > >
u Bf P12 − Au P22 − P21 Bf Cu
dt where the control problem corresponds to designing the control
− P22 Au + P> 12 Mf U
−1
M> f P12 . function c(x, t) minimizing, for example, a linear quadratic cost
Proof:
 The result
 can be obtained by inserting the partitioned functional. In principle, it is possible to directly analyze such infinite-
P11 P12 dimensional control problems which leads to, for example, general-
P= into (20).
P> 12 P22 izations of the controllability concepts [19]. However, in practice,
In the above theorem the gain for the physical system (i.e. f ) portion after setting up the model, we replace the infinite-dimensional model
of the state is exactly the same as in the optimal controller without with its finite-dimensional approximation. Therefore it is actually
an input. However, the second part of gain is non-zero and uses the more fruitful to directly analyze the finite-dimensional approximation
input states for control feedback as well. rather than the original infinite-dimensional model—this way we can
In the next section we will simplify the control problem even more, also easily account for the effect of discretization. For the finite-
and consider the limit T → ∞, because it leads to a particularly dimensional approximate model the results in the previous and next
convenient class of linear controllers which are computationally sections apply as such.
tractable while still being able to use the learning outcome of the
Gaussian process inference. III. O BSERVABILITY AND C ONTROLLABILITY
In this section, our aim is to discuss the detectability and observ-
B. Linear quadratic regulation of temporal LFMs
ability of the latent force models along with the stabilizability and
In the LFM case, namely because we have restricted our consid- controllability of them. We only consider finite-dimensional models,
eration to time-invariant models, a very convenient type of control because as discussed above, infinite-dimensional models anyway
problem is the infinite-time linear regulation problem which corre- need to be discretized and in order to ensure the detectability and
sponds to the cost function observability of the resulting models, the finite-dimensional results
Z ∞ i are sufficient. The corresponding pure infinite-dimensional results
J [c] = (f > (t) X f (t) + c> (t) U c(t)) dt , (23) could be derived using the results in [19].
0
4 TO APPEAR IN IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. XX, NO. XX, XXXX 201X

A. Detectability and observability of latent force models Theorem 3.1: Assume that (Af , Cf ) is observable, the physical
Let us now consider the detectability and observability of LFMs. system is not critically sampled, and that the latent force model part
We assume that we have a latent force model which has the following is stable. Then the full system is detectable and the Kalman filter for
state space representation the model is exponentially stable.
Proof: According to [21], the observability of the continuous-
time system together with the non-critical sampling ensures that the
dg(t)
= A g(t) + B w(t), discrete-time system is also observable. As discrete-time observability
dt (29)
implies discrete-time detectability the result follows from Lemma 3.1.
yk = C g(tk ) + k ,
where g and the matrices A, B, and C are defined as in (9). In this Let us now consider the conditions for the observability of the full
representation we have dropped the control signal, because it does system. It turns out that in general, the best way to determine the
not affect the detectability and observability. observability of the joint system is not to attempt to think of the
It is also reasonable to assume that the state-space representation physical system and the latent force model separately, but explicitly
of the latent force model is stable and hence detectable. However, consider the joint state-space model. There are numerous attempts to
the physical system part itself often is not stable. We need to assume map the properties of this kind cascaded systems to the properties of
though that it is at least detectable and preferably it should be ob- the joint system (e.g. [22]–[24]), but still the best way to go seems
servable. The most useful case occurs when the whole joint system is to be simply to use a standard observability tests on the joint system.
observable. The sampling procedure also affects the observability and The properties of the sub-systems of this kind of cascade do not
we need to ensure that we do not get ’aliasing’ kind of phenomenon alone determine the observability, because we can have phenomena
analogously to sampling a signal with a sampling frequency that is like zero-pole cancellation which leads to a non-observable system
below the Nyquist frequency. Let us start with the following result even when all the subsystems are observable (see, e.g., [22]). When
for detectability. we also account for the effect of sampling to observability, we get
Lemma 3.1: Assume that we have a latent force model which the following theorem.
has the state space representation given in (29). Assume that Theorem 3.2: Assume that the continuous-time joint system
(exp(Af ∆tk ), Cf ) is detectable, and that the input function u(t) (A, C) is observable, and the observations are not critically sampled,
has an exponentially stable state space representation. Then the then the discrete-time full system is observable.
full system is detectable and the Kalman filter for the model is Proof: See [21].
exponentially stable. In practical terms it is thus easiest to use, for example, the classical
Proof: We first discretize the system at arbitrary time points. rank-condition (see, e.g., [25]) which says that the (joint) system
The discretized system has the form (see, e.g., [12]) (A, C) is observable, which in time-invariant case is ensured pro-
gk = exp(Af ∆tk ) gk−1 + qk , vided that the following matrix has full rank for some m:
(30)
C
 
yk = C gk + k ,
 CA 
where qk is a Gaussian random variable, which will be detectable O=

.. ,

(32)
provided that there exists a bounded gain sequence Gk such that  . 
the sequence g̃k defined as g̃k = (exp(Af ∆tk ) − Gk C) g̃k−1 is C Am−1
exponentially stable [20]. More explicitly, the following system for
and then ensure that sampling is non-critical [21]. Fortunately, the
the sequences f̃k and ũk needs to be exponentially stable with some
continuous-time joint system will be observable in many practical
choice of sequence Gk :
scenarios provided that we do not have any zero-pole cancellations
f̃k = exp(Af ∆tk ) f̃k−1 + Γk ũk−1 − Gk Cf f̃k−1 , between the physical system and force model.
(31)
ũk = exp(Au ∆tk ) ũk−1 .
B. Stabilizability and non-controllability of LFMs
As the process uk is exponentially stable, the sequence ũk is
exponentially decreasing and bounded. Hence it does not affect the The aim is now to discuss the controllability and stabilizability of
stability of the first equation. Therefore, the full system will be state-space latent force models. We assume that the model has the
detectable provided that there exists a gain sequence Kk such that form
f̃k = (exp(Af ∆tk ) − Gk Cf ) f̃k−1 is exponentially stable. The dg(t)
= A g(t) + B w(t) + M c(t), (33)
gain sequence exists, because (exp(Af ∆tk ), Cf ) is detectable by dt
assumption. where g and the matrices A, B, and M are defined in (9).
Above, in Lemma 3.1 we had to assume the detectability of the First of all, the stabilizability of the system is guaranteed solely
discretized system. There are many ways to assure this, but one way is by ensuring that the physical model part is stabilizable, provided that
to demand that the continuous physical model is observable and that the state-space representation of the stationary GP is constructed such
we are not sampling critically [21], that is, in a way that would lead that it is exponentially stable. Thus we have the following theorem.
to aliasing of frequencies as in the Shannon-Nyquist theory. Although Theorem 3.3: Assume that (Af , Mf ) is stabilizable and the latent
observability is a quite strong condition compared to detectability, it force has an exponentially stable state space representation. Then the
assures that we have the chance to reconstruct the physical system full system is stabilizable.
with an arbitrary precision by improving the measurement protocol, Proof: The system is stabilizable if there exist a finite gain Gc
which would not be true for mere detectability. such that the system dg̃/dt = (A + M Gc ) g̃ is exponentially stable
If we assume that the physical system part is observable and the [26]. More explicitly we should have
sampling is not critical, we get the following detectability theorem. df̃
Note that we do not yet assume that the latent force model part = (Af + Mf Gf ) f̃ + (Bf Cu + Mf Gu ) ũ,
dt (34)
would be observable although its stability already implies that it is dũ
detectable. = Au ũ,
dt
PREPRINT 5


where we have written Gc = Gf Gu . Because ũ is exponen- of ∆t = 0.01 seconds. The measurements contain Gaussian noise
tially decreasing and bounded, we can safely set Gu = 0. The with a relatively small standard deviation 0.01 – the small noise is
remainder of the system will be stabilizable if there exists a gain selected to better highlight the differences between the controllers.
Gf such that df̃ /dt = (Af + Mf Gf ) f̃ is exponentially stable. By We selected the Gaussian process prior for the input process u(t) to
our assumption on the stabilizability of (Af , Mf ), this is true and have a zero mean and squared exponential (SE) covariance function
hence the result follows. of the form K(t, t0 ) = σ 2 exp[−(t − t0 )2 /`2 ] which was approxi-
The stabilizability also implies that the corresponding LQ con- mated with state-space model using 4/8-order Padé approximant [9].
troller is uniquely determined [18]. However, the sole stabilizability During the training phase, the parameters σ and ` were estimated
is not very useful in practice, because sole stabilizability says that by maximizing the marginal likelihood. The simulated open-loop
we might have randomly wandering subprocesses in the joint system system along with the Gaussian process interpolation (implemented in
which practically prevent us from controlling the process exactly state-space with a Rauch–Tung–Striebel smoother) and extrapolation
where we wish it to go. A much stronger requirement is to require that results are shown in Figure 1. It can be seen that the GP follows the
the full system is controllable. Unfortunately, it turns out that latent true position well until the end of the measurements, after which it
force models are never fully controllable in the present formulation, quite quickly reverts to the prior mean (which in this case is zero).
because we cannot control the subsystem corresponding to the GP Thus the extrapolation accuracy of the GP model is fairly limited,
force. This is summarized in the following theorem. but fortunately the uncertainty estimate of the GP indicates that this
Theorem 3.4: Latent force models are not controllable. should be expected.
Proof: The model is in Kalman’s canonical form [27], where
the non-controllable part is the input signal.
In practice, the non-controllability of the input part is not a
problem, as we are actually interested in controlling the physical
system part of the model, not the input signal per se. It turns out that
the physical system can be controllable even though the full system
is not controllable. This result can be obtained as a corollary of so
called output controllability (see, e.g., [25]) as follows.
Corollary 3.1: Assume that (Af , Mf ) is controllable. Then the
full system is output controllable with respect to the physical system
part.
Proof: This can be derived by writing down the output control-
lability condition [25] and noticing that it reduces to controllability Fig. 1: The open-loop spring position f (t) and measurements
of the physical system part. {yk }n
k=1 (which overlap with the position trajectory in the figure)
The above result is useful when the system is fully observable as along with the GP estimate and its 95% uncertainty quantiles. The GP
well. Then it ensures that we can successfully control the physical was trained using the first 50 seconds of data, after which we obtained
system part although the full latent force model remains uncontrol- measurements for additional 40 seconds. These time intervals are
lable. However, if the latent force model is not fully observable, then indicated with the vertical lines.
the latent force model inherently causes disturbance to the physical
system and although we can keep the system stable, the state cannot
be forced to follow a given trajectory.
As a conclusion, for all practical purposes a (time-invariant) latent
force model is controllable, if it is observable and the following
matrix has a full rank for some m:
 
C = Mf Af Mf . . . Am−1 f Mf . (35)

IV. E XPERIMENTAL R ESULTS


In this section, we illustrate the latent force model framework in
two different problems: a controlled second order ordinary differential
equation modeling a spring and a controlled heat source in two Fig. 2: The input signal u(t) to the spring model and its GP estimate
dimensions. along with the 95% uncertainty quantiles.

A. Controlled ODE Model The result for inference for the input function u(t) is shown in
Our first illustrative example corresponds to the second order Figure 2. Similarly to the position, the input estimate is good until
differential equation model described in (1), which physically can be the end of measurements after which it reverts to the zero mean.
considered as a damped spring. We consider a 100-second interval, To demonstrate the benefit of modeling of the input signal as GP
where the first 50 seconds are used for learning the hyperparameters in the stochastic control context, we consider the model (1) with
of the (state-space) GP after which the hyperparameters are kept linear closed loop optimal control design for c(t). Similarly to the
fixed. We then continue obtaining 40 seconds of additional measure- case shown in Figures 1 and 2, we run the first 50 seconds without
ments of the system after which the measurements stop while we still control and train the hyperparameters during this period. After that,
continue to run the system for 10 seconds. we turn on the control signal aiming to keep the spring at zero. We
The unknown input signal is u(t) = sin(0.23 t) + sin(0.13 t) for consider two ways of designing the controller which were discussed
t ∈ [0, 100], the parameters λ = 0.1 and γ = 1, and we assume that in Section II-A: using the assumed separability design based on
only the position of the spring f (t) is measured in time intervals putting u(t) = 0 and a controller which is designed by taking
6 TO APPEAR IN IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. XX, NO. XX, XXXX 201X

(a) Temperature field f (x, t) (b) Source field u(x, t)


Fig. 6: The temperature function f (x, t) and the source function
Fig. 3: Result of controlling the spring model with Basic LQR and
u(x, t) at time t = 6.9. The small circles mark the positions of the
LFM LQR. It can be seen the that control designed for the full
measurements.
LFM outperforms the basic LQR significantly. The average position
tracking error for the Basic LFM was approximately 0.27 units
whereas in the case of LFM LQR it was approximately 0.11 units.

(a) Field f (x, t) with Basic LQR (b) Field f (x, t) with LFM LQR

Fig. 4: The LQR control signals.

heat source
movement

(c) Maximum temperatures (d) LFM control signal


Fig. 7: The results of using Basic LQR and LFM LQR controllers to
regulate the temperature field to zero. It can be seen from Figures 7a,
y 7b, and 7c that LFM LQR is able to keep the temperature closer to
x
zero than Basic LQR. Figure 7d shows an example control signal
Fig. 5: A cartoon representation of a heat source moving across a which can be see to effectively cancel out the input signal part as
2D spatial field. one would expect.

into account the existence of the input signal as described in the bottom-left direction and then it was turned off. The temperature then
same section. The results of using the basic linear quadratic regulator increases at the application point and when the heat source moves
(”Basic LQR”), that is, the certainty equivalent design, and the result away, the position starts cooling down. Figures 6a and 6b show the
of using the joint LFM control (”LFM LQR”) are shown in Figure 3. temperature field and the heat source at time t = 6.9 when no control
It can be seen that the LFM controller is able to maintain the system is applied.
much better near the origin than the basic controller. The control We then formed a Fourier-basis approximation to the PDE (with
signals are shown in Figure 4. 100 basis functions) and designed two controllers for it—one using
an assumed separability design (”Basic LQR”) and one by taking
the input signal into account (”LFM LQR”). We used SE covariance
B. Controlled heat equation
functions for the latent force model in both time and space directions.
In this experiment we consider the controlled heat equation (2), A Kalman filter was used to estimate the physical system and
where x ∈ R2 . Figure 5 is a cartoon representation of the simulated input signal states from temperature measurements with low variance
scenario which is a heat source moving across a 2D spatial field. (σ 2 = 0.012 ) and the controller was applied using the estimate.
The field is measured at a discrete grid and the measurements are Figures 7a – 7d show the results when the controllers were used.
corrupted by Gaussian noise. In the simulation, the input signal It can be seen that the LFM LQR provides a significantly smaller
u(x, t) is the heat generated by the moving source and the aim is tracking error.
to reconstruct f and u from noisy observations as well as design an
optimal control signal c(x, t), which aims to regulate the temperature
f (x, t) to zero. V. C ONCLUSION AND D ISCUSSION
In the simulation, we used the parameters λ = 0.2 and D = 0.001 In this paper we have studied a latent force model (LFM)
and the heat source was moving for 10 seconds from top-right to framework for learning and control in hybrid models which are
PREPRINT 7

combinations of first-principles (physical) models and non-parametric [5] J. Hartikainen, M. Seppänen, and S. Särkkä, “State-space inference
Gaussian process (GP) models as their inputs. In particular, we have for non-linear latent force models with application to satellite orbit
prediction,” in Proceedings of the 29th International Conference on
considered stochastic control problems associated with these models
Machine Learning (ICML), 2012.
as well as analyzed the observability and controllability properties [6] C. E. Rasmussen and C. K. Williams, Gaussian Processes for Machine
of the models. It turned out that although the models are often Learning. The MIT Press, 2006.
observable, they typically are not fully controllable. However, they [7] J. Hartikainen and S. Särkkä, “Kalman filtering and smoothing solutions
still are output controllable with respect to the physical system to temporal Gaussian process regression models,” in Proceedings of
IEEE International Workshop on Machine Learning for Signal Process-
part and thus the control problem is well defined. We have also ing (MLSP), 2010.
experimentally shown that learning the input signal improves the [8] S. Särkkä, A. Solin, and J. Hartikainen, “Spatiotemporal learning via
control performance. This is in line with the theoretical result that infinite-dimensional Bayesian filtering and smoothing,” IEEE Signal
the optimal control is a combination of a classical control without an Processing Magazine, vol. 30, no. 4, pp. 51–61, 2013.
[9] S. Särkkä and R. Piché, “On convergence and accuracy of state-
input signal and an additional term that modifies the control using space approximations of squared exponential covariance functions,” in
the knowledge on the input signal. Proceedings of 2014 IEEE International Workshop on Machine Learning
The framework also allows for a number of extensions. For ex- for Signal Processing (MLSP), 2014, pp. 1–6.
ample, introducing non-linearities in the measurement model can be [10] G. De Nicolao and G. Ferrari-Trecate, “Regularization networks: Fast
tackled by replacing the Kalman filter with its non-linear counterpart weight calculation via Kalman filtering,” IEEE Transactions on Neural
Networks, vol. 12, no. 2, pp. 228–235, 2001.
(e.g., [14], [28]–[30]), and another possible extension is to include [11] ——, “Regularization networks for inverse problems: A state-space
an operator or a functional into the measurement model of a spatio- approach,” Automatica, vol. 39, no. 4, pp. 669–676, 2003.
temporal system (e.g. [8], [15], [31]) leading to an inverse problem [12] Y. Bar-Shalom, X.-R. Li, and T. Kirubarajan, Estimation with Applica-
type of model. With these extensions the inference in the resulting tions to Tracking and Navigation. Wiley, 2001.
[13] W.-H. Chen, J. Yang, L. Guo, and S. Li, “Disturbance-observer-based
system can still be performed using Kalman filter techniques and control and related methods - an overview,” IEEE Transactions on
the control problem can be kept intact. In the non-linear case this Industrial Electronics, vol. 63, no. 2, pp. 1083–1095, 2016.
corresponds to an assumed certainty equivalence approximation to the [14] S. Särkkä, Bayesian Filtering and Smoothing. Cambridge University
solution. It would also be possible to consider non-linear differential Press, 2013.
[15] S. Särkkä and J. Hartikainen, “Infinite-dimensional Kalman filtering
equation (physical) models which are driven by Gaussian processes.
approach to spatio-temporal Gaussian process regression,” in JMLR
In that case we would need to resort to approximate Kalman filtering Workshop and Conference Proceedings Volume 22 (AISTATS 2012),
methods along with approximate non-linear control methods (e.g. 2012, pp. 993–1001.
[16], [32], [33]). [16] P. Maybeck, Stochastic Models, Estimation and Control, Volume 3.
Finally, an important practical issue is the choice of appropriate Academic Press, 1982.
[17] R. E. Kalman, “Contributions to the theory of optimal control,” Boletin
covariance function for the GP. As highlighted by the extrapolation de la Sociedad Matematica Mexicana, vol. 5, no. 1, pp. 102–119, 1960.
experiment in Section IV-A, the typically used squared exponential [18] B. D. O. Anderson and J. B. Moore, Optimal Control: Linear Quadratic
covariance function is not always a good choice when extrapolation Methods. Dover, 2007.
capability is required. The same applies to all stationary covariance [19] R. F. Curtain and H. Zwart, An introduction to infinite-dimensional linear
systems theory. Springer Science & Business Media, 2012, vol. 21.
functions, because they always revert to the prior mean after the [20] B. Anderson and J. B. Moore, “Detectability and stabilizability of time-
data ends. One way to cope with this problem would be to use varying discrete-time linear systems,” SIAM Journal on Control and
non-stationary covariance functions such as once or twice integrated Optimization, vol. 19, no. 1, pp. 20–32, 1981.
stationary GPs which, instead of reverting to the prior mean, revert to [21] F. Ding, L. Qiu, and T. Chen, “Reconstruction of continuous-time
zero derivative (constant prediction) or zero second derivative (linear systems from their non-uniformly sampled discrete-time systems,” Au-
tomatica, vol. 45, no. 2, pp. 324–332, 2009.
prediction). An alternative approach would be to augment unknown [22] E. G. Gilbert, “Controllability and observability in multivariable control
constants or linear in parameters functions into the state-space model systems,” Journal of the Society for Industrial and Applied Mathematics,
which corresponds to replacing the zero mean function with a linear Series A: Control, vol. 1, no. 2, pp. 128–151, 1963.
in parameters model (cf. [6]). However, for these kinds of models [23] C. T. Chen and C. Desoer, “Controllability and observability of compos-
ite systems,” IEEE Transactions on Automatic Control, vol. 12, no. 4,
the present observability and controllability results no longer apply 1967.
as such. [24] E. Davison and S. Wang, “New results on the controllability and
observability of general composite systems,” IEEE Transactions on
Automatic Control, vol. 20, no. 1, pp. 123–128, 1975.
ACKNOWLEDGMENT [25] K. Ogata, Modern control engineering, 3rd ed. Prentice Hall, 1997.
Simo Särkkä would like to thank Academy of Finland for financial [26] W. M. Wonham, Linear Multivariable Control: A Geometric Approach.
support. Mauricio A. Álvarez has been partially financed by the Springer-Verlag, 1985.
[27] R. E. Kalman, “Mathematical description of linear dynamical systems,”
EPSRC Research Project EP/N014162/1. The work was done when Journal of the Society for Industrial and Applied Mathematics, Series
Neil D. Lawrence was at the University of Sheffield. A: Control, vol. 1, no. 2, pp. 152–192, 1963.
[28] A. H. Jazwinski, Stochastic Processes and Filtering Theory. Academic
Press, 1970.
R EFERENCES [29] P. S. Maybeck, Stochastic Models, Estimation and Control. New York:
[1] M. Álvarez, D. Luengo, and N. D. Lawrence, “Latent force models,” Academic Press, 1982, vol. 2.
in JMLR Workshop and Conference Proceedings Volume 5 (AISTATS [30] S. Särkkä and J. Sarmavuori, “Gaussian filtering and smoothing for
2009), 2009, pp. 9–16. continuous-discrete dynamic systems,” Signal Processing, vol. 93, no. 2,
[2] M. Álvarez, J. R. Peters, N. D. Lawrence, and B. Schölkopf, “Switched pp. 500–510, 2013.
latent force models for movement segmentation,” in Advances in neural [31] S. Särkkä, “Linear operators and stochastic partial differential equations
information processing systems, 2010, pp. 55–63. in Gaussian process regression,” in Proceedings of ICANN, 2011.
[3] M. A. Álvarez, D. Luengo, and N. D. Lawrence, “Linear latent force [32] R. F. Stengel, Optimal Control and Estimation. New York: Dover,
models using Gaussian processes,” IEEE Transactions on Pattern Anal- 1994.
ysis and Machine Intelligence, vol. 35, no. 11, pp. 2693–2705, 2013. [33] E. B. Erdem and A. G. Alleyne, “Design of a class of nonlinear
[4] J. Hartikainen and S. Särkkä, “Sequential inference for latent force controllers via state dependent Riccati equations,” IEEE Transactions
models,” in Proceedings of The 27th Conference on Uncertainty in on Control Systems Technology, vol. 12, no. 1, pp. 133–137, 2004.
Artificial Intelligence (UAI 2011), 2011.

You might also like