Epfl
Epfl
Abstract— We study control of constrained linear systems the Kullback–Leibler divergence and the total variation dis-
when faced with only partial statistical information about the
arXiv:2312.07324v1 [math.OC] 12 Dec 2023
xt+1 = Axt + But + wt , yt = Cxt + vt , (1) and we define polytopic safe sets X ⊆ Rn and U ⊆ Rm for
the system state and input signals, respectively, as:
where xt ∈ Rn , ut ∈ Rm , yt ∈ Rp , wt ∈ Rn and vt ∈
Rp are the system state, the control input, the observable X = {x ∈ Rn : gx (x) = max G⊤
xj x + gxj ≤ 0 , Jx ∈ N} ,
j∈[Jx ]
output, and the stochastic disturbances modeling process and
measurement noise, respectively. We study infinite-horizon U = {u ∈ R m
: gu (u) = max G⊤
uj u + guj ≤ 0 , Ju ∈ N} ,
j∈[Ju ]
control when only partial statistical information about the
distribution of the joint disturbance process ξt = (wt , vt ) where [Jx ] denotes the set {1, . . . , Jx } ⊂ N and similarly for
is available. Specifically, we assume availability of N ∈ N [Ju ]. Then, given a safety parameter γ ∈ (0, 1) to control
(1) (N )
independent observations ξT , . . . , ξT , where each sample the level of acceptable constraint violations, we formulate
the following chance constrained stochastic optimization
(i) (i) (i) (i) (i) (i) (i)
ξT = (wT , vT ) = (w0 , . . . , wT , v0 , . . . , vT ) , (2) problem:
constitutes a trajectory of length T ∈ N of wt and vt . As π ⋆ = arg min EP [J(π, ξ)] (3a)
π
no performance or safety guarantee can be established if
the samples in (2) are not representative of the asymptotic subject to CVaRPγ (max{gx (xt (ξ)), gu (ut (ξ))}) ≤ 0 , (3b)
statistics of w and v, we start by formulating the following where CVaR constraints are defined according to
stationarity assumption, see, e.g., [22, p. 154].
1
Assumption 1: For all t ∈ N, the stochastic process that CVaRPγ (g(ξ)) = inf τ + EP [max{g(ξ) − τ, 0}] , (4)
generates the joint disturbance vector ξt = (wt , vt ) is τ ∈R γ
stationary of order T , i.e., P(ξ0 , . . . , ξT ) = P(ξt , . . . , ξt+T ). for any measurable function g : Rd → R. We note that,
We note that Assumption 1 subsumes the usual setting where besides implying that P[xt ∈ X , ut ∈ U] ≥ 1 − γ, (3b)
each realization of the disturbance processes is indepen- also accounts for the expected amount of constraint violation
dent and identically distributed, and more generally allows in the γ percent of cases where any such violation occurs.
modeling temporal correlation between samples that are As such, the CVaR formulation reflects the observation that,
separated by up to T time steps. Further, as the order T can in most control applications, severe breaches of the safety
theoretically be arbitrarily large, this assumption is relatively constraints often have far more detrimental consequences
mild, albeit, in practice, an upper bound on the order T is than mild violations. As the probability distribution P is
often dictated by computational complexity concerns. fundamentally unknown, however, we cannot address the
Throughout the paper, we denote by Ξ ⊆ Rd , with decision problem (3) directly, and we instead rely on the
d = (n + p)(T + 1), the support of the unknown probability following approximations.
distribution P, and we make the following assumption. First, we construct the empirical probability distribution
Assumption 2: The support set Ξ = {ξ ∈ Rd : Hξ ≤ h} N
X
is full-dimensional, that is, Ξ contains a d-dimensional ball b= 1
P δ (i) , (5)
with strictly positive radius. N i=1 ξT
(i)
where δξ(i) denotes the Dirac delta distribution at ξT . In classical H2 and H∞ control problems, which correspond to
T
order to immunize against any error in P,b we replace the the limit cases of ǫ approaching 0 and ∞, respectively.
If ǫ = 0, the Wasserstein ball Bǫ (P) b reduces to the
nominal objective (3a) with the minimization of the worst-
b ⊆ b
singleton {P} and the supremum disappears. This gives
case expected loss over the set of distributions Bǫ (P)
P(Ξ) that are supported on Ξ and are sufficiently close to a simple Monte-Carlo-based control design problem [24],
b 1 More formally, we define
the empirical estimate P. [25]. Moreover, because J(π, ξ) is quadratic, the result-
ing optimal controller is the LQG designed for PN =
b = {Q ∈ P(Ξ) : W (P,
Bǫ (P) b Q) ≤ ǫ} , (6) N (Eξ∼bP [ξ], varξ∼bP [ξ]) in the absence of constraints [26].
b and Indeed, because both the dynamics and the controller are
where ǫ ≥ 0 is the radius of the ambiguity set Bǫ (P),
b b linear, one has2
W (P, Q) is the Wasserstein distance between P and Q, i.e.,
Z EPN [J(π, ξ)] = EbP [J(π, ξ)] ,
b Q) = inf 2
W (P, kξ − ξ ′ k2 π(dξ, dξ ′ ) , (7)
π∈Π Ξ2 which means that the arg minπ of both expectations is also
where Π denotes the set of joint probability distributions of ξ the same.
and ξ ′ with marginal distributions Pb and Q, respectively [1], If ǫ is very large and Ξ is compact, (8) can also be seen
[2]. In (7), the decision variable π encodes a transportation as a generalization of H∞ synthesis methods [26], [27]. In
plan for moving a mass distribution described by P b to a fact, in the limit case of ǫ → ∞ and no matter how P b is
b
distribution described by Q. Thus, Bǫ (P) can be interpreted constructed, (6) contains all distributions P(Ξ) supported
as the set of distributions onto which P b can be reshaped at a on Ξ, including the degenerate distribution taking value at
cost of at most ǫ, where the cost of moving a unit probability the most-averse ξ almost surely.
from ξ to ξ ′ is given by kξ − ξ ′ k22 . Intermediate values of ǫ instead yield solutions that lever-
Second, since dynamic programming solutions are gener- age the observations (2) to trade-off robustness to adversarial
ally computationally intractable, we restrict our attention to perturbations or distribution shifts against performance under
policies π ∈ ΠL that are linear in the past observations y, distributions in a neighborhood of P. b
that is, u = π(y) = K(z)y for some real-rational proper We conclude this section by remarking that, differently
transfer function K(z). Besides computational advantages, from [9], we do not assume that the nominal distribution
our choice is supported by recent advances in DRC, which b is Gaussian, and instead use the empirical estimate (5)
P
show that linear policies are globally optimal for a general- to provide greater design flexibility. In fact, if P is, e.g.,
ization of the classical unconstrained LQG problem, where bimodal, then the Wasserstein distance between P and its
the noise distributions belong to a Wasserstein ambiguity set closest Gaussian distribution G will generally be larger than
(6), centered at a nominal Gaussian distribution P b [9]. the Wasserstein distance between P and its empirical estimate
We are now in a position to state our problem of interest b In turn, this implies that a larger radius ǫ needs to be used
P.
as: to ensure that P ∈ Bǫ (G) with high probability, leading to a
more conservative design.
inf sup EQ [J(π, ξ)] (8a)
π∈ΠL
Q∈Bǫ (b
P) III. BACKGROUND
subject to sup CVaRQ
γ (gt (ξ)) ≤ 0 , ∀t ∈ N , (8b)
In this section, we recall useful technical preliminaries,
Q∈Bǫ (b
P)
and we discuss the design assumptions that will allow us
where gt (ξ) = max{gx (xt (ξ)), gu (ut (ξ))} for compactness. to compute an approximate solution to (8) through convex
Note that the worst-case distributions in (8a) and (8b) may programming. In particular, we start by reviewing the system
not coincide. Despite the fact that in practice the uncertainty level approach to controller synthesis [19], and then present
distribution is unique, the formulation in (8) proves necessary recent duality results from the DRO literature [3].
b and not simply
to ensure safety for all distributions in Bǫ (P)
for the one maximizing the expected control cost. A. System level synthesis
C. Expressivity of the problem formulation and related work The system level synthesis framework provides a con-
vex parameterization of the non-convex set of internally
The solution to the DRO problem (8) depends on the ra- stabilizing controllers K(z), allowing one to reformulate
dius ǫ defining (6). In particular, we argue that (8) generalizes many control problems as optimization over the closed-loop
1 It is well-known that solving (3) upon naively replacing P with b P, that
responses Φxw (z), Φxv (z), Φuw (z) and Φuv (z) that map w
is, setting ǫ to zero in (6), may lead to decisions that are unsafe or exhibit and v to x and u. To define these maps, we first combine
poor out-of-sample performance, as the optimization process often amplifies the linear output feedback policy u = K(z)y with the z
any estimation error in b P. Instead, for any β > 0, if P is light-tailed transform of the state dynamics in (1) to obtain:
and the radius ǫ is chosen as a sublinearly growing function of log(1/β)N
,
then results from measure concentration theory ensure that P lie inside
the ambiguity set (6) with confidence 1 − β, see, [23, Theorem 2] and
(zI − (A + BK(z)C))x = w + BK(z)v .
[2, Theorem 18]. Therefore, in this case, any solution to (8) retains finite-
2 Both expectations are equal to the same linear transformation of the first
samples probabilistic guarantees in terms of out-of-samples control cost and
constraint satisfaction. and second moments of P b and PN , which are equal.
Then, since the transfer matrix (zI − (A + BK(z)C)) is B. A stationarity control problem
invertible for any proper controller K(z), we have As we consider an infinite horizon control problem, we
focus on the steady state behavior of the system, and we are
x Φxw (z) Φxv (z) w instead less interested in the transient behavior [30]. Moti-
= = Φξ (z)ξ ,
u Φuw (z) Φuv (z) v
vated by this and to take full advantage of the stationarity
(zI − (A + BK(z)C))−1 Φxw (z)BK(z) properties of ξt in Assumption 1, we focus on designing an
= ξ.
K(z)CΦxw (z) Φuw (z) + zK(z) optimal safe controller to operate the system for t ≥ T only.
In this setting, we proceed to show that the distributionally
In particular, we note that causality of K(z) implies causality robust worst-case control cost and CVaR constraints admit
of Φuv and strict causality of Φxw , Φxv and Φuw . Further, finite-dimensional representations
one can show that the affine subspace defined by Assumption 3: The system is initialized by an external
controller with x0 , . . . , xT −1 ∈ X and u0 , . . . , uT −1 ∈ U.
zI − A −B Φξ (z) = I 0 , (9a) We therefore redefine the optimization cost J in (8a) as
zI − A I T′
Φξ (z) = , (9b) 1 X ⊤
−C 0 JT (π(Φ), ξ) = lim ξt−T :t Φ⊤ DΦξt−T :t .
′
T →∞ T −T
′
t=T
characterizes all and only the system responses Φξ (z) that
are achievable by an internally stabilizing controller K(z) Note that due to the stationarity of Q (see Assumption 1),
[19]. Despite the fact that (9) defines a convex feasible JT satisfies
set, minimizing a given convex objective with P∞ respect −k to EQ J(π(Φ), ξ)
the closed-loop transfer matrix Φξ (z) = k=0 Φ(k)z T′
proves challenging, as the resulting optimization problem re- 1 X ⊤
= lim E ξ0:R ∼Q ′
ξt−T :t Φ⊤ DΦξt−T :t ,
mains infinite dimensional. Therefore, to recover tractability ′ T →∞ .. T −T
. t=T
and following [19], [28], we rely on a FIR approximation ξT ′ −T :T ′ ∼Q
of Φξ (z), i.e., we restrict our attention to the truncated = EξT ∼Q ξT⊤ Φ⊤ DΦξT . (12)
PT −k
system response ΦTξ (z) = k=0 Φ(k)z . We remark
that controllability and observability of (1) ensure that (9) The problem statement (8) for DRInC synthesis can be
admits a FIR solution [19, Theorem 4]. At the same time, reformulated as finding the optimal FIR map Φ⋆ of length
since Φξ (z) represents a stable map, the effect of this FIR T + 1 given by
approximation becomes negligible if T is sufficiently large; Φ⋆ = arg min sup EξT ∼Q ξT⊤ Φ⊤ DΦξT , (13)
for the case of LQR regulators, for instance, it was shown Φ achievable Q∈Bǫ (b
P)
that the performance degradation relative to the solution to while satisfying the achievability constraints (9) as well as
the infinite-horizon problem decays exponentially with T , conditional value-at-risk constraints
see [29, Section 5].
T ∼Q
According to the discussed FIR approximation, we let: sup CVaRξ1−γ (G⊤
j ΦξT + gj ) ≤ 0, ∀j ∈ [J], (14)
Q∈Bǫ (b
P)
Φx = [Φxw (T ), . . . , Φxw (0), Φxv (T ), . . . , Φxv (0)] , where J = Jx + Ju and [J] = {1, . . . , J} enumerates all the
Φu = [Φuw (T ), . . . , Φuw (0), Φuv (T ), . . . , Φuv (0)] , constraints on [x⊤ , u⊤ ], which are defined by
Gx 0 g
and we define Φ = [Φ⊤ ⊤ ⊤ G= , g= x .
x , Φu ] for compactness. With this 0 Gu gu
notation in place, for any t ≥ T , we have that:
We highlight that while (8a) is an infimum problem,
the minimum in (13) is attained. Indeed, as Ξ is full-
xt = Φx ξt−T :t , ut = Φu ξt−T :t , (10)
dimensional per Assumption 2, there is always a distri-
bution Qb such that E b J(π(Φ), ξ) is strongly convex in
where ξt−T :t = [wt−T , . . . , wt , vt−T , . . . , vt ]⊤ collects the Q
last T +1 realizations of the process and measurement noises. Φ (e.g., an empirical distribution containing samples that
The following proposition, for which we provide a proof form a basis for Rd ). Moreover, since EQb J(π(Φ), ξ) ≤
in Appendix A for the sake of comprehensiveness, shows supQ∈Bǫ (bP) EQ J(π(Φ), ξ) by definition, the supremum in
how to implement a controller that achieves a given pair of (13) is strongly convex and the minimizer Φ⋆ is attainable.
system responses Φx and Φu . However, both the control cost grow quadratically, which
can render supQ∈Bǫ (bP) EQ [J(π(Φ), ξ)] unattainable [3]3 . In
Proposition 1: If the closed loop map Φ is achievable, the
corresponding control policy π(Φ) can be implemented as 3 The ratio between the growth rates of the loss function and the transport
a linear system with dynamics cost is crucial in DRO problems. If the control cost grows faster than the
transport cost, the adversary can make the control cost diverge by moving an
δt = −Φx φt−T :t , ut = Φu φt−T :t + Φuv (0)Cδt , (11) infinitesimal amount of mass very far away from the empirical distribution.
Reversely, if the control cost grows slower, there is always be a point at
which it is not worth for the adversary to keep moving and the supremum
⊤ ⊤ ⊤ ⊤⊤
where φt−T :t = [δt−T +1 , . . . , δt−1 , 02n , yt−T , . . . , yt ] . is attained. This is the case for the constraints, as their cost grows linearly.
what follows, we use the recent advances in DRO theory We start by observing that if the loss is not concave
presented in [3] to reformulate the control design problem with respect to ξT , then the optimization problem in (17)
as a finite-dimensional and tractable problem. may not be convex. In fact, while [2] shows that there is a
hidden convexity when Ξ = Rd , this result does not hold
C. Strong duality for DRO of piecewise linear objectives in general. To illustrate this point, consider for example
The minimization (13) subject to (14) is infinite- the situation drawn in Fig. 1. One can observe that if the
dimensional and therefore cannot be directly solved. The constraint Q ∈ Bǫ (δ) is active, then the problem (17)
next proposition, which serves as a starting point for our amounts to a Quadratically Constrained Quadratic Program
derivations in Section IV, shows how DRO of piecewise (QCQP), which admits a tight convex relaxation as a Semi-
linear objectives can be recast as a finite-dimensional convex Definite Program (SDP) [31]. Conversely, however, when
program. the constraint Q ∈ Bǫ (δ) is not active, the adversary must
Proposition 2: Let aj ∈ Rd and bj ∈ R constitute a piece- maximize a convex Quadratic Program (QP), which is not
wise linear cost with J pieces. If Assumption 2 holds and convex.
ǫ > 0, then the risk:
local optimum
sup EξT ∼Q max a⊤
j ξT + bj , (15) with ε′ ℓ
j∈[J]
Q∈Bǫ
δ Q Q′
can be equivalently computed as:
1 X (i) Ξ
inf λǫ + s , subject to (16a) O
λ≥0,κij ≥0 N ξ
i∈[N ] ε
ε′
kaj k22 (i)
s(i) ≥ bj + − a⊤
j ξT (16b)
4λ
1 ⊤ 1 (i) ⊤ Fig. 1. Illustration of two worst-case distributions Q ∈ Bǫ (δ) and Q′ ∈
+ κ HH ⊤ κij − a⊤ H ⊤ κij + HξT + h κij , Bǫ′ (δ) in different Wasserstein balls around the Dirac delta distribution.
4λ ij 2λ j The support ξ is represented by the horizontal blue line above the ξ axis,
for all i = 1, . . . , N and j = 1, . . . , J. and the left-most Dirac distribution represents a local minima in Bǫ′ (δ) for
Proof: This proposition is a direct consequence of [3, the risk R(Q) in (17).
Proposition 2.12]. For the sake of clarity, we report detailed
derivations in Appendix C. Whether the constraint Q ∈ Bǫ (δ) is active or not depends
Proposition 2 uses strong duality to establish an equivalence on the value taken at the optimum by its Lagrange multiplier
between (16) and (15). In particular, the decision variables λ, which represents the shadow cost of robustification. The
λ and κij in (16) correspond to the Lagrange multipliers following proposition provides a sufficient condition for the
associated with the constraints Q ∈ Bǫ and ξT ∈ Ξ, constraint to be active by generalizing the example shown in
respectively. The optimal value of λ can thus be interpreted Fig. 1 to Rd .
as the shadow cost of robustification, i.e., the amount by Proposition 3: Let ∂Ξ = {ξ : max Hk ξ − hk = 0},
k∈[nH ]
which the risk EξT ∼Q maxj∈[J] a⊤ j ξT + bj increases for each where nH is the number of rows in H, denote the boundary
unit increase of ǫ. The variables s(i) instead represent the of Ξ. If
empirical Lagrangian for each sample.
1 X (i)
2
min ξT − ξ˜ > ǫ , (18)
IV. M AIN R ESULTS N ξ̃∈∂Ξ 2
i∈[N ]
In this section, we present our main results. Motivated by
that is, if the average squared distance between the samples
the observation that the operational costs of engineering ap-
and the border ∂Ξ of the support Ξ is strictly greater than
plications usually relate to energy consumption and are thus
epsilon, then the optimal shadow cost of robustification λ⋆
often modeled using quadratic functions, we first extend the
is greater than λmax (Q) for any Q ∈ Rd×d .
results of Proposition 2 beyond piecewise linear objectives.
Proof: The proof is given in Appendix D.
A. Non-convexity challenges Proposition 3 shows that λ is contingent on the radius ǫ, the
support Ξ, and the realizations ξ (i) . The radius ǫ is usually
While [3, Proposition 2.12] holds for general transport
small, as the samples should be approximating the real
costs and no matter if Ξ is bounded or not, this strong
distribution well enough, which means that the condition (18)
duality result does not directly apply to (13), as the objective
is often satisfied. In the next section, we utilize the inequality
J(π(Φ), ξ) is not piece-wise concave. An extension of
λ⋆ ≥ λmax (Q) to propose a strong dual formulation for (17).
current state-of-the-art results in DRO is therefore required
to minimize a risk of the form B. Tight convex relaxation for DRO of quadratic objectives
R(Q) := sup EξT ∼Q ξT⊤ QξT . (17) In this section, we present a convex upper bound for (17),
Q∈Bǫ
and prove that it becomes tight if λ is greater than λmax (Q),
where Ξ does not necessarily equal Rd and Q 0. the largest eigenvalue of Q.
Lemma 4: Let Q ∈ Rd×d be a symmetric and positive by
definite matrix. Under Assumption 2, if ǫ > 0 and if Ξ is
bounded, the risk (17) satisfies Φ⋆ = arg min lim R(Q + |η|I) , (21a)
Φ Φ achievable,Q η→0
1 X (i) Q ⋆
R(Q) ≤ inf λǫ + s , (19a) subject to 1 0. (21b)
λ≥0,µi ≥0, N D2Φ I
ψi ≥−µi i∈[N ]
α≥0 Proof: The proof can be found in Appendix F
subject to , ∀i ∈ [N ] : We continue our derivations by presenting an equivalent
convex reformulation of the safety constraints in (14). In
(i)
s(i)−h⊤ψi +λkξT k22 ⋆ ⋆ particular, in the next proposition, we embed the function
2λξT(i)+H⊤ψi 4(λI −Q) ⋆ 0, (19b) max{· − τ, 0} in (4) as a (J + 1)th constraint.
H⊤ µi 0 4Q Lemma 7: Under Assumption 2 and if ǫ > 0, the con-
straints (14) can be reformulated as the following convex
α ⋆
0. (19c) LMIs
H⊤µi λI − Q
γ−1 1 X (i)
Moreover, (19a) holds with equality and (19c) is inactive if ρǫ + τ+ ζ ≤ 0, ρ ≥ 0, (22a)
γ N
the optimum λ⋆ of λ satisfies λ⋆ I ≻ Q. i∈[N ]
∀i ∈ [N ] , ∀j ∈ [J + 1] : κij ≥ 0 , (22b)
Proof: This result is obtained by taking the limit of (16) " #
(i) (i)
when the number J of pieces tends to infinity. The detailed ζ (i)− γ1(G⊤
j ΦξT +gj )−(HξT +h) κij
⊤
⋆
derivations are presented in Appendix E. 1 ⊤ ⊤ 0,
γ Φ Gj −H κij 4ργ 2 I
We stress that our results continue to hold even if H = 0 (22c)
and h = 0, that is, if Ξ = Rd . In this case, (19) simplifies
substantially. where GJ+1 = 0 and gJ+1 = τ .
Corollary 5: Lemma 4 also holds if Ξ = Rd and (19) Proof: The proof can be found in Appendix G.
simplifies into Leveraging Lemmas 4, (6), and (7), we are now ready to
reformulate (13) subject to (14) as SDP.
1 X (i)
R(Q) = inf λǫ + s , (20a) Theorem 8: Under Assumption 2 and if ǫ > 0, the closed
λ≥0 N loop map given by
i∈[N ]
" #
(i)
s(i) +λkξT k22 ⋆ 1 X (i)
subject to (i) 0. (20b) Φ⋆ = arg min inf λǫ + s ,
λξT λI − Q (i) (i)
Φ achievable Q,s ,ζ ,τ, N
λ≥0,ρ≥0,α≥0, i∈[N ]
Proof: If Ξ = Rd , the problem (13) falls into the µi ≥0,κij ≥0,
ψi ≥−µi
assumptions of [2, Theorem 11]. Additionally, we observe
that, when H = 0 and h = 0, (20b) has the same Schur subject to
complement as (19b) and (19c) is always satisfied. (21b), (22a),
To understand the effect of having restricted our attention (19b), (19c), ∀i ∈ [N ],
to distributions with bounded support, it is of interest to
(22c), ∀i ∈ [N ], j ∈ [J +1],
compare (19) with (20). In both problems, the presence of
the term λI − Q in (19b) and (20b) implies that any feasible is stable and satisfies the safety constraints (14). Moreover,
solution has a shadow cost λ greater or equal than λmax (Q). it optimizes (13) if Ξ is bounded and the optimizer λ⋆ is
On the other hand, for (20b) to be feasible, λ should be
(i)
greater than λmax Φ⋆ ⊤ DΦ⋆ .
large enough to guarantee s(i) + λkξT k22 ≥ 0, whereas the Proof: We first highlight that Φ is FIR and therefore
presence of the additional term −h⊤ ψi in the top-left entry stable by definition. Second, the safety constraints (14) are
of (19b) softens this requirement, demonstrating the helpful equivalent to (22), as shown in Lemma 7. Third, consider
contribution of the bounded support. a closed loop map Φ, b which optimizes the expectation of
b⊤ DΦξ
ξT⊤ Φ b T + |η|kξT k2 for η 6= 0. With Q = Φ⊤ DΦ +
2
C. Convex formulation of DRInC design |η|I ≻ 0, Lemma 4 shows that R(Φ⊤ DΦ) is tightly upper-
Our results of Section IV-B does not directly allow us bounded by (19). Fourth and finally, as shown in Lemma 6,
to solve (13), as (12) shows that Q depends quadratically taking the limit η → 0 yields Φ b → Φ⋆ from (13), which
on Φ and may also be rank deficient. In this subsection, concludes the proof.
we mitigate the issues associated with quadratic matrix We remark that the reformulation proposed in Theorem 8
inequalities by employing a Schur complement, and we is exact whenever the true shadow cost of robustification
address singularity concerns by examining the behavior of λ is greater or equal than λmax (Q), a condition which is
the system as Q approaches singularity, showing that this always satisfied for sufficiently small ǫ as per Proposition 3.
limit remains well-behaved. When λ is lower than λmax (Q), the solution computed using
Lemma 6: Under Assumption 2, if ǫ > 0 and Ξ is Theorem 8 may instead be suboptimal. Nevertheless, our
bounded, the optimal closed loop map Φ⋆ in (13) is given solution retains safety and stability guarantees in face of the
uncertain distribution, since neither (22) nor the achievability [14] K. Kim and I. Yang, “Distributional robustness in minimax linear
constraints depend on λ. quadratic control with Wasserstein distance,” SIAM Journal on Control
and Optimization, vol. 61, no. 2, pp. 458–483, 2023.
[15] V. Krishnan and S. Martı́nez, “A probabilistic framework for moving-
V. C ONCLUSION horizon estimation: Stability and privacy guarantees,” IEEE Transac-
tions on Automatic Control, vol. 66, no. 4, pp. 1817–1824, 2020.
We have presented an end-to-end synthesis method from [16] J.-S. Brouillon, F. Dörfler, and G. Ferrari-Trecate, “Regularization for
a collection of a finite number of disturbance realizations to distributionally robust state estimation and prediction,” arXiv preprint
the design of a stabilizing linear policy with DR safety and arXiv:2304.09921, 2023.
[17] L. Aolaritei, N. Lanzetti, H. Chen, and F. Dörfler, “Distribu-
performance guarantees. Our approach consists in estimating tional uncertainty propagation via optimal transport,” arXiv preprint
an empirical distribution using samples of the uncertainty, arXiv:2205.00343, 2023.
and then computing a feedback policy that safely minimizes [18] L. Aolaritei, N. Lanzetti, and F. Dörfler, “Capture, propagate, and
the worst-case expected cost over all distributions within a control distributional uncertainty,” arXiv preprint arXiv:2304.02235,
2023.
Wasserstein ball around the nominal estimate through the [19] Y.-S. Wang, N. Matni, and J. C. Doyle, “A system-level approach
solution of an SDP. We have shown that, as the radius of to controller synthesis,” IEEE Transactions on Automatic Control,
this ambiguity set varies, our problem statement recovers vol. 64, no. 10, pp. 4079–4093, 2019.
[20] J. Coulson, J. Lygeros, and F. Dörfler, “Distributionally robust chance
classical control formulations. To address the resulting op- constrained data-enabled predictive control,” IEEE Transactions on
timal control problem, we have established a novel tight Automatic Control, vol. 67, no. 7, pp. 3289–3304, 2021.
convex relaxation for DRO of quadratic objectives, and we [21] A. Hakobyan and I. Yang, “Wasserstein distributionally robust control
of partially observable linear systems: Tractable approximation and
have combined our results with the system level synthesis performance guarantee,” in 2022 IEEE 61st Conference on Decision
framework, presenting conditions under which our design and Control (CDC). IEEE, 2022, pp. 4800–4807.
method is non-conservative. [22] K. I. Park, M. Park, and James, Fundamentals of probability and
Future work will validate the effectiveness of our approach stochastic processes with applications to communications. Springer,
2018.
by means of numerical simulations and real-world experi- [23] N. Fournier and A. Guillin, “On the rate of convergence in wasserstein
ments. distance of the empirical measure,” Probability theory and related
fields, vol. 162, no. 3-4, pp. 707–738, 2015.
R EFERENCES [24] T. S. Badings, A. Abate, N. Jansen, D. Parker, H. A. Poonawala, and
M. Stoelinga, “Sampling-based robust control of autonomous systems
[1] P. Mohajerin Esfahani and D. Kuhn, “Data-driven distributionally with non-gaussian noise,” in Proceedings of the AAAI Conference on
robust optimization using the wasserstein metric: Performance guar- Artificial Intelligence, vol. 36, no. 9, 2022, pp. 9669–9678.
antees and tractable reformulations,” Mathematical Programming, vol. [25] L. Blackmore, M. Ono, A. Bektassov, and B. C. Williams, “A proba-
171, no. 1-2, pp. 115–166, 2018. bilistic particle-control approximation of chance-constrained stochastic
[2] D. Kuhn, P. M. Esfahani, V. A. Nguyen, and S. Shafieezadeh-Abadeh, predictive control,” IEEE transactions on Robotics, vol. 26, no. 3, pp.
“Wasserstein distributionally robust optimization: Theory and appli- 502–517, 2010.
cations in machine learning,” in Operations research & management [26] B. Hassibi, A. H. Sayed, and T. Kailath, Indefinite-quadratic estima-
science in the age of analytics. Informs, 2019, pp. 130–166. tion and control: a unified approach to H2 and H∞ theories. SIAM,
[3] S. Shafieezadeh-Abadeh, L. Aolaritei, F. Dörfler, and D. Kuhn, 1999.
“New perspectives on regularization and computation in optimal [27] K. Zhou and J. C. Doyle, Essentials of robust control. Prentice hall
transport-based distributionally robust optimization,” arXiv preprint Upper Saddle River, NJ, 1998, vol. 104.
arXiv:2303.03900, 2023. [28] J. Anderson, J. C. Doyle, S. H. Low, and N. Matni, “System level
[4] A. L. Gibbs and F. E. Su, “On choosing and bounding probability synthesis,” Annual Reviews in Control, vol. 47, pp. 364–393, 2019.
metrics,” International statistical review, vol. 70, no. 3, pp. 419–435, [29] S. Dean, H. Mania, N. Matni, B. Recht, and S. Tu, “On the sample
2002. complexity of the linear quadratic regulator,” Foundations of Compu-
[5] C. Villani et al., Optimal transport: old and new. Springer, 2009, tational Mathematics, vol. 20, no. 4, pp. 633–679, 2020.
vol. 338. [30] B. P. Van Parys, D. Kuhn, P. J. Goulart, and M. Morari, “Distri-
[6] Z. Chen, D. Kuhn, and W. Wiesemann, “Data-driven chance con- butionally robust control of constrained stochastic systems,” IEEE
strained programs over wasserstein balls,” Operations Research, 2022. Transactions on Automatic Control, vol. 61, no. 2, pp. 430–442, 2015.
[7] C. Frogner, C. Zhang, H. Mobahi, M. Araya, and T. A. Poggio, [31] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge
“Learning with a wasserstein loss,” Advances in neural information university press, 2004.
processing systems, vol. 28, 2015. [32] J. Borwein and A. Lewis, Convex Analysis and Nonlinear Optimiza-
[8] D. O. Adu, T. Başar, and B. Gharesifard, “Optimal transport for a tion: Theory and Examples. Springer New York, 2005.
class of linear quadratic differential games,” IEEE Transactions on [33] R. G. Bartle, The elements of integration and Lebesgue measure. John
Automatic Control, vol. 67, no. 11, pp. 6287–6294, 2022. Wiley & Sons, 2014.
[9] B. Taşkesen, D. A. Iancu, Ç. Koçyiğit, and D. Kuhn, “Distributionally
[34] M. Sion, “On general minimax theorems.” Pacific Journal of Mathe-
robust linear quadratic control,” arXiv preprint arXiv:2305.17037,
matics, vol. 8, no. 4, pp. 171–176, 1958.
2023.
[10] C. Mark and S. Liu, “Stochastic MPC with distributionally robust
chance constraints,” IFAC-PapersOnLine, vol. 53, no. 2, pp. 7136–
7141, 2020. A PPENDIX
[11] M. Fochesato and J. Lygeros, “Data-driven distributionally robust
bounds for stochastic model predictive control,” in 2022 IEEE 61st
Conference on Decision and Control (CDC). IEEE, 2022, pp. 3611– A. SLS controller implementation
3616.
[12] L. Aolaritei, M. Fochesato, J. Lygeros, and F. Dörfler, “Wasser- From [19],
stein tube MPC with exact uncertainty propagation,” arXiv preprint
arXiv:2304.12093, 2023.
[13] I. Yang, “Wasserstein distributionally robust stochastic control: A data- δ = (I − zΦxw (z))δ − Φxv (z)y,
driven approach,” IEEE Transactions on Automatic Control, vol. 66,
no. 8, pp. 3863–3870, 2020. u = zΦuw (z)δ + Φuv (z)y,
which means that at each timestep t ≥ T , one has ∀k = 1, . . . , T and Φ(T +1) = 0. In matrix form, this yields
T
X T
X 0 I ... 0 0 0
δt = δt − Φxw (k)δt−k+1 − Φxv (k)yt−k , .. .. . . .. .. ..
. . . .
. .
k=1 k=1 0
0 0 . . . I 0 0
T
X T
X
0 0 ... 0 I 0
ut = Φuw (k)δt−k+1 + Φuv (k)yt−k . (23)
0 0 ... 0 0 I
k=1 k=0
0 I ... 0 0 0
The achievability constraints (24) imply Φxw (1) = I and [I, 0]Φ .. .. . . .. .. ..
. . . . . .
Φuw (1) = Φuv (0)C, (see Appendix B). Hence, (23) can be 0 0 . . . I 0 0
0
reformulated at as 0 0 . . . 0 I 0
T
X −1 T
X 0 0 ... 0 0 I
δt = − Φxw (k + 1)δt−k − Φxv (k)yt−k ,
k=1 k=1
|} | {z } |} |} |} | {z } |} |}
(25g) (25c) (25b) (25a) (25g) (25d) (25b) (25a)
T
X −1 T
X
ut = Φuv (0)Cδt + Φuw (k + 1)δt−k + Φuv (k)yt−k . = [0, 0, . . . , 0, I, 0], [0, 0, . . . , 0, 0, 0] +
k=1 k=0 I 0 ... 0 0 0
0 I . . . 0 0 0
Writing this controller implementation in matrix form and . . .
. . . . .. .. ..
noting that Φxv (0) = 0 yields (11). . . . . . 0
0 0 . . . I 0 0
B. Infinite horizon Achievability
0 0 ... 0 I 0
Proposition 9: The achievability constraints (9) are equiv- I 0 ... 0 0 0
[A, B]Φ 0 I . . . ,
alent to . . . 0 0 0
− + .. .. . . .. .. ..
Z ⊗In 0 Z ⊗In 0
0 . . .
[I, 0]Φ = [A, B]Φ 0 0 . . . 0
0 Z − ⊗Ip 0 Z + ⊗Ip I 0
+ [ZT++1 ⊗In , ZT++1 ⊗0], (24a) 0 0 ... 0 I 0
−
+ |} | {z } |} |} |} | {z } |} |}
Z ⊗In Z ⊗A
Φ − =Φ + (25g) (25c) (25b) (25a) (25g) (25d) (25b) (25a)
Z ⊗(0C) Z ⊗C
+ and
ZT +1 ⊗In
+ , (24b) 0 I ... 0 0 0 A 0 ... 0 0 0
ZT++1 ⊗(0C)
.. .. ..
.
.. .. .. 0 A ... 0 0 0
. . . . . . .. .. .. .. ..
where Z + = [IT +1 , 0], Z − = [0, IT +1 ] are in
0
. .
0 0 ... I 0
. . . . .
R(T +1)×(T +2) , and ZT++1 is the last row of Z + .
0 0 ... 0 I 0 0 0 ... A 0 0
Proof: By treating Φx and Φu as FIR filters:
0 0 ... 0 0 I
0 0 ... 0 A 0
T
X 0 0 ... 0 0 0 C 0 ... 0 0 0
Φ .. .. .. .. .. .. = Φ
Φxw (k)z −k+1 −AΦxw (k)z −k −BΦuw (k)z −k = I, . . . . . . 0 C ... 0 0 0
.. .. .. .. .. ..
k=1 0 0 ... 0 0 0 . . . . . .
0
T
X 0 0 ... 0 0 0 0 0 ... C 0
Φxv (k)z −k+1 −AΦxv (k)z −k −BΦuv (k)z −k = BΦuv (0), 0 0 ... 0 0 0 0 0 ... 0 C 0
k=1
XT |} | {z } |} |} |} | {z } |} |}
Φxw (k)z −k+1 −Φxw (k)Az −k −Φxv (k)Cz −k = I, (25g) (25e),(25f) (25b) (25a) (25g)
(25e),(25f) (25b) (25a)
k=1 0, 0, . . . , 0, I, 0
+ .
T
X 0, 0, . . . , 0, 0, 0
Φuw (k)z −k+1 −Φuw (k)Az −k −Φuv (k)Cz −k = Φuv (0)C, The matrices can be written in a compact form as (24), which
k=1
concludes the proof.
which is equivalent to
C. Proof of Proposition 2
Φxw (0) = 0, Φxv (0) = 0, Φuw (0) = 0, (25a) The risk (15) is contingent on three mathematical objects:
Φxw (1) = I, Φxv (1) = BΦuv (0), Φuw (1) = Φuv (0)C, (25b) 1) A loss function max ℓj (ξT ) = max a⊤
j ξT + bj ,
j∈[J] j∈[J]
Φxw (k + 1) = AΦxw (k)+BΦuw (k) , ∀k = 1, . . . , T, (25c) (i) (i)
k22 ,
2) A transport cost c(ξnT , ξT ) = kξT − ξT o
Φxv (k + 1) = AΦxv (k)+BΦuv (k) , ∀k = 1, . . . , T, (25d)
3) and a support Ξ = ξ : max fk (ξ) ≤ 0 , where nH
Φxw (k + 1) = Φxw (k)A+Φxv (k)C , ∀k = 1, . . . , T, (25e) k∈[nH ]
is the number of rows in H and fk (ξ) = Hk ξ − hk .
Φuw (k + 1) = Φuw (k)A+Φuv (k)C , ∀k = 1, . . . , T, (25f)
Moreover, since the loss is concave and both the transport
Φ(T + 1) = 0, (25g) cost and the support are convex, (15) shows strong duality
properties if and only if it is strictly feasible. The strict from the boundary of Ξ. Third and finally, let wmax (Q) be an
feasibility is guaranteed by the full-dimensionality of Ξ and eigenvector of Q associated with λmax (Q). The distribution
the strict positivity of ǫ. The dual problem is given by [3] as Q⋆ satisfies
N
1 X (i) dR(Q) 1 ⊤
inf λǫ + s , = lim sup E ξ′ ∼Q′ ξT′ QξT′ −ξT⊤ QξT
λ≥0 N i=0 dǫ dǫ→0 dǫ Q′ ∈Bdǫ (Q⋆ ) T
+
ξT ∼Q⋆
1
(i)
subject to sup ℓ(ξT ) − λc(ξT , ξT ) ≤ s(i) , ∀i ∈ [N ]. = lim+ sup E ξ′ ∼Q′ (ξT′ −ξT )⊤ Q(ξT′ −ξT )
dǫ→0 dǫ Q′ ∈Bdǫ (Q⋆ ) T
ξT ∈Ξ ξT ∼Q⋆
While the dual problem does not seem much simpler to solve − 2(ξT′ −ξT )⊤ QξT
than the primal at first glance, we use [3, Proposition 2.12] to 1
≥ lim max δλmax (Q)kdξk22 , (28)
reformulate it using convex conjugates. In our own notation, dǫ→0+dǫ δkdξk22 ≤dǫ
this gives √
because moving δ of Q⋆ ’s mass by kdξk ≤ δ −1 dǫ in the
1 X (i)
inf λǫ+ s , subject to , ∀i ∈ [N ], ∀j ∈ [J] : direction of ±wmax (Q) to obtain Q′ remains in Bdǫ (Q⋆ ) if
λ≥0 ,κijk ≥0 N dǫ ≤ δ 2 , which is true at the limit dǫ → 0+ . Hence, the
i∈[N ]
c X f
! inequality (28) implies that
ζij (i) ζijk
(i) ⋆ ℓ
s ≥ (−ℓj ) (ζij ) + λc ⋆
, ξb + κijk fk ⋆
, dR(Q) 1
λ T κijk λ⋆ = ≥ lim+ λmax (Q)dǫ = λmax (Q),
k∈[nH ]
X f dǫ dǫ→0 dǫ
ℓ c
ζij + ζij + ζijk = 0, (26)
which concludes the proof.
k∈[nH ]
where r(Ξ) < ∞ is the radius of a ball containing Ξ, which In general, one has maxξ̄∈Ξ minκi ≥0 f (ξ̄, κi , λ) ≤
is finite because Ξ is bounded. This gives the following minκi ≥0 maxξ̄∈Ξ f (ξ̄, κi , λ). This means that (29b) is a
inequality stricter constraint than (30a), yielding a larger infimum.
1
Nevertheless, if f is not only convex in κ but also concave
ξT⊤ QξT −∆Q J − d ≤ max 2ξT⊤ Qξj −ξj⊤ Qξj ≤ ξT⊤ QξT , in ξ, then Sion’s minimax theorem proves that the max and
j∈[J]
√ min operators commute [34, Corollary 3.3]. This means that
where ∆Q = 2r(Ξ) dλmax (Q). Finally, if all points of a if Q − λ−1 Q2 0, (29b) and (30a) are equivalent, which
function satisfy an inequality, its supremum must satisfy i as concludes the proof
well, hence
1
R(Q)−∆Q J − d ≤ sup EξT ∼Q max 2ξT⊤ Qξj −ξj⊤Qξj ≤ R(Q). Using Proposition 10, we are now ready to prove Lemma
Q∈Bǫ j∈[J] 4 by dualizing (29b) to remove the max operator, and by
1 using Schur’s complement to obtain linear inequalities. We
The limit limJ→∞ R(Q)−∆Q J − d is equal to R(Q). There-
start by highlighting that (29b) contains the maximization of
fore, the supremum of the piece-wise linear approximation
the quadratic cost
is squeezed into the equality
(i)
lim sup EξT ∼Q max 2ξT⊤ Qξj −ξj⊤ Qξj = R(Q). − ξ̄ ⊤ (Q − λ−1 Q2 )ξ̄ − ξ̄ ⊤ (2QξT +λ−1 QH ⊤ κi )
J→∞ Q∈Bǫ j∈[J] | {z } | {z }
quadratic linear
The second part of the proof aims at bringing the limit ⊤
1 (i)
back into the problem and evaluating it. Using the previous + κ⊤ ⊤
i HH κi + HξT + h κi ,
result and Proposition 2 with aj = 2Qξj and bj = −ξj⊤ Qξj , |4λ {z }
we know that R(Q) as defined in (17) is equal to constant
s (i)
≥ lim max min f (ξj , κij , λ) , ∀i ∈ [N ] , 1 (i) 1 ⊤ 1
J→∞ j∈[J] κij ≥0
− 2QξT + (HQ)⊤κi Q2 H⊤µi + µ⊤i HQ2 H⊤µi ,
2 λ 4
or equivalently, In general, the right-hand side of (29b) is smaller than
1 (i) (i) (38), which means that s(i) ≥ (38) implies (29b). Moreover,
(2λξT +H⊤κi )⊤ (λ2 Q−1 −λI)† (2λξT +H⊤κi ) (33a) if λI − Q 0, the problem (38) is a convex and strictly
4
1 (i) ⊤ feasible QP. Strong duality therefore shows that the right-
− 2λξT +H⊤κi (λI − Q)† H⊤µi (33b) hand side of (29b) is equal to (38) in this case. Finally,
2
1 we replace the upper bound on a minimum by an existence
+ µ⊤i HQ2 H⊤µi , (33c) constraint and perform the change of variable ψi = κi − µi
4
In order to obtain some simplifications, we use the fol- to rewrite (29b) as
lowing Woodbury-like identities: z {
† (i) ⊤ (i) 2 1 ⊤ −1 ⊤
1 1 s ≥ h ψi −λkξT k2 + µi HQ H µi
(λ2 Q−1 −λI)† = 2 Q−1 − I 4
λ λ 1 (i) ⊤ ⊤
(i)
† + 2λξT +H ψi (λI −Q)† 2λξT +H⊤ψi .
1 −1 1 −1 1
= Q − Q + 2I Q − I −1 1 |4 }
λ λ λ λ
† Applying Schur’s lemma to the two terms highlighted with
† 1 1 −1 −1 1 brackets and with (32), we obtain
= (λI −Q) + 2 I − Q Q − I
λ λ λ 1 X (i)
† 1 R(Q) ≤ inf λǫ + s ,
= (λI −Q) − (I − Pλ ) (34a) λ≥0,µi ≥0, N
λ ψi ≥−µi i∈[N ]
subject to , ∀i ∈ [N ] , ∀j ∈ [J + 1] :
1 1 (i) (i) ⊤
s(i) ≥ (gj −τ +γτ )− G⊤ ΦξT + HξT +h κij
γ γ j
kH⊤κij k22 1 ⊤ kΦ⊤ Gj k22
+ − Gj ΦH⊤κij + .
4ρ 2ργ 4ργ 2
One can factorize the last three terms of the constraint and
do the change of variable ζ (i) = s(i) +γ −1 τ −τ , which gives
1 1 X (i)
inf ρǫ − τ + τ + ζ ≤ 0, (43a)
ρ≥0,κij ≥0 γ N
i∈[N ]
subject to , ∀i ∈ [N ] , ∀j ∈ [J + 1] :
1 1 (i) (i) ⊤
ζ (i) ≥ gj − G⊤ j ΦξT + HξT + h κij (43b)
γ γ
1
+ (Φ⊤ Gj −γH⊤κij )⊤(Φ⊤ Gj −γH⊤κij ).
4ργ 2
Finally, a zero upper-bound constraint on an infimum is
equivalent to an existence constaint. Moreover, because ρ ≥
0, (43b) can be written as an LMI using Schur’s complement,
which concludes the proof.