Optimal Transport: Fast Probabilistic Approximation With Exact Solvers
Optimal Transport: Fast Probabilistic Approximation With Exact Solvers
Abstract
We propose a simple subsampling scheme for fast randomized approximate computation of optimal
transport distances on finite spaces. This scheme operates on a random subset of the full data and
can use any exact algorithm as a black-box back-end, including state-of-the-art solvers and entrop-
ically penalized versions. It is based on averaging the exact distances between empirical measures
generated from independent samples from the original measures and can easily be tuned towards
higher accuracy or shorter computation times. To this end, we give non-asymptotic deviation
bounds for its accuracy in the case of discrete optimal transport problems. In particular, we show
that in many important instances, including images (2D-histograms), the approximation error is
independent of the size of the full problem. We present numerical experiments that demonstrate
that a very good approximation in typical applications can be obtained in a computation time
that is several orders of magnitude smaller than what is required for exact computation of the full
problem.
c 2019 Max Sommerfeld, Jörn Schrieber, Yoav Zemel, and Axel Munk.
License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at
http://jmlr.org/papers/v20/18-079.html.
Sommerfeld, Schrieber, Zemel, and Munk
1. Introduction
Optimal transport distances, a.k.a. Wasserstein, earth-mover’s, Monge-Kantorovich-Rubinstein or
Mallows distances, as metrics to compare probability measures (Rachev and Rüschendorf, 1998;
Villani, 2008) have become a popular tool in a wide range of applications in computer science,
machine learning and statistics. Important examples are image retrieval (Rubner et al., 2000) and
classification (Zhang et al., 2007), computer vision (Ni et al., 2009), but also therapeutic equivalence
(Munk and Czado, 1998), generative modeling (Bousquet et al., 2017), biometrics (Sommerfeld and
Munk, 2018), metagenomics (Evans and Matsen, 2012) and medical imaging (Ruttenberg et al.,
2013).
Optimal transport distances compare probability measures by incorporating a suitable ground
distance on the underlying space, typically driven by the particular application, e.g. Euclidean dis-
tance. This often makes it preferable to competing distances such as total-variation or χ2 -distances,
which are oblivious to any metric or similarity structure on the ground space. Note that total vari-
ation is the Wasserstein distance with respect to the trivial metric, which usually does not carry
the geometry of the underlying ground space. In this setting, optimal transport distances have a
clear and intuitive interpretation as the amount of ‘work’ required to transport one probability dis-
tribution onto the other. This notion is typically well-aligned with human perception of similarity
(Rubner et al., 2000).
1.1. Computation
The outstanding theoretical and practical performance of optimal transport distances is contrasted
by its excessive computational cost. For example, optimal transport distances can be computed
with an auction algorithm (Bertsekas, 1992). For two probability measures supported on N points
this algorithm has a worst case run time of O(N 3 log N ). Other methods like the transportation
simplex have sub-cubic empirical average runtime (compare Gottschlich and Schuhmacher, 2014),
but exponential worst case runtimes.
Therefore, many attempts have been made to design improved algorithms. We give some selective
references: Ling and Okada (2007) proposed a specialized algorithm for L1 -ground distance and X
a regular grid and report an empirical runtime of O(N 2 ). Gottschlich and Schuhmacher (2014)
improved existing general purpose algorithms by initializing with a greedy heuristic. Their Shortlist
algorithm achieves an empirical average runtime of the order O(N 5/2 ). Schmitzer (2016) solves the
optimal transport problem by solving a sequence of sparse problems. The theoretical runtime of his
algorithm is not known, but it exhibits excellent performance on two-dimensional grids (Schrieber
et al., 2016). The literature on this topic is rapidly growing and we refer for further recent work to
Liu et al. (2018), Dvurechensky et al. (2018), Lin et al. (2019), and the references given there.
Despite these efforts, still many practically relevant problems remain well outside the scope of
available algorithms. See Schrieber et al. (2016) for an overview and a numerical comparison of
state-of-the-art algorithms for discrete optimal transport. This is true in particular for two or three
dimensional images and spatio temporal imaging, which constitute an important area of potential
applications. Here, N is the number of pixels or voxels and is typically of size 105 to 107 . Naturally,
this problem is aggravated when many distances have to be computed, as is the case for Wasserstein
barycenters (Agueh and Carlier, 2011; Cuturi and Doucet, 2014), which have become an important
use case.
To bypass the computational bottleneck, also many surrogates for optimal transport distances
that are more amenable to fast computation have been proposed. Shirdhonkar and Jacobs (2008)
proposed to use an equivalent distance based on wavelets that can be computed in linear time but
cannot be calibrated to approximate the Wasserstein distance with arbitrary accuracy. Pele and
Werman (2009) threshold the ground distance to reduce the complexity of the underlying linear
program, obtaining a lower bound for the exact distance. Cuturi (2013) altered the optimization
problem by adding an entropic penalty term in order to use faster and more stable algorithms, see
2
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
100% Problem
Size
● 32x32
64x64
128x128
Relative Error
●
10% ●
●
●
●
1%
Figure 1: Relative error and relative runtime compared to the exact computation of the proposed
scheme. Optimal transport distances and its approximations were computed between
images of different sizes (32 × 32, 64 × 64, 128 × 128). Each point represents a specific
parameter choice in the scheme and is a mean over different problem instances, solvers
and cost exponents. For the relative runtimes the geometric mean is reported. For details
on the parameters see Figure 2.
also Altschuler et al. (2017). Bonneel et al. (2015) consider the 1D Wasserstein distances of radial
projections of the original measures, exploiting the fact that, in one dimension, computing the
Wasserstein distance amounts to sorting the point masses and hence has quasi-linear computation
time.
1.2. Contribution
We do not propose a new algorithm to solve the optimal transport problem. Instead, we propose a
simple probabilistic scheme as a meta-algorithm that can use any algorithm (e.g., those mentioned
above) solving finitely supported optimal transport problems as a black-box back-end and gives a
random but fast approximation of the exact distance. This scheme
a) is extremely easy to implement, to parallelize and to tune towards higher accuracy or shorter
computation time as desired (see Figure 1);
b) can be used with any algorithm for transportation problems as a back-end, including general LP
solvers, specialized network solvers and algorithms using entropic penalization (Cuturi, 2013);
c) comes with theoretical non-asymptotic guarantees for the approximation error of the Wasserstein
distance—in particular, this error is independent of the size of the original problem in many
important cases, including images;
d) works well in practice. For example, the Wasserstein distance between two 1282 -pixel images can
typically be approximated with a relative error of less than 5% in only 1% of the time required
for exact computation.
3
Sommerfeld, Schrieber, Zemel, and Munk
via Pr ({x}) = rx . We will not distinguish between the vector r and the measure it defines. For
p ≥ 1, the p-th Wasserstein distance between two probability measures r, s ∈ PX is defined as
1/p
X
Wp (r, s) = min dp(x, x0 )wx,x0 , (1)
w∈Π(r,s)
x,x0 ∈X
where Π(r, s) is the set of all probability measures on X × X with marginal distributions r and s,
respectively. The minimization in (1) can be written as a linear program
X X X
min wx,x0 dp(x, x0 ) s.t. wx,x0 = rx , wx,x0 = sx0 , wx,x0 ≥ 0, (2)
x,x0 ∈X x0 ∈X x∈X
with N 2 variables wx,x0 and 2N constraints, where the weights dp(x, x0 ) are known and have been
precalculated.
In each of the B iterations in Algorithm 1, the Wasserstein distance between two sets of S point
masses has to be computed. For the exact Wasserstein distance, two measures on N points need to
be compared. If we take for example the super-cubic runtime of the auction algorithm as a basis,
Algorithm 1 has worst case runtime
O(BS 3 log S)
4
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
compared to O(N 3 log N ) for the exact distance. This means a dramatic reduction of computation
time if S (and B) are small compared to N .
The application of Algorithm 1 to other optimal transport distances is straightforward. One
can simply replace Wp (r̂S , ŝS ) with the desired distance, e.g., the Sinkhorn distance (Cuturi, 2013),
see also our numerical experiments below. Further, the algorithm can be applied to non-discrete
instances as long as we can sample from the measures. However, the theoretical results below only
apply to the EOT on a finite ground space X .
3. Theoretical Results
(S)
We give general non-asymptotic guarantees for the quality of the approximation Ŵp (r, s) =
PB
B −1 i=1 Wp (r̂S,i , ŝS,i ) (where r̂S,i are independent empirical measures of size S from r; see Algo-
rithm 1) in terms of the expected L1 -error. That is, we give bounds of the form
h i
E Ŵp(S) (r, s) − Wp (r, s) ≤ g(S, X , p), (4)
for some function g. We are particularly interested in the dependence of the bound on the size
N of X and on the sample size S as this determines how the number of sampling points S (and
hence the computational effort of Algorithm 1) must be increased for increasing problem size N in
order to retain (on average) a certain approximation quality. In a second step, we obtain deviation
inequalities for Ŵ (S) (r, s) via concentration of measure techniques.
Related work The question of the convergence of empirical measures to the true measure in
expected Wasserstein distance has been considered in detail by Boissard and Le Gouic (2014) and
Fournier and Guillin (2015). The case of the underlying measures being different (that is, the
convergence of EWp (r̂S , ŝS ) to Wp (r, s) when r 6= s) has not been considered to the best of our
knowledge. Theorem 1 is reminiscent of the main result of Boissard and Le Gouic (2014). However,
we give a result here, which is explicitly tailored to finite spaces and makes explicit the dependence
of the constants on the size N of the underlying set X . In fact, when we consider finite spaces X
which are subsets of RD later in Theorem 3, we will see that in contrast to the results of Boissard
and Le Gouic (2014), the rate of convergence (in S) does not change when the dimension gets large,
but rather the dependence of the constants on N changes. This is a valuable insight as our main
concern here is how the subsample size S (driving the computational cost) must be chosen when N
grows in order to retain a certain approximation quality.
Theorem 1. Let r̂S be the empirical measure obtained from i.i.d. samples X1 , . . . , XS ∼ r, then
√
E Wpp (r̂S , r) ≤ Eq / S,
(5)
lX
!
p−1 2p p −(lmax +1)p
√ max
−lp
q
Eq = 2 q ( diam(X )) q N+ q N (X , q −l diam(X )) (6)
l=0
5
Sommerfeld, Schrieber, Zemel, and Munk
Remark 1. Since Theorem 1 holds for any integer q ≥ 2 and lmax ∈ N, they can be chosen freely to
minimize the constant Eq . In the proof they appear as the branching number and depth of a spanning
tree that is constructed on X (see appendix). In general, an optimal choice of q and lmax cannot be
given. However, in the Euclidean case, the optimal values for q and lmax will be determined, and in
particular we will show that q = 2 is optimal (see the discussion after Theorem 3, and Lemma 1).
Remark 2 (covering by arbitrary sets). At the price of a factor 2p , we can replace the balls defining
the covering numbers N with arbitrary sets, and obtain the bound
lX
!
2p−1 2p p −(lmax +1)p
√ max
−lp
q
Eq = 2 q ( diam(X )) q N+ q −l
N1 (X , q diam(X )) ,
l=0
where N1 (X , δ) is the minimal number of closed sets of diameter ≤ 2δ needed to cover X . The
proof is given in the appendix. These alternative covering numbers lead to better bounds in high-
dimensional Euclidean spaces when p > 2.5 (see Remark 3).
Based on Theorem 1, we can formulate a bound for the mean approximation error of Algorithm
1. A mean squared error version is given below, in Theorem 5.
(S)
Theorem 2. Let Ŵp (r, s) be as in Algorithm 1 for any choice of B ∈ N. Then for every integer
q≥2 h i
E Ŵp(S) (r, s) − Wp (r, s) ≤ 2Eq1/p S −1/(2p) . (7)
Proof. The statement is an immediate consequence of the reverse triangle inequality for the Wasser-
stein distance, Jensen’s inequality and Theorem 1,
h i
E Ŵp(S) (r, s) − Wp (r, s) ≤ E [Wp (r̂S , r) + Wp (ŝS , s)]
1/p 1/p
≤ E Wpp (r̂S , r) + E Wpp (ŝS , s) ≤ 2Eq1/p /S 1/(2p) .
Measures on Euclidean Space While the constant Eq in Theorem 1 may be difficult to compute
or estimate in general, we give explicit bounds in the case when X is a finite subset of a Euclidean
space. They exhibit the dependence of the approximation error on N = |X |. In particular, it
comprises the case when the measures represent images (two- or more dimensional).
Theorem 3. Let X be a finite subset of RD with the usual Euclidean metric. Then,
E2 ≤ Dp/2 23p−1 ( diam(X ))p · CD,p (N ),
where N = |X | and
D/2−p
1/(1 − 2
) D < 2p,
CD,p (N ) = 2 + D−1 log2 N D = 2p, (8)
1/2−p/D
N [2 + 1/(2D/2−p − 1)] D > 2p.
One can obtain bounds for Eq , q > 2 (see the proof), but the choice q = 2 leads to the smallest
bound (Lemma 1a, page 17). Further, if p is an integer, then
√
2 + 2
D < 2p,
−1
CD,p (N ) ≤ 2 + D log2 N D = 2p,
√
(3 + 2)N 1/2−p/D D > 2p
6
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
Remark 3 (improved bounds in high dimensions). The term Dp/2 appears because in the proof
of Theorem 3 we switch between the Euclidean norm and the supremum norm. One may wonder
whether this change of norms is necessary. We can stay in the Euclidean setting, and may assume
without loss of generality that X is included in B diam(X ) (0), where Br (x) = {y : ky − xk2 ≤
r} is the closed ball of radius r around x. According to Verger-Gaugry (2005), there exists an
absolute constant C such that N (B1 (0), ) ≤ C 2 D5/2 −D . Using this would allow to replace Dp/2
by C2D/2 D5/4 , or, combining the alternative covering numbers N1 (Remark 2), by C2p D5/4 . This
is better than Dp/2 when p > 2.5 and D is large.
(S)
Theorem 3 gives control over the error made by the approximation Ŵp (r, s) of Wp (r, s). Of
particular interest is the behavior of this error as N gets large (e.g., for high resolution images). We
distinguish three cases. In the low-dimensional case p0 = D/2 − p < 0, we have CD,p (N ) = O(1) and
1
the approximation error is O(S − 2p ) independent of the size of the image.In the critical 0
case p = 0
1
the approximation error is no longer independent of N but is of order O log(N )S − 2p . Finally, in
the high-dimensional case the dependence on N becomes stronger with an approximation error of
order !12p 2p
N (1− D )
O .
S
In all cases one can choose S = o(N ) while still guaranteeing vanishing approximation error for
N → ∞. In practice, this means that S can typically be chosen (much) smaller than N to obtain a
good approximation of the Wasserstein distance. In particular, this implies that for low-dimensional
applications with two or three dimensional histograms (for example grayscale images, where N
corresponds to the number of pixels / voxels and r, s correspond to the grey value distribution after
normalization), the approximation error is essentially not affected by the size of the problem when
p is not too small, e.g., p = 2.
While the three cases in Theorem 3 resemble those given by Boissard and Le Gouic (2014),
the rate of convergence in S as seen in Theorem 1 is O(S −1/2 ), regardless of the dimension of the
underlying space X . The constant depends on D, however, roughly at the polynomial rate Dp/2 and
through CD,p (N ). It is also worth mentioning that by considering the dual transport problem, our
results can be recast in the framework of Shalev-Shwartz et al. (2010).
Remark 4. The results presented here extend to the case where X is a bounded, countable subset
of RD . However, our bounds for Eq contain the term CD,p (N ), which is finite as N → ∞ in the
low-dimensional case (D < 2p) but infinite otherwise. Finding a better bound for Eq when X is
countable is challenging and an interesting topic for further research.
7
Sommerfeld, Schrieber, Zemel, and Munk
(S)
Theorem 4. If Ŵp (r, s) is obtained from Algorithm 1, then for every z ≥ 0
" #
1/p
SBz 2p
(S) 2Eq
P |Ŵp (r, s) − Wp (r, s)| ≥ z + 1/2p ≤ 2 exp − . (9)
S 8 diam(X )2p
1/p
Note that while the mean approximation quality 2Eq /S 1/(2p) only depends on the subsample
size S, the stochastic variability (see the right hand side term in Equation 9) depends on the product
SB. This means that the repetition number B cannot decrease the expected error but it decreases
the magnitude of fluctuation around it.
From these concentration bounds we can obtain a mean squared error version of Theorem 2:
(S)
Theorem 5. Let Ŵp (r, s) be as in Algorithm 1 for any choice of B ∈ N. Then for every integer
q ≥ 2 the mean squared error of the EOT can be bounded as
2
E Ŵp(S) (r, s) − Wp (r, s) ≤ 18Eq2/p S −1/p = O(S −1/p ).
Remark 5. The power 2 can be replaced by any α ≤ 2p with rate S −α/(2p) , as can be seen from a
straightforward modification of the first lines of the proof.
with the constant CD,p (N ) given in (8). Thus, we qualitatively observe the same dependence on N
as in Theorem 3, e.g., the mean squared error is independent of N when D < 2p.
4. Simulations
This section covers the numerical findings of the simulations. Runtimes and returned values of
Algorithm 1 for each back-end solver are reported in relation to the results of that solver on the
original problem. Four different solvers are tested.
8
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
covered by the theoretical results from Section 3. The errors reported for the Sinkhorn scaling are
relative to the values returned by the algorithm on the full problems, which themselves differ from
the actual Wasserstein distances.
The instances of optimal transport considered here are discrete instances of two different types:
regular grids in two dimensions, that is, images in various resolutions, as well as point clouds in
D
[0, 1] with dimensions D = 2, 3 and 4. For the image case, from the DOTmark, which contains
images of various types intended to be used as optimal transport instances in the form of two-
dimensional histograms, three instances were chosen: two images of each of the classes White Noise,
Cauchy Density, and Classic Images, which are then treated in the three resolutions 32 × 32, 64 × 64
and 128 × 128. Images are interpreted as finitely supported measures. The mass of a pixel is given
by the grayscale value and the support of the measure is the grid {1, . . . , R} × {1, . . . , R} for an
image with resolution R × R.
In the White Noise class the grayscale values of the pixels are independent of each other, the
Cauchy Density images show bivariate Cauchy densities with random centers and varying scale
ellipses, while Classic Images contains grayscale test images. See Schrieber et al. (2016) for further
details on the different image classes and example images. The instances were chosen to cover
different types of images, while still allowing for the simulation of a large variety of parameters for
subsampling.
The point cloud type instances were created as follows: The support points of the measures
D
are independently, uniformly distributed on [0, 1] . The number of points N was chosen 322 , 642
2
and 128 in order to match the size of the grid based instances. For each choice of D and N ,
three instances were generated with regards to the three images types used in the grid based case.
Two measures on the points are drawn from the Dirichlet distribution with all parameters equal
to one. That is, the masses on different points are independent of each other, similar to the white
noise images. To create point cloud versions of the Cauchy Density and Classic Images classes the
grayscale values of the same images were used to get the mass values for the support points. In
three and four dimensions, the product measure of the images with their sum of columns and with
themselves, respectively, was used.
All original instances were solved by each back-end solver in each resolution for the values p = 1,
p = 2, and p = 3 in order to be compared to the approximative results for the subsamples in terms
of runtime and accuracy, with the exception of CPLEX, where the 128 × 128 instances could not
be solved due to memory limitations. Algorithm 1 was applied to each of these instances with
parameters S ∈ {100, 500, 1000, 2000, 4000} and B ∈ {1, 2, 5}. For every combination of instance
and parameters, the subsampling algorithm was run 5 times in order to mitigate the randomness of
the results.
Since the linear programming solvers had a very similar performance on the grid based instances
(see below), only one of them–the transportation simplex—was tested on the point cloud instances.
9
Sommerfeld, Schrieber, Zemel, and Munk
10% ● ●
●
● ●
●
● ● ●
● ● ●
1%
10−5 10−4 10−3 10−2 10−1 100 10−5 10−4 10−3 10−2 10−1 100 10−5 10−4 10−3 10−2 10−1 100
Relative Runtime
(S)
Figure 2: Relative errors |Ŵp (r, s) − Wp (r, s)|/Wp (r, s) vs. relative runtimes t̂/t for different pa-
rameters S and B and different problem sizes for images. t̂ is the runtime of Algorithm 1
and t is the runtime of the respective back-end solver without subsampling.
●
10% ●
●
● ● ●
● ●
● ●
●
●
1%
10−5 10−4 10−3 10−2 10−1 100 10−5 10−4 10−3 10−2 10−1 100 10−5 10−4 10−3 10−2 10−1 100
Relative Runtime
Figure 3: Relative errors vs. relative runtimes for different parameters S and B and different problem
sizes for point clouds. The number of support points matches the number of pixels in the
images.
10
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
●
●
●
32x32
64x64
150% ●
128x128
●●
●●
●●
●
●
●●
●
●●
100% ●●
●●
●●●
●
●
●
●●
●●
●
● ●
● ● ●● ●●
●
● ● ●● ●
●
●●●
● ● ●● ●●
● ●●
● ●● ● ●
●
●● ●●● ●
● ●● ●●
●
● ● ● ●●
●●● ● ●
●
●●●● ●● ●●
●
●●●● ●●
● ● ●● ●
● ●
●● ● ● ●●●
●
●● ●●
● ●● ●●
● ●
●● ●●●
●●● ●● ●
50% ● ● ● ●
● ● ●●
●● ●● ●●
●●●● ● ●●● ●●● ● ●●
●
●●
●● ●
●●● ●●
● ● ●● ● ● ●●
● ● ●● ●● ● ●
●●●● ●● ● ●
● ●
● ● ●
●
●●
● ●●
● ●● ● ●
●●● ●
●● ● ●
●● ●
●
● ●● ●● ●●
●
●
●●● ● ● ● ●●
●● ●●● ● ●●● ●● ●● ● ●
●
●● ●●● ● ●● ●
● ● ●●● ●●
●● ●●
● ●
●●
● ●●
●●
●●
● ●● ● ●
● ●● ● ●●● ●● ● ●● ● ●● ●
●● ●●
●●● ● ● ● ● ● ● ● ● ●● ●
● ●● ●
●●
●●●● ● ●●●● ●●●●
●●
●●
●●●
●●●● ●
●●●●
●
● ●●● ● ●● ●●
●● ●●●
●●
●● ●● ● ●● ● ●●●
● ●●
●● ●●● ●● ●
●
●● ● ●●●●●
●● ● ●●● ●●●● ●●
●
●● ● ●●
●
●
● ● ● ●
● ●●●●●● ●●●● ●●● ●
●●●
●
●
● ●●● ●●
●●●
● ● ●● ●●
●
●● ●●●●
●●
●●●
●● ●● ●●
●
● ●● ●●●●
● ●● ●
●●
●●● ●● ● ●●
● ●
●●● ● ● ●● ●
●●
● ●●●
● ●●● ●● ● ●●
●●
●
●●● ● ●● ●●
● ●●● ●
● ●●● ●● ●●
● ●● ●● ● ● ● ● ● ●
●●
● ● ● ●●
●● ● ●●●●● ●
●● ●●
● ●●● ●● ●● ●●
●●
●
●
●●
●
●●● ● ●●
●
●●
●●
●
●●● ●
●● ●
●●●●
●●●
●
●
●
● ●● ●●
●
●
● ● ●●●●● ●
●
●●●
●●●●●● ●●
●●
●●
●
● ●●● ●
●
●
● ●●●
●
●●
●● ●● ●
●
●
●●●● ● ●● ●●
●●
● ●
●● ● ●
●● ●
●
●
●● ● ●
●●●● ● ●●
●● ●● ●● ●● ●● ●
●
●● ●●●● ●●● ●
●
●●
●
● ●●●●●● ●
●
●●● ● ●●● ●●● ● ●● ● ● ● ●●●
●
●●
●
●●
●● ●●
●
● ●
●●●●●
●
●●●
●● ●
●●
● ● ●
●
●●●● ●●
●●
●
●●
●●
● ● ●●●
● ● ●
●
●●●●●
●●
●
●
● ● ●
●●●
●●
●●
● ●●●●● ●●
●● ●
●●
● ●
●
●
●
●
●
●●●
●
●●
●
●● ●●
● ●●●●
● ●●● ●
●●
● ●●●●
●● ●●●
●● ● ●● ● ● ●●● ●
●●●●●● ●
● ● ●●
● ● ● ● ●●● ●● ● ● ●● ●● ● ●● ● ● ● ● ●● ●● ●●
● ●● ●●● ●
●●●●
●●
●
● ●●●
●●
●●●●
●●
●
● ●●
● ●
●
●●
● ●
● ● ●●
●
●●●●
●●●
●● ● ●
●
●
●●●●
●● ●●●●● ●
● ●
●●
●●●●
●
●● ●●
● ●
●
●● ●
●●
● ●●
●● ●
●●
●●●
●●●
● ●●●●
● ● ●●●● ●● ●●
●
●●●
● ●●●
●●
●●
●
●
● ●
●● ●
●
● ● ●●
●
● ●
●
●
●
● ●●●●
●●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
● ●
●● ● ●
●●
● ●●
●
●●
●●●
●●
● ●
●
●
●
●
●
●
●●
●●
●
●●
●
●
●
●●
● ●
●
●
●●
● ●●
●●
●
●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●●●●● ●
●
●●
●●
●●
●
●●
●
●
● ●●
●●
●
● ●●
● ●
●●●●●
●
●
●
●●
●●
●● ●● ●
●●● ●
● ●● ● ●●●
●
●●
●●●
●●
●● ●●●
●●●
●●●●●
● ● ●●
●● ● ●
●● ●
●●
●●
●●●
●●
● ● ●●
●●
●
● ●●●●
●
●●● ●
●●●
●● ● ●● ●●
●● ●●● ●
●●
● ●●
●
●●●
●●● ●●
●●
0%
● ● ●● ●● ● ●● ● ● ● ● ●●● ●● ●●
● ●
● ● ●●●● ●● ●●
●
●●
●
●
●
●●
●
●
●●
●
●
●
●● ●●
● ●
●
●
●
●
●
●
●
●
●●●
●
●●
● ●●●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●●
●
●●
●●
●
●
●●
●
●●●
●
●●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●●●●●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
● ●
● ● ● ●●●● ●●● ●● ● ●
●●●
● ● ● ● ●
● ●●●●●●●●
● ●
● ●●● ● ●●
●●● ●
●● ●●
●●
●
●●
● ●●
●●● ●● ●
●●
● ●
● ●●●
●
● ● ●
●●
●
●●●●
●
●●● ●●
●●
●
●●
●●● ●● ●
●
●●●
●● ●●● ●● ●
● ●
●●
● ●●● ● ●
●
●●●
●●
●●●
●●
●
●
●
●
● ●
●
●●●
●●●
●
●
●●●● ●
●●●
●●
●●
●●
● ●● ●
●
●
●●
●●
●
●●
●●●
●●
●●
●●
● ●●
●●
●●
●
●
●●
●●
●
●
●●
●
●
●● ●●●●
●●●
●
●●
●●
●
● ●
●●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
● ●
●
●●
●
●
●●
●
●●
●●
●
●●
●
●
●
●●
● ●
●●●
●●●
● ●
●
●●
●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●●
●
●
●● ●
●
●●
●●●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●●
●●
●●
●
●●
●●
●
●
●●●
●
●
●
●
●
●●●
●●
●●●
● ●
● ●●●
●●●●
● ●●●●
●
●
●●●
●●
● ●
●
●
●
●
●
●
● ●●●
●
●
●●●
●
●
●
●● ● ●●
●
●
●●
●
●●
●●● ●●
●
●●
●●
●
●●●
●
●
●●
●
●
● ●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
● ●●● ●
●
●●
●
●● ●
●
●●●●
●●
●●
●
●
●
● ●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
● ● ●
●
●
● ●
●
●
●
●
●●●
●●●●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●●
● ●
●●
●●
● ●●
●●●● ● ●●● ●●
● ● ●
●●● ●
●●●
●
●● ●●●●●● ● ●● ●●● ● ●●
●●
●●
●●● ● ● ●● ●● ● ●
●●
●● ●●
●● ● ●
●●
●●
●●●
●●
● ●●
●●●
●
●● ●
●●●
●●
●●
●● ● ● ● ● ● ● ●● ●● ● ●
●●● ●● ●●● ● ●●●
●●
●● ●●●●
●●● ● ●●●● ● ●●●
●
●● ●
● ●●●●
●
●● ● ● ●●●
● ●●●●
●● ● ● ● ● ●● ●
● ●●●
● ●●
● ●
●● ● ●● ●
●●●● ● ●●
●
●
● ●● ● ●●● ●●●● ●● ●●
● ●
● ●●
● ●
● ●
●●●●
●● ● ●
●● ●
●●● ●● ● ●● ●●● ●
●
●
●●
● ●● ●
●●● ●● ●● ●● ●●
● ● ●● ●
● ●
●
●
● ●●
● ●
● ●●
lower runtimes. With S = 500 the runtime is reduced by over four orders of magnitude with an
average relative error of less than 10%. As to be expected, runtime increases linearly with the
number of repetitions B. However, the impact on the relative errors is rather inconsistent. This
is due to the fact, that the costs returned by the subsampling algorithm are often overestimated,
therefore averaging over multiple tries does not yield improvements (see Figure 4). This means that
in order to increase the accuracy of the algorithm it is advisable to keep B = 1 and instead increase
the sample size S. However, increasing B can be useful to lower the variability of the results.
On the contrary, there is a big difference in accuracy between the image classes. While Algorithm
1 has consistently low relative errors on the Cauchy Density images, the exact optimal costs for
White Noise images cannot be approximated as reliably. The relative errors fluctuate more and
are generally much higher, as one can see from Figure 5 (left). In images with smooth structures
and regular features the subsamples are able to capture that structure and therefore deliver a more
precise representation of the images and a more precise value. This is not possible in images that are
very irregular or noisy, such as the White Noise images, which have no structure to begin with. The
Classic Images contain both regular structures and more irregular regions, therefore their relative
errors are slightly higher than in the Cauchy Density cases. The algorithm has a similar performance
on the point cloud instances, that are modelled after the Cauchy Density and Classic Images classes,
while the Dirichlet instances have a more desirable accuracy compared to the White Noise images,
as seen in Figure 5 (right).
There are no significant differences in performance between the different back-end solvers for the
Wasserstein distance. As Figure 6 shows, accuracy seems to be better for the Sinkhorn distance
compared to the other three solvers which report the exact Wasserstein distance.
In the results of the point cloud instances we can observe the influence of the value p0 = (D/2) − p
on the scaling of the relative error with the instance size N for constant sample size (S = 4000).
This is shown in Figure 7. We observe an increase of the relative error with p0 , as expected from
the theory. However, we are not able to clearly distinguish between the three cases p0 < 0, p0 = 0
11
Sommerfeld, Schrieber, Zemel, and Munk
●●
●
●
●●
●
●●
●●●
●
●
●●●
●
CauchyDensity 1000% ●●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
CauchyDensity
ClassicImages
●●● ●●●
●● ● ●
●●●●
● ●
ClassicImages
● ●● ●
1000%
● ● ●● ●●
●● ●● ●
●●●
●
●● ●● ●
●●
●
●●
●●
●●
●
●
●
●
●
●
●●
●
●
●
● ●●●● ●
●
●●
● ●●●
● ●●●
● ● ●
●
●
●
●●●●
●●
●
●● ●
●
●●
● ●
●
●●●●
●
●
● ●
●● ●●●●
● ●
●●
●
●●
●●● ● ●●● ●●
Dirichlet
●●●
●●
● ● ● ● ●
●
●●
●●●●
●
●
●
●●
● ●
●●●●●●
● ● ●
●● ●●
●●
●●●
● ●●
● ● ●●
●●●
●●
●●
● ●
● ● ●
● ● ●
●
●
●
●●
●●●●●● ●●●● ●●● ●
●
●
●●
●●
●
●●●
●●
● ● ● ●●●
● ●
WhiteNoise
● ● ●●● ●●
● ●●
●● ●● ● ●
●●
●
●
●
●
●
●
●●
●●
●
●●
●●
●
●
●
●
●
●
●
●●
● ●● ●
●
●
●●
●●
●
●● ● ● ●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●●
●
●●
●
●
●
●●
●●
●●
●●●
●
●
●
●●
●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
● ● ● ●●
●●
● ●●
● ●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
● ●
●
●
●●
●
●●●
●●
●
●●
●●
●
●
●
●
●●●● ● ● ● ●
● ●●
●
●●●●●
●● ●● ● ●● ●● ●●●
●
● ● ● ●
● ● ● ●● ●● ●●● ●●
●
● ● ● ●●● ●
● ●
●●●● ● ● ●● ●
100%
●
●
●●
●
●●
●●
●●
●
●●
●● ●● ●
●●
●
● ● ●
● ●
●●
●
●●
●
●
●●●
●●
●
●●
●
●● ●
●
●●●
●●
● ●
●●
●●●
●
● ●
●●●●●
●●
●● ●●●● ●●
● ● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●●
●
●
●
●
●
● ● ●●
●●●● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
● ●●
●●
● ● ●●
●
●
●
●●●
●●
●●
●●
●●
● ●●●●
●
●
●
●●●
●
● ●● ●●
● ●●● ●●●
● ●
●
●●
●
●
●
●●
●
●●
●●
●● ●
●
●
●
●●
●
●●
●
●●●
●●
●
● ●
●●
●●
●●●●
●● ●
●●
●●
●●●
●●
●
●
●
●●
●
●
●● ●
●●●
● ●●●● ● ●
●
●●
●
●
●●●
●●● ●
●
●●
●
●●●
●●
●●
●● ●●
●●● ● ●●●●●●
● ●
●
●●
●●
●●
●
●
●●
●
●
●
●●
●● ●
●
● ●
●●
●
●
●●
●●
●●
●
●● ●●
●●
●●
●●
●●●
●●
●●
● ●
●●●
●
●
●●
●●●
●
●●
●
●●
●● ●
●●●
●● ●●
●● ●
●●
●●●
●
●●
●
●●●●●
●● ●●
● ●
● ●
●
● ●
●
●●●
● ●●
● ● ● ●
●
●●
●
●
●●
●●
●
●●
●●●●●
●●●●
●● ●
●●
●
●
●● ●●● ●●●
●●
●
●●●●●
●
● ●
●
● ●
●●●
●●●●
● ●● ●●
●
●● ● ●
●
●
●●
●
●●
●●
●●●
●
● ●●●
●●
●● ●
●●●●●
●● ●
●
●●
●
●
●
●●
●●
●
●
●●
●
●
●
●
●●
● ●
●
●
●●
●
●●
●●
●●●
●●
●
●
●
●●
● ●
●
●●●
●
●● ●
●●
●●
●
●●
● ●●
●
●●
●●
●●
●●
●
● ● ●
● ●
●
●●
●● ●●●
●●
●
●
●● ●●
●
● ●
●●●
● ●
● ●● ●
●●
●●●
●●
●● ●●
●●●
●●●
● ● ● ● ●
●●
● ●
●●● ●●
●●●●
● ●●●
●
●
●●
●● ●●
●
●● ●
●
● ●● ●●●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●●
●
● ●●
●●
●
● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●● ●
● ●●
●
●
●●●
●●
●
●●●
●
●
●
●●
●
●
● ●● ●●
●
●●●
●
●
●
●
●●
●● ●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●
● ●●
●
●
●●
●
●●
●
●●
●
●●
●
●
●●
●
●
●
● ●●
●
●
● ●
● ●●
●
●●●●
●
●
● ●
●●
●
● ●● ●
●●
●
●●
●
●●
●
●
●
●
●●●
●●●
● ●
●●
●
●
●●●●
●
●●
●
●
●
● ●
●●
●●
●●
●
● ●
●
●●●
● ●
●●●● ●
●● ●
●●
●●
●
●
●●
●●
●
● ●
●●●
●●
●●● ●●●●●
●
● ● ●
●
●●●
●●
●
●●
●●
●
● ●
●
●
●
●●
●
●●
●
●●
●
●●
●
● ●●●●●
●●
●●
● ●● ●
●
●●
●
●
●●●
●
●●
●●
●
●●
●
●
●
● ●
● ●
●●●
●
● ●
●
●●
● ●
●
●
●●
●●
●
●●
●
●●
●
●●
● ●
●●
●●●●
●
●●
●
●●
● ● ●● ●●
●●
●● ●
●
●●
●
●● ● ●●● ● ●●●
●●
● ●●●●
●
●●●
●
● ●●●●
● ●●
● ●● ●● ● ● ●
●●
●●
●
●●
●●● ●
●
● ●
●
●
●● ●
●● ●●
●●
● ●
●● ●
● ●
●●
●●
●●
●●
●●
●● ●●●●● ●●●
●●
●●●
●
●
●
●●
● ● ●
●●●● ●●
●●●
●●
●●
●
●●●
●
● ●
●●
●
●
●●
●● ● ● ● ● ●
●●●
● ● ●●●
●●
●●
● ●●●●
● ●● ● ● ●● ●● ●●
●●
●●
●●
●
●● ●●
● ●● ●
●● ●
●●●
● ●
●●
●●●●●
●● ●● ● ● ● ●●● ●
●●● ●
●●
●● ● ●
●
●●
●●●
Relative Error
●●●● ● ●
●
●
●●
● ●
● ●
●● ●
●
●
●
●●●●
● ●●
● ●
●
●●●
●● ●
● ● ●
● ●●● ●●
●
●●
●●
●
●●
●
●● ●
●
●●
●
●●● ● ●
●
●
●
●●
●●
●
●
●●
●
●●●
●
●
●
●● ●●
●●
●●
●●
●
●●
●●●
●
●● ●
●●
●
●●●●
●
●●●● ●●●●●
●●
● ●● ●
●
●●●
●●
●●
● ●
●
● ●●●
●●
●●
●●●
●●
● ●● ●●● ●● ● ● ● ●
● ●●
● ● ●●●● ●
● ●● ●
●●●
●●
●
●●
●● ●●● ●● ●● ● ●
●● ● ● ● ●● ● ● ●
●● ●
●
● ●● ●● ●●●
●●●● ●
●●●
●●
●●●
● ●
●
●
●●
●
●●
●●
●
●●
● ● ●
●●
● ●
●●
●
●●
●● ●
●●●●●●
● ●●●
●●●●
●● ●
●
●
●
●●
●●
●
●●
●
●●
● ●
● ●● ●●● ●
●
●
●
●●
●●
●
●●
●
●
●●
●
●● ●
●
●
●●
●
●
●●●●
●
●●
●●
● ●●●●●
●
●●●
●●
●●
● ●●
●●●
●●
●
●●
● ● ●
● ●
●
●●
●●
●
●
●● ●●
●●
● ● ● ●
● ●
●●
●
●●
● ●
● ●●●
●● ● ●●●● ●
●●
●●
● ●
●
●●
●
●
●● ●●
●
●●
●●
●
●
●
● ●
●
●●● ●●●●●
●●●
● ●●
●●●
●● ●
●●●
●●
●●●
●●
●
●
● ●●●●● ●
●
●●●
●●
●●
●●
●●
● ●●
●●
●●
●●
●
●●
●
●
●●
● ●
●
●●●
● ●
●
●
●●
●
●●
●● ●●●
●●●
●●
● ●●● ●
● ●
●●
●●
●●
●●●●●
●
●
●
●● ●●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●● ●●●
● ●
●
●
●●●
●
● ●● ●
●
●
●
●
●
●
●●
●
●●●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●● ●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●●●
●●
●●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●● ●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●
●● ●●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●●
●●● ●● ●●
● ●●
●●
●●●●
● ●● ●●●● ●
● ● ●
●●
● ●●
●●
●●
●●● ●● ●●
●●
●●●● ●
●●●● ●
●●
●●●●●
●● ●●
●●
●●●
●●
●●● ● ●
●●●●
●●
●● ● ●●● ●● ● ●●●
●●
●
●●
●
●
●●●
●●
●
●
●●
●●●
● ● ●● ●●●●● ●
●●
●
●●●
●●●
●● ● ●
●●●●
●●
● ●
●●
●●
●●
●
●
●
●●
●
●●● ●
● ●●●
●● ●
●
●
●
●●●
●
●
●●
●
●●
●
●
●● ●●
● ●●●
●
●●
● ●
●●●
●●
●●
●●
●
●●
●●
●
●
●
●
● ●●
●
●
●
●●●●
●●
●
●● ●
●●
●●
●●●
●
●
●
●
●
● ●
●●
●●
● ●●
●
●
● ● ● ●● ●●●
●
●
●
●●●
●●
●●
10%
●
●●
●●●●
●
●●
●● ● ● ●●
● ● ●
●●
●● ●
● ●
●● ●● ●
●
●●●●
●
●●
●●● ●● ●●
●●
●●●●
●● ●
●● ●● ●●
●● ●
●●●●● ●●
●●●
●●●
● ●●●
●●
●●●
●● ●● ● ● ● ● ● ●
●●●
●●
●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●● ●
●●● ● ●
● ●
●●
●
●
●
●
●
●
●
●
●●
●●●
●
●
●●
●
●
●
●
●●
●
●
●●●
●●
●
●●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
● ●●●●
●
●●
●● ●
● ●
●●●●
●
● ●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●●●
●
●
●
●
●●
●
●
●●
●
●
● ●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
● ●●
●
●
●
●
●
●●●
●
●●
●●
●
●
●
●● ●
●
●●●●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●
●●
●● ●
● ●●●● ●● ●● ● ●●●
●●●●
●●
● ●
● ●●●
● ●●
●
●●
●●
●
●
●●
●●●
● ● ● ●●
● ●
●
●●
●●
●
●●
●●
●
●●
● ● ●●●●
●● ●
●
●●
●●●
●●
●●
●
●●
●
●●●
● ●
●●●●
●●
● ●
●●
●
●●
●●●●
●
●●
●●●●●●
● ●●
● ●
●
●●●●●●
● ●
●●●●
●
● ●
●●
●●
●
●●
●
● ●
●
●
●●
●●
●
● ●● ●
● ●●●● ●● ● ● ●
●● ●● ● ●●
● ●● ●●
●●
●
●●
●
●
●●
●●
●● ●
●●● ●
●
●
●●●●
● ●●●
●
● ●
●●
●●
●
●
●●●●●
● ●●
●● ●● ●●
●
●● ●
●
●●
●●
●● ● ●
● ●
●
●●
● ● ● ●●●
●●
●
●
●●
●
●
●●
●
●●
●●●●●● ●
●●
● ●
●
●
●●
●●
●●
● ●
●
● ●
● ● ● ●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●● ●
● ● ●
●●●
● ●
●
●
●
●
●
●
●●
●
●●
● ●
●● ●
●●
● ●●
●●●●
●●
●
●●●
●●
● ●
●●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●●
●
●
● ●●
●●
●
●●
●
●
●
●●
●●
● ●
●
●●●●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●●●●
●
●
●●
●
● ● ● ● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●● ●
● ●●
● ●
●
●
●
●
●●
●
●● ●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●●●●
●
●
●
●
●
●●
●●
●
●
●
●
●● ● ●
●●
●●●
●●
●
●●
●
●● ●
●● ●●●● ●●●
●●●
●● ●● ●●●
● ●●●●● ●● ●
● ●●●●
●●
●● ●
●
●●
●●
●
●●
●
●●
●●
● ●●
●● ●
●●
●●
●●
●●
●
●● ●
● ●
●●●●
●
●●
●●
●●
● ●
●●● ●
● ●●
●●●
●●
●● ●
●
●●
●●●
●
●●
●●
●
●●
●
●
●● ●
●
● ●●●●●
●●
● ●
●●
●
●●
●●
●
●●
●
●
●
●●
●
●●
● ●
●●●
●
●
●●
●●
● ●
●
●
●● ●
●
●
●● ●
●
●●
●
●●●
●
●●
●
●
●●● ●●
●●●●
● ●
● ●
●
●●
●
●●
●
●
●
●●● ● ● ●
●
●
●●
●●
●
●
●
●●●● ●●
●
●●
●● ● ●●
●
●
●● ● ●
●●
●
●
●● ●
●
●● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●● ●●
● ● ●
●
●
●●
●
●●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●● ●
● ●●●
●●
●
●
●
●
●
●
●●
●
●●
●
●●●
● ●
●
●●●● ●
●
●●●
● ●
●●
●●
●
● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●● ●●●
●●
●
●●
●
●●
●● ● ●●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ●
● ●
●●
●●
●●●●
●●
●
●● ●
●
●
●
●
●●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●●
●●
● ●● ●
●
●
●●
● ●●
●
●●
●● ●●●●●● ●●●●
●
●●
●● ●● ●●● ●● ●●●
●●●●
●● ●
●●
●
●●●
●●
●●
●
●● ●● ●
●
●●
●
●●●
●●
●●
●●
● ●●
●
●●
● ●● ●●● ●
● ●
●●
●
● ●
●
●● ●
● ●
●●●
● ●●
●●
●●
●● ●
●
●●
●●
●●
●●
● ●
● ●
● ●●
●
●● ●●●
●
●●
●
●
●●
●●
●
●●
●
●●●●●●
●●
●●●●
●
●● ●
●
●●●
●
●●
●
●●
●●
●
● ●●
●
●●
●
●●
●●
●●
●●
● ●● ● ●● ●●
● ● ●
● ●
●●●●
●
●
● ●●
●●●●●
●
●●
●
●
● ●
●●●●●
●
●
● ●●● ●●●● ●● ●
●●●
●
●●●
●
●●● ●●●●
●
●●●●
● ●● ●
●
●●
●
●
●
●●
●●
●●
●
●
●●
●
● ●
● ●
●●
●●●●
●● ●●
●
●
●●
●●
●
●
●●●
●
● ●●
●●
● ●
●●●●
● ●●
●●●●
●
●●
● ●● ●
● ●
●●●
●
●●
●
●●
●
●●
●
●
●
●
●●●
●●●●●●
●
●
● ●● ● ●●
●
●
●
●
●●
●●
●
●
●●
●
●
●
●
● ●
●
● ●●●●
●
●
●●
●●
●●
●●●
● ●●
● ● ●
●
●
●
●●
●
●
●
●
●
●
● ●●
● ●
● ●
●
●●
●
●●
●●
●
●●●
●
●
●
● ●●
● ●●●●●●
●
● ●●● ●● ●● ● ●●
●●●
●● ●●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●● ● ● ●
●
●
●●
●
●
●
●●
●●
●●
●
●●
●
●●
●
●
●
● ●
●●●
●●●
● ● ●●
● ●
●
●
●
●●
●
●
●●
●●
●
●
●
●●
●
●
●
● ●
●●
●●
●
●
●
●●
●
●
●●
●●
● ●●
●
●
●●●
●
●
●●● ●
●
●●●
●●
●●
●
●● ●
● ●
●●
●
●●●
●
●●
●
●
●
●
●
●●●●
●
● ●●
●
●
●●
●●●●
● ● ● ●
●
●
●●
●●●
●
●●
●●
●●●●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●●●
●●●●●
● ●
●● ●
●●
●●
●
●●
●
●
●
●●
●
●●
●
● ●
●
●
●
●
●
●
●
●●
●● ● ●●
● ●
● ●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●● ● ●
●●
●
●
●●●
●
●
●
●●
● ●
●
●
●● ●
●●● ●
●
●●●●
●
●
●● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●
●
●●
● ●
●
●● ●
● ●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●●
●●
●
●●
●
●●
●
●
●●
●●
●
●●
● ● ●●
●
●
● ● ●●
●
●
●
●
●
●
●● ●
●
●
●
●●●
● ●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
● ●●●
●
●
●
●
●●
●●
●
●
●
●
●
●
● ●
●●●
● ● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ●● ● ●
●● ● ●
● ● ●●● ● ●
● ● ● ● ●● ●●● ● ● ● ● ● ●●●
● ●● ●●● ● ●● ●● ● ● ●
●●●
● ●
●● ●●● ●●●●●● ●● ●
●●●●
●●● ● ●●●
●●
●
● ●
●
● ●
●● ●● ● ●
● ●●●● ●●● ●●●●
● ● ●
●●
●●●
●
●●
●● ●●●
●●●●● ●●
●
● ●●
●
●●●
●●●
● ●●●● ●● ●
● ●
● ●● ● ●
●●
●● ●
●
●●
●
●●
●
● ● ●
●●
●●
●●
●●
● ●●● ●
● ●●●
● ● ●
●
●●
●
●●●
● ●
●
●● ●●●
●
●
●
● ●
●●
10%
●●
●● ● ● ●●
● ● ● ●●●
● ● ●●● ● ●● ●●●● ●●
● ●●● ● ● ●
●
●●
●●●
●●
●● ●●●
● ●● ●●●●● ●
●
●● ●● ● ●●
●● ●
●● ●●
●●●● ●
● ● ●
●●
●
●
● ● ●
●●●
● ●
●
●●●●
●
● ●
● ●●● ●
●●●
●
●●
●●
● ●●
●●
●●●
●
● ●●
●
●●●
●
●●
●
●
●
●
●
●
●●●●●●
●●
●
● ●
●● ●
●●
●●●
●●●
●
●● ●● ●
●●
●●●●
●● ●
● ●
●●
● ●
●●●
●● ●
●
●
●●
●●
●
●●
●●
●●
●
●
●
●
●●●
● ● ●
●●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●●
●●● ●
●●●
●
●
●
●● ● ●
●● ●● ●
●
●
●●●
●
●●
●
●●
●
●●
●●
● ●●●
●●●●●●
●●
●●
●●
●
●●●●
●
●
● ●
●●●
●●●
●●
●
●
●●●●
●●
●●
●
●●
●
●●●
●
●
●●●● ●
●●●
●
●
●●
●
●
●
●●●
●●●●
●
● ●
● ●
●●●● ●● ● ●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●●
●
●●
●●
●
●●
●
●
●● ●
●
●●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●● ●●
●
●
●
●
● ●●
●●
●● ●●
● ●
●●● ●●
●
●
●●
●
●
●
●
●●
●●●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●●●
●
●● ●
● ●
●
● ●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●●
●
●●●
●
●●
●●
●
●
● ●
● ●●
●
●
●●●
●●
●●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
● ●
●●
●
●
●
●
●
●●
●
●
●
●
●
●●
● ● ●●
● ●● ●● ● ● ● ● ● ●● ● ●● ● ●●
●●●●●●
●
●● ●
●
● ●● ● ●
●●●●● ●
●
●●
●
●●●
●
●●
●● ● ●●●
● ● ●●●
●
●●
●
●
●●
● ●●
● ●●
●
●
●
●●●●●
● ● ●● ● ●●
●
●●●●
● ● ● ●●● ●
●●
●●
●
●
●●●
● ●
●●
●
● ●●
● ●●● ●
●
●
●
●●
●●
●
●●
●
●●
●
●● ●
●
●
● ● ●● ●●●
● ●●
● ●
●
●●
●
●●
●●
●●
●
●
●●●
●
●
● ● ●
●
●
●●
●
●● ●
● ●●●● ●●● ●
●
●●●
●
●
●●●
●●
●
●
●● ●
●
● ●
●●
●●●
●●
●
● ●
●● ●
●●
●
●●
●●●
●
●
●●
●
●
●
●
●●
●●●●
● ● ●●●
●●● ● ●
●●
●●
●
●
● ●● ●●
● ●● ●
● ●● ●●
●●●
●
●●
●
●●
● ●●
● ●
●
●●
●●
●●
●
●●
●●
●
●
●●
●●
●
●
●
● ●●● ● ●●
●
●●● ●
●
●●
●
● ●●
●● ●
● ● ●
●
●●
●
●●
●●
●
●●
●
●●
●
●●●
● ●●
● ●
●
●●
●
●
●●
●●
●●
●
●●
●●
● ●
● ●●●●
● ●
● ●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
● ●●
●●
●
●
●
●● ●
●
●● ●
●
●
●●
●
●●
●
●●
●
●
●
● ●
●
●●
●
●●●
●
●● ●
●●
●●
●●
●
●●
● ●●●
●●
●●
●●●
●●
●
1%
● ● ● ● ●● ●●
● ●● ● ● ● ● ●● ●
●●
● ●● ●
● ● ● ●
●●●●
●
● ●● ● ● ●● ●
● ● ●●● ● ● ● ●● ●●
●
●●●●
● ●
●
●● ●●●
●● ●●
● ●
● ●●
●●
●●
●
●● ● ●
●●
●
●
● ●● ● ●
●
●●
●
●
●
●
●
●
●
●●
●
● ● ● ●
●
●●
●●
●●
●
●●
●●
● ●●
● ●● ● ●
●●
●
●
● ●
● ●● ● ●
● ●● ●
● ● ●
●
●
●
●
●
●
●
●
●●
●●● ●●●
●
●
● ●● ● ● ●
●
●●
●
●●●
●
●
●●
●
●
●● ●
● ●●● ●
●
●●●● ●
●●
●●
●●
●●
●
●
●
●●●
● ●●●●
●●
●
● ● ●●
●●● ●
●● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●●
●
●●
●
●●
●
●
●
● ●
●●
●
●
● ●
●●●
●
● ●
● ●●●
●
●●●
● ●
●
●
●
●●●
●● ● ● ●● ●● ●
●●
●●●●●
● ●
●●
● ● ● ● ●● ●
● ● ●
●
●●●
●
●●
●
● ●
● ●● ●● ●
●●●●
●●
●●
●
●
● ●●
● ● ●●●
●
●●●
●● ●●●● ●● ●
● ●
●
●
●●
●●●
●●
●●
●
● ●● ●
●
●●
●
●
●●
●●
●
●●
●● ●● ● ●
●●
●
●●
●●
●●
●
●
●
●●
●●
●
● ●●
●
● ●●
●
●●● ●● ●● ●
● ●
●●
●●
●
●
●●
●
●
●●
●●
●
●●
●
●● ●
●
● ●
● ●●
●
●
●●● ●
●●
●●●●
●
●●
●
●
●●●
●●
●●
●●● ●
●
● ●
● ●●
●
●●
●
●●
●●
●●
●
●●
● ●●●●
●●
●
●
●● ●●
● ●●
● ●● ●
●●
● ●● ● ● ●
● ●●
● ●●●●● ●
● ●●
●
●
●●
●
●
●
●
●
●
●●● ●
● ●●
●
●● ● ●
●
●●
●
●
●●
●●
●●●
●
●
●
●●
●
● ●●●●
● ● ●
●
●
●
●
●
●●●
●●
●●
●● ●
●
● ●●●
●
●●
●
●
●●
●
●
●
●●●
●
●
●
●
●
●● ●
●
● ● ●
●
●
●●
●
●●
●
●●
●
●●
●●
●
● ● ●●
●●●
●
● ● ●●
●
● ●
●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●
●
●
●
●
● ●●
●●●
●●
●
●●● ●●
●●
●●
●●
●
●●
●
●
●
●
●●
●
●●●
●● ●● ●●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●● ●
●●
●
●
● ●● ●
●
● ●●●
●●
● ●●●
●
●●
●
●
●
●●
●
●
●
●
● ●
●●
● ●
●●
●
●
●●
●
●
● ●● ●●
●
● ●
●
●
●●●
●
●●●
● ●
● ●●●
● ●
●● ●
●
● ●
●
●
●
●●●
●●
●
●
●
●
●
●
●● ● ●●●
●
● ●
●
●● ●● ●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●●
●
●
●●●
●●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
● ●
●
● ● ●● ●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●●
●●●●
●
●
●
●
●
●
●● ●●
●●
●
●●
●●
●
● ●
● ●● ●●● ●
●●
●●● ●● ●
● ●
● ●●●● ●● ●
●● ● ●● ●
● ●● ● ●●
● ●●● ●●●●●● ● ●● ●●
●●●●
●● ●●● ●● ● ● ● ●● ● ●● ●●● ●●
● ● ●
●●●
●
● ●
●●
● ●
● ●
●
● ●● ●●●●
●
●●
●
●● ●●
● ●●
●●
●
●●●
●●
●●
●
●●
●●
●●
●
●● ●●●
●● ● ●●
●●
●
● ●
●
● ●
●
●●
●●
● ● ●● ●
●●●
●●
●●● ●
●● ●●
●
●●
●
●●
● ●● ● ●
● ●●
●
●
●
●
● ●●
●
●
●●
●
●
●
● ●
● ●● ●●
●
●●●●
●●●●
●● ●●●
●
●●
● ●●
●
●
●
●●
●●●
●
●●
● ● ●
●●●●
●
●
●
●
●
●
●
●●
● ●● ●●
●
● ● ●●●●● ●●
●●
●
●●
●●
●
●●
● ●●
●
● ●●
● ● ●●●
● ●
●
●
●●
● ● ● ●
●
●● ●●
●
● ●
● ●● ●
●
●●
●●
● ●●
●●●
●
●●
●●
●
●●
● ●●
●
●
●●●
●
● ● ●●●●● ●
●●●
●●
● ●●
●
● ●●
●● ●
●
●●●● ●
●
●●
●●●
●
●●
● ●●●
● ●
●●●
●
●
●
●● ●● ●●●●●
●
● ●
●
●●
●
●●
●●
●
●
●● ●●●●●
● ● ●
●●●
●
●
●●
●●
●●● ●●●●●● ●
●●●
●
●
●
●●●
●●
●
●
●● ● ● ●
●
●● ●
● ●●
●
●
●●●●
●●●
●
●● ●● ●● ●
●
●●
●
●
●●●
●●●
●
●
●
●
●
●● ● ●●
●●
●● ●● ●●●
●
●●
●
●●
●
●
●●
●●
● ●●
●
●
●
●
●
● ● ● ●
●
● ●
●●
●●
● ●
●
●●
●
●●
●●● ●● ●●
●
●●
●
●●
●●
●
●
●●●
● ●
●
●
●
● ●
●
● ●
●●● ● ●● ●
●
●
●
●●
●
● ●
●●
●● ●●
● ●
● ●●
●● ●●
●
●●
●●●
●●
●
●●
●●
● ●●
● ● ● ●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●●
●
●
●
●●●
● ●●●●
●
●●
●● ●●
●
●● ● ●●
●
●●● ●
●●●●● ● ●● ●●● ●
●●● ●●
● ●●●● ● ●●
● ● ●●●
●●●●●
● ●●● ●● ●●●
● ● ● ● ● ●
● ●
●● ●●●●●
●
● ●
●●●●● ●● ● ●
●●● ●● ●● ●●
● ● ●
●
● ●●●
● ●●●
● ●
●●● ● ●
● ●
● ●●
●
●●
●
● ●
●●
●●
●●
●● ● ●
● ●
●●●
●
●
●●
●
●●●
●●
● ●●● ●●
● ●●● ●
●●
●●●
●●●● ●
●●
●
●
●●
●●●● ●●●
●●●● ●
●●
●
●●
●●●●●
●
● ●●
●● ●
● ●
●●●
● ●● ●●●●
●●
●
●● ●
● ●●● ●
●
●●●
●●
● ●
●● ●●
● ●
●● ●
●
●
●●●
●
●●
●
●●●
●●
●●●
●●●
●● ●● ●
●
●
● ●
●●●●●
● ●● ● ●
●●
●
●●
●●
● ●
●●
●●
●
●
●●
●
●
●
●●●
●● ●●●●
●
●
● ●
● ●●
●●
● ●●
●●●
●
●●
●●●●●
●●● ●●
●● ●● ●●
●
● ●
●
●
●
● ●●
●●● ● ● ●●
● ●● ●● ●●
● ●
●● ● ●●
● ●
●
● ●
●●●
●● ●
● ●●
●
● ●●
●
●●● ●●●
●
●●●
●●● ●●
●
● ●
●●●
●
●
●●●
●
●●●
●● ●● ● ●
●
●
● ●●●
● ●●●● ●
●
●
●
●
●●●
●●●
●
●
●● ●●
●
●●
●●●●
● ●●
●● ●
●●●
●●
●●
●●
●
●● ● ●●● ●●● ●
●●●
●●●●
●
●
●● ●
●
●●
●
● ●●● ● ●
● ●●● ●●
●● ●●● ● ●● ●●● ●●
● ●●●
●●
●● ● ●
●
●
●
●
●●
●
●●●
●
● ●
●●
● ●
●
● ● ●
●●● ●●
● ● ● ● ●
● ● ● ● ● ●●
●● ●
● ●
● ● ●●
●● ●●● ●●● ● ●
●
●● ●● ● ●
● ● ● ●● ●
●●● ●● ● ● ●
● ● ● ● ●
●●●●
● ●● ●● ●●●●●
●● ●●
●
● ● ●●●●
● ●
● ●●● ●● ●● ●
● ●
●●●
●●●●
●● ●
● ●● ● ● ●●
●
● ●
●
●●
●●
● ●●
●●
● ●●● ● ● ●●● ● ● ●
●●●●● ●● ● ●● ●
●
●●● ●●
●
● ●
●
● ● ● ● ●●● ●
● ●● ●●
●●●●
●● ●● ● ●
●
●●●
●●● ●● ● ●●●●
●●●● ●
●
● ●
●●● ●● ●● ● ●
●
●
● ●●●●
● ●
●●●
●●● ●●● ●●●●● ●● ●● ●●● ●● ●● ● ●● ● ● ●● ●●
●●
●●
●●
● ●
● ●
●● ●
●● ● ●● ● ● ● ● ●
1% ● ●●● ●● ● ●
●● ● ●● ●●
●●
● ● ● ●● ●●
●
●●●●
●● ●●● ● ●
●● ● ● ●
●
●
●●●● ●● ●●● ● ●● ●
●
● ● ● ●●●●
● ●●●●●
● ●● ●●● ● ●
● ● ●
●●
● ●
●●
●● ●● ● ●
● ●
●●●
●● ● ● ●
●●
●●●
●
●
●●●
●
●● ● ●●● ●●
● ●
●●●●●
●●●
●
●● ●●
● ●●
● ●
●
●
●●●● ● ●●
● ●
●● ●●●●
● ●●● ●
●
●●●● ●●
●
● ●●
●
●●
●
●
● ●●
● ● ●●
● ● ●
●
● ● ●
●●
●●
●●●
● ●●
● ●
●● ●● ●●●
●
●●
●●●●●
●
●
●
●●
●
● ●
●
●● ●
● ● ●
●●
●●
●
●
● ●●
● ●
● ●●●
●
● ●
●
●●
● ●● ● ●●●●
● ●● ●●
●
● ●●
●●●● ●● ●●
●
0.1%
●● ● ● ●●● ●
●● ● ● ● ● ● ●● ● ●● ●●
●● ● ●●● ● ● ●
●●● ●●●● ● ●
●●
● ● ●
●●● ●●
● ●● ●
● ●●
● ● ●
● ●●●●●
●
●
●●●● ● ●
●● ● ● ● ● ●●● ●● ●
●
●●
● ● ● ●●
●● ●●● ●● ●●
●●●●●● ●●● ●
●● ● ●● ● ●
●●●●
●●●
● ●●●
● ● ●●●
●● ●
●●●●
●● ●● ● ● ●● ●● ●●
● ●● ●● ●
● ●●
● ● ●●●
●●●
●● ●
●
●●
●
● ●● ●● ●● ●
●●
● ●
●●
●●●●● ● ●
● ●●● ●●
● ●
●
●●●●
●
●●●
●●
● ●
●
● ●●●●
●
● ●● ● ●
●●●
●
●
●
●
●
●● ●●
●●●●
● ●● ● ●● ●
●● ●
● ●●
● ●●●
● ●
●● ●● ●
●
●●
●
●●●
●
●
●● ● ●
●● ●●
●●
●
●●●
●●
●
●● ● ● ●
● ● ●
●●●
●
●
●●●
●
● ●●●
●
●● ●
●
●
● ●●● ●● ●●
● ●
●●
●● ●
●●
●●● ● ●
● ●●
●● ● ●● ● ●
● ●
●● ● ● ●
●●●●●
● ●
● ● ● ●●●
●
● ●
●●● ●●
●●●
●● ●
●● ●
●
●●
●●● ●●
● ●●
●●● ●●
●
●● ●● ●●●● ● ●●●● ●● ●
●
● ● ●
●●●●●
●●●
●
●● ●●
● ● ●
●
●● ●●
●● ●
●●
●● ● ●
●●●●● ●●
●●
●●● ●
●
●
● ●
● ● ●●● ● ● ● ●
●
● ● ● ●
●●● ● ●
●
●● ●●●
●
●
●
●
●
●●● ● ●●
● ●
●● ● ●● ●●●●
●● ●● ● ●● ●●● ●●
●●
● ●
● ●●● ●● ● ● ● ●●
● ●● ●●●● ●●●●
● ●●
●● ● ● ●●●
●
● ●
●●
● ●●
●
●● ●●
● ●
●● ●●
●●●● ●● ● ●●● ●● ● ●● ● ●● ●● ●●● ●●
● ● ●● ●● ●●
●
●● ● ●● ● ● ● ●● ●
● ● ●● ●● ●
●
● ●● ●●
●● ● ●● ●● ● ●● ● ●
● ●
●
● ●● ● ●●● ●
● ● ●●● ●● ●● ● ● ●●● ● ●●●
●
●●● ●● ● ●●
● ●●●
●
● ●● ● ●● ●●
●● ● ● ● ●● ●●
●●
● ●● ●● ●● ● ●● ●
●● ●● ●●
●● ●● ●●
●●●
●●●
●
●
●●● ● ●
● ● ●●● ● ● ●● ●●●
●●● ●●● ● ●●● ● ● ●● ●●● ● ● ●●
0.1% ● ● ●● ●●●
●●●
● ●● ● ● ●● ●●
0.01%
●●
●●
● ●● ● ● ●●● ●● ●●
●●● ●● ● ●
● ● ●
● ●
● ● ●●●
●● ● ● ● ● ● ●● ●●
●● ●● ●● ● ●●
●●● ●
●● ●
● ● ●
● ●●
●
●●● ●
●●● ●●● ●
● ●
●●● ●● ●●
●● ●● ●● ●
●
●● ●●
●●
●● ●● ●● ●●
●● ●● ●
● ●●
●● ●● ●
●
● ●● ●●
●● ●● ● ●●
●● ●●
0.01% ●●
●
●
●● ●● ●●
●
●●
●●
100 500 1000 2000 4000 100 500 1000 2000 4000
S S
Figure 5: A comparison of the relative errors for different image classes (left) and point cloud in-
stance classes (right).
● ● ●
100% ●
●
●
●
●
● ●
●●
●●●
●●●
●●
●
●●
●
●
●
●
●
●●
●●● ●
●
● ●●●
●
Wasserstein
Sinkhorn
● ●
●
●●●●
●● ● ●● ●
●●
● ● ●
● ●
●
●● ● ●●
●● ●●●●
●● ●●
● ● ● ●● ●
●
●●●●● ● ●● ●
●●●● ●●● ●
●
●● ●●●●● ● ● ●● ●
●●●●
●
● ●●●● ●●●● ● ●
●●
● ●● ● ● ● ●
●
●● ● ●
●
● ●●
●●●●●●
●
●
●●
● ● ●
●●
●●
● ●●●● ● ● ●
●
● ● ●●●●
●● ●
● ●● ● ●
●●
● ● ● ●
●●●
●
● ●● ● ●● ●●●● ●●● ● ● ●● ●● ●●●●
●●●●●●●●● ●
● ●● ●● ●● ● ●●
● ●● ●● ●
●●
●●● ● ●
●
●
●●
●●●●●
●●
●
● ● ●●
● ●●●●● ●●
●
●
●●● ●● ●
● ● ● ●
●●
●● ●●●
●●●
●●●
●
● ● ●● ● ●●●● ● ●● ●●
● ●● ● ●● ● ●● ●●●●● ●● ●
●
●●
●● ●●
●● ●
● ●●
●●●● ●●
● ●●
●●●●●●
●
●
●●
●
● ●
●● ● ●
●
●● ● ● ● ●
●● ● ● ●● ● ●●● ●●● ● ●●●● ● ●● ●
●● ●● ●
●
●
● ● ●
● ● ● ● ●● ● ●
●●●
●●
● ●
●●
● ●●
●
● ●●●● ●●● ●●
● ● ●● ●● ●
●
●●
●●
● ●●● ●●●
● ●●●●●●● ● ● ●
●
● ●
● ●●●●● ●
● ●● ●● ● ●
●●
●
● ●●●● ●
●● ●●●
● ●
● ● ●●●●
● ●●●● ●● ● ●● ●● ●●● ●●●● ●●●● ●●●●
●●●● ●●●
●● ●●●● ●●●● ●●● ● ● ●● ●●
● ●● ●●●
●● ● ●● ●●●● ●
10%
● ● ● ● ● ● ● ● ●● ● ● ● ●●● ●●
● ● ● ● ●
● ● ●●● ●● ●●
Relative Error
1%
● ●● ● ●● ●● ● ●●●● ● ●●●●
●●● ●●●● ● ●
● ● ● ● ●●●
● ●● ●● ●● ●● ● ● ●●● ● ● ●● ●
● ● ● ●●● ●● ● ● ●● ●
●
●●●
●
● ●●●●
●●●
●
●●● ●
●
●●● ●
●
●● ●● ● ●●● ●
●
●● ●●● ●● ●● ●●● ●● ● ●●● ● ●
● ●● ●●●●
●● ● ●● ● ●● ● ●●● ● ●● ●● ● ●●
● ● ●● ●
●● ● ● ●● ●
●●●●
● ●●●● ●
●●
● ●●● ●
●● ●●
●●●● ●●
●●●
● ●
● ●● ● ● ● ●● ● ●
●● ●●●● ●●
●● ●● ●●
● ● ●
● ●● ● ●
●●●
● ●● ● ●●● ● ● ● ● ● ●●● ●●
●●
●
●● ●● ● ● ● ● ●● ●● ●● ●●● ●
●● ●
●● ●● ●●● ●
● ● ●●● ●●
● ● ● ● ● ●● ● ● ●●● ●●●● ●●
● ●● ● ●● ●●
●● ●
● ●● ● ●● ● ●●●
● ● ●●
●
●● ● ●● ● ● ●● ●●
●
●●
● ●
●●
●
● ● ● ● ● ●● ●● ● ● ●●●●
● ● ● ●●
● ● ● ● ●● ● ●● ●
●●● ● ● ● ●
●●● ● ●● ●● ●
●
●● ● ● ●
●● ● ●● ● ●●● ● ● ●●
●
● ● ●●
●● ● ●● ● ●●● ●
● ● ●● ●●● ●●●● ●●
● ●●●● ● ● ● ●●●
●● ● ● ●●●● ●
●●
● ●● ●● ●
●● ●● ●
●●
●● ●● ● ●●
● ●● ● ●
● ●●
●
●● ●●● ● ● ● ● ●
●●● ● ●●● ● ● ●
● ●●
0.1% ●●
●
● ●●
● ● ● ● ● ●
●
● ● ●●
●
●● ● ● ●
●● ● ● ● ●●
●●
●● ●
● ● ● ●●
●
●
●●
●● ● ●●
●
●
●
●
●●
●●
●●
●
●
●
●
● ●
●
●●
●
●
0.01% ●●
●● ●●
●●
Figure 6: A comparison between the approximations of the Wasserstein and Sinkhorn distances.
12
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
5%
4%
Relative Error
3% Problem
Size
●
32x32
2% ● ●
64x64
● ● 128x128
●
●
●
1%
−2 −1 0 1
D 2−p
Figure 7: A comparison of the mean relative errors in the point cloud instances with sample size
S = 4000 for different values of p0 = (D/2) − p.
and p0 > 0. This might be due to the relatively small instance sizes N in the experiments. While
we see that the relative errors are independent of N in the image case (compare Figure 2), for the
point clouds N has an influence on the accuracy that depends on p0 .
5. Discussion
As our simulations demonstrate, subsampling is a simple yet powerful tool to obtain good approxi-
mations to Wasserstein distances with only a small fraction of required runtime and memory. It is
especially remarkable that in the case of two dimensional images for a fixed amount of subsampled
points, and therefore a fixed amount of time and memory, the relative error is independent of the
resolution/size of the images. Based on these results, we expect the subsampling algorithm to return
similarly precise results with even higher resolutions of the images it is applied to, while the effort
to obtain them stays the same. Even in point cloud instances the relative error only scales mildly
with the original input size N and is dependent on the value p0 .
The numerical results (Figure 2) show an inverse polynomial decrease of the approximation
error with S, in accordance with the theoretical results. In fact, the rate O(S −1/2p ) is optimal.
Indeed, when r = s (are nontrivial measures), Sommerfeld and Munk (2018) show that ZS =
S 1/2p [Wp (r̂S , ŝS ) − Wp (r, s)] has a nondegenerate limiting distribution Z. For each R > 0 the
function x 7→ min(R, |x|) is nonnegative, continuous and bounded, so
lim inf E{S 1/2p |Wp (r̂S , ŝS )−Wp (r, s)|} = lim inf E{|ZS |} ≥ lim inf E min{R, |ZS |} = E min(R, |Z|).
S→∞ S→∞ S→∞
lim inf E{S 1/2p |Wp (r̂S , ŝS ) − Wp (r, s)|} ≥ E|Z| > 0.
S→∞
When applying the algorithm, it is important to note that the quality of the returned values
depends on the structure of the data. In very irregular instances it is necessary to increase the
13
Sommerfeld, Schrieber, Zemel, and Munk
sample size in order to obtain similarly precise results, while in regular structures a small sample
size suffices.
Our scheme allows the parameters S and B to be easily tuned towards faster runtimes or more
precise results, as desired. Increases and decreases of the sample size S will increase/decrease the
(S) (S)
mean approximation of Wp by Ŵp , while B will only affect the concentration around E Ŵp .
Empirically, we found that for fixed computational cost, the best performance is achieved when
B = 1 (compare Figure 2), suggesting that the bias is more dominant than the variance in the mean
squared error.
The scheme presented here can readily be applied to other optimal transport distances, as long as
a solver is available, as we demonstrated with the Sinkhorn distance (Cuturi, 2013). Empirically, we
can report good performance in this case, suggesting that entropically regularized distances might be
even more amenable to subsampling approximation than the Wasserstein distance itself. Extending
the theoretical results to this case would require an analysis of the mean speed of convergence of
empirical Sinkhorn distances, which is an interesting task for future research.
All in all, subsampling proves to be a general, powerful and versatile tool that can be used with
virtually any optimal transport solver as back-end and has both theoretical approximation error
guarantees, and a convincing performance in practice. It is a challenge to extend this method in a
way which is specifically tailored to the geometry of the underlying space X , which may result in
further improvements.
Acknowledgments
Jörn Schrieber and Max Sommerfeld acknowledge support by the German Research Foundation
(DFG) through the RTG 2088. Yoav Zemel is supported by Swiss National Science Foundation
Grant #178220. We thank an Associate Editor and three reviewers for insightful comments on
previous versions of this work.
Appendix A. Proofs
This Appendix contains proofs of all our theoretical results.
and Munk (2018) for examples and comparisons of different spanning trees on two-dimensional grids.
Assume T is rooted at root(T ) ∈ X . Then, for x ∈ X and x 6= root(T ) we may define par(x) ∈ X
as the immediate neighbor of x in the unique path connecting x and root(T ). We set par(root(T )) =
root(T ). We also define children(x) as the set of vertices x0 ∈ X such that there exists a sequence
14
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
Building the tree We build a q-ary tree on X . To this end, we split X to lmax + 2 groups and
build the tree in such a way that a node at level l + 1 has a unique parent at level l with edge length
q −l . The formal construction follows.
For l ∈ {0, . . . , lmax } we let Ql ⊂ X be the center points of a q −l diam(X ) covering of X , that is
[
B(x, q −l diam(X )) = X , and |Ql | = N (X , q −l diam(X )),
x∈Ql
where B(x, ) = {x0 ∈ X : d(x, x0 ) ≤ }. Additionally set Qlmax +1 = X . Now define Q̃l = Ql × {l};
we will build a tree structure on ∪ll=0
max +1
Q̃l .
Since we must have |Q̃0 | = 1 we can take this element as the root. Assume now that the tree
already contains all elements of ∪lj=0 Q̃j . Then, we add to the tree all elements of Q̃l+1 by choosing
for (x, l + 1) ∈ Q̃l+1 (exactly one) parent element (x0 , l) ∈ Q̃l such that d(x, x0 ) ≤ q −l diam(X ). This
is possible, since Ql is a q −l diam(X ) covering of X . We set the length of the connecting edge to
q −l diam(X ).
In this fashion we obtain a spanning tree T of ∪ll=0
max +1
Q̃l and a partition {Q̃l }l=0,...,lmax +1 . About
this tree we know that:
• it is in fact a tree. First, it is connected, because the construction starts with one connected
component and in every subsequent step all additional vertices are connected to it. Second,
it contains no cycles. To see this let ((x1 , l1 ), . . . , (xK , lK )) be a cycle in T . Without loss of
generality we may assume l1 = min{l1 , . . . , lK }. Then, (x1 , l1 ) must have at least two edges
connecting it to vertices in a Q̃l with l ≥ l1 which is impossible by construction.
• |Q̃l | = N (X , q −l diam(X )) for 0 ≤ l ≤ lmax .
• d(x, par(x)) = q −l+1 diam(X ) whenever x ∈ Q̃l , l ≥ 1.
• d(x, x0 ) ≤ dT ((x, lmax + 1), (x0 , lmax + 1)).
Since the leaves of T can be identified with X , a measure r ∈ P(X ) canonically defines a probability
measure r T ∈ P(T ) for which r(x,l
T
max +1)
T
= rx and r(x,l) = 0 for l ≤ lmax . In slight abuse of notation
we will denote the measure r simply by r. With this notation, we have Wp (r, s) ≤ WpT (r, s) for
T
all r, s ∈ P(X ).
Wasserstein distance on trees Note also that T is ultra-metric, that is, all its leaves are at the
same distance from the root. For trees of this type, we can define a height function h : X → [0, ∞)
such that h(x) = 0 if x ∈ X is a leaf and h(par(x)) − h(x) = dT (x, par(x)) for all x ∈ X \ root(X ).
There is an explicit formula for the Wasserstein distance on ultra-metric trees (Kloeckner, 2015).
Indeed, if r, s ∈ P(X ) then
X
(WpT (r, s))p = 2p−1 (h(par(x))p − h(x)p ) |(ST r)x − (ST s)x | , (11)
x∈X
with the operator ST as defined in (10). For the tree T constructed above and x ∈ Q̃l with
l = 0, . . . , lmax we have
lX
max
h(x) = q −j diam(X ),
j=l
15
Sommerfeld, Schrieber, Zemel, and Munk
Since (ST r̂S )x is the mean of S i.i.d. Bernoulli variables with expectation (ST r)x we have
r
X X (ST r)x (1 − (ST r)x )
E|(ST r̂S )x − (ST r)x | ≤
S
x∈Q̃l x∈Q̃l
1/2 1/2
1 X X q
≤√ (ST r)x (1 − (ST r)x ) ≤ |Q̃l |/S,
S
x∈Q̃l x∈Q̃l
P
using Hölder’s inequality and the fact that x∈Q̃l (ST r)x = 1 for all l = 0, . . . , lmax + 1. This finally
yields
lX
!
−(lmax +1)p
√ max
−lp
q √
E Wpp (r̂S , r) ≤ 2p−1 q 2p ( diam(X ))p
q N+ q N (X , q −l diam(X )) / S
l=0
√
≤ Eq (X , p)/ S.
Covering by arbitrary sets We now explain how to obtain the second formula for Eq as stated
in Remark 2. The idea is to define the coverings with arbitrary sets, not necessarily balls. Let
In comparison with (6), we replaced N by N1 . The price to pay for this is an additional factor of
2p .
16
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
This yields
lX lX
(
max max
(1 − q (lmax +1)(D/2−p) )/(1 − q D/2−p ) D 6= 2p,
q
−lp −l l(D/2−p)
q N (X , q diam(X )) ≤ q =
l=0 l=0
lmax + 1 D = 2p.
Denote for brevity p0 = D/2 − p and plug this into (6) to bound S 1/2 E Wpp (r̂S , r, k · k∞ ) by
" ( 0 0
#
√ (1 − q (lmax +1)p )/(1 − q p ) p0 =
6 0,
2p−1 q 2p ( diam(X ))p q −p(lmax +1)
N+ 0
lmax + 1 p = 0.
If p0 < 0, then let lmax → ∞. Otherwise, choose lmax = bD−1 logq N c (giving the best dependence
on N ), so that the element inside the square brackets is smaller than
p0 p0
1/(1 − q )
p0 < 0, 1/(1 − q )
p0 < 0,
−1 0 −1
2 + D logq N p = 0, ≤ 2 + D logq N p0 = 0,
0 0 0 0
+ (N 1/2−p/D q p − 1)/(q p − 1) p0 > 0 (2q p − 1)N 1/2−p/D /(q p − 1) p0 > 0.
1/2−p/D
N
(12)
√
The right-hand side is CD,p (N ) for q = 2. To get back to the Euclidean norm use kak2 ≤ kak∞ D,
so that
√
E Wpp (r̂S , r) ≤ Dp/2 E Wpp (r̂S , r, k · k∞ ) ≤ Dp/2 2p−1 q 2p ( diam(X ))p CD,p (N )/ S,
Lemma 1. (a) Let C̃D,p (q, N ) denote the right-hand side of (12). Then the minimum of the
function q 7→ q 2p C̃D,p (q, N ) on [2, ∞) is attained at q = 2.
0 √
(b) Let q ≥ 2, p, D integers, and √ p0 = D/2 − p. If p0 < 0, then 1/(1 − q p ) ≤ 2 + 2 and if p0 > 0,
0
then 2 + 1/(q p − 1) ≤ 3 + 2.
0
Proof. We begin with (b). If p0 < 0 then 1/(1 − q p ) is decreasing in q and increasing in p0 . The
integer constraints on D and p imply that the maximal value p0 can attain is −0.5. The smallest
value q can attain is 2. Thus
√
0 2 √ √ √
1/(1 − q p ) ≤ 1/(1 − 2−0.5 ) = √ = 2( 2 + 1) = 2 + 2.
2−1
0
When p0 > 0 the term 2 + 1/(q p − 1) is decreasing in p0 ≥ 0.5 and in q ≥ 2, so it is bounded by
√ √
2 + 1/( 2 − 1) = 3 + 2.
To prove (a) we shall differentiate the function q 2p C̃D,p (q, N ) with respect to q and show that
the derivative is positive for all q ≥ 2, and p, D, N ≥ 1.
17
Sommerfeld, Schrieber, Zemel, and Munk
q 2p
f1 (q) = , q ≥ 2; p ≥ 1; p0 < 0.
1 − q p0
Its derivative is
0 0
" 0
#
2pq 2p−1 (1 − q p ) + p0 q p −1 q 2p q 2p−1 p0 q p
f10 (q) = = 2p + .
(1 − q p0 )2 1 − q p0 1 − q p0
0
It suffices to show that the term in square brackets is positive, since 1 − q p > 0. Let us bound
0 0
q p and the denominator (1 − q p )−1 . Since ex ≥ 1 + x for x ≥ 0, e−x ≤ 1/(1 + x) and setting
x = −p0 log q gives
0 0 1
q p = ep log q ≤ .
1 − p0 log q
Hence
0 1 1 − p0 log q − 1 −p0 log q
1 − qp ≥ 1 − = = ,
1 − p0 log q 1 − p0 log q 1 − p0 log q
so that 0
qp 1 1 − p0 log q 1
p 0 ≤ 0 0
= 0
.
1−q 1 − p log q −p log q −p log q
Conclude that, since p0 < 0,
0
p0 q p 0 1 1 1 1 1
2p + p0 ≥ 2p + p 0
= 2p + = 2p − ≥ 2p − ≥2− > 0.
1−q −p log q − log q log q log 2 log 2
q 2p log N
f2 (q) = q 2p (2 + D−1 logq N ) = 2q 2p + , q ≥ 2; D = 2p ≥ 2.
D log q
Its derivative is
log N 2p−1 log N
f20 (q) 2p−1 −1 2p 2p−1
= 4pq + 2pq log q − q q =q 4p + (2p log q − 1) > 0
D(log q)2 D(log q)2
0 q 2p
f3 (q) = q 2p [2 + 1/(q p − 1)] = 2q 2p + = 2q 2p − f1 (q), q ≥ 2; p ≥ 1; p0 > 0.
q p0 − 1
The derivative is
" 0
# " 0
#
2p−1 q 2p−1 p0 q p 2p−1 q 2p−1 p0 q p
4pq − 2p + = 4pq + p0 2p − p0 .
1 − q p0 1 − q p0 q −1 q −1
This function is more complicated and we need to split into cases according to small, large or
moderate values of p0 .
0
Case 1: p0 ≤ 0.5. Then the negative term can be bounded using q p − 1 ≥ p0 log q as
0
p0 q p p0 1 1 1
p 0 = p 0 + p0 ≤ p0 + ≤ p0 + ≤ 0.5 + < 2 ≤ 2p.
q −1 q −1 log q log 2 log 2
18
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
1 1 √
1+ ≤ 1 + √ = 2 + 2 < 4 ≤ 4p.
e1/2 − 1 2−1
Hence the derivative is positive in this case.
Case 4: q ≤ e and p0 ∈ [1/2, 1]. The negative term is bounded above by
√
1 1 1 1 1 1 2+ 2
+ ≤ + ≤ + √ = ≈ 4.93,
log q (q p0 − 1) log q log 2 (q p0 − 1) log 2 log 2 ( 2 − 1) log 2 log 2
19
Sommerfeld, Schrieber, Zemel, and Munk
Our first goal is to show that Z is Lipschitz continuous. Let ((x11 , y11 ), . . . , (xSB , ySB )) and
((x011 , y11
0
), . . . , (x0SB , ySB
0
)) be arbitrary elements of (X 2 )SB . Then, using the reverse triangle in-
equality and the relations above
|Z((x11 , y11 ), . . . , (xSB , ySB )) − Z((x011 , y11
0
), . . . , (x0SB , ySB
0
))|
B S S S S
1 X 1 X 1 X 1 X 1 X
≤ Wp δx , δ y − Wp δ x0 , δy 0
B i=1 S j=1 ji S j=1 ji S j=1 ji S j=1 ji
B S S S S
1 X 1 X 1 X 1 X 1 X
≤ Wp δx , δx 0 + Wp δy , δy 0
B i=1 S j=1 ji S j=1 ji S j=1 ji S j=1 ji
1/p 1/p
−1/p B XS S
S X X
≤ dp (xji , x0ji ) + 0
dp (yji , yji )
B i=1
j=1
j=1
1/p
S −1/p p−1
X
≤ (2B) p dpX 2 ((xji , yji ), (x0ji , yji
0
)) .
B
i,j
Hence, Z/2 is Lipschitz continuous with constant (SB)−1/p relative to the p-metric generated by
dX 2 on (X 2 )SB .
For r̃ ∈ P(X 2 ) let H(· | r̃) denote the relative entropy with respect to r̃. Since X 2 has dX 2 -
diameter 21/p diam(X ), we have by Bolley and Villani (2005, Particular case 2.5, page 337) that for
every s̃
1/2p
Wp (r̃, s̃) ≤ 8 diam(X )2p H(r̃ | s̃) . (13)
If X11 , . . . , XSB ∼ r and Y11 , . . . , YSB ∼ s are all independent, we have
Z((X11 , Y11 ), . . . , (XSB , YSB )) ∼ Ŵp(S) (r, s) − Wp (r, s).
The Lipschitz continuity of Z and the transportation inequality (13) yields a concentration result
for this random variable. In fact, by Gozlan and Léonard (2007, Lemma 6) we have
−SBz 2p
h i
P Ŵp(S) (r, s) − Wp (r, s) ≥ E Ŵp(S) (r, s) − Wp (r, s) + z ≤ exp .
8 diam(X )2p
for all z ≥ 0. Note that −Z is Lipschitz continuous as well and hence, by the union bound,
−SBz 2p
(S) (S)
P Ŵp (r, s) − Wp (r, s) ≥ E Ŵp (r, s) − Wp (r, s) + z ≤ 2 exp .
8 diam(X )2p
Now, with the reverse triangle inequality, Jensen’s inequality and Theorem 1,
h i
E Ŵp(S) (r, s) − Wp (r, s) ≤ E [Wp (r̂S , r) + Wp (ŝS , s)]
1/p p 1/p
≤ E Wpp (r̂S , r) ≤ 2Eq1/p /S 1/(2p) .
+ Wp (ŝS , s)
Together with the last concentration inequality above, this concludes the proof of Theorem 4.
20
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
C ∞ ∞
SBz 2p
Z Z Z
≤2 (z + C)dz + 4 P (V > z + C)zdz ≤ 4C 2 + 8 z exp − dz,
−C C C 8 diam(X )2p
by Theorem 4. Changing variables and using the inequality y 2p ≥ y 2 (valid for y, p ≥ 1) gives
Z ∞ Z ∞
SBz 2p SB(Cy)2p
2
8 z exp − dz = 8C y exp − dy
C 8 diam(X )2p 1 8 diam(X )2p
Z ∞
SBC 2p y 2 2p
SBC 2p
2 4( diam(X ))
≤ 8C 2 y exp − dy = 8C exp −
1 8 diam(X )2p SBC 2p 8 diam(X )2p
!
2 ( diam(X ))
2p 4p Eq2 B
= 4C exp − ,
22p−3 Eq2 B 8 diam(X )2p
2/p
where we have used C 2 = 4Eq S −1/p . Deduce that
( !)
( diam(X ))2p 4p Eq2 B
2
(S)
E Ŵp (r, s) − Wp (r, s) ≤ 16Eq2/p
1+ exp − S −1/p .
22p−3 Eq2 B 8 diam(X )2p
Now note that (6) implies Eq2 ≥ 26p−2 [diam(X )]2p and hence [diam(X )]2p /[B22p−3 Eq2 ] ≤ 25−8p ≤ 1/8,
so the term in parentheses is smaller than 1 + 1/8. Consequently the mean squared error is bounded
2/p
by 18Eq S −1/p . h i α
(S)
Similar computations show that E Ŵp (r, s) − Wp (r, s) = O(S −α/(2p) ) for all 0 ≤ α ≤
2p.
References
Martial Agueh and Guillaume Carlier. Barycenters in the Wasserstein space. SIAM J. Math. Anal.,
43(2):904–924, 2011.
Jason Altschuler, Jonathan Weed, and Philippe Rigollet. Near-linear time approximation algorithms
for optimal transport via Sinkhorn iteration. In Advances in Neural Information Processing Sys-
tems, pages 1964–1974, 2017.
Dimitri P. Bertsekas. Auction algorithms for network flow problems: A tutorial introduction. Com-
putational Optimization and Applications, 1(1):7–66, 1992.
Emmanuel Boissard and Thibaut Le Gouic. On the mean speed of convergence of empirical and
occupation measures in Wasserstein distance. Ann. Inst. H. Poincaré Probab. Statist., 50(2):
539–563, 2014.
François Bolley and Cédric Villani. Weighted Csiszár-Kullback-Pinsker inequalities and applications
to transportation inequalities. Annales de La Faculté Des Sciences de Toulouse: Mathématiques,
14(3):331–352, 2005.
Nicolas Bonneel, Julien Rabin, Gabriel Peyré, and Hanspeter Pfister. Sliced and Radon Wasserstein
barycenters of measures. Journal of Mathematical Imaging and Vision, 51(1):22–45, 2015.
Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, and Bernhard
Schoelkopf. From optimal transport to generative modeling: the VEGAN cookbook. 2017. URL
https://arxiv.org/abs/1705.07642.
Hector Corrada Bravo and Stefan Theussl. Rcplex: R interface to cplex, 2016. URL https:
//CRAN.R-project.org/package=Rcplex. R package version 0.3-3.
21
Sommerfeld, Schrieber, Zemel, and Munk
22
Optimal Transport: Fast Probabilistic Approximation with Exact Solvers
Brian E. Ruttenberg, Gabriel Luna, Geoffrey P. Lewis, Steven K. Fisher, and Ambuj K. Singh.
Quantifying spatial relationships from whole retinal images. Bioinformatics, 29(7):940–946, 2013.
Bernhard Schmitzer. A sparse multi-scale algorithm for dense optimal transport. Journal of Math-
ematical Imaging and Vision, 56(2):238–259, 2016.
Jörn Schrieber, Dominic Schuhmacher, and Carsten Gottschlich. DOTmark — a benchmark for
discrete optimal transport. IEEE Access, 5:271–282, 2016. doi: 10.1109/ACCESS.2016.2639065.
Dominic Schuhmacher, Carsten Gottschlich, and Bjoern Baehre. R-package transport: Optimal
transport in various forms, 2014. URL https://cran.r-project.org/package=transport. R
package version 0.6-3.
Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, and Karthik Sridharan. Learnability, stability
and uniform convergence. Journal of Machine Learning Research, 11(Oct):2635–2670, 2010.
Sameer Shirdhonkar and David W. Jacobs. Approximate earth mover’s distance in linear time. In
IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8, 2008.
Max Sommerfeld and Axel Munk. Inference for empirical Wasserstein distances on finite spaces. J.
R. Stat. Soc. B, 80(1):219–238, 2018.
Carla Tameling and Axel Munk. Computational strategies for statistical inference based on empirical
optimal transport. In 2018 IEEE Data Science Workshop, pages 175–179. IEEE, 2018.
Jean-Louis Verger-Gaugry. Covering a ball with smaller equal balls in Rn . Discrete & Computational
Geometry, 33(1):143–155, 2005.
Cédric Villani. Optimal Transport: Old and New. Springer, New York, 2008.
Jianguo Zhang, Marcin Marszalek, Svetlana Lazebnik, and Cordelia Schmid. Local features and
kernels for classification of texture and object categories: A comprehensive study. International
Journal of Computer Vision, 73(2):213–238, 2007.
23