-
Non-convex entropic mean-field optimization via Best Response flow
Authors:
Razvan-Andrei Lascu,
Mateusz B. Majka
Abstract:
We study the problem of minimizing non-convex functionals on the space of probability measures, regularized by the relative entropy (KL divergence) with respect to a fixed reference measure, as well as the corresponding problem of solving entropy-regularized non-convex-non-concave min-max problems. We utilize the Best Response flow (also known in the literature as the fictitious play flow) and stu…
▽ More
We study the problem of minimizing non-convex functionals on the space of probability measures, regularized by the relative entropy (KL divergence) with respect to a fixed reference measure, as well as the corresponding problem of solving entropy-regularized non-convex-non-concave min-max problems. We utilize the Best Response flow (also known in the literature as the fictitious play flow) and study how its convergence is influenced by the relation between the degree of non-convexity of the functional under consideration, the regularization parameter and the tail behaviour of the reference measure. In particular, we demonstrate how to choose the regularizer, given the non-convex functional, so that the Best Response operator becomes a contraction with respect to the $L^1$-Wasserstein distance, which then ensures the existence of its unique fixed point, which is then shown to be the unique global minimizer for our optimization problem. This extends recent results where the Best Response flow was applied to solve convex optimization problems regularized by the relative entropy with respect to arbitrary reference measures, and with arbitrary values of the regularization parameter. Our results explain precisely how the assumption of convexity can be relaxed, at the expense of making a specific choice of the regularizer. Additionally, we demonstrate how these results can be applied in reinforcement learning in the context of policy optimization for Markov Decision Processes and Markov games with softmax parametrized policies in the mean-field regime.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Geometric ergodicity of modified Euler schemes for SDEs with super-linearity
Authors:
Jianhai Bao,
Mateusz B. Majka,
Jian Wang
Abstract:
As a well-known fact, the classical Euler scheme works merely for SDEs with coefficients of linear growth. In this paper, we study a general framework of modified Euler schemes, which is applicable to SDEs with super-linear drifts and encompasses numerical methods such as the tamed Euler scheme and the truncated Euler scheme. On the one hand, by exploiting an approach based on the refined basic co…
▽ More
As a well-known fact, the classical Euler scheme works merely for SDEs with coefficients of linear growth. In this paper, we study a general framework of modified Euler schemes, which is applicable to SDEs with super-linear drifts and encompasses numerical methods such as the tamed Euler scheme and the truncated Euler scheme. On the one hand, by exploiting an approach based on the refined basic coupling, we show that all Euler recursions within our proposed framework are geometrically ergodic under a mixed probability distance (i.e., the total variation distance plus the $L^1$-Wasserstein distance) and the weighted total variation distance. On the other hand, by utilizing the coupling by reflection, we demonstrate that the tamed Euler scheme is geometrically ergodic under the $L^1$-Wasserstein distance. In addition, as an important application, we provide a quantitative $L^1$-Wasserstein error bound between the exact invariant probability measure of an SDE with super-linearity, and the invariant probability measure of the tamed Euler scheme which is its numerical counterpart.
△ Less
Submitted 26 December, 2024;
originally announced December 2024.
-
Linear convergence of proximal descent schemes on the Wasserstein space
Authors:
Razvan-Andrei Lascu,
Mateusz B. Majka,
David Šiška,
Łukasz Szpruch
Abstract:
We investigate proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space. We establish linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity. Our analysis circumvents the need for discrete-time adaptations of the E…
▽ More
We investigate proximal descent methods, inspired by the minimizing movement scheme introduced by Jordan, Kinderlehrer and Otto, for optimizing entropy-regularized functionals on the Wasserstein space. We establish linear convergence under flat convexity assumptions, thereby relaxing the common reliance on geodesic convexity. Our analysis circumvents the need for discrete-time adaptations of the Evolution Variational Inequality (EVI). Instead, we leverage a uniform logarithmic Sobolev inequality (LSI) and the entropy "sandwich" lemma, extending the analysis from arXiv:2201.10469 and arXiv:2202.01009. The major challenge in the proof via LSI is to show that the relative Fisher information $I(\cdot|π)$ is well-defined at every step of the scheme. Since the relative entropy is not Wasserstein differentiable, we prove that along the scheme the iterates belong to a certain class of Sobolev regularity, and hence the relative entropy $\operatorname{KL}(\cdot|π)$ has a unique Wasserstein sub-gradient, and that the relative Fisher information is indeed finite.
△ Less
Submitted 22 November, 2024;
originally announced November 2024.
-
A Fisher-Rao gradient flow for entropic mean-field min-max games
Authors:
Razvan-Andrei Lascu,
Mateusz B. Majka,
Łukasz Szpruch
Abstract:
Gradient flows play a substantial role in addressing many machine learning problems. We examine the convergence in continuous-time of a \textit{Fisher-Rao} (Mean-Field Birth-Death) gradient flow in the context of solving convex-concave min-max games with entropy regularization. We propose appropriate Lyapunov functions to demonstrate convergence with explicit rates to the unique mixed Nash equilib…
▽ More
Gradient flows play a substantial role in addressing many machine learning problems. We examine the convergence in continuous-time of a \textit{Fisher-Rao} (Mean-Field Birth-Death) gradient flow in the context of solving convex-concave min-max games with entropy regularization. We propose appropriate Lyapunov functions to demonstrate convergence with explicit rates to the unique mixed Nash equilibrium.
△ Less
Submitted 18 September, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Mirror Descent-Ascent for mean-field min-max problems
Authors:
Razvan-Andrei Lascu,
Mateusz B. Majka,
Łukasz Szpruch
Abstract:
We study two variants of the mirror descent-ascent algorithm for solving min-max problems on the space of measures: simultaneous and sequential. We work under assumptions of convexity-concavity and relative smoothness of the payoff function with respect to a suitable Bregman divergence, defined on the space of measures via flat derivatives. We show that the convergence rates to mixed Nash equilibr…
▽ More
We study two variants of the mirror descent-ascent algorithm for solving min-max problems on the space of measures: simultaneous and sequential. We work under assumptions of convexity-concavity and relative smoothness of the payoff function with respect to a suitable Bregman divergence, defined on the space of measures via flat derivatives. We show that the convergence rates to mixed Nash equilibria, measured in the Nikaidò-Isoda error, are of order $\mathcal{O}\left(N^{-1/2}\right)$ and $\mathcal{O}\left(N^{-2/3}\right)$ for the simultaneous and sequential schemes, respectively, which is in line with the state-of-the-art results for related finite-dimensional algorithms.
△ Less
Submitted 28 May, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
$L^2$-Wasserstein contraction for Euler schemes of elliptic diffusions and interacting particle systems
Authors:
Linshan Liu,
Mateusz B. Majka,
Pierre Monmarché
Abstract:
We show the $L^2$-Wasserstein contraction for the transition kernel of a discretised diffusion process, under a contractivity at infinity condition on the drift and a sufficiently high diffusivity requirement. This extends recent results that, under similar assumptions on the drift but without the diffusivity restrictions, showed the $L^1$-Wasserstein contraction, or $L^p$-Wasserstein bounds for…
▽ More
We show the $L^2$-Wasserstein contraction for the transition kernel of a discretised diffusion process, under a contractivity at infinity condition on the drift and a sufficiently high diffusivity requirement. This extends recent results that, under similar assumptions on the drift but without the diffusivity restrictions, showed the $L^1$-Wasserstein contraction, or $L^p$-Wasserstein bounds for $p > 1$ that were, however, not true contractions. We explain how showing the true $L^2$-Wasserstein contraction is crucial for obtaining the local Poincaré inequality for the transition kernel of the Euler scheme of a diffusion. Moreover, we discuss other consequences of our contraction results, such as concentration inequalities and convergence rates in KL-divergence and total variation. We also study the corresponding $L^2$-Wasserstein contraction for discretisations of interacting diffusions. As a particular application, this allows us to analyse the behaviour of particle systems that can be used to approximate a class of McKean-Vlasov SDEs that were recently studied in the mean-field optimization literature.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Entropic mean-field min-max problems via Best Response flow
Authors:
Razvan-Andrei Lascu,
Mateusz B. Majka,
Łukasz Szpruch
Abstract:
We investigate the convergence properties of a continuous-time optimization method, the \textit{Mean-Field Best Response} flow, for solving convex-concave min-max games with entropy regularization. We introduce suitable Lyapunov functions to establish exponential convergence to the unique mixed Nash equilibrium. Additionally, we demonstrate the convergence of the fictitious play flow as a by-produ…
▽ More
We investigate the convergence properties of a continuous-time optimization method, the \textit{Mean-Field Best Response} flow, for solving convex-concave min-max games with entropy regularization. We introduce suitable Lyapunov functions to establish exponential convergence to the unique mixed Nash equilibrium. Additionally, we demonstrate the convergence of the fictitious play flow as a by-product of our analysis.
△ Less
Submitted 9 March, 2025; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Optimal Markovian coupling for finite activity Lévy processes
Authors:
Wilfrid S. Kendall,
Mateusz B. Majka,
Aleksandar Mijatović
Abstract:
We study optimal Markovian couplings of Markov processes, where the optimality is understood in terms of minimization of concave transport costs between the time-marginal distributions of the coupled processes. We provide explicit constructions of such optimal couplings for one-dimensional finite-activity Lévy processes (continuous-time random walks) whose jump distributions are unimodal but not n…
▽ More
We study optimal Markovian couplings of Markov processes, where the optimality is understood in terms of minimization of concave transport costs between the time-marginal distributions of the coupled processes. We provide explicit constructions of such optimal couplings for one-dimensional finite-activity Lévy processes (continuous-time random walks) whose jump distributions are unimodal but not necessarily symmetric. Remarkably, the optimal Markovian coupling does not depend on the specific concave transport cost. To this end, we combine McCann's results on optimal transport and Rogers' results on random walks with a novel uniformization construction that allows us to characterize all Markovian couplings of finite-activity Lévy processes. In particular, we show that the optimal Markovian coupling for finite-activity Lévy processes with non-symmetric unimodal Lévy measures has to allow for non-simultaneous jumps of the two coupled processes.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Polyak-Łojasiewicz inequality on the space of measures and convergence of mean-field birth-death processes
Authors:
Linshan Liu,
Mateusz B. Majka,
Łukasz Szpruch
Abstract:
The Polyak-Lojasiewicz inequality (PLI) in $\mathbb{R}^d$ is a natural condition for proving convergence of gradient descent algorithms. In the present paper, we study an analogue of PLI on the space of probability measures $\mathcal{P}(\mathbb{R}^d)$ and show that it is a natural condition for showing exponential convergence of a class of birth-death processes related to certain mean-field optimi…
▽ More
The Polyak-Lojasiewicz inequality (PLI) in $\mathbb{R}^d$ is a natural condition for proving convergence of gradient descent algorithms. In the present paper, we study an analogue of PLI on the space of probability measures $\mathcal{P}(\mathbb{R}^d)$ and show that it is a natural condition for showing exponential convergence of a class of birth-death processes related to certain mean-field optimization problems. We verify PLI for a broad class of such problems for energy functions regularised by the KL-divergence.
△ Less
Submitted 5 June, 2023; v1 submitted 6 June, 2022;
originally announced June 2022.
-
Strict Kantorovich contractions for Markov chains and Euler schemes with general noise
Authors:
Lu-Jing Huang,
Mateusz B. Majka,
Jian Wang
Abstract:
We study contractions of Markov chains on general metric spaces with respect to some carefully designed distance-like functions, which are comparable to the total variation and the standard $L^p$-Wasserstein distances for $p \ge 1$. We present explicit lower bounds of the corresponding contraction rates. By employing the refined basic coupling and the coupling by reflection, the results are applie…
▽ More
We study contractions of Markov chains on general metric spaces with respect to some carefully designed distance-like functions, which are comparable to the total variation and the standard $L^p$-Wasserstein distances for $p \ge 1$. We present explicit lower bounds of the corresponding contraction rates. By employing the refined basic coupling and the coupling by reflection, the results are applied to Markov chains whose transitions include additive stochastic noises that are not necessarily isotropic. This can be useful in the study of Euler schemes for SDEs driven by Lévy noises. In particular, motivated by recent works on the use of heavy tailed processes in Markov Chain Monte Carlo, we show that chains driven by the $α$-stable noise can have better contraction rates than corresponding chains driven by the Gaussian noise, due to the heavy tails of the $α$-stable distribution.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.
-
Approximation of heavy-tailed distributions via stable-driven SDEs
Authors:
Lu-Jing Huang,
Mateusz B. Majka,
Jian Wang
Abstract:
Constructions of numerous approximate sampling algorithms are based on the well-known fact that certain Gibbs measures are stationary distributions of ergodic stochastic differential equations (SDEs) driven by the Brownian motion. However, for some heavy-tailed distributions it can be shown that the associated SDE is not exponentially ergodic and that related sampling algorithms may perform poorly…
▽ More
Constructions of numerous approximate sampling algorithms are based on the well-known fact that certain Gibbs measures are stationary distributions of ergodic stochastic differential equations (SDEs) driven by the Brownian motion. However, for some heavy-tailed distributions it can be shown that the associated SDE is not exponentially ergodic and that related sampling algorithms may perform poorly. A natural idea that has recently been explored in the machine learning literature in this context is to make use of stochastic processes with heavy tails instead of the Brownian motion. In this paper we provide a rigorous theoretical framework for studying the problem of approximating heavy-tailed distributions via ergodic SDEs driven by symmetric (rotationally invariant) $α$-stable processes.
△ Less
Submitted 4 July, 2020;
originally announced July 2020.
-
Multi-index Antithetic Stochastic Gradient Algorithm
Authors:
Mateusz B. Majka,
Marc Sabate-Vidales,
Łukasz Szpruch
Abstract:
Stochastic Gradient Algorithms (SGAs) are ubiquitous in computational statistics, machine learning and optimisation. Recent years have brought an influx of interest in SGAs, and the non-asymptotic analysis of their bias is by now well-developed. However, relatively little is known about the optimal choice of the random approximation (e.g mini-batching) of the gradient in SGAs as this relies on the…
▽ More
Stochastic Gradient Algorithms (SGAs) are ubiquitous in computational statistics, machine learning and optimisation. Recent years have brought an influx of interest in SGAs, and the non-asymptotic analysis of their bias is by now well-developed. However, relatively little is known about the optimal choice of the random approximation (e.g mini-batching) of the gradient in SGAs as this relies on the analysis of the variance and is problem specific. While there have been numerous attempts to reduce the variance of SGAs, these typically exploit a particular structure of the sampled distribution by requiring a priori knowledge of its density's mode. It is thus unclear how to adapt such algorithms to non-log-concave settings. In this paper, we construct a Multi-index Antithetic Stochastic Gradient Algorithm (MASGA) whose implementation is independent of the structure of the target measure and which achieves performance on par with Monte Carlo estimators that have access to unbiased samples from the distribution of interest. In other words, MASGA is an optimal estimator from the mean square error-computational cost perspective within the class of Monte Carlo estimators. We prove this fact rigorously for log-concave settings and verify it numerically for some examples where the log-concavity assumption is not satisfied.
△ Less
Submitted 30 September, 2021; v1 submitted 10 June, 2020;
originally announced June 2020.
-
Exponential ergodicity for SDEs and McKean-Vlasov processes with Lévy noise
Authors:
Mingjie Liang,
Mateusz B. Majka,
Jian Wang
Abstract:
We study stochastic differential equations (SDEs) of McKean-Vlasov type with distribution dependent drifts and driven by pure jump Lévy processes. We prove a uniform in time propagation of chaos result, providing quantitative bounds on convergence rate of interacting particle systems with Lévy noise to the corresponding McKean-Vlasov SDE. By applying techniques that combine couplings, appropriatel…
▽ More
We study stochastic differential equations (SDEs) of McKean-Vlasov type with distribution dependent drifts and driven by pure jump Lévy processes. We prove a uniform in time propagation of chaos result, providing quantitative bounds on convergence rate of interacting particle systems with Lévy noise to the corresponding McKean-Vlasov SDE. By applying techniques that combine couplings, appropriately constructed $L^1$-Wasserstein distances and Lyapunov functions, we show exponential convergence of solutions of such SDEs to their stationary distributions. Our methods allow us to obtain results that are novel even for a broad class of Lévy-driven SDEs with distribution independent coefficients.
△ Less
Submitted 8 November, 2020; v1 submitted 30 January, 2019;
originally announced January 2019.
-
Non-asymptotic bounds for sampling algorithms without log-concavity
Authors:
Mateusz B. Majka,
Aleksandar Mijatović,
Lukasz Szpruch
Abstract:
Discrete time analogues of ergodic stochastic differential equations (SDEs) are one of the most popular and flexible tools for sampling high-dimensional probability measures. Non-asymptotic analysis in the $L^2$ Wasserstein distance of sampling algorithms based on Euler discretisations of SDEs has been recently developed by several authors for log-concave probability distributions. In this work we…
▽ More
Discrete time analogues of ergodic stochastic differential equations (SDEs) are one of the most popular and flexible tools for sampling high-dimensional probability measures. Non-asymptotic analysis in the $L^2$ Wasserstein distance of sampling algorithms based on Euler discretisations of SDEs has been recently developed by several authors for log-concave probability distributions. In this work we replace the log-concavity assumption with a log-concavity at infinity condition. We provide novel $L^2$ convergence rates for Euler schemes, expressed explicitly in terms of problem parameters. From there we derive non-asymptotic bounds on the distance between the laws induced by Euler schemes and the invariant laws of SDEs, both for schemes with standard and with randomised (inaccurate) drifts. We also obtain bounds for the hierarchy of discretisation, which enables us to deploy a multi-level Monte Carlo estimator. Our proof relies on a novel construction of a coupling for the Markov chains that can be used to control both the $L^1$ and $L^2$ Wasserstein distances simultaneously. Finally, we provide a weak convergence analysis that covers both the standard and the randomised (inaccurate) drift case. In particular, we reveal that the variance of the randomised drift does not influence the rate of weak convergence of the Euler scheme to the SDE.
△ Less
Submitted 10 October, 2019; v1 submitted 21 August, 2018;
originally announced August 2018.
-
Quantitative contraction rates for Markov chains on general state spaces
Authors:
Andreas Eberle,
Mateusz B. Majka
Abstract:
We investigate the problem of quantifying contraction coefficients of Markov transition kernels in Kantorovich ($L^1$ Wasserstein) distances. For diffusion processes, relatively precise quantitative bounds on contraction rates have recently been derived by combining appropriate couplings with carefully designed Kantorovich distances. In this paper, we partially carry over this approach from diffus…
▽ More
We investigate the problem of quantifying contraction coefficients of Markov transition kernels in Kantorovich ($L^1$ Wasserstein) distances. For diffusion processes, relatively precise quantitative bounds on contraction rates have recently been derived by combining appropriate couplings with carefully designed Kantorovich distances. In this paper, we partially carry over this approach from diffusions to Markov chains. We derive quantitative lower bounds on contraction rates for Markov chains on general state spaces that are powerful if the dynamics is dominated by small local moves. For Markov chains on $\mathbb{R^d}$ with isotropic transition kernels, the general bounds can be used efficiently together with a coupling that combines maximal and reflection coupling. The results are applied to Euler discretizations of stochastic differential equations with non-globally contractive drifts, and to the Metropolis adjusted Langevin algorithm for sampling from a class of probability measures on high dimensional state spaces that are not globally log-concave.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
A note on existence of global solutions and invariant measures for jump SDEs with locally one-sided Lipschitz drift
Authors:
Mateusz B. Majka
Abstract:
We extend some methods developed by Albeverio, Brzeźniak and Wu and we show how to apply them in order to prove existence of global strong solutions of stochastic differential equations with jumps, under a local one-sided Lipschitz condition on the drift (also known as a monotonicity condition) and a local Lipschitz condition on the diffusion and jump coefficients, while an additional global one-s…
▽ More
We extend some methods developed by Albeverio, Brzeźniak and Wu and we show how to apply them in order to prove existence of global strong solutions of stochastic differential equations with jumps, under a local one-sided Lipschitz condition on the drift (also known as a monotonicity condition) and a local Lipschitz condition on the diffusion and jump coefficients, while an additional global one-sided linear growth assumption is satisfied. Then we use these methods to prove existence of invariant measures for a broad class of such equations.
△ Less
Submitted 12 December, 2016;
originally announced December 2016.
-
Transportation inequalities for non-globally dissipative SDEs with jumps via Malliavin calculus and coupling
Authors:
Mateusz B. Majka
Abstract:
By using the mirror coupling for solutions of SDEs driven by pure jump Lévy processes, we extend some transportation and concentration inequalities, which were previously known only in the case where the coefficients in the equation satisfy a global dissipativity condition. Furthermore, by using the mirror coupling for the jump part and the coupling by reflection for the Brownian part, we extend a…
▽ More
By using the mirror coupling for solutions of SDEs driven by pure jump Lévy processes, we extend some transportation and concentration inequalities, which were previously known only in the case where the coefficients in the equation satisfy a global dissipativity condition. Furthermore, by using the mirror coupling for the jump part and the coupling by reflection for the Brownian part, we extend analogous results for jump diffusions. To this end, we improve some previous results concerning such couplings and show how to combine the jump and the Brownian case. As a crucial step in our proof, we develop a novel method of bounding Malliavin derivatives of solutions of SDEs with both jump and Gaussian noise, which involves the coupling technique and which might be of independent interest. The bounds we obtain are new even in the case of diffusions without jumps.
△ Less
Submitted 11 November, 2019; v1 submitted 21 October, 2016;
originally announced October 2016.
-
Multilevel Monte Carlo methods for the approximation of invariant measures of stochastic differential equations
Authors:
Michael B. Giles,
Mateusz B. Majka,
Lukasz Szpruch,
Sebastian Vollmer,
Konstantinos Zygalakis
Abstract:
We develop a framework that allows the use of the multi-level Monte Carlo (MLMC) methodology (Giles2015) to calculate expectations with respect to the invariant measure of an ergodic SDE. In that context, we study the (over-damped) Langevin equations with a strongly concave potential. We show that, when appropriate contracting couplings for the numerical integrators are available, one can obtain a…
▽ More
We develop a framework that allows the use of the multi-level Monte Carlo (MLMC) methodology (Giles2015) to calculate expectations with respect to the invariant measure of an ergodic SDE. In that context, we study the (over-damped) Langevin equations with a strongly concave potential. We show that, when appropriate contracting couplings for the numerical integrators are available, one can obtain a uniform in time estimate of the MLMC variance in contrast to the majority of the results in the MLMC literature. As a consequence, a root mean square error of $\mathcal{O}(\varepsilon)$ is achieved with $\mathcal{O}(\varepsilon^{-2})$ complexity on par with Markov Chain Monte Carlo (MCMC) methods, which however can be computationally intensive when applied to large data sets. Finally, we present a multi-level version of the recently introduced Stochastic Gradient Langevin Dynamics (SGLD) method (Welling and Teh, 2011) built for large datasets applications. We show that this is the first stochastic gradient MCMC method with complexity $\mathcal{O}(\varepsilon^{-2}|\log {\varepsilon}|^{3})$, in contrast to the complexity $\mathcal{O}(\varepsilon^{-3})$ of currently available methods. Numerical experiments confirm our theoretical findings.
△ Less
Submitted 12 August, 2019; v1 submitted 4 May, 2016;
originally announced May 2016.
-
Coupling and exponential ergodicity for stochastic differential equations driven by Lévy processes
Authors:
Mateusz B. Majka
Abstract:
We present a novel idea for a coupling of solutions of stochastic differential equations driven by Lévy noise, inspired by some results from the optimal transportation theory. Then we use this coupling to obtain exponential contractivity of the semigroups associated with these solutions with respect to an appropriately chosen Kantorovich distance. As a corollary, we obtain exponential convergence…
▽ More
We present a novel idea for a coupling of solutions of stochastic differential equations driven by Lévy noise, inspired by some results from the optimal transportation theory. Then we use this coupling to obtain exponential contractivity of the semigroups associated with these solutions with respect to an appropriately chosen Kantorovich distance. As a corollary, we obtain exponential convergence rates in the total variation and standard $L^1$-Wasserstein distances.
△ Less
Submitted 29 April, 2017; v1 submitted 29 September, 2015;
originally announced September 2015.