Search | arXiv e-print repository

Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities

Authors: Tara Akhound-Sadegh, Jungyoon Lee, Avishek Joey Bose, Valentin De Bortoli, Arnaud Doucet, Michael M. Bronstein, Dominique Beaini, Siamak Ravanbakhsh, Kirill Neklyudov, Alexander Tong

Abstract: Sampling efficiently from a target unnormalized probability density remains a core challenge, with relevance across countless high-impact scientific applications. A promising approach towards this challenge is the design of amortized samplers that borrow key ideas, such as probability path design, from state-of-the-art generative diffusion models. However, all existing diffusion-based samplers rem… ▽ More Sampling efficiently from a target unnormalized probability density remains a core challenge, with relevance across countless high-impact scientific applications. A promising approach towards this challenge is the design of amortized samplers that borrow key ideas, such as probability path design, from state-of-the-art generative diffusion models. However, all existing diffusion-based samplers remain unable to draw samples from distributions at the scale of even simple molecular systems. In this paper, we propose Progressive Inference-Time Annealing (PITA), a novel framework to learn diffusion-based samplers that combines two complementary interpolation techniques: I.) Annealing of the Boltzmann distribution and II.) Diffusion smoothing. PITA trains a sequence of diffusion models from high to low temperatures by sequentially training each model at progressively higher temperatures, leveraging engineered easy access to samples of the temperature-annealed target density. In the subsequent step, PITA enables simulating the trained diffusion model to procure training samples at a lower temperature for the next diffusion model through inference-time annealing using a novel Feynman-Kac PDE combined with Sequential Monte Carlo. Empirically, PITA enables, for the first time, equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates with dramatically lower energy function evaluations. Code available at: https://github.com/taraak/pita △ Less

Submitted 19 June, 2025; originally announced June 2025.

arXiv:2505.14177 [pdf, ps, other]

From stability of Langevin diffusion to convergence of proximal MCMC for non-log-concave sampling

Authors: Marien Renaud, Valentin De Bortoli, Arthur Leclaire, Nicolas Papadakis

Abstract: We consider the problem of sampling distributions stemming from non-convex potentials with Unadjusted Langevin Algorithm (ULA). We prove the stability of the discrete-time ULA to drift approximations under the assumption that the potential is strongly convex at infinity. In many context, e.g. imaging inverse problems, potentials are non-convex and non-smooth. Proximal Stochastic Gradient Langevin… ▽ More We consider the problem of sampling distributions stemming from non-convex potentials with Unadjusted Langevin Algorithm (ULA). We prove the stability of the discrete-time ULA to drift approximations under the assumption that the potential is strongly convex at infinity. In many context, e.g. imaging inverse problems, potentials are non-convex and non-smooth. Proximal Stochastic Gradient Langevin Algorithm (PSGLA) is a popular algorithm to handle such potentials. It combines the forward-backward optimization algorithm with a ULA step. Our main stability result combined with properties of the Moreau envelope allows us to derive the first proof of convergence of the PSGLA for non-convex potentials. We empirically validate our methodology on synthetic data and in the context of imaging inverse problems. In particular, we observe that PSGLA exhibits faster convergence rates than Stochastic Gradient Langevin Algorithm for posterior sampling while preserving its restoration properties. △ Less

Submitted 20 May, 2025; originally announced May 2025.

arXiv:2502.02483 [pdf, ps, other]

Distributional Diffusion Models with Scoring Rules

Authors: Valentin De Bortoli, Alexandre Galashov, J. Swaroop Guntupalli, Guangyao Zhou, Kevin Murphy, Arthur Gretton, Arnaud Doucet

Abstract: Diffusion models generate high-quality synthetic data. They operate by defining a continuous-time forward process which gradually adds Gaussian noise to data until fully corrupted. The corresponding reverse process progressively "denoises" a Gaussian sample into a sample from the data distribution. However, generating high-quality outputs requires many discretization steps to obtain a faithful app… ▽ More Diffusion models generate high-quality synthetic data. They operate by defining a continuous-time forward process which gradually adds Gaussian noise to data until fully corrupted. The corresponding reverse process progressively "denoises" a Gaussian sample into a sample from the data distribution. However, generating high-quality outputs requires many discretization steps to obtain a faithful approximation of the reverse process. This is expensive and has motivated the development of many acceleration methods. We propose to accomplish sample generation by learning the posterior {\em distribution} of clean data samples given their noisy versions, instead of only the mean of this distribution. This allows us to sample from the probability transitions of the reverse process on a coarse time scale, significantly accelerating inference with minimal degradation of the quality of the output. This is accomplished by replacing the standard regression loss used to estimate conditional means with a scoring rule. We validate our method on image and robot trajectory generation, where we consistently outperform standard diffusion models at few discretization steps. △ Less

Submitted 7 July, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

arXiv:2501.05370 [pdf, ps, other]

Accelerated Diffusion Models via Speculative Sampling

Authors: Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, Arnaud Doucet

Abstract: Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and accepting or rejecting them based on the target model's distribution. While speculative sampling was previously limited to discrete sequences, we extend it to diffusion models, which generate samples via continuous, vector-valued Markov chains.… ▽ More Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and accepting or rejecting them based on the target model's distribution. While speculative sampling was previously limited to discrete sequences, we extend it to diffusion models, which generate samples via continuous, vector-valued Markov chains. In this context, the target model is a high-quality but computationally expensive diffusion model. We propose various drafting strategies, including a simple and effective approach that does not require training a draft model and is applicable out of the box to any diffusion model. Our experiments demonstrate significant generation speedup on various diffusion models, halving the number of function evaluations, while generating exact samples from the target model. △ Less

Submitted 6 July, 2025; v1 submitted 9 January, 2025; originally announced January 2025.

arXiv:2410.19191 [pdf, other]

doi 10.1049/iet-ipr.2017.1005

Review of wavelet-based unsupervised texture segmentation, advantage of adaptive wavelets

Authors: Yuan Huang, Valentin De Bortoli, Fugen Zhou, Jerome Gilles

Abstract: Wavelet-based segmentation approaches are widely used for texture segmentation purposes because of their ability to characterize different textures. In this paper, we assess the influence of the chosen wavelet and propose to use the recently introduced empirical wavelets. We show that the adaptability of the empirical wavelet permits to reach better results than classic wavelets. In order to focus… ▽ More Wavelet-based segmentation approaches are widely used for texture segmentation purposes because of their ability to characterize different textures. In this paper, we assess the influence of the chosen wavelet and propose to use the recently introduced empirical wavelets. We show that the adaptability of the empirical wavelet permits to reach better results than classic wavelets. In order to focus only on the textural information, we also propose to perform a cartoon + texture decomposition step before applying the segmentation algorithm. The proposed method is tested on six classic benchmarks, based on several popular texture images. △ Less

Submitted 24 October, 2024; originally announced October 2024.

MSC Class: 42C40 ACM Class: G.0

Journal ref: IET Image Processing, Vol.12, No.9, 1626--1638, August 2018

arXiv:2409.09347 [pdf, other]

Schrödinger Bridge Flow for Unpaired Data Translation

Authors: Valentin De Bortoli, Iryna Korshunova, Andriy Mnih, Arnaud Doucet

Abstract: Mass transport problems arise in many areas of machine learning whereby one wants to compute a map transporting one distribution to another. Generative modeling techniques like Generative Adversarial Networks (GANs) and Denoising Diffusion Models (DDMs) have been successfully adapted to solve such transport problems, resulting in CycleGAN and Bridge Matching respectively. However, these methods do… ▽ More Mass transport problems arise in many areas of machine learning whereby one wants to compute a map transporting one distribution to another. Generative modeling techniques like Generative Adversarial Networks (GANs) and Denoising Diffusion Models (DDMs) have been successfully adapted to solve such transport problems, resulting in CycleGAN and Bridge Matching respectively. However, these methods do not approximate Optimal Transport (OT) maps, which are known to have desirable properties. Existing techniques approximating OT maps for high-dimensional data-rich problems, such as DDM-based Rectified Flow and Schrödinger Bridge procedures, require fully training a DDM-type model at each iteration, or use mini-batch techniques which can introduce significant errors. We propose a novel algorithm to compute the Schrödinger Bridge, a dynamic entropy-regularised version of OT, that eliminates the need to train multiple DDM-like models. This algorithm corresponds to a discretisation of a flow of path measures, which we call the Schrödinger Bridge Flow, whose only stationary point is the Schrödinger Bridge. We demonstrate the performance of our algorithm on a variety of unpaired data translation tasks. △ Less

Submitted 14 September, 2024; originally announced September 2024.

arXiv:2405.06780 [pdf, other]

Deep MMD Gradient Flow without adversarial training

Authors: Alexandre Galashov, Valentin de Bortoli, Arthur Gretton

Abstract: We propose a gradient flow procedure for generative modeling by transporting particles from an initial source distribution to a target distribution, where the gradient field on the particles is given by a noise-adaptive Wasserstein Gradient of the Maximum Mean Discrepancy (MMD). The noise-adaptive MMD is trained on data distributions corrupted by increasing levels of noise, obtained via a forward… ▽ More We propose a gradient flow procedure for generative modeling by transporting particles from an initial source distribution to a target distribution, where the gradient field on the particles is given by a noise-adaptive Wasserstein Gradient of the Maximum Mean Discrepancy (MMD). The noise-adaptive MMD is trained on data distributions corrupted by increasing levels of noise, obtained via a forward diffusion process, as commonly used in denoising diffusion probabilistic models. The result is a generalization of MMD Gradient Flow, which we call Diffusion-MMD-Gradient Flow or DMMD. The divergence training procedure is related to discriminator training in Generative Adversarial Networks (GAN), but does not require adversarial training. We obtain competitive empirical performance in unconditional image generation on CIFAR10, MNIST, CELEB-A (64 x64) and LSUN Church (64 x 64). Furthermore, we demonstrate the validity of the approach when MMD is replaced by a lower bound on the KL divergence. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2402.18491 [pdf, other]

doi 10.1038/s41467-024-54281-3

Dynamical Regimes of Diffusion Models

Authors: Giulio Biroli, Tony Bonnaire, Valentin de Bortoli, Marc Mézard

Abstract: Using statistical physics methods, we study generative diffusion models in the regime where the dimension of space and the number of data are large, and the score function has been trained optimally. Our analysis reveals three distinct dynamical regimes during the backward generative diffusion process. The generative dynamics, starting from pure noise, encounters first a 'speciation' transition wh… ▽ More Using statistical physics methods, we study generative diffusion models in the regime where the dimension of space and the number of data are large, and the score function has been trained optimally. Our analysis reveals three distinct dynamical regimes during the backward generative diffusion process. The generative dynamics, starting from pure noise, encounters first a 'speciation' transition where the gross structure of data is unraveled, through a mechanism similar to symmetry breaking in phase transitions. It is followed at later time by a 'collapse' transition where the trajectories of the dynamics become attracted to one of the memorized data points, through a mechanism which is similar to the condensation in a glass phase. For any dataset, the speciation time can be found from a spectral analysis of the correlation matrix, and the collapse time can be found from the estimation of an 'excess entropy' in the data. The dependence of the collapse time on the dimension and number of data provides a thorough characterization of the curse of dimensionality for diffusion models. Analytical solutions for simple models like high-dimensional Gaussian mixtures substantiate these findings and provide a theoretical framework, while extensions to more complex scenarios and numerical validations with real datasets confirm the theoretical predictions. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Comments: 22 pages, 11 figures

Journal ref: Nature Communications 15, 9957 (2024)

arXiv:2402.18012 [pdf, other]

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints

Authors: Lingkai Kong, Yuanqi Du, Wenhao Mu, Kirill Neklyudov, Valentin De Bortoli, Dongxia Wu, Haorui Wang, Aaron Ferber, Yi-An Ma, Carla P. Gomes, Chao Zhang

Abstract: Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailable. While numerous studies have addressed the issue of unknown objectives, limited research has focused on scenarios where feasibility constraints are not given explicitly. Overlooking these constraints can lead to spurious solutions that are unrealistic in pra… ▽ More Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailable. While numerous studies have addressed the issue of unknown objectives, limited research has focused on scenarios where feasibility constraints are not given explicitly. Overlooking these constraints can lead to spurious solutions that are unrealistic in practice. To deal with such unknown constraints, we propose to perform optimization within the data manifold using diffusion models. To constrain the optimization process to the data manifold, we reformulate the original optimization problem as a sampling problem from the product of the Boltzmann distribution defined by the objective function and the data distribution learned by the diffusion model. Depending on the differentiability of the objective function, we propose two different sampling methods. For differentiable objectives, we propose a two-stage framework that begins with a guided diffusion process for warm-up, followed by a Langevin dynamics stage for further correction. For non-differentiable objectives, we propose an iterative importance sampling strategy using the diffusion model as the proposal distribution. Comprehensive experiments on a synthetic dataset, six real-world black-box optimization datasets, and a multi-objective molecule optimization dataset show that our method achieves better or comparable performance with previous state-of-the-art baselines. △ Less

Submitted 21 October, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.08667 [pdf, other]

Target Score Matching

Authors: Valentin De Bortoli, Michael Hutchinson, Peter Wirnsberger, Arnaud Doucet

Abstract: Denoising Score Matching estimates the score of a noised version of a target distribution by minimizing a regression loss and is widely used to train the popular class of Denoising Diffusion Models. A well known limitation of Denoising Score Matching, however, is that it yields poor estimates of the score at low noise levels. This issue is particularly unfavourable for problems in the physical sci… ▽ More Denoising Score Matching estimates the score of a noised version of a target distribution by minimizing a regression loss and is widely used to train the popular class of Denoising Diffusion Models. A well known limitation of Denoising Score Matching, however, is that it yields poor estimates of the score at low noise levels. This issue is particularly unfavourable for problems in the physical sciences and for Monte Carlo sampling tasks for which the score of the clean original target is known. Intuitively, estimating the score of a slightly noised version of the target should be a simple task in such cases. In this paper, we address this shortcoming and show that it is indeed possible to leverage knowledge of the target score. We present a Target Score Identity and corresponding Target Score Matching regression loss which allows us to obtain score estimates admitting favourable properties at low noise levels. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.06320 [pdf, other]

Particle Denoising Diffusion Sampler

Authors: Angus Phillips, Hai-Dang Dau, Michael John Hutchinson, Valentin De Bortoli, George Deligiannidis, Arnaud Doucet

Abstract: Denoising diffusion models have become ubiquitous for generative modeling. The core idea is to transport the data distribution to a Gaussian by using a diffusion. Approximate samples from the data distribution are then obtained by estimating the time-reversal of this diffusion using score matching ideas. We follow here a similar strategy to sample from unnormalized probability densities and comput… ▽ More Denoising diffusion models have become ubiquitous for generative modeling. The core idea is to transport the data distribution to a Gaussian by using a diffusion. Approximate samples from the data distribution are then obtained by estimating the time-reversal of this diffusion using score matching ideas. We follow here a similar strategy to sample from unnormalized probability densities and compute their normalizing constants. However, the time-reversed diffusion is here simulated by using an original iterative particle scheme relying on a novel score matching loss. Contrary to standard denoising diffusion models, the resulting Particle Denoising Diffusion Sampler (PDDS) provides asymptotically consistent estimates under mild assumptions. We demonstrate PDDS on multimodal and high dimensional sampling tasks. △ Less

Submitted 15 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

Comments: To be published in ICML 2024. 37 pages, 20 figures, 3 tables, 5 algorithms

arXiv:2402.05468 [pdf, other]

Implicit Diffusion: Efficient Optimization through Stochastic Sampling

Authors: Pierre Marion, Anna Korba, Peter Bartlett, Mathieu Blondel, Valentin De Bortoli, Arnaud Doucet, Felipe Llinares-López, Courtney Paquette, Quentin Berthet

Abstract: We present a new algorithm to optimize distributions defined implicitly by parameterized stochastic diffusions. Doing so allows us to modify the outcome distribution of sampling processes by optimizing over their parameters. We introduce a general framework for first-order optimization of these processes, that performs jointly, in a single loop, optimization and sampling steps. This approach is in… ▽ More We present a new algorithm to optimize distributions defined implicitly by parameterized stochastic diffusions. Doing so allows us to modify the outcome distribution of sampling processes by optimizing over their parameters. We introduce a general framework for first-order optimization of these processes, that performs jointly, in a single loop, optimization and sampling steps. This approach is inspired by recent advances in bilevel optimization and automatic implicit differentiation, leveraging the point of view of sampling as optimization over the space of probability distributions. We provide theoretical guarantees on the performance of our method, as well as experimental results demonstrating its effectiveness. We apply it to training energy-based models and finetuning denoising diffusions. △ Less

Submitted 5 March, 2025; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: 40 pages, 18 figures. Accepted to AISTATS 2025

arXiv:2311.06978 [pdf, other]

Augmented Bridge Matching

Authors: Valentin De Bortoli, Guan-Horng Liu, Tianrong Chen, Evangelos A. Theodorou, Weilie Nie

Abstract: Flow and bridge matching are a novel class of processes which encompass diffusion models. One of the main aspect of their increased flexibility is that these models can interpolate between arbitrary data distributions i.e. they generalize beyond generative modeling and can be applied to learning stochastic (and deterministic) processes of arbitrary transfer tasks between two given distributions. I… ▽ More Flow and bridge matching are a novel class of processes which encompass diffusion models. One of the main aspect of their increased flexibility is that these models can interpolate between arbitrary data distributions i.e. they generalize beyond generative modeling and can be applied to learning stochastic (and deterministic) processes of arbitrary transfer tasks between two given distributions. In this paper, we highlight that while flow and bridge matching processes preserve the information of the marginal distributions, they do \emph{not} necessarily preserve the coupling information unless additional, stronger optimality conditions are met. This can be problematic if one aims at preserving the original empirical pairing. We show that a simple modification of the matching process recovers this coupling by augmenting the velocity field (or drift) with the information of the initial sample point. Doing so, we lose the Markovian property of the process but preserve the coupling information between distributions. We illustrate the efficiency of our augmentation in learning mixture of image translation tasks. △ Less

Submitted 12 November, 2023; originally announced November 2023.

arXiv:2310.13102 [pdf, other]

Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion Models

Authors: Gabriele Corso, Yilun Xu, Valentin de Bortoli, Regina Barzilay, Tommi Jaakkola

Abstract: In light of the widespread success of generative models, a significant amount of research has gone into speeding up their sampling time. However, generative models are often sampled multiple times to obtain a diverse set incurring a cost that is orthogonal to sampling time. We tackle the question of how to improve diversity and sample efficiency by moving beyond the common assumption of independen… ▽ More In light of the widespread success of generative models, a significant amount of research has gone into speeding up their sampling time. However, generative models are often sampled multiple times to obtain a diverse set incurring a cost that is orthogonal to sampling time. We tackle the question of how to improve diversity and sample efficiency by moving beyond the common assumption of independent samples. We propose particle guidance, an extension of diffusion-based generative sampling where a joint-particle time-evolving potential enforces diversity. We analyze theoretically the joint distribution that particle guidance generates, how to learn a potential that achieves optimal diversity, and the connections with methods in other disciplines. Empirically, we test the framework both in the setting of conditional image generation, where we are able to increase diversity without affecting quality, and molecular conformer generation, where we reduce the state-of-the-art median error by 13% on average. △ Less

Submitted 24 November, 2023; v1 submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.03546 [pdf, other]

Plug-and-Play Posterior Sampling under Mismatched Measurement and Prior Models

Authors: Marien Renaud, Jiaming Liu, Valentin de Bortoli, Andrés Almansa, Ulugbek S. Kamilov

Abstract: Posterior sampling has been shown to be a powerful Bayesian approach for solving imaging inverse problems. The recent plug-and-play unadjusted Langevin algorithm (PnP-ULA) has emerged as a promising method for Monte Carlo sampling and minimum mean squared error (MMSE) estimation by combining physical measurement models with deep-learning priors specified using image denoisers. However, the intrica… ▽ More Posterior sampling has been shown to be a powerful Bayesian approach for solving imaging inverse problems. The recent plug-and-play unadjusted Langevin algorithm (PnP-ULA) has emerged as a promising method for Monte Carlo sampling and minimum mean squared error (MMSE) estimation by combining physical measurement models with deep-learning priors specified using image denoisers. However, the intricate relationship between the sampling distribution of PnP-ULA and the mismatched data-fidelity and denoiser has not been theoretically analyzed. We address this gap by proposing a posterior-L2 pseudometric and using it to quantify an explicit error bound for PnP-ULA under mismatched posterior distribution. We numerically validate our theory on several inverse problems such as sampling from Gaussian mixture models and image deblurring. Our results suggest that the sensitivity of the sampling distribution of PnP-ULA to a mismatch in the measurement model and the denoiser can be precisely characterized. △ Less

Submitted 10 March, 2025; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2308.03686 [pdf, other]

Nearly $d$-Linear Convergence Bounds for Diffusion Models via Stochastic Localization

Authors: Joe Benton, Valentin De Bortoli, Arnaud Doucet, George Deligiannidis

Abstract: Denoising diffusions are a powerful method to generate approximate samples from high-dimensional data distributions. Recent results provide polynomial bounds on their convergence rate, assuming $L^2$-accurate scores. Until now, the tightest bounds were either superlinear in the data dimension or required strong smoothness assumptions. We provide the first convergence bounds which are linear in the… ▽ More Denoising diffusions are a powerful method to generate approximate samples from high-dimensional data distributions. Recent results provide polynomial bounds on their convergence rate, assuming $L^2$-accurate scores. Until now, the tightest bounds were either superlinear in the data dimension or required strong smoothness assumptions. We provide the first convergence bounds which are linear in the data dimension (up to logarithmic factors) assuming only finite second moments of the data distribution. We show that diffusion models require at most $\tilde O(\frac{d \log^2(1/δ)}{\varepsilon^2})$ steps to approximate an arbitrary distribution on $\mathbb{R}^d$ corrupted with Gaussian noise of variance $δ$ to within $\varepsilon^2$ in KL divergence. Our proof extends the Girsanov-based methods of previous works. We introduce a refined treatment of the error from discretizing the reverse SDE inspired by stochastic localization. △ Less

Submitted 5 March, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

arXiv:2307.05439 [pdf, other]

Metropolis Sampling for Constrained Diffusion Models

Authors: Nic Fishman, Leo Klarner, Emile Mathieu, Michael Hutchinson, Valentin de Bortoli

Abstract: Denoising diffusion models have recently emerged as the predominant paradigm for generative modelling on image domains. In addition, their extension to Riemannian manifolds has facilitated a range of applications across the natural sciences. While many of these problems stand to benefit from the ability to specify arbitrary, domain-informed constraints, this setting is not covered by the existing… ▽ More Denoising diffusion models have recently emerged as the predominant paradigm for generative modelling on image domains. In addition, their extension to Riemannian manifolds has facilitated a range of applications across the natural sciences. While many of these problems stand to benefit from the ability to specify arbitrary, domain-informed constraints, this setting is not covered by the existing (Riemannian) diffusion model methodology. Recent work has attempted to address this issue by constructing novel noising processes based on the reflected Brownian motion and logarithmic barrier methods. However, the associated samplers are either computationally burdensome or only apply to convex subsets of Euclidean space. In this paper, we introduce an alternative, simple noising scheme based on Metropolis sampling that affords substantial gains in computational efficiency and empirical performance compared to the earlier samplers. Of independent interest, we prove that this new process corresponds to a valid discretisation of the reflected Brownian motion. We demonstrate the scalability and flexibility of our approach on a range of problem settings with convex and non-convex constraints, including applications from geospatial modelling, robotics and protein design. △ Less

Submitted 9 November, 2023; v1 submitted 11 July, 2023; originally announced July 2023.

Comments: NeurIPS 2023

arXiv:2307.05431 [pdf, other]

Geometric Neural Diffusion Processes

Authors: Emile Mathieu, Vincent Dutordoir, Michael J. Hutchinson, Valentin De Bortoli, Yee Whye Teh, Richard E. Turner

Abstract: Denoising diffusion models have proven to be a flexible and effective paradigm for generative modelling. Their recent extension to infinite dimensional Euclidean spaces has allowed for the modelling of stochastic processes. However, many problems in the natural sciences incorporate symmetries and involve data living in non-Euclidean spaces. In this work, we extend the framework of diffusion models… ▽ More Denoising diffusion models have proven to be a flexible and effective paradigm for generative modelling. Their recent extension to infinite dimensional Euclidean spaces has allowed for the modelling of stochastic processes. However, many problems in the natural sciences incorporate symmetries and involve data living in non-Euclidean spaces. In this work, we extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling. We do so by a) constructing a noising process which admits, as limiting distribution, a geometric Gaussian process that transforms under the symmetry group of interest, and b) approximating the score with a neural network that is equivariant w.r.t. this group. We show that with these conditions, the generative functional model admits the same symmetry. We demonstrate scalability and capacity of the model, using a novel Langevin-based conditional sampler, to fit complex scalar and vector fields, with Euclidean and spherical codomain, on synthetic and real-world weather data. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2306.09099 [pdf, other]

Unbalanced Diffusion Schrödinger Bridge

Authors: Matteo Pariset, Ya-Ping Hsieh, Charlotte Bunne, Andreas Krause, Valentin De Bortoli

Abstract: Schrödinger bridges (SBs) provide an elegant framework for modeling the temporal evolution of populations in physical, chemical, or biological systems. Such natural processes are commonly subject to changes in population size over time due to the emergence of new species or birth and death events. However, existing neural parameterizations of SBs such as diffusion Schrödinger bridges (DSBs) are re… ▽ More Schrödinger bridges (SBs) provide an elegant framework for modeling the temporal evolution of populations in physical, chemical, or biological systems. Such natural processes are commonly subject to changes in population size over time due to the emergence of new species or birth and death events. However, existing neural parameterizations of SBs such as diffusion Schrödinger bridges (DSBs) are restricted to settings in which the endpoints of the stochastic process are both probability measures and assume conservation of mass constraints. To address this limitation, we introduce unbalanced DSBs which model the temporal evolution of marginals with arbitrary finite mass. This is achieved by deriving the time reversal of stochastic differential equations with killing and birth terms. We present two novel algorithmic schemes that comprise a scalable objective function for training unbalanced DSBs and provide a theoretical analysis alongside challenging applications on predicting heterogeneous molecular single-cell responses to various cancer drugs and simulating the emergence and spread of new viral variants. △ Less

Submitted 15 June, 2023; originally announced June 2023.

arXiv:2305.16557 [pdf, other]

Tree-Based Diffusion Schrödinger Bridge with Applications to Wasserstein Barycenters

Authors: Maxence Noble, Valentin De Bortoli, Arnaud Doucet, Alain Durmus

Abstract: Multi-marginal Optimal Transport (mOT), a generalization of OT, aims at minimizing the integral of a cost function with respect to a distribution with some prescribed marginals. In this paper, we consider an entropic version of mOT with a tree-structured quadratic cost, i.e., a function that can be written as a sum of pairwise cost functions between the nodes of a tree. To address this problem, we… ▽ More Multi-marginal Optimal Transport (mOT), a generalization of OT, aims at minimizing the integral of a cost function with respect to a distribution with some prescribed marginals. In this paper, we consider an entropic version of mOT with a tree-structured quadratic cost, i.e., a function that can be written as a sum of pairwise cost functions between the nodes of a tree. To address this problem, we develop Tree-based Diffusion Schrödinger Bridge (TreeDSB), an extension of the Diffusion Schrödinger Bridge (DSB) algorithm. TreeDSB corresponds to a dynamic and continuous state-space counterpart of the multimarginal Sinkhorn algorithm. A notable use case of our methodology is to compute Wasserstein barycenters which can be recast as the solution of a mOT problem on a star-shaped tree. We demonstrate that our methodology can be applied in high-dimensional settings such as image interpolation and Bayesian fusion. △ Less

Submitted 28 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.16261 [pdf, other]

Trans-Dimensional Generative Modeling via Jump Diffusion Models

Authors: Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet

Abstract: We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed gen… ▽ More We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed generative process along with a novel evidence lower bound training objective for learning to approximate it. Simulating our learned approximation to the time-reversed generative process then provides an effective way of sampling data of varying dimensionality by jointly generating state values and dimensions. We demonstrate our approach on molecular and video datasets of varying dimensionality, reporting better compatibility with test-time diffusion guidance imputation tasks and improved interpolation capabilities versus fixed dimensional models that generate state values and dimensions separately. △ Less

Submitted 30 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: 41 pages, 11 figures, 8 tables; NeurIPS 2023

arXiv:2304.05364 [pdf, other]

Diffusion Models for Constrained Domains

Authors: Nic Fishman, Leo Klarner, Valentin De Bortoli, Emile Mathieu, Michael Hutchinson

Abstract: Denoising diffusion models are a novel class of generative algorithms that achieve state-of-the-art performance across a range of domains, including image generation and text-to-image tasks. Building on this success, diffusion models have recently been extended to the Riemannian manifold setting, broadening their applicability to a range of problems from the natural and engineering sciences. Howev… ▽ More Denoising diffusion models are a novel class of generative algorithms that achieve state-of-the-art performance across a range of domains, including image generation and text-to-image tasks. Building on this success, diffusion models have recently been extended to the Riemannian manifold setting, broadening their applicability to a range of problems from the natural and engineering sciences. However, these Riemannian diffusion models are built on the assumption that their forward and backward processes are well-defined for all times, preventing them from being applied to an important set of tasks that consider manifolds defined via a set of inequality constraints. In this work, we introduce a principled framework to bridge this gap. We present two distinct noising processes based on (i) the logarithmic barrier metric and (ii) the reflected Brownian motion induced by the constraints. As existing diffusion model techniques cannot be applied in this setting, we derive new tools to define such models in our framework. We then demonstrate the practical utility of our methods on a number of synthetic and real-world tasks, including applications from robotics and protein design. △ Less

Submitted 7 March, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Published in Transactions on Machine Learning Research (07/2023)

arXiv:2303.16852 [pdf, other]

Diffusion Schrödinger Bridge Matching

Authors: Yuyang Shi, Valentin De Bortoli, Andrew Campbell, Arnaud Doucet

Abstract: Solving transport problems, i.e. finding a map transporting one given distribution to another, has numerous applications in machine learning. Novel mass transport methods motivated by generative modeling have recently been proposed, e.g. Denoising Diffusion Models (DDMs) and Flow Matching Models (FMMs) implement such a transport through a Stochastic Differential Equation (SDE) or an Ordinary Diffe… ▽ More Solving transport problems, i.e. finding a map transporting one given distribution to another, has numerous applications in machine learning. Novel mass transport methods motivated by generative modeling have recently been proposed, e.g. Denoising Diffusion Models (DDMs) and Flow Matching Models (FMMs) implement such a transport through a Stochastic Differential Equation (SDE) or an Ordinary Differential Equation (ODE). However, while it is desirable in many applications to approximate the deterministic dynamic Optimal Transport (OT) map which admits attractive properties, DDMs and FMMs are not guaranteed to provide transports close to the OT map. In contrast, Schrödinger bridges (SBs) compute stochastic dynamic mappings which recover entropy-regularized versions of OT. Unfortunately, existing numerical methods approximating SBs either scale poorly with dimension or accumulate errors across iterations. In this work, we introduce Iterative Markovian Fitting (IMF), a new methodology for solving SB problems, and Diffusion Schrödinger Bridge Matching (DSBM), a novel numerical algorithm for computing IMF iterates. DSBM significantly improves over previous SB numerics and recovers as special/limiting cases various recent transport methods. We demonstrate the performance of DSBM on a variety of problems. △ Less

Submitted 11 December, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

arXiv:2302.02277 [pdf, other]

SE(3) diffusion model with application to protein backbone generation

Authors: Jason Yim, Brian L. Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, Tommi Jaakkola

Abstract: The design of novel protein structures remains a challenge in protein engineering for applications across biomedicine and chemistry. In this line of work, a diffusion model over rigid bodies in 3D (referred to as frames) has shown success in generating novel, functional protein backbones that have not been observed in nature. However, there exists no principled methodological framework for diffusi… ▽ More The design of novel protein structures remains a challenge in protein engineering for applications across biomedicine and chemistry. In this line of work, a diffusion model over rigid bodies in 3D (referred to as frames) has shown success in generating novel, functional protein backbones that have not been observed in nature. However, there exists no principled methodological framework for diffusion on SE(3), the space of orientation preserving rigid motions in R3, that operates on frames and confers the group invariance. We address these shortcomings by developing theoretical foundations of SE(3) invariant diffusion models on multiple frames followed by a novel framework, FrameDiff, for learning the SE(3) equivariant score over multiple frames. We apply FrameDiff on monomer backbone generation and find it can generate designable monomers up to 500 amino acids without relying on a pretrained protein structure prediction network that has been integral to previous methods. We find our samples are capable of generalizing beyond any known protein structure. △ Less

Submitted 22 May, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

Journal ref: International Conference of Machine Learning (ICML) 2023

arXiv:2211.03595 [pdf, other]

From Denoising Diffusions to Denoising Markov Models

Authors: Joe Benton, Yuyang Shi, Valentin De Bortoli, George Deligiannidis, Arnaud Doucet

Abstract: Denoising diffusions are state-of-the-art generative models exhibiting remarkable empirical performance. They work by diffusing the data distribution into a Gaussian distribution and then learning to reverse this noising process to obtain synthetic datapoints. The denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities using score matching. Such mod… ▽ More Denoising diffusions are state-of-the-art generative models exhibiting remarkable empirical performance. They work by diffusing the data distribution into a Gaussian distribution and then learning to reverse this noising process to obtain synthetic datapoints. The denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities using score matching. Such models can also be used to perform approximate posterior simulation when one can only sample from the prior and likelihood. We propose a unifying framework generalising this approach to a wide class of spaces and leading to an original extension of score matching. We illustrate the resulting models on various applications. △ Less

Submitted 18 February, 2024; v1 submitted 7 November, 2022; originally announced November 2022.

arXiv:2210.11925 [pdf, other]

Unbiased constrained sampling with Self-Concordant Barrier Hamiltonian Monte Carlo

Authors: Maxence Noble, Valentin De Bortoli, Alain Durmus

Abstract: In this paper, we propose Barrier Hamiltonian Monte Carlo (BHMC), a version of the HMC algorithm which aims at sampling from a Gibbs distribution $π$ on a manifold $\mathrm{M}$, endowed with a Hessian metric $\mathfrak{g}$ derived from a self-concordant barrier. Our method relies on Hamiltonian dynamics which comprises $\mathfrak{g}$. Therefore, it incorporates the constraints defining… ▽ More In this paper, we propose Barrier Hamiltonian Monte Carlo (BHMC), a version of the HMC algorithm which aims at sampling from a Gibbs distribution $π$ on a manifold $\mathrm{M}$, endowed with a Hessian metric $\mathfrak{g}$ derived from a self-concordant barrier. Our method relies on Hamiltonian dynamics which comprises $\mathfrak{g}$. Therefore, it incorporates the constraints defining $\mathrm{M}$ and is able to exploit its underlying geometry. However, the corresponding Hamiltonian dynamics is defined via non separable Ordinary Differential Equations (ODEs) in contrast to the Euclidean case. It implies unavoidable bias in existing generalization of HMC to Riemannian manifolds. In this paper, we propose a new filter step, called "involution checking step", to address this problem. This step is implemented in two versions of BHMC, coined continuous BHMC (c-BHMC) and numerical BHMC (n-BHMC) respectively. Our main results establish that these two new algorithms generate reversible Markov chains with respect to $π$ and do not suffer from any bias in comparison to previous implementations. Our conclusions are supported by numerical experiments where we consider target distributions defined on polytopes. △ Less

Submitted 28 October, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

arXiv:2209.14125 [pdf, other]

Spectral Diffusion Processes

Authors: Angus Phillips, Thomas Seror, Michael Hutchinson, Valentin De Bortoli, Arnaud Doucet, Emile Mathieu

Abstract: Score-based generative modelling (SGM) has proven to be a very effective method for modelling densities on finite-dimensional spaces. In this work we propose to extend this methodology to learn generative models over functional spaces. To do so, we represent functional data in spectral space to dissociate the stochastic part of the processes from their space-time part. Using dimensionality reducti… ▽ More Score-based generative modelling (SGM) has proven to be a very effective method for modelling densities on finite-dimensional spaces. In this work we propose to extend this methodology to learn generative models over functional spaces. To do so, we represent functional data in spectral space to dissociate the stochastic part of the processes from their space-time part. Using dimensionality reduction techniques we then sample from their stochastic component using finite dimensional SGM. We demonstrate our method's effectiveness for modelling various multimodal datasets. △ Less

Submitted 28 November, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

Comments: 17 pages, 11 figures, Score-based Method Workshop at 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2208.05314 [pdf, other]

Convergence of denoising diffusion models under the manifold hypothesis

Authors: Valentin De Bortoli

Abstract: Denoising diffusion models are a recent class of generative models exhibiting state-of-the-art performance in image and audio synthesis. Such models approximate the time-reversal of a forward noising process from a target distribution to a reference density, which is usually Gaussian. Despite their strong empirical results, the theoretical analysis of such models remains limited. In particular, al… ▽ More Denoising diffusion models are a recent class of generative models exhibiting state-of-the-art performance in image and audio synthesis. Such models approximate the time-reversal of a forward noising process from a target distribution to a reference density, which is usually Gaussian. Despite their strong empirical results, the theoretical analysis of such models remains limited. In particular, all current approaches crucially assume that the target density admits a density w.r.t. the Lebesgue measure. This does not cover settings where the target distribution is supported on a lower-dimensional manifold or is given by some empirical distribution. In this paper, we bridge this gap by providing the first convergence results for diffusion models in this more general setting. In particular, we provide quantitative bounds on the Wasserstein distance of order one between the target data distribution and the generative distribution of the diffusion model. △ Less

Submitted 29 May, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

arXiv:2208.05003 [pdf, other]

Wavelet Score-Based Generative Modeling

Authors: Florentin Guth, Simon Coste, Valentin De Bortoli, Stephane Mallat

Abstract: Score-based generative models (SGMs) synthesize new data samples from Gaussian white noise by running a time-reversed Stochastic Differential Equation (SDE) whose drift coefficient depends on some probabilistic score. The discretization of such SDEs typically requires a large number of time steps and hence a high computational cost. This is because of ill-conditioning properties of the score that… ▽ More Score-based generative models (SGMs) synthesize new data samples from Gaussian white noise by running a time-reversed Stochastic Differential Equation (SDE) whose drift coefficient depends on some probabilistic score. The discretization of such SDEs typically requires a large number of time steps and hence a high computational cost. This is because of ill-conditioning properties of the score that we analyze mathematically. We show that SGMs can be considerably accelerated, by factorizing the data distribution into a product of conditional probabilities of wavelet coefficients across scales. The resulting Wavelet Score-based Generative Model (WSGM) synthesizes wavelet coefficients with the same number of time steps at all scales, and its time complexity therefore grows linearly with the image size. This is proved mathematically over Gaussian distributions, and shown numerically over physical processes at phase transition and natural image datasets. △ Less

Submitted 9 August, 2022; originally announced August 2022.

arXiv:2207.03024 [pdf, other]

Riemannian Diffusion Schrödinger Bridge

Authors: James Thornton, Michael Hutchinson, Emile Mathieu, Valentin De Bortoli, Yee Whye Teh, Arnaud Doucet

Abstract: Score-based generative models exhibit state of the art performance on density estimation and generative modeling tasks. These models typically assume that the data geometry is flat, yet recent extensions have been developed to synthesize data living on Riemannian manifolds. Existing methods to accelerate sampling of diffusion models are typically not applicable in the Riemannian setting and Rieman… ▽ More Score-based generative models exhibit state of the art performance on density estimation and generative modeling tasks. These models typically assume that the data geometry is flat, yet recent extensions have been developed to synthesize data living on Riemannian manifolds. Existing methods to accelerate sampling of diffusion models are typically not applicable in the Riemannian setting and Riemannian score-based methods have not yet been adapted to the important task of interpolation of datasets. To overcome these issues, we introduce \emph{Riemannian Diffusion Schrödinger Bridge}. Our proposed method generalizes Diffusion Schrödinger Bridge introduced in \cite{debortoli2021neurips} to the non-Euclidean setting and extends Riemannian score-based models beyond the first time reversal. We validate our proposed method on synthetic data and real Earth and climate data. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Comments: Accepted to Continuous Time Methods for Machine Learning, ICML 2022

arXiv:2206.14476 [pdf, other]

Can Push-forward Generative Models Fit Multimodal Distributions?

Authors: Antoine Salmona, Valentin de Bortoli, Julie Delon, Agnès Desolneux

Abstract: Many generative models synthesize data by transforming a standard Gaussian random variable using a deterministic neural network. Among these models are the Variational Autoencoders and the Generative Adversarial Networks. In this work, we call them "push-forward" models and study their expressivity. We show that the Lipschitz constant of these generative networks has to be large in order to fit mu… ▽ More Many generative models synthesize data by transforming a standard Gaussian random variable using a deterministic neural network. Among these models are the Variational Autoencoders and the Generative Adversarial Networks. In this work, we call them "push-forward" models and study their expressivity. We show that the Lipschitz constant of these generative networks has to be large in order to fit multimodal distributions. More precisely, we show that the total variation distance and the Kullback-Leibler divergence between the generated and the data distribution are bounded from below by a constant depending on the mode separation and the Lipschitz constant. Since constraining the Lipschitz constants of neural networks is a common way to stabilize generative models, there is a provable trade-off between the ability of push-forward models to approximate multimodal distributions and the stability of their training. We validate our findings on one-dimensional and image datasets and empirically show that generative models consisting of stacked networks with stochastic input at each step, such as diffusion models do not suffer of such limitations. △ Less

Submitted 12 October, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: In Thirty-sixth Conference on Neural Information Processing Systems

arXiv:2205.14987 [pdf, other]

A Continuous Time Framework for Discrete Denoising Models

Authors: Andrew Campbell, Joe Benton, Valentin De Bortoli, Tom Rainforth, George Deligiannidis, Arnaud Doucet

Abstract: We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques develo… ▽ More We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques developed in chemical physics and exploit our continuous time framework to derive high performance samplers that we show can outperform discrete time methods for discrete data. The continuous time treatment also enables us to derive a novel theoretical result bounding the error between the generated sample distribution and the true data distribution. △ Less

Submitted 14 October, 2022; v1 submitted 30 May, 2022; originally announced May 2022.

Comments: 44 pages, 15 figures; NeurIPS 2022

arXiv:2202.13460 [pdf, other]

Conditional Simulation Using Diffusion Schrödinger Bridges

Authors: Yuyang Shi, Valentin De Bortoli, George Deligiannidis, Arnaud Doucet

Abstract: Denoising diffusion models have recently emerged as a powerful class of generative models. They provide state-of-the-art results, not only for unconditional simulation, but also when used to solve conditional simulation problems arising in a wide range of inverse problems. A limitation of these models is that they are computationally intensive at generation time as they require simulating a diffus… ▽ More Denoising diffusion models have recently emerged as a powerful class of generative models. They provide state-of-the-art results, not only for unconditional simulation, but also when used to solve conditional simulation problems arising in a wide range of inverse problems. A limitation of these models is that they are computationally intensive at generation time as they require simulating a diffusion process over a long time horizon. When performing unconditional simulation, a Schrödinger bridge formulation of generative modeling leads to a theoretically grounded algorithm shortening generation time which is complementary to other proposed acceleration techniques. We extend the Schrödinger bridge framework to conditional simulation. We demonstrate this novel methodology on various applications including image super-resolution, optimal filtering for state-space models and the refinement of pre-trained networks. Our code can be found at https://github.com/vdeborto/cdsb. △ Less

Submitted 26 June, 2022; v1 submitted 27 February, 2022; originally announced February 2022.

Comments: 29 pages, 15 figures. UAI 2022 camera-ready version

arXiv:2202.02763 [pdf, other]

Riemannian Score-Based Generative Modelling

Authors: Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, Arnaud Doucet

Abstract: Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance. Score-based generative modelling (SGM) consists of a ``noising'' stage, whereby a diffusion is used to gradually add Gaussian noise to data, and a generative model, which entails a ``denoising'' process defined by approximating the time-reversal of the diffusion. Existing S… ▽ More Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance. Score-based generative modelling (SGM) consists of a ``noising'' stage, whereby a diffusion is used to gradually add Gaussian noise to data, and a generative model, which entails a ``denoising'' process defined by approximating the time-reversal of the diffusion. Existing SGMs assume that data is supported on a Euclidean space, i.e. a manifold with flat geometry. In many domains such as robotics, geoscience or protein modelling, data is often naturally described by distributions living on Riemannian manifolds and current SGM techniques are not appropriate. We introduce here Riemannian Score-based Generative Models (RSGMs), a class of generative models extending SGMs to Riemannian manifolds. We demonstrate our approach on a variety of manifolds, and in particular with earth and climate science spherical data. △ Less

Submitted 22 November, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

Comments: Neurips 2022 camera ready

arXiv:2201.06133 [pdf, other]

doi 10.1007/s10851-022-01134-7

On Maximum-a-Posteriori estimation with Plug & Play priors and stochastic gradient descent

Authors: Rémi Laumont, Valentin de Bortoli, Andrés Almansa, Julie Delon, Alain Durmus, Marcelo Pereyra

Abstract: Bayesian methods to solve imaging inverse problems usually combine an explicit data likelihood function with a prior distribution that explicitly models expected properties of the solution. Many kinds of priors have been explored in the literature, from simple ones expressing local properties to more involved ones exploiting image redundancy at a non-local scale. In a departure from explicit model… ▽ More Bayesian methods to solve imaging inverse problems usually combine an explicit data likelihood function with a prior distribution that explicitly models expected properties of the solution. Many kinds of priors have been explored in the literature, from simple ones expressing local properties to more involved ones exploiting image redundancy at a non-local scale. In a departure from explicit modelling, several recent works have proposed and studied the use of implicit priors defined by an image denoising algorithm. This approach, commonly known as Plug & Play (PnP) regularisation, can deliver remarkably accurate results, particularly when combined with state-of-the-art denoisers based on convolutional neural networks. However, the theoretical analysis of PnP Bayesian models and algorithms is difficult and works on the topic often rely on unrealistic assumptions on the properties of the image denoiser. This papers studies maximum-a-posteriori (MAP) estimation for Bayesian models with PnP priors. We first consider questions related to existence, stability and well-posedness, and then present a convergence proof for MAP computation by PnP stochastic gradient descent (PnP-SGD) under realistic assumptions on the denoiser used. We report a range of imaging experiments demonstrating PnP-SGD as well as comparisons with other PnP schemes. △ Less

Submitted 16 January, 2022; originally announced January 2022.

Report number: https://hal.science/hal-03348735/ MSC Class: 65K10 (Primary) 65K05; 62F15; 62C10; 68Q25; 68U10; 90C26 (Secondary) 65K10; 65K05; 62F15; 62C10; 68Q25; 68U10; 90C26

Journal ref: Journal of Mathematical Imaging and Vision, Volume 65, Pages 140-163 (2023)

arXiv:2111.07243 [pdf, ps, other]

Simulating Diffusion Bridges with Score Matching

Authors: Jeremy Heng, Valentin De Bortoli, Arnaud Doucet, James Thornton

Abstract: We consider the problem of simulating diffusion bridges, which are diffusion processes that are conditioned to initialize and terminate at two given states. The simulation of diffusion bridges has applications in diverse scientific fields and plays a crucial role in the statistical inference of discretely-observed diffusions. This is known to be a challenging problem that has received much attenti… ▽ More We consider the problem of simulating diffusion bridges, which are diffusion processes that are conditioned to initialize and terminate at two given states. The simulation of diffusion bridges has applications in diverse scientific fields and plays a crucial role in the statistical inference of discretely-observed diffusions. This is known to be a challenging problem that has received much attention in the last two decades. This article contributes to this rich body of literature by presenting a new avenue to obtain diffusion bridge approximations. Our approach is based on a backward time representation of a diffusion bridge, which may be simulated if one can time-reverse the unconditioned diffusion. We introduce a variational formulation to learn this time-reversal with function approximation and rely on a score matching method to circumvent intractability. Another iteration of our proposed methodology approximates the Doob's $h$-transform defining the forward time representation of a diffusion bridge. We discuss algorithmic considerations and extensions, and present numerical results on an Ornstein--Uhlenbeck process, a model from financial econometrics for interest rates, and a model from genetics for cell differentiation and development to illustrate the effectiveness of our approach. △ Less

Submitted 18 June, 2025; v1 submitted 14 November, 2021; originally announced November 2021.

Comments: Revised

arXiv:2110.12922 [pdf, other]

On quantitative Laplace-type convergence results for some exponential probability measures, with two applications

Authors: Valentin De Bortoli, Agnès Desolneux

Abstract: Laplace-type results characterize the limit of sequence of measures $(π_\varepsilon)_{\varepsilon >0}$ with density w.r.t the Lebesgue measure $(\mathrm{d} π_\varepsilon / \mathrm{d} \mathrm{Leb})(x) \propto \exp[-U(x)/\varepsilon]$ when the temperature $\varepsilon>0$ converges to $0$. If a limiting distribution $π_0$ exists, it concentrates on the minimizers of the potential $U$. Classical resul… ▽ More Laplace-type results characterize the limit of sequence of measures $(π_\varepsilon)_{\varepsilon >0}$ with density w.r.t the Lebesgue measure $(\mathrm{d} π_\varepsilon / \mathrm{d} \mathrm{Leb})(x) \propto \exp[-U(x)/\varepsilon]$ when the temperature $\varepsilon>0$ converges to $0$. If a limiting distribution $π_0$ exists, it concentrates on the minimizers of the potential $U$. Classical results require the invertibility of the Hessian of $U$ in order to establish such asymptotics. In this work, we study the particular case of norm-like potentials $U$ and establish quantitative bounds between $π_\varepsilon$ and $π_0$ w.r.t. the Wasserstein distance of order $1$ under an invertibility condition of a generalized Jacobian. One key element of our proof is the use of geometric measure theory tools such as the coarea formula. We apply our results to the study of maximum entropy models (microcanonical/macrocanonical distributions) and to the convergence of the iterates of the Stochastic Gradient Langevin Dynamics (SGLD) algorithm at low temperatures for non-convex minimization. △ Less

Submitted 25 October, 2021; originally announced October 2021.

arXiv:2108.08129 [pdf, ps, other]

Quantitative Uniform Stability of the Iterative Proportional Fitting Procedure

Authors: George Deligiannidis, Valentin De Bortoli, Arnaud Doucet

Abstract: We establish the uniform in time stability, w.r.t. the marginals, of the Iterative Proportional Fitting Procedure, also known as Sinkhorn algorithm, used to solve entropy-regularised Optimal Transport problems. Our result is quantitative and stated in terms of the 1-Wasserstein metric. As a corollary we establish a quantitative stability result for Schrödinger bridges. We establish the uniform in time stability, w.r.t. the marginals, of the Iterative Proportional Fitting Procedure, also known as Sinkhorn algorithm, used to solve entropy-regularised Optimal Transport problems. Our result is quantitative and stated in terms of the 1-Wasserstein metric. As a corollary we establish a quantitative stability result for Schrödinger bridges. △ Less

Submitted 22 October, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: 14 pages

arXiv:2106.01357 [pdf, other]

Diffusion Schrödinger Bridge with Applications to Score-Based Generative Modeling

Authors: Valentin De Bortoli, James Thornton, Jeremy Heng, Arnaud Doucet

Abstract: Progressively applying Gaussian noise transforms complex data distributions to approximately Gaussian. Reversing this dynamic defines a generative model. When the forward noising process is given by a Stochastic Differential Equation (SDE), Song et al. (2021) demonstrate how the time inhomogeneous drift of the associated reverse-time SDE may be estimated using score-matching. A limitation of this… ▽ More Progressively applying Gaussian noise transforms complex data distributions to approximately Gaussian. Reversing this dynamic defines a generative model. When the forward noising process is given by a Stochastic Differential Equation (SDE), Song et al. (2021) demonstrate how the time inhomogeneous drift of the associated reverse-time SDE may be estimated using score-matching. A limitation of this approach is that the forward-time SDE must be run for a sufficiently long time for the final distribution to be approximately Gaussian. In contrast, solving the Schrödinger Bridge problem (SB), i.e. an entropy-regularized optimal transport problem on path spaces, yields diffusions which generate samples from the data distribution in finite time. We present Diffusion SB (DSB), an original approximation of the Iterative Proportional Fitting (IPF) procedure to solve the SB problem, and provide theoretical analysis along with generative modeling experiments. The first DSB iteration recovers the methodology proposed by Song et al. (2021), with the flexibility of using shorter time intervals, as subsequent DSB iterations reduce the discrepancy between the final-time marginal of the forward (resp. backward) SDE with respect to the prior (resp. data) distribution. Beyond generative modeling, DSB offers a widely applicable computational optimal transport tool as the continuous state-space analogue of the popular Sinkhorn algorithm (Cuturi, 2013). △ Less

Submitted 5 April, 2023; v1 submitted 1 June, 2021; originally announced June 2021.

Comments: NeurIPS 2021 (spotlight)

arXiv:2103.04715 [pdf, other]

doi 10.1137/21M1406349

Bayesian imaging using Plug & Play priors: when Langevin meets Tweedie

Authors: Rémi Laumont, Valentin de Bortoli, Andrés Almansa, Julie Delon, Alain Durmus, Marcelo Pereyra

Abstract: Since the seminal work of Venkatakrishnan et al. in 2013, Plug & Play (PnP) methods have become ubiquitous in Bayesian imaging. These methods derive Minimum Mean Square Error (MMSE) or Maximum A Posteriori (MAP) estimators for inverse problems in imaging by combining an explicit likelihood function with a prior that is implicitly defined by an image denoising algorithm. The PnP algorithms proposed… ▽ More Since the seminal work of Venkatakrishnan et al. in 2013, Plug & Play (PnP) methods have become ubiquitous in Bayesian imaging. These methods derive Minimum Mean Square Error (MMSE) or Maximum A Posteriori (MAP) estimators for inverse problems in imaging by combining an explicit likelihood function with a prior that is implicitly defined by an image denoising algorithm. The PnP algorithms proposed in the literature mainly differ in the iterative schemes they use for optimisation or for sampling. In the case of optimisation schemes, some recent works guarantee the convergence to a fixed point, albeit not necessarily a MAP estimate. In the case of sampling schemes, to the best of our knowledge, there is no known proof of convergence. There also remain important open questions regarding whether the underlying Bayesian models and estimators are well defined, well-posed, and have the basic regularity properties required to support these numerical schemes. To address these limitations, this paper develops theory, methods, and provably convergent algorithms for performing Bayesian inference with PnP priors. We introduce two algorithms: 1) PnP-ULA (Unadjusted Langevin Algorithm) for Monte Carlo sampling and MMSE inference; and 2) PnP-SGD (Stochastic Gradient Descent) for MAP inference. Using recent results on the quantitative convergence of Markov chains, we establish detailed convergence guarantees for these two algorithms under realistic assumptions on the denoising operators used, with special attention to denoisers based on deep neural networks. We also show that these algorithms approximately target a decision-theoretically optimal Bayesian model that is well-posed. The proposed algorithms are demonstrated on several canonical problems such as image deblurring, inpainting, and denoising, where they are used for point estimation as well as for uncertainty visualisation and quantification. △ Less

Submitted 12 January, 2022; v1 submitted 8 March, 2021; originally announced March 2021.

Report number: https://hal.science/hal-03161400/ MSC Class: 65K10; 65K05; 65D18; 62F15; 62C10; 68Q25; 68U10; 90C26

Journal ref: SIAM Journal on Imaging Sciences, Volume 15, Issue 2 (2022)

arXiv:2007.06352 [pdf, other]

Quantitative Propagation of Chaos for SGD in Wide Neural Networks

Authors: Valentin De Bortoli, Alain Durmus, Xavier Fontaine, Umut Simsekli

Abstract: In this paper, we investigate the limiting behavior of a continuous-time counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural networks, as the number or neurons (ie, the size of the hidden layer) $N \to +\infty$. Following a probabilistic approach, we show 'propagation of chaos' for the particle system defined by this continuous-time dynamics… ▽ More In this paper, we investigate the limiting behavior of a continuous-time counterpart of the Stochastic Gradient Descent (SGD) algorithm applied to two-layer overparameterized neural networks, as the number or neurons (ie, the size of the hidden layer) $N \to +\infty$. Following a probabilistic approach, we show 'propagation of chaos' for the particle system defined by this continuous-time dynamics under different scenarios, indicating that the statistical interaction between the particles asymptotically vanishes. In particular, we establish quantitative convergence with respect to $N$ of any particle to a solution of a mean-field McKean-Vlasov equation in the metric space endowed with the Wasserstein distance. In comparison to previous works on the subject, we consider settings in which the sequence of stepsizes in SGD can potentially depend on the number of neurons and the iterations. We then identify two regimes under which different mean-field limits are obtained, one of them corresponding to an implicitly regularized version of the minimization problem at hand. We perform various experiments on real datasets to validate our theoretical results, assessing the existence of these two regimes on classification problems and illustrating our convergence results. △ Less

Submitted 14 July, 2020; v1 submitted 13 July, 2020; originally announced July 2020.

arXiv:1912.01691 [pdf, other]

Maximum entropy methods for texture synthesis: theory and practice

Authors: Valentin De Bortoli, Agnes Desolneux, Alain Durmus, Bruno Galerne, Arthur Leclaire

Abstract: Recent years have seen the rise of convolutional neural network techniques in exemplar-based image synthesis. These methods often rely on the minimization of some variational formulation on the image space for which the minimizers are assumed to be the solutions of the synthesis problem. In this paper we investigate, both theoretically and experimentally, another framework to deal with this proble… ▽ More Recent years have seen the rise of convolutional neural network techniques in exemplar-based image synthesis. These methods often rely on the minimization of some variational formulation on the image space for which the minimizers are assumed to be the solutions of the synthesis problem. In this paper we investigate, both theoretically and experimentally, another framework to deal with this problem using an alternate sampling/minimization scheme. First, we use results from information geometry to assess that our method yields a probability measure which has maximum entropy under some constraints in expectation. Then, we turn to the analysis of our method and we show, using recent results from the Markov chain literature, that its error can be explicitly bounded with constants which depend polynomially in the dimension even in the non-convex setting. This includes the case where the constraints are defined via a differentiable neural network. Finally, we present an extensive experimental study of the model, including a comparison with state-of-the-art methods and an extension to style transfer. △ Less

Submitted 3 December, 2019; originally announced December 2019.

Showing 1–42 of 42 results for author: De Bortoli, V