-
Generative diffusion models from a PDE perspective
Authors:
Fei Cao,
Kimball Johnston,
Thomas Laurent,
Justin Le,
Sébastien Motsch
Abstract:
Diffusion models have become the de facto framework for generating new datasets. The core of these models lies in the ability to reverse a diffusion process in time. The goal of this manuscript is to explain, from a PDE perspective, how this method works and how to derive the PDE governing the reverse dynamics as well as to study its solution analytically. By linking forward and reverse dynamics,…
▽ More
Diffusion models have become the de facto framework for generating new datasets. The core of these models lies in the ability to reverse a diffusion process in time. The goal of this manuscript is to explain, from a PDE perspective, how this method works and how to derive the PDE governing the reverse dynamics as well as to study its solution analytically. By linking forward and reverse dynamics, we show that the reverse process's distribution has its support contained within the original distribution. Consequently, diffusion methods, in their analytical formulation, do not inherently regularize the original distribution, and thus, there is no generalization principle. This raises a question: where does generalization arise, given that in practice it does occur? Moreover, we derive an explicit solution to the reverse process's SDE under the assumption that the starting point of the forward process is fixed. This provides a new derivation that links two popular approaches to generative diffusion models: stable diffusion (discrete dynamics) and the score-based approach (continuous dynamics). Finally, we explore the case where the original distribution consists of a finite set of data points. In this scenario, the reverse dynamics are explicit (i.e., the loss function has a clear minimizer), and solving the dynamics fails to generate new samples: the dynamics converge to the original samples. In a sense, solving the minimization problem exactly is "too good for its own good" (i.e., an overfitting regime).
△ Less
Submitted 28 January, 2025;
originally announced January 2025.
-
Combining Reinforcement Learning and Optimal Transport for the Traveling Salesman Problem
Authors:
Yong Liang Goh,
Wee Sun Lee,
Xavier Bresson,
Thomas Laurent,
Nicholas Lim
Abstract:
The traveling salesman problem is a fundamental combinatorial optimization problem with strong exact algorithms. However, as problems scale up, these exact algorithms fail to provide a solution in a reasonable time. To resolve this, current works look at utilizing deep learning to construct reasonable solutions. Such efforts have been very successful, but tend to be slow and compute intensive. Thi…
▽ More
The traveling salesman problem is a fundamental combinatorial optimization problem with strong exact algorithms. However, as problems scale up, these exact algorithms fail to provide a solution in a reasonable time. To resolve this, current works look at utilizing deep learning to construct reasonable solutions. Such efforts have been very successful, but tend to be slow and compute intensive. This paper exemplifies the integration of entropic regularized optimal transport techniques as a layer in a deep reinforcement learning network. We show that we can construct a model capable of learning without supervision and inferences significantly faster than current autoregressive approaches. We also empirically evaluate the benefits of including optimal transport algorithms within deep learning models to enforce assignment constraints during end-to-end training.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
Spatial Point Pattern Analysis of the Unidentified Aerial Phenomena in France
Authors:
Thibault Laurent,
Christine Thomas-Agnan,
Michaël Vaillant
Abstract:
We model the unidentified aerial phenomena observed in France during the last 60 years as a spatial point pattern. We use some public information such as population density, rate of moisture or presence of airports to model the intensity of the unidentified aerial phenomena. Spatial exploratory data analysis is a first approach to appreciate the link between the intensity of the unidentified aeria…
▽ More
We model the unidentified aerial phenomena observed in France during the last 60 years as a spatial point pattern. We use some public information such as population density, rate of moisture or presence of airports to model the intensity of the unidentified aerial phenomena. Spatial exploratory data analysis is a first approach to appreciate the link between the intensity of the unidentified aerial phenomena and the covariates. We then fit an inhomogeneous spatial Poisson process model with covariates. We find that the significant variables are the population density, the presence of the factories with a nuclear risk and contaminated land, and the rate of moisture. The analysis of the residuals shows that some parts of France (the Belgian border, the tip of Britany, some parts in the SouthEast , the Picardie and Haute-Normandie regions, the Loiret and Corr eze departments) present a high value of local intensity which are not explained by our model.
△ Less
Submitted 2 September, 2015;
originally announced September 2015.
-
The regularity of the boundary of a multidimensional aggregation patch
Authors:
Andrea Bertozzi,
John Garnett,
Thomas Laurent,
Joan Verdera
Abstract:
Let $d \geq 2$ and let $N(y)$ be the fundamental solution of the Laplace equation in $R^d$ We consider the aggregation equation $$ \frac{\partial ρ}{\partial t} + \operatorname{div}(ρv) =0, v = -\nabla N * ρ$$ with initial data $ρ(x,0) = χ_{D_0}$, where $χ_{D_0}$ is the indicator function of a bounded domain $D_0 \subset R^d.$ We now fix $0 < γ< 1$ and take $D_0$ to be a bounded $C^{1+γ}$ domain (…
▽ More
Let $d \geq 2$ and let $N(y)$ be the fundamental solution of the Laplace equation in $R^d$ We consider the aggregation equation $$ \frac{\partial ρ}{\partial t} + \operatorname{div}(ρv) =0, v = -\nabla N * ρ$$ with initial data $ρ(x,0) = χ_{D_0}$, where $χ_{D_0}$ is the indicator function of a bounded domain $D_0 \subset R^d.$ We now fix $0 < γ< 1$ and take $D_0$ to be a bounded $C^{1+γ}$ domain (a domain with smooth boundary of class $C^{1+γ}$). Then we have Theorem: If $D_0$ is a $C^{1 + γ}$ domain, then the initial value problem above has a solution given by $$ρ(x,t) = \frac{1}{1 -t} χ_{D_t}(x), \quad x \in R^d, \quad 0 \le t < 1$$ where $D_t$ is a $C^{1 + γ}$ domain for all $0 \leq t < 1$.
△ Less
Submitted 27 August, 2016; v1 submitted 28 July, 2015;
originally announced July 2015.
-
Consistency of Cheeger and Ratio Graph Cuts
Authors:
Nicolas Garcia Trillos,
Dejan Slepcev,
James von Brecht,
Thomas Laurent,
Xavier Bresson
Abstract:
This paper establishes the consistency of a family of graph-cut-based algorithms for clustering of data clouds. We consider point clouds obtained as samples of a ground-truth measure. We investigate approaches to clustering based on minimizing objective functionals defined on proximity graphs of the given sample. Our focus is on functionals based on graph cuts like the Cheeger and ratio cuts. We s…
▽ More
This paper establishes the consistency of a family of graph-cut-based algorithms for clustering of data clouds. We consider point clouds obtained as samples of a ground-truth measure. We investigate approaches to clustering based on minimizing objective functionals defined on proximity graphs of the given sample. Our focus is on functionals based on graph cuts like the Cheeger and ratio cuts. We show that minimizers of the these cuts converge as the sample size increases to a minimizer of a corresponding continuum cut (which partitions the ground truth measure). Moreover, we obtain sharp conditions on how the connectivity radius can be scaled with respect to the number of sample points for the consistency to hold. We provide results for two-way and for multiway cuts. Furthermore we provide numerical experiments that illustrate the results and explore the optimality of scaling in dimension two.
△ Less
Submitted 24 November, 2014;
originally announced November 2014.
-
Multiclass Total Variation Clustering
Authors:
Xavier Bresson,
Thomas Laurent,
David Uminsky,
James H. von Brecht
Abstract:
Ideas from the image processing literature have recently motivated a new set of clustering algorithms that rely on the concept of total variation. While these algorithms perform well for bi-partitioning tasks, their recursive extensions yield unimpressive results for multiclass clustering tasks. This paper presents a general framework for multiclass total variation clustering that does not rely on…
▽ More
Ideas from the image processing literature have recently motivated a new set of clustering algorithms that rely on the concept of total variation. While these algorithms perform well for bi-partitioning tasks, their recursive extensions yield unimpressive results for multiclass clustering tasks. This paper presents a general framework for multiclass total variation clustering that does not rely on recursion. The results greatly outperform previous total variation algorithms and compare well with state-of-the-art NMF approaches.
△ Less
Submitted 5 June, 2013;
originally announced June 2013.
-
A Method Based on Total Variation for Network Modularity Optimization using the MBO Scheme
Authors:
Huiyi Hu,
Thomas Laurent,
Mason A. Porter,
Andrea L. Bertozzi
Abstract:
The study of network structure is pervasive in sociology, biology, computer science, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups of nodes called "communities". One popular approach to find communities is to maximize a quality function known as {\em modularity} to achieve some sort of optimal clustering of nodes. In…
▽ More
The study of network structure is pervasive in sociology, biology, computer science, and many other disciplines. One of the most important areas of network science is the algorithmic detection of cohesive groups of nodes called "communities". One popular approach to find communities is to maximize a quality function known as {\em modularity} to achieve some sort of optimal clustering of nodes. In this paper, we interpret the modularity function from a novel perspective: we reformulate modularity optimization as a minimization problem of an energy functional that consists of a total variation term and an $\ell_2$ balance term. By employing numerical techniques from image processing and $\ell_1$ compressive sensing -- such as convex splitting and the Merriman-Bence-Osher (MBO) scheme -- we develop a variational algorithm for the minimization problem. We present our computational results using both synthetic benchmark networks and real data.
△ Less
Submitted 17 April, 2013;
originally announced April 2013.
-
An Adaptive Total Variation Algorithm for Computing the Balanced Cut of a Graph
Authors:
Xavier Bresson,
Thomas Laurent,
David Uminsky,
James H. von Brecht
Abstract:
We propose an adaptive version of the total variation algorithm proposed in [3] for computing the balanced cut of a graph. The algorithm from [3] used a sequence of inner total variation minimizations to guarantee descent of the balanced cut energy as well as convergence of the algorithm. In practice the total variation minimization step is never solved exactly. Instead, an accuracy parameter is s…
▽ More
We propose an adaptive version of the total variation algorithm proposed in [3] for computing the balanced cut of a graph. The algorithm from [3] used a sequence of inner total variation minimizations to guarantee descent of the balanced cut energy as well as convergence of the algorithm. In practice the total variation minimization step is never solved exactly. Instead, an accuracy parameter is specified and the total variation minimization terminates once this level of accuracy is reached. The choice of this parameter can vastly impact both the computational time of the overall algorithm as well as the accuracy of the result. Moreover, since the total variation minimization step is not solved exactly, the algorithm is not guarantied to be monotonic. In the present work we introduce a new adaptive stopping condition for the total variation minimization that guarantees monotonicity. This results in an algorithm that is actually monotonic in practice and is also significantly faster than previous, non-adaptive algorithms.
△ Less
Submitted 12 February, 2013;
originally announced February 2013.
-
Dimensionality of Local Minimizers of the Interaction Energy
Authors:
D. Balagué,
J. A. Carrillo,
T. Laurent,
G. Raoul
Abstract:
In this work we consider local minimizers (in the topology of transport distances) of the interaction energy associated to a repulsive-attractive potential. We show how the imensionality of the support of local minimizers is related to the repulsive strength of the potential at the origin.
In this work we consider local minimizers (in the topology of transport distances) of the interaction energy associated to a repulsive-attractive potential. We show how the imensionality of the support of local minimizers is related to the repulsive strength of the potential at the origin.
△ Less
Submitted 25 October, 2012;
originally announced October 2012.
-
Convergence of a Steepest Descent Algorithm for Ratio Cut Clustering
Authors:
Xavier Bresson,
Thomas Laurent,
David Uminsky,
James H. von Brecht
Abstract:
Unsupervised clustering of scattered, noisy and high-dimensional data points is an important and difficult problem. Tight continuous relaxations of balanced cut problems have recently been shown to provide excellent clustering results. In this paper, we present an explicit-implicit gradient flow scheme for the relaxed ratio cut problem, and prove that the algorithm converges to a critical point of…
▽ More
Unsupervised clustering of scattered, noisy and high-dimensional data points is an important and difficult problem. Tight continuous relaxations of balanced cut problems have recently been shown to provide excellent clustering results. In this paper, we present an explicit-implicit gradient flow scheme for the relaxed ratio cut problem, and prove that the algorithm converges to a critical point of the energy. We also show the efficiency of the proposed algorithm on the two moons dataset.
△ Less
Submitted 29 April, 2012;
originally announced April 2012.
-
Characterization of radially symmetric finite time blowup in multidimensional aggregation equations,
Authors:
Andrea L. Bertozzi,
John B. Garnett,
Thomas Laurent
Abstract:
This paper studies the transport of a mass $μ$ in $\real^d, d \geq 2,$ by a flow field $v= -\nabla K*μ$. We focus on kernels $K=|x|^α/ α$ for $2-d\leq α<2$ for which the smooth densities are known to develop singularities in finite time. For this range This paper studies the transport of a mass $μ$ in $\real^d, d \geq 2,$ by a flow field $v= -\nabla K*μ$. We focus on kernels $K=|x|^α/ α$ for…
▽ More
This paper studies the transport of a mass $μ$ in $\real^d, d \geq 2,$ by a flow field $v= -\nabla K*μ$. We focus on kernels $K=|x|^α/ α$ for $2-d\leq α<2$ for which the smooth densities are known to develop singularities in finite time. For this range This paper studies the transport of a mass $μ$ in $\real^d, d \geq 2,$ by a flow field $v= -\nabla K*μ$. We focus on kernels $K=|x|^α/ α$ for $2-d\leq α<2$ for which the smooth densities are known to develop singularities in finite time. For this range we prove the existence for all time of radially symmetric measure solutions that are monotone decreasing as a function of the radius, thus allowing for continuation of the solution past the blowup time. The monotone constraint on the data is consistent with the typical blowup profiles observed in recent numerical studies of these singularities. We prove monotonicity is preserved for all time, even after blowup, in contrast to the case $α>2$ where radially symmetric solutions are known to lose monotonicity. In the case of the Newtonian potential ($α=2-d$), under the assumption of radial symmetry the equation can be transformed into the inviscid Burgers equation on a half line. This enables us to prove preservation of monotonicity using the classical theory of conservation laws. In the case $2 -d < α< 2$ and at the critical exponent $p$ we exhibit initial data in $L^p$ for which the solution immediately develops a Dirac mass singularity. This extends recent work on the local ill-posedness of solutions at the critical exponent.
△ Less
Submitted 4 April, 2012;
originally announced April 2012.
-
Nonlocal interactions by repulsive-attractive potentials: radial ins/stability
Authors:
D. Balague,
J. A. Carrillo,
T. Laurent,
G. Raoul
Abstract:
In this paper, we investigate nonlocal interaction equations with repulsive-attractive radial potentials. Such equations describe the evolution of a continuum density of particles in which they repulse each other in the short range and attract each other in the long range. We prove that under some conditions on the potential, radially symmetric solutions converge exponentially fast in some transpo…
▽ More
In this paper, we investigate nonlocal interaction equations with repulsive-attractive radial potentials. Such equations describe the evolution of a continuum density of particles in which they repulse each other in the short range and attract each other in the long range. We prove that under some conditions on the potential, radially symmetric solutions converge exponentially fast in some transport distance toward a spherical shell stationary state. Otherwise we prove that it is not possible for a radially symmetric solution to converge weakly toward the spherical shell stationary state. We also investigate under which condition it is possible for a non-radially symmetric solution to converge toward a singular stationary state supported on a general hypersurface. Finally we provide a detailed analysis of the specific case of the repulsive-attractive power law potential as well as numerical results. We point out the the conditions of radial ins/stability are sharp.
△ Less
Submitted 24 September, 2011;
originally announced September 2011.