Search | arXiv e-print repository

Model-free learning of probability flows: Elucidating the nonequilibrium dynamics of flocking

Authors: Nicholas M. Boffi, Eric Vanden-Eijnden

Abstract: Active systems comprise a class of nonequilibrium dynamics in which individual components autonomously dissipate energy. Efforts towards understanding the role played by activity have centered on computation of the entropy production rate (EPR), which quantifies the breakdown of time reversal symmetry. A fundamental difficulty in this program is that high dimensionality of the phase space renders… ▽ More Active systems comprise a class of nonequilibrium dynamics in which individual components autonomously dissipate energy. Efforts towards understanding the role played by activity have centered on computation of the entropy production rate (EPR), which quantifies the breakdown of time reversal symmetry. A fundamental difficulty in this program is that high dimensionality of the phase space renders traditional computational techniques infeasible for estimating the EPR. Here, we overcome this challenge with a novel deep learning approach that estimates probability currents directly from stochastic system trajectories. We derive a new physical connection between the probability current and two local definitions of the EPR for inertial systems, which we apply to characterize the departure from equilibrium in a canonical model of flocking. Our results highlight that entropy is produced and consumed on the spatial interface of a flock as the interplay between alignment and fluctuation dynamically creates and annihilates order. By enabling the direct visualization of when and where a given system is out of equilibrium, we anticipate that our methodology will advance the understanding of a broad class of complex nonequilibrium dynamics. △ Less

Submitted 21 November, 2024; originally announced November 2024.

arXiv:2410.05163 [pdf, other]

An Efficient On-Policy Deep Learning Framework for Stochastic Optimal Control

Authors: Mengjian Hua, Mathieu Laurière, Eric Vanden-Eijnden

Abstract: We present a novel on-policy algorithm for solving stochastic optimal control (SOC) problems. By leveraging the Girsanov theorem, our method directly computes on-policy gradients of the SOC objective without expensive backpropagation through stochastic differential equations or adjoint problem solutions. This approach significantly accelerates the optimization of neural network control policies wh… ▽ More We present a novel on-policy algorithm for solving stochastic optimal control (SOC) problems. By leveraging the Girsanov theorem, our method directly computes on-policy gradients of the SOC objective without expensive backpropagation through stochastic differential equations or adjoint problem solutions. This approach significantly accelerates the optimization of neural network control policies while scaling efficiently to high-dimensional problems and long time horizons. We evaluate our method on classical SOC benchmarks as well as applications to sampling from unnormalized distributions via Schrödinger-Föllmer processes and fine-tuning pre-trained diffusion models. Experimental results demonstrate substantial improvements in both computational speed and memory efficiency compared to existing approaches. △ Less

Submitted 12 May, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

arXiv:2406.07507 [pdf, ps, other]

Flow map matching with stochastic interpolants: A mathematical framework for consistency models

Authors: Nicholas M. Boffi, Michael S. Albergo, Eric Vanden-Eijnden

Abstract: Generative models based on dynamical equations such as flows and diffusions offer exceptional sample quality, but require computationally expensive numerical integration during inference. The advent of consistency models has enabled efficient one-step or few-step generation, yet despite their practical success, a systematic understanding of their design has been hindered by the lack of a comprehen… ▽ More Generative models based on dynamical equations such as flows and diffusions offer exceptional sample quality, but require computationally expensive numerical integration during inference. The advent of consistency models has enabled efficient one-step or few-step generation, yet despite their practical success, a systematic understanding of their design has been hindered by the lack of a comprehensive theoretical framework. Here we introduce Flow Map Matching (FMM), a principled framework for learning the two-time flow map of an underlying dynamical generative model, thereby providing this missing mathematical foundation. Leveraging stochastic interpolants, we propose training objectives both for distillation from a pre-trained velocity field and for direct training of a flow map over an interpolant or a forward diffusion process. Theoretically, we show that FMM unifies and extends a broad class of existing approaches for fast sampling, including consistency models, consistency trajectory models, and progressive distillation. Experiments on CIFAR-10 and ImageNet-32 highlight that our approach can achieve sample quality comparable to flow matching while reducing generation time by a factor of 10-20. △ Less

Submitted 2 June, 2025; v1 submitted 11 June, 2024; originally announced June 2024.

arXiv:2404.01145 [pdf, ps, other]

Sequential-in-time training of nonlinear parametrizations for solving time-dependent partial differential equations

Authors: Huan Zhang, Yifan Chen, Eric Vanden-Eijnden, Benjamin Peherstorfer

Abstract: Sequential-in-time methods solve a sequence of training problems to fit nonlinear parametrizations such as neural networks to approximate solution trajectories of partial differential equations over time. This work shows that sequential-in-time training methods can be understood broadly as either optimize-then-discretize (OtD) or discretize-then-optimize (DtO) schemes, which are well known concept… ▽ More Sequential-in-time methods solve a sequence of training problems to fit nonlinear parametrizations such as neural networks to approximate solution trajectories of partial differential equations over time. This work shows that sequential-in-time training methods can be understood broadly as either optimize-then-discretize (OtD) or discretize-then-optimize (DtO) schemes, which are well known concepts in numerical analysis. The unifying perspective leads to novel stability and a posteriori error analysis results that provide insights into theoretical and numerical aspects that are inherent to either OtD or DtO schemes such as the tangent space collapse phenomenon, which is a form of over-fitting. Additionally, the unified perspective facilitates establishing connections between variants of sequential-in-time training methods, which is demonstrated by identifying natural gradient descent methods on energy functionals as OtD schemes applied to the corresponding gradient flows. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2310.03695 [pdf, other]

Multimarginal generative modeling with stochastic interpolants

Authors: Michael S. Albergo, Nicholas M. Boffi, Michael Lindsey, Eric Vanden-Eijnden

Abstract: Given a set of $K$ probability densities, we consider the multimarginal generative modeling problem of learning a joint distribution that recovers these densities as marginals. The structure of this joint distribution should identify multi-way correspondences among the prescribed marginals. We formalize an approach to this task within a generalization of the stochastic interpolant framework, leadi… ▽ More Given a set of $K$ probability densities, we consider the multimarginal generative modeling problem of learning a joint distribution that recovers these densities as marginals. The structure of this joint distribution should identify multi-way correspondences among the prescribed marginals. We formalize an approach to this task within a generalization of the stochastic interpolant framework, leading to efficient learning algorithms built upon dynamical transport of measure. Our generative models are defined by velocity and score fields that can be characterized as the minimizers of simple quadratic objectives, and they are defined on a simplex that generalizes the time variable in the usual dynamical transport framework. The resulting transport on the simplex is influenced by all marginals, and we show that multi-way correspondences can be extracted. The identification of such correspondences has applications to style transfer, algorithmic fairness, and data decorruption. In addition, the multimarginal perspective enables an efficient algorithm for reducing the dynamical transport cost in the ordinary two-marginal setting. We demonstrate these capacities with several numerical examples. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2309.12991 [pdf, other]

doi 10.1073/pnas.2318106121

Deep learning probability flows and entropy production rates in active matter

Authors: Nicholas M. Boffi, Eric Vanden-Eijnden

Abstract: Active matter systems, from self-propelled colloids to motile bacteria, are characterized by the conversion of free energy into useful work at the microscopic scale. They involve physics beyond the reach of equilibrium statistical mechanics, and a persistent challenge has been to understand the nature of their nonequilibrium states. The entropy production rate and the probability current provide q… ▽ More Active matter systems, from self-propelled colloids to motile bacteria, are characterized by the conversion of free energy into useful work at the microscopic scale. They involve physics beyond the reach of equilibrium statistical mechanics, and a persistent challenge has been to understand the nature of their nonequilibrium states. The entropy production rate and the probability current provide quantitative ways to do so by measuring the breakdown of time-reversal symmetry. Yet, their efficient computation has remained elusive, as they depend on the system's unknown and high-dimensional probability density. Here, building upon recent advances in generative modeling, we develop a deep learning framework to estimate the score of this density. We show that the score, together with the microscopic equations of motion, gives access to the entropy production rate, the probability current, and their decomposition into local contributions from individual particles. To represent the score, we introduce a novel, spatially-local transformer network architecture that learns high-order interactions between particles while respecting their underlying permutation symmetry. We demonstrate the broad utility and scalability of the method by applying it to several high-dimensional systems of active particles undergoing motility-induced phase separation (MIPS). We show that a single network trained on a system of 4096 particles at one packing fraction can generalize to other regions of the phase diagram, including systems with as many as 32768 particles. We use this observation to quantify the spatial structure of the departure from equilibrium in MIPS as a function of the number of particles and the packing fraction. △ Less

Submitted 17 June, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

arXiv:2306.15630 [pdf, ps, other]

Coupling parameter and particle dynamics for adaptive sampling in Neural Galerkin schemes

Authors: Yuxiao Wen, Eric Vanden-Eijnden, Benjamin Peherstorfer

Abstract: Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, esp… ▽ More Training nonlinear parametrizations such as deep neural networks to numerically approximate solutions of partial differential equations is often based on minimizing a loss that includes the residual, which is analytically available in limited settings only. At the same time, empirically estimating the training loss is challenging because residuals and related quantities can have high variance, especially for transport-dominated and high-dimensional problems that exhibit local features such as waves and coherent structures. Thus, estimators based on data samples from un-informed, uniform distributions are inefficient. This work introduces Neural Galerkin schemes that estimate the training loss with data from adaptive distributions, which are empirically represented via ensembles of particles. The ensembles are actively adapted by evolving the particles with dynamics coupled to the nonlinear parametrizations of the solution fields so that the ensembles remain informative for estimating the training loss. Numerical experiments indicate that few dynamic particles are sufficient for obtaining accurate empirical estimates of the training loss, even for problems with local features and with high-dimensional spatial domains. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2305.19414 [pdf, other]

Efficient Training of Energy-Based Models Using Jarzynski Equality

Authors: Davide Carbone, Mengjian Hua, Simon Coste, Eric Vanden-Eijnden

Abstract: Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Their performance is best measured by the cross-entropy (CE) of the model distribution relative to the data distribution. Using the CE as the objective for training is however challenging because the computation of its gradient with respect to the model param… ▽ More Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Their performance is best measured by the cross-entropy (CE) of the model distribution relative to the data distribution. Using the CE as the objective for training is however challenging because the computation of its gradient with respect to the model parameters requires sampling the model distribution. Here we show how results for nonequilibrium thermodynamics based on Jarzynski equality together with tools from sequential Monte-Carlo sampling can be used to perform this computation efficiently and avoid the uncontrolled approximations made using the standard contrastive divergence algorithm. Specifically, we introduce a modification of the unadjusted Langevin algorithm (ULA) in which each walker acquires a weight that enables the estimation of the gradient of the cross-entropy at any step during GD, thereby bypassing sampling biases induced by slow mixing of ULA. We illustrate these results with numerical experiments on Gaussian mixture distributions as well as the MNIST dataset. We show that the proposed approach outperforms methods based on the contrastive divergence algorithm in all the considered situations. △ Less

Submitted 11 December, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

arXiv:2303.08797 [pdf, other]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Authors: Michael S. Albergo, Nicholas M. Boffi, Eric Vanden-Eijnden

Abstract: A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo & Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time stochastic processes called `stochastic interpolants' to bridge any two arbitrary probability density functions exactly in finite time. These interpolants are built by comb… ▽ More A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo & Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time stochastic processes called `stochastic interpolants' to bridge any two arbitrary probability density functions exactly in finite time. These interpolants are built by combining data from the two prescribed densities with an additional latent variable that shapes the bridge in a flexible way. The time-dependent probability density function of the stochastic interpolant is shown to satisfy a first-order transport equation as well as a family of forward and backward Fokker-Planck equations with tunable diffusion coefficient. Upon consideration of the time evolution of an individual sample, this viewpoint immediately leads to both deterministic and stochastic generative models based on probability flow equations or stochastic differential equations with an adjustable level of noise. The drift coefficients entering these models are time-dependent velocity fields characterized as the unique minimizers of simple quadratic objective functions, one of which is a new objective for the score of the interpolant density. We show that minimization of these quadratic objectives leads to control of the likelihood for generative models built upon stochastic dynamics, while likelihood control for deterministic dynamics is more stringent. We also discuss connections with other methods such as score-based diffusion models, stochastic localization processes, probabilistic denoising techniques, and rectifying flows. In addition, we demonstrate that stochastic interpolants recover the Schrödinger bridge between the two target densities when explicitly optimizing over the interpolant. Finally, algorithmic aspects are discussed and the approach is illustrated on numerical examples. △ Less

Submitted 6 November, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

arXiv:2210.16286 [pdf, other]

A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks

Authors: Zhengdao Chen, Eric Vanden-Eijnden, Joan Bruna

Abstract: To understand the training dynamics of neural networks (NNs), prior studies have considered the infinite-width mean-field (MF) limit of two-layer NN, establishing theoretical guarantees of its convergence under gradient flow training as well as its approximation and generalization capabilities. In this work, we study the infinite-width limit of a type of three-layer NN model whose first layer is r… ▽ More To understand the training dynamics of neural networks (NNs), prior studies have considered the infinite-width mean-field (MF) limit of two-layer NN, establishing theoretical guarantees of its convergence under gradient flow training as well as its approximation and generalization capabilities. In this work, we study the infinite-width limit of a type of three-layer NN model whose first layer is random and fixed. To define the limiting model rigorously, we generalize the MF theory of two-layer NNs by treating the neurons as belonging to functional spaces. Then, by writing the MF training dynamics as a kernel gradient flow with a time-varying kernel that remains positive-definite, we prove that its training loss in $L_2$ regression decays to zero at a linear rate. Furthermore, we define function spaces that include the solutions obtainable through the MF training dynamics and prove Rademacher complexity bounds for these spaces. Our theory accommodates different scaling choices of the model, resulting in two regimes of the MF limit that demonstrate distinctive behaviors while both exhibiting feature learning. △ Less

Submitted 28 October, 2022; originally announced October 2022.

arXiv:2206.09908 [pdf, other]

Learning Optimal Flows for Non-Equilibrium Importance Sampling

Authors: Yu Cao, Eric Vanden-Eijnden

Abstract: Many applications in computational sciences and statistical inference require the computation of expectations with respect to complex high-dimensional distributions with unknown normalization constants, as well as the estimation of these constants. Here we develop a method to perform these calculations based on generating samples from a simple base distribution, transporting them by the flow gener… ▽ More Many applications in computational sciences and statistical inference require the computation of expectations with respect to complex high-dimensional distributions with unknown normalization constants, as well as the estimation of these constants. Here we develop a method to perform these calculations based on generating samples from a simple base distribution, transporting them by the flow generated by a velocity field, and performing averages along these flowlines. This non-equilibrium importance sampling (NEIS) strategy is straightforward to implement and can be used for calculations with arbitrary target distributions. On the theory side, we discuss how to tailor the velocity field to the target and establish general conditions under which the proposed estimator is a perfect estimator with zero-variance. We also draw connections between NEIS and approaches based on mapping a base distribution onto a target via a transport map. On the computational side, we show how to use deep learning to represent the velocity field by a neural network and train it towards the zero variance optimum. These results are illustrated numerically on benchmark examples (with dimension up to $10$), where after training the velocity field, the variance of the NEIS estimator is reduced by up to $6$ orders of magnitude than that of a vanilla estimator. We also compare the performances of NEIS with those of Neal's annealed importance sampling (AIS). △ Less

Submitted 24 October, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

arXiv:2206.04642 [pdf, other]

Probability flow solution of the Fokker-Planck equation

Authors: Nicholas M. Boffi, Eric Vanden-Eijnden

Abstract: The method of choice for integrating the time-dependent Fokker-Planck equation in high-dimension is to generate samples from the solution via integration of the associated stochastic differential equation. Here, we study an alternative scheme based on integrating an ordinary differential equation that describes the flow of probability. Acting as a transport map, this equation deterministically pus… ▽ More The method of choice for integrating the time-dependent Fokker-Planck equation in high-dimension is to generate samples from the solution via integration of the associated stochastic differential equation. Here, we study an alternative scheme based on integrating an ordinary differential equation that describes the flow of probability. Acting as a transport map, this equation deterministically pushes samples from the initial density onto samples from the solution at any later time. Unlike integration of the stochastic dynamics, the method has the advantage of giving direct access to quantities that are challenging to estimate from trajectories alone, such as the probability current, the density itself, and its entropy. The probability flow equation depends on the gradient of the logarithm of the solution (its "score"), and so is a-priori unknown. To resolve this dependence, we model the score with a deep neural network that is learned on-the-fly by propagating a set of samples according to the instantaneous probability current. We show theoretically that the proposed approach controls the KL divergence from the learned solution to the target, while learning on external samples from the stochastic differential equation does not control either direction of the KL divergence. Empirically, we consider several high-dimensional Fokker-Planck equations from the physics of interacting particle systems. We find that the method accurately matches analytical solutions when they are available as well as moments computed via Monte-Carlo when they are not. Moreover, the method offers compelling predictions for the global entropy production rate that out-perform those obtained from learning on stochastic trajectories, and can effectively capture non-equilibrium steady-state probability currents over long time intervals. △ Less

Submitted 15 February, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

arXiv:2204.10782 [pdf, other]

On Feature Learning in Neural Networks with Global Convergence Guarantees

Authors: Zhengdao Chen, Eric Vanden-Eijnden, Joan Bruna

Abstract: We study the optimization of wide neural networks (NNs) via gradient flow (GF) in setups that allow feature learning while admitting non-asymptotic global convergence guarantees. First, for wide shallow NNs under the mean-field scaling and with a general class of activation functions, we prove that when the input dimension is no less than the size of the training set, the training loss converges t… ▽ More We study the optimization of wide neural networks (NNs) via gradient flow (GF) in setups that allow feature learning while admitting non-asymptotic global convergence guarantees. First, for wide shallow NNs under the mean-field scaling and with a general class of activation functions, we prove that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. Building upon this analysis, we study a model of wide multi-layer NNs whose second-to-last layer is trained via GF, for which we also prove a linear-rate convergence of the training loss to zero, but regardless of the input dimension. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart. △ Less

Submitted 22 April, 2022; originally announced April 2022.

Comments: Accepted by the 10th International Conference on Learning Representations (ICLR 2022)

arXiv:2203.01360 [pdf, other]

doi 10.1016/j.jcp.2023.112588

Neural Galerkin Schemes with Active Learning for High-Dimensional Evolution Equations

Authors: Joan Bruna, Benjamin Peherstorfer, Eric Vanden-Eijnden

Abstract: Deep neural networks have been shown to provide accurate function approximations in high dimensions. However, fitting network parameters requires informative training data that are often challenging to collect in science and engineering applications. This work proposes Neural Galerkin schemes based on deep learning that generate training data with active learning for numerically solving high-dimen… ▽ More Deep neural networks have been shown to provide accurate function approximations in high dimensions. However, fitting network parameters requires informative training data that are often challenging to collect in science and engineering applications. This work proposes Neural Galerkin schemes based on deep learning that generate training data with active learning for numerically solving high-dimensional partial differential equations. Neural Galerkin schemes build on the Dirac-Frenkel variational principle to train networks by minimizing the residual sequentially over time, which enables adaptively collecting new training data in a self-informed manner that is guided by the dynamics described by the partial differential equations. This is in contrast to other machine learning methods that aim to fit network parameters globally in time without taking into account training data acquisition. Our finding is that the active form of gathering training data of the proposed Neural Galerkin schemes is key for numerically realizing the expressive power of networks in high dimensions. Numerical experiments demonstrate that Neural Galerkin schemes have the potential to enable simulating phenomena and processes with many variables for which traditional and other deep-learning-based solvers fail, especially when features of the solutions evolve locally such as in high-dimensional wave propagation problems and interacting particle systems described by Fokker-Planck and kinetic equations. △ Less

Submitted 29 February, 2024; v1 submitted 2 March, 2022; originally announced March 2022.

Journal ref: Journal of Computational Physics, Volume 496, 2024

arXiv:2111.14325 [pdf, ps, other]

doi 10.1007/s00024-023-03281-3

Estimating earthquake-induced tsunami height probabilities without sampling

Authors: Shanyin Tong, Eric Vanden-Eijnden, Georg Stadler

Abstract: Given a distribution of earthquake-induced seafloor elevations, we present a method to compute the probability of the resulting tsunamis reaching a certain size on shore. Instead of sampling, the proposed method relies on optimization to compute the most likely fault slips that result in a seafloor deformation inducing a large tsunami wave. We model tsunamis induced by bathymetry change using the… ▽ More Given a distribution of earthquake-induced seafloor elevations, we present a method to compute the probability of the resulting tsunamis reaching a certain size on shore. Instead of sampling, the proposed method relies on optimization to compute the most likely fault slips that result in a seafloor deformation inducing a large tsunami wave. We model tsunamis induced by bathymetry change using the shallow water equations on an idealized slice through the sea. The earthquake slip model is based on a sum of multivariate log-normal distributions, and follows the Gutenberg-Richter law for moment magnitudes 7--9. For a model problem inspired by the Tohoku-Oki 2011 earthquake and tsunami, we quantify annual probabilities of differently sized tsunami waves. Our method also identifies the most effective tsunami mechanisms. These mechanisms have smoothly varying fault slip patches that lead to an expansive but moderately large bathymetry change. The resulting tsunami waves are compressed as they approach shore and reach close-to-vertical leading wave edge close to shore. △ Less

Submitted 7 April, 2023; v1 submitted 28 November, 2021; originally announced November 2021.

arXiv:2107.05134 [pdf, other]

Dual Training of Energy-Based Models with Overparametrized Shallow Neural Networks

Authors: Carles Domingo-Enrich, Alberto Bietti, Marylou Gabrié, Joan Bruna, Eric Vanden-Eijnden

Abstract: Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is non-convex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational principles dual to maximum likelihood EBMs with shallow… ▽ More Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is non-convex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational principles dual to maximum likelihood EBMs with shallow overparametrized neural network energies, both in the feature-learning and lazy linearized regimes. In the feature-learning regime, this dual formulation justifies using a two time-scale gradient ascent-descent (GDA) training algorithm in which one updates concurrently the particles in the sample space and the neurons in the parameter space of the energy. We also consider a variant of this algorithm in which the particles are sometimes restarted at random samples drawn from the data set, and show that performing these restarts at every iteration step corresponds to score matching training. These results are illustrated in simple numerical experiments, which indicates that GDA performs best when features and particles are updated using similar time scales. △ Less

Submitted 15 February, 2022; v1 submitted 11 July, 2021; originally announced July 2021.

arXiv:2103.04837 [pdf, other]

Sharp Asymptotic Estimates for Expectations, Probabilities, and Mean First Passage Times in Stochastic Systems with Small Noise

Authors: Tobias Grafke, Tobias Schäfer, Eric Vanden-Eijnden

Abstract: Freidlin-Wentzell theory of large deviations can be used to compute the likelihood of extreme or rare events in stochastic dynamical systems via the solution of an optimization problem. The approach gives exponential estimates that often need to be refined via calculation of a prefactor. Here it is shown how to perform these computations in practice. Specifically, sharp asymptotic estimates are de… ▽ More Freidlin-Wentzell theory of large deviations can be used to compute the likelihood of extreme or rare events in stochastic dynamical systems via the solution of an optimization problem. The approach gives exponential estimates that often need to be refined via calculation of a prefactor. Here it is shown how to perform these computations in practice. Specifically, sharp asymptotic estimates are derived for expectations, probabilities, and mean first passage times in a form that is geared towards numerical purposes: they require solving well-posed matrix Riccati equations involving the minimizer of the Freidlin-Wentzell action as input, either forward or backward in time with appropriate initial or final conditions tailored to the estimate at hand. The usefulness of our approach is illustrated on several examples. In particular, invariant measure probabilities and mean first passage times are calculated in models involving stochastic partial differential equations of reaction-advection-diffusion type. △ Less

Submitted 15 September, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

arXiv:2008.09623 [pdf, other]

A Dynamical Central Limit Theorem for Shallow Neural Networks

Authors: Zhengdao Chen, Grant M. Rotskoff, Joan Bruna, Eric Vanden-Eijnden

Abstract: Recent theoretical works have characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic mean-field limit when the width tends towards infinity. At initialization, the random sampling of the parameters leads to deviations from the mean-field limit dictated by the classical Central Limit Theorem (CLT). However, since gradient descent induces correlation… ▽ More Recent theoretical works have characterized the dynamics of wide shallow neural networks trained via gradient descent in an asymptotic mean-field limit when the width tends towards infinity. At initialization, the random sampling of the parameters leads to deviations from the mean-field limit dictated by the classical Central Limit Theorem (CLT). However, since gradient descent induces correlations among the parameters, it is of interest to analyze how these fluctuations evolve. Here, we use a dynamical CLT to prove that the asymptotic fluctuations around the mean limit remain bounded in mean square throughout training. The upper bound is given by a Monte-Carlo resampling error, with a variance that that depends on the 2-norm of the underlying measure, which also controls the generalization error. This motivates the use of this 2-norm as a regularization term during training. Furthermore, if the mean-field dynamics converges to a measure that interpolates the training data, we prove that the asymptotic deviation eventually vanishes in the CLT scaling. We also complement these results with numerical experiments. △ Less

Submitted 26 March, 2022; v1 submitted 21 August, 2020; originally announced August 2020.

Comments: Appeared in Advances in Neural Information Processing Systems 33 (NeurIPS 2020). An error in Theorem 3.5 has been corrected

arXiv:2007.13930 [pdf, other]

doi 10.2140/camcos.2021.16.181

Extreme event probability estimation using PDE-constrained optimization and large deviation theory, with application to tsunamis

Authors: Shanyin Tong, Eric Vanden-Eijnden, Georg Stadler

Abstract: We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we… ▽ More We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we blend theoretical results from large deviation theory (LDT) with numerical tools from PDE-constrained optimization. Our methods first compute parameters that minimize the LDT-rate function over the set of parameters leading to extreme events, using adjoint methods to compute the gradient of this rate function. The minimizers give information about the mechanism of the extreme events as well as estimates of their probability. We then propose a series of methods to refine these estimates, either via importance sampling or geometric approximation of the extreme event sets. Results are formulated for general parameter distributions and detailed expressions are provided when Gaussian distributions. We give theoretical and numerical arguments showing that the performance of our methods is insensitive to the extremeness of the events we are interested in. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as a random process, which takes into account the underlying physics. We use the one-dimensional shallow water equation to model tsunamis numerically. In the context of this example, we present a comparison of our methods for extreme event probability estimation, and find which type of ocean floor elevation change leads to the largest tsunamis on shore. △ Less

Submitted 22 November, 2023; v1 submitted 27 July, 2020; originally announced July 2020.

MSC Class: 65K10; 35Q93; 76B15; 60F10; 60H35

Journal ref: Commun. Appl. Math. Comput. Sci. 16 (2021) 181-225

arXiv:1809.05066 [pdf, other]

doi 10.1088/1742-5468/aaf323

Simulated Tempering Method in the Infinite Switch Limit with Adaptive Weight Learning

Authors: Anton Martinsson, Jianfeng Lu, Benedict Leimkuhler, Eric Vanden-Eijnden

Abstract: We investigate the theoretical foundations of the simulated tempering method and use our findings to design efficient algorithms. Employing a large deviation argument first used for replica exchange molecular dynamics [Plattner et al., J. Chem. Phys. 135:134111 (2011)], we demonstrate that the most efficient approach to simulated tempering is to vary the temperature infinitely rapidly. In this lim… ▽ More We investigate the theoretical foundations of the simulated tempering method and use our findings to design efficient algorithms. Employing a large deviation argument first used for replica exchange molecular dynamics [Plattner et al., J. Chem. Phys. 135:134111 (2011)], we demonstrate that the most efficient approach to simulated tempering is to vary the temperature infinitely rapidly. In this limit, we can replace the equations of motion for the temperature and physical variables by averaged equations for the latter alone, with the forces rescaled according to a position-dependent function defined in terms of temperature weights. The averaged equations are similar to those used in Gao's integrated-over-temperature method, except that we show that it is better to use a continuous rather than a discrete set of temperatures. We give a theoretical argument for the choice of the temperature weights as the reciprocal partition function, thereby relating simulated tempering to Wang-Landau sampling. Finally, we describe a self-consistent algorithm for simultaneously sampling the canonical ensemble and learning the weights during simulation. This algorithm is tested on a system of harmonic oscillators as well as a continuous variant of the Curie-Weiss model, where it is shown to perform well and to accurately capture the second-order phase transition observed in this model. △ Less

Submitted 7 February, 2019; v1 submitted 13 September, 2018; originally announced September 2018.

arXiv:1808.10764 [pdf, other]

Extreme event quantification in dynamical systems with random components

Authors: Giovanni Dematteis, Tobias Grafke, Eric Vanden-Eijnden

Abstract: A central problem in uncertainty quantification is how to characterize the impact that our incomplete knowledge about models has on the predictions we make from them. This question naturally lends itself to a probabilistic formulation, by making the unknown model parameters random with given statistics. Here this approach is used in concert with tools from large deviation theory (LDT) and optimal… ▽ More A central problem in uncertainty quantification is how to characterize the impact that our incomplete knowledge about models has on the predictions we make from them. This question naturally lends itself to a probabilistic formulation, by making the unknown model parameters random with given statistics. Here this approach is used in concert with tools from large deviation theory (LDT) and optimal control to estimate the probability that some observables in a dynamical system go above a large threshold after some time, given the prior statistical information about the system's parameters and/or its initial conditions. Specifically, it is established under which conditions such extreme events occur in a predictable way, as the minimizer of the LDT action functional. It is also shown how this minimization can be numerically performed in an efficient way using tools from optimal control. These findings are illustrated on the examples of a rod with random elasticity pulled by a time-dependent force, and the nonlinear Schrödinger equation (NLSE) with random initial conditions. △ Less

Submitted 31 August, 2018; originally announced August 2018.

arXiv:1712.06947 [pdf, ps, other]

Methodological and computational aspects of parallel tempering methods in the infinite swapping limit

Authors: Jianfeng Lu, Eric Vanden-Eijnden

Abstract: A variant of the parallel tempering method is proposed in terms of a stochastic switching process for the coupled dynamics of replica configuration and temperature permutation. This formulation is shown to facilitate the analysis of the convergence properties of parallel tempering by large deviation theory, which indicates that the method should be operated in the infinite swapping limit to maximi… ▽ More A variant of the parallel tempering method is proposed in terms of a stochastic switching process for the coupled dynamics of replica configuration and temperature permutation. This formulation is shown to facilitate the analysis of the convergence properties of parallel tempering by large deviation theory, which indicates that the method should be operated in the infinite swapping limit to maximize sampling efficiency. The effective equation for the replica alone that arises in this infinite swapping limit simply involves replacing the original potential by a mixture potential. The analysis of the geometric properties of this potential offers a new perspective on the issues of how to choose of temperature ladder, and why many temperatures should typically be introduced to boost the sampling efficiency. It is also shown how to simulate the effective equation in this many temperature regime using multiscale integrators. Finally, similar ideas are also used to discuss extensions of the infinite swapping limits to the technique of simulated tempering. △ Less

Submitted 19 December, 2017; originally announced December 2017.

arXiv:1708.08800 [pdf, ps, other]

doi 10.1088/1361-6544/aac541

Longtime convergence of the Temperature-Accelerated Molecular Dynamics Method

Authors: Gabriel Stoltz, Eric Vanden-Eijnden

Abstract: The equations of the temperature-accelerated molecular dynamics (TAMD) method for the calculations of free energies and partition functions are analyzed. Specifically, the exponential convergence of the law of these stochastic processes is established, with a convergence rate close to the one of the limiting, effective dynamics at higher temperature obtained with infinite acceleration. It is also… ▽ More The equations of the temperature-accelerated molecular dynamics (TAMD) method for the calculations of free energies and partition functions are analyzed. Specifically, the exponential convergence of the law of these stochastic processes is established, with a convergence rate close to the one of the limiting, effective dynamics at higher temperature obtained with infinite acceleration. It is also shown that the invariant measures of TAMD are close to a known reference measure, with an error that can be quantified precisely. Finally, a Central Limit Theorem is proven, which allows the estimation of errors on properties calculated by ergodic time averages. These results not only demonstrate the usefulness and validity range of the TAMD equations, but they also permit in principle to adjust the parameter in these equations to optimize their efficiency. △ Less

Submitted 2 May, 2018; v1 submitted 29 August, 2017; originally announced August 2017.

arXiv:1704.01496 [pdf, other]

doi 10.1073/pnas.1710670115

Rogue Waves and Large Deviations in Deep Sea

Authors: Giovanni Dematteis, Tobias Grafke, Eric Vanden-Eijnden

Abstract: The appearance of rogue waves in deep sea is investigated using the modified nonlinear Schrödinger (MNLS) equation in one spatial-dimension with random initial conditions that are assumed to be normally distributed, with a spectrum approximating realistic conditions of a uni-directional sea state. It is shown that one can use the incomplete information contained in this spectrum as prior and suppl… ▽ More The appearance of rogue waves in deep sea is investigated using the modified nonlinear Schrödinger (MNLS) equation in one spatial-dimension with random initial conditions that are assumed to be normally distributed, with a spectrum approximating realistic conditions of a uni-directional sea state. It is shown that one can use the incomplete information contained in this spectrum as prior and supplement this information with the MNLS dynamics to reliably estimate the probability distribution of the sea surface elevation far in the tail at later times. Our results indicate that rogue waves occur when the system hits unlikely pockets of wave configurations that trigger large disturbances of the surface height. The rogue wave precursors in these pockets are wave patterns of regular height but with a very specific shape that is identified explicitly, thereby allowing for early detection. The method proposed here combines Monte Carlo sampling with tools from large deviations theory that reduce the calculation of the most likely rogue wave precursors to an optimization problem that can be solved efficiently. This approach is transferable to other problems in which the system's governing equations contain random initial conditions and/or parameters. △ Less

Submitted 23 January, 2018; v1 submitted 5 April, 2017; originally announced April 2017.

arXiv:1608.01355 [pdf, ps, other]

doi 10.1007/s00332-016-9358-x

Metastability of the Nonlinear Wave Equation: Insights from Transition State Theory

Authors: Katherine A Newhall, Eric Vanden-Eijnden

Abstract: This paper is concerned with the long-time dynamics of the nonlinear wave equation in one-space dimension, $$ u_{tt} - δ^2 u_{xx} +V'(u) =0 \qquad x\in [0,1] $$ where $δ>0$ is a parameter and $V(u)$ is a potential bounded from below and growing at least like $u^2$ as $|u|\to\infty$. Infinite energy solutions of this equation preserve a natural Gibbsian invariant measure and when the potential is d… ▽ More This paper is concerned with the long-time dynamics of the nonlinear wave equation in one-space dimension, $$ u_{tt} - δ^2 u_{xx} +V'(u) =0 \qquad x\in [0,1] $$ where $δ>0$ is a parameter and $V(u)$ is a potential bounded from below and growing at least like $u^2$ as $|u|\to\infty$. Infinite energy solutions of this equation preserve a natural Gibbsian invariant measure and when the potential is double-welled, for example when $V(u) = \tfrac14(1-u^2)^2$, there is a regime such that two small disjoint sets in the system's phase-space concentrate most of the mass of this measure. This suggests that the solutions to the nonlinear wave equation can be metastable over these sets, in the sense that they spend long periods of time in these sets and only rarely transition between them. Here we quantify this phenomenon by calculating exactly via Transition State Theory (TST) the mean frequency at which the solutions of the nonlinear wave equation with initial conditions drawn from its invariant measure cross a dividing surface lying in between the metastable sets. Numerical results suggest that the dynamics of the nonlinear wave equation is ergodic and rapidly mixing with respect to the Gibbs invariant measure when the parameter $δ$ in small enough. This is a regime in which the dynamics of the nonlinear wave equation displays a metastable behavior that is not fundamentally different from that observed in its stochastic counterpart in which random noise and damping terms are added to the equation. For larger $δ$, however, the dynamics either stops being ergodic, or its mixing time becomes larger than the inverse of the TST frequency, indicating that successive transitions between the metastable sets are correlated and the coarse-graining to a Markov chain fails. △ Less

Submitted 3 August, 2016; originally announced August 2016.

Comments: 30 pages

MSC Class: 60G15; 82B05

arXiv:1604.03818 [pdf, other]

doi 10.1007/978-1-4939-6969-2_2

Long Term Effects of Small Random Perturbations on Dynamical Systems: Theoretical and Computational Tools

Authors: Tobias Grafke, Tobias Schaefer, Eric Vanden-Eijnden

Abstract: Small random perturbations may have a dramatic impact on the long time evolution of dynamical systems, and large deviation theory is often the right theoretical framework to understand these effects. At the core of the theory lies the minimization of an action functional, which in many cases of interest has to be computed by numerical means. Here we review the theoretical and computational aspects… ▽ More Small random perturbations may have a dramatic impact on the long time evolution of dynamical systems, and large deviation theory is often the right theoretical framework to understand these effects. At the core of the theory lies the minimization of an action functional, which in many cases of interest has to be computed by numerical means. Here we review the theoretical and computational aspects behind these calculations, and propose an algorithm that simplifies the geometric minimum action method to minimize the action in the space of arc-length parametrized curves. We then illustrate this algorithm's capabilities by applying it to various examples from material sciences, fluid dynamics, atmosphere/ocean sciences, and reaction kinetics. In terms of models, these examples involve stochastic (ordinary or partial) differential equations with multiplicative or degenerate noise, Markov jump processes, and systems with fast and slow degrees of freedom, which all violate detailed balance, so that simpler computational methods are not applicable. △ Less

Submitted 10 October, 2017; v1 submitted 13 April, 2016; originally announced April 2016.

arXiv:1601.02147 [pdf, ps, other]

Fluctuations in the heterogeneous multiscale methods for fast-slow systems

Authors: David Kelly, Eric Vanden-Eijnden

Abstract: How heterogeneous multiscale methods (HMM) handle fluctuations acting on the slow variables in fast-slow systems is investigated. In particular, it is shown via analysis of central limit theorems (CLT) and large deviation principles (LDP) that the standard version of HMM artificially amplifies these fluctuations. A simple modification of HMM, termed parallel HMM, is introduced and is shown to reme… ▽ More How heterogeneous multiscale methods (HMM) handle fluctuations acting on the slow variables in fast-slow systems is investigated. In particular, it is shown via analysis of central limit theorems (CLT) and large deviation principles (LDP) that the standard version of HMM artificially amplifies these fluctuations. A simple modification of HMM, termed parallel HMM, is introduced and is shown to remedy this problem, capturing fluctuations correctly both at the level of the CLT and the LDP. Similar type of arguments can also be used to justify that the tau-leaping method used in the context of Gillespie's stochastic simulation algorithm for Markov jump processes also captures the right CLT and LDP for these processes. △ Less

Submitted 9 January, 2016; originally announced January 2016.

Comments: Dedicated with admiration and friendship to Bjorn Engquist on the occasion of his 70th birthday

MSC Class: 34E13; 60F10; 70K70

arXiv:1502.05034 [pdf, other]

Continuous-time Random Walks for the Numerical Solution of Stochastic Differential Equations

Authors: Nawaf Bou-Rabee, Eric Vanden-Eijnden

Abstract: This paper introduces time-continuous numerical schemes to simulate stochastic differential equations (SDEs) arising in mathematical finance, population dynamics, chemical kinetics, epidemiology, biophysics, and polymeric fluids. These schemes are obtained by spatially discretizing the Kolmogorov equation associated with the SDE in such a way that the resulting semi-discrete equation generates a M… ▽ More This paper introduces time-continuous numerical schemes to simulate stochastic differential equations (SDEs) arising in mathematical finance, population dynamics, chemical kinetics, epidemiology, biophysics, and polymeric fluids. These schemes are obtained by spatially discretizing the Kolmogorov equation associated with the SDE in such a way that the resulting semi-discrete equation generates a Markov jump process that can be realized exactly using a Monte Carlo method. In this construction the spatial increment of the approximation can be bounded uniformly in space, which guarantees that the schemes are numerically stable for both finite and long time simulation of SDEs. By directly analyzing the generator of the approximation, we prove that the approximation has a sharp stochastic Lyapunov function when applied to an SDE with a drift field that is locally Lipschitz continuous and weakly dissipative. We use this stochastic Lyapunov function to extend a local semimartingale representation of the approximation. This extension permits to analyze the complexity of the approximation. Using the theory of semigroups of linear operators on Banach spaces, we show that the approximation is (weakly) accurate in representing finite and infinite-time statistics, with an order of accuracy identical to that of its generator. The proofs are carried out in the context of both fixed and variable spatial step sizes. Theoretical and numerical studies confirm these statements, and provide evidence that these schemes have several advantages over standard methods based on time-discretization. In particular, they are accurate, eliminate nonphysical moves in simulating SDEs with boundaries (or confined domains), prevent exploding trajectories from occurring when simulating stiff SDEs, and solve first exit problems without time-interpolation errors. △ Less

Submitted 11 March, 2015; v1 submitted 17 February, 2015; originally announced February 2015.

Comments: v2: 135 pages; added references to works of C. Doering, T. Elston, and H. Kushner

MSC Class: 65C30

arXiv:1410.3004 [pdf, other]

Stochastic Mode-Reduction in Models with Conservative Fast Sub-Systems

Authors: Ankita Jain, Ilya Timofeyev, Eric Vanden-Eijnden

Abstract: A stochastic mode reduction strategy is applied to multiscale models with a deterministic energy-conserving fast sub-system. Specifically, we consider situations where the slow variables are driven stochastically and interact with the fast sub-system in an energy-conserving fashion. Since the stochastic terms only affect the slow variables, the fast-subsystem evolves deterministically on a sphere… ▽ More A stochastic mode reduction strategy is applied to multiscale models with a deterministic energy-conserving fast sub-system. Specifically, we consider situations where the slow variables are driven stochastically and interact with the fast sub-system in an energy-conserving fashion. Since the stochastic terms only affect the slow variables, the fast-subsystem evolves deterministically on a sphere of constant energy. However, in the full model the radius of the sphere slowly changes due to the coupling between the slow and fast dynamics. Therefore, the energy of the fast sub-system becomes an additional hidden slow variable that must be accounted for in order to apply the stochastic mode reduction technique to systems of this type. △ Less

Submitted 11 October, 2014; originally announced October 2014.

arXiv:1309.5037 [pdf, other]

doi 10.1137/130937470

Metropolis Integration Schemes for Self-Adjoint Diffusions

Authors: Nawaf Bou-Rabee, Aleksandar Donev, Eric Vanden-Eijnden

Abstract: We present explicit methods for simulating diffusions whose generator is self-adjoint with respect to a known (but possibly not normalizable) density. These methods exploit this property and combine an optimized Runge-Kutta algorithm with a Metropolis-Hastings Monte-Carlo scheme. The resulting numerical integration scheme is shown to be weakly accurate at finite noise and to gain higher order accu… ▽ More We present explicit methods for simulating diffusions whose generator is self-adjoint with respect to a known (but possibly not normalizable) density. These methods exploit this property and combine an optimized Runge-Kutta algorithm with a Metropolis-Hastings Monte-Carlo scheme. The resulting numerical integration scheme is shown to be weakly accurate at finite noise and to gain higher order accuracy in the small noise limit. It also permits to avoid computing explicitly certain terms in the equation, such as the divergence of the mobility tensor, which can be tedious to calculate. Finally, the scheme is shown to be ergodic with respect to the exact equilibrium probability distribution of the diffusion when it exists. These results are illustrated on several examples including a Brownian dynamics simulation of DNA in a solvent. In this example, the proposed scheme is able to accurately compute dynamics at time step sizes that are an order of magnitude (or more) larger than those permitted with commonly used explicit predictor-corrector schemes. △ Less

Submitted 31 March, 2014; v1 submitted 19 September, 2013; originally announced September 2013.

Comments: 54 pages, 8 figures, To appear in MMS

MSC Class: 65C30 (Primary); 65C05; 60J05 (Secondary)

Journal ref: Multiscale Modeling & Simulation Vol. 12 No. 2 (2014) pp. 781-831

arXiv:1210.6253 [pdf, ps, other]

doi 10.1063/1.4804070

Averaged equation for energy diffusion on a graph reveals bifurcation diagram and thermally assisted reversal times in spin-torque driven nanomagnets

Authors: Katherine Newhall, Eric Vanden-Eijnden

Abstract: Driving nanomagnets by spin-polarized currents offers exciting prospects in magnetoelectronics, but the response of the magnets to such currents remains poorly understood. We show that an averaged equation describing the diffusion of energy on a graph captures the low-damping dynamics of these systems. From this equation we obtain the bifurcation diagram of the magnets, including the critical curr… ▽ More Driving nanomagnets by spin-polarized currents offers exciting prospects in magnetoelectronics, but the response of the magnets to such currents remains poorly understood. We show that an averaged equation describing the diffusion of energy on a graph captures the low-damping dynamics of these systems. From this equation we obtain the bifurcation diagram of the magnets, including the critical currents to induce stable precessional states and magnetization switching, as well as the mean times of thermally assisted magnetization reversal in situations where the standard reaction rate theory of Kramers is no longer valid. These results match experimental observations and give a theoretical basis for a Néel-Brown-type formula with an effective energy barrier for the reversal times. △ Less

Submitted 19 January, 2022; v1 submitted 23 October, 2012; originally announced October 2012.

Comments: 13 pages, 5 figures

MSC Class: 82B31; 37H20; 82D40

Journal ref: J. Appl. Phys. 113, 184105 (2013)

arXiv:1010.4278 [pdf, other]

A patch that imparts unconditional stability to certain explicit integrators for SDEs

Authors: Nawaf Bou-Rabee, Eric Vanden-Eijnden

Abstract: This paper proposes a simple strategy to simulate stochastic differential equations (SDE) arising in constant temperature molecular dynamics. The main idea is to patch an explicit integrator with Metropolis accept or reject steps. The resulting `Metropolized integrator' preserves the SDE's equilibrium distribution and is pathwise accurate on finite time intervals. As a corollary the integrator can… ▽ More This paper proposes a simple strategy to simulate stochastic differential equations (SDE) arising in constant temperature molecular dynamics. The main idea is to patch an explicit integrator with Metropolis accept or reject steps. The resulting `Metropolized integrator' preserves the SDE's equilibrium distribution and is pathwise accurate on finite time intervals. As a corollary the integrator can be used to estimate finite-time dynamical properties along an infinitely long solution. The paper explains how to implement the patch (even in the presence of multiple-time-stepsizes and holonomic constraints), how it scales with system size, and how much overhead it requires. We test the integrator on a Lennard-Jones cluster of particles and `dumbbells' at constant temperature. △ Less

Submitted 20 October, 2010; originally announced October 2010.

Comments: 29 pages, 5 figures

MSC Class: 82C80 (65C30; 65C05; 65P10)

arXiv:1008.3514 [pdf, ps, other]

Non-asymptotic mixing of the MALA algorithm

Authors: Nawaf Bou-Rabee, Martin Hairer, Eric Vanden-Eijnden

Abstract: The Metropolis-Adjusted Langevin Algorithm (MALA), originally introduced to sample exactly the invariant measure of certain stochastic differential equations (SDE) on infinitely long time intervals, can also be used to approximate pathwise the solution of these SDEs on finite time intervals. However, when applied to an SDE with a nonglobally Lipschitz drift coefficient, the algorithm may not have… ▽ More The Metropolis-Adjusted Langevin Algorithm (MALA), originally introduced to sample exactly the invariant measure of certain stochastic differential equations (SDE) on infinitely long time intervals, can also be used to approximate pathwise the solution of these SDEs on finite time intervals. However, when applied to an SDE with a nonglobally Lipschitz drift coefficient, the algorithm may not have a spectral gap even when the SDE does. This paper reconciles MALA's lack of a spectral gap with its ergodicity to the invariant measure of the SDE and finite time accuracy. In particular, the paper shows that its convergence to equilibrium happens at exponential rate up to terms exponentially small in time-stepsize. This quantification relies on MALA's ability to exactly preserve the SDE's invariant measure and accurately represent the SDE's transition probability on finite time intervals. △ Less

Submitted 20 August, 2010; originally announced August 2010.

Comments: 34 pages

MSC Class: 60J05 (Primary) 65C30; 65C05 (Secondary)

arXiv:0905.4218 [pdf, other]

doi 10.1002/cpa.20306

Pathwise Accuracy and Ergodicity of Metropolized Integrators for SDEs

Authors: Nawaf Bou-Rabee, Eric Vanden-Eijnden

Abstract: Metropolized integrators for ergodic stochastic differential equations (SDE) are proposed which (i) are ergodic with respect to the (known) equilibrium distribution of the SDE and (ii) approximate pathwise the solutions of the SDE on finite time intervals. Both these properties are demonstrated in the paper and precise strong error estimates are obtained. It is also shown that the Metropolized i… ▽ More Metropolized integrators for ergodic stochastic differential equations (SDE) are proposed which (i) are ergodic with respect to the (known) equilibrium distribution of the SDE and (ii) approximate pathwise the solutions of the SDE on finite time intervals. Both these properties are demonstrated in the paper and precise strong error estimates are obtained. It is also shown that the Metropolized integrator retains these properties even in situations where the drift in the SDE is nonglobally Lipschitz, and vanilla explicit integrators for SDEs typically become unstable and fail to be ergodic. △ Less

Submitted 26 May, 2009; originally announced May 2009.

Comments: 46 pages, 5 figures

MSC Class: 65C30

arXiv:0806.1621 [pdf, ps, other]

Some Critical Issues for the "Equation-Free" Approach to Multiscale Modeling

Authors: Weinan E, Eric Vanden-Eijnden

Abstract: The "equation-free'' approach has been proposed in recent years as a general framework for developing multiscale methods to efficiently capture the macroscale behavior of a system using only the microscale models. In this paper, we take a close look at some of the algorithms proposed under the "equation-free'' umbrella, the projective integrators and the patch dynamics. We discuss some very simp… ▽ More The "equation-free'' approach has been proposed in recent years as a general framework for developing multiscale methods to efficiently capture the macroscale behavior of a system using only the microscale models. In this paper, we take a close look at some of the algorithms proposed under the "equation-free'' umbrella, the projective integrators and the patch dynamics. We discuss some very simple examples in the context of the "equation-free'' approach. These examples seem to indicate that while its general philosophy is quite attractive and indeed similar to many other approaches in concurrent multiscale modeling, there are severe limitations to the specific implementation proposed by the equation-free approach. △ Less

Submitted 10 June, 2008; originally announced June 2008.

MSC Class: 65L99; 65M99

arXiv:math-ph/0607047 [pdf, ps, other]

doi 10.1007/s00220-007-0333-0

Simple Systems with Anomalous Dissipation and Energy Cascade

Authors: Jonathan C. Mattingly, Toufic Suidan, Eric Vanden-Eijnden

Abstract: We analyze a class of linear shell models subject to stochastic forcing in finitely many degrees of freedom. The unforced systems considered formally conserve energy. Despite being formally conservative, we show that these dynamical systems support dissipative solutions (suitably defined) and, as a result, may admit unique (statistical) steady states when the forcing term is nonzero. This claim… ▽ More We analyze a class of linear shell models subject to stochastic forcing in finitely many degrees of freedom. The unforced systems considered formally conserve energy. Despite being formally conservative, we show that these dynamical systems support dissipative solutions (suitably defined) and, as a result, may admit unique (statistical) steady states when the forcing term is nonzero. This claim is demonstrated via the complete characterization of the solutions of the system above for specific choices of the coupling coefficients. The mechanism of anomalous dissipations is shown to arise via a cascade of the energy towards the modes ($a_n$) with higher $n$; this is responsible for solutions with interesting energy spectra, namely $\EE |a_n|^2$ scales as $n^{-α}$ as $n\to\infty$. Here the exponents $α$ depend on the coupling coefficients $c_n$ and $\EE$ denotes expectation with respect to the equilibrium measure. This is reminiscent of the conjectured properties of the solutions of the Navier-Stokes equations in the inviscid limit and their accepted relationship with fully developed turbulence. Hence, these simple models illustrate some of the heuristic ideas that have been advanced to characterize turbulence, similar in that respect to the random passive scalar or random Burgers equation, but even simpler and fully solvable. △ Less

Submitted 22 July, 2006; originally announced July 2006.

Comments: 32 Pages

MSC Class: 76F55; 76M35;60J60;37L55

arXiv:math/0212415 [pdf, ps, other]

Energy landscapes and rare events

Authors: Weinan E, Weiqing Ren, Eric Vanden-Eijnden

Abstract: Many problems in physics, material sciences, chemistry and biology can be abstractly formulated as a system that navigates over a complex energy landscape of high or infinite dimensions. Well-known examples include phase transitions of condensed matter, conformational changes of biopolymers, and chemical reactions. The energy landscape typically exhibits multiscale features, giving rise to the m… ▽ More Many problems in physics, material sciences, chemistry and biology can be abstractly formulated as a system that navigates over a complex energy landscape of high or infinite dimensions. Well-known examples include phase transitions of condensed matter, conformational changes of biopolymers, and chemical reactions. The energy landscape typically exhibits multiscale features, giving rise to the multiscale nature of the dynamics. This is one of the main challenges that we face in computational science. In this report, we will review the recent work done by scientists from several disciplines on probing such energy landscapes. Of particular interest is the analysis and computation of transition pathways and transition rates between metastable states. We will then present the string method that has proven to be very effective for some truly complex systems in material science and chemistry. △ Less

Submitted 30 November, 2002; originally announced December 2002.

Report number: ICM-2002 MSC Class: 60-08; 60F10; 65C

Journal ref: Proceedings of the ICM, Beijing 2002, vol. 1, 621--630

Showing 1–37 of 37 results for author: Vanden-Eijnden, E