-
Efficiently Vectorized MCMC on Modern Accelerators
Authors:
Hugh Dance,
Pierre Glaser,
Peter Orbanz,
Ryan Adams
Abstract:
With the advent of automatic vectorization tools (e.g., JAX's $\texttt{vmap}$), writing multi-chain MCMC algorithms is often now as simple as invoking those tools on single-chain code. Whilst convenient, for various MCMC algorithms this results in a synchronization problem -- loosely speaking, at each iteration all chains running in parallel must wait until the last chain has finished drawing its…
▽ More
With the advent of automatic vectorization tools (e.g., JAX's $\texttt{vmap}$), writing multi-chain MCMC algorithms is often now as simple as invoking those tools on single-chain code. Whilst convenient, for various MCMC algorithms this results in a synchronization problem -- loosely speaking, at each iteration all chains running in parallel must wait until the last chain has finished drawing its sample. In this work, we show how to design single-chain MCMC algorithms in a way that avoids synchronization overheads when vectorizing with tools like $\texttt{vmap}$ by using the framework of finite state machines (FSMs). Using a simplified model, we derive an exact theoretical form of the obtainable speed-ups using our approach, and use it to make principled recommendations for optimal algorithm design. We implement several popular MCMC algorithms as FSMs, including Elliptical Slice Sampling, HMC-NUTS, and Delayed Rejection, demonstrating speed-ups of up to an order of magnitude in experiments.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
(De)-regularized Maximum Mean Discrepancy Gradient Flow
Authors:
Zonghao Chen,
Aratrika Mustafi,
Pierre Glaser,
Anna Korba,
Arthur Gretton,
Bharath K. Sriperumbudur
Abstract:
We introduce a (de)-regularization of the Maximum Mean Discrepancy (DrMMD) and its Wasserstein gradient flow. Existing gradient flows that transport samples from source distribution to target distribution with only target samples, either lack tractable numerical implementation ($f$-divergence flows) or require strong assumptions, and modifications such as noise injection, to ensure convergence (Ma…
▽ More
We introduce a (de)-regularization of the Maximum Mean Discrepancy (DrMMD) and its Wasserstein gradient flow. Existing gradient flows that transport samples from source distribution to target distribution with only target samples, either lack tractable numerical implementation ($f$-divergence flows) or require strong assumptions, and modifications such as noise injection, to ensure convergence (Maximum Mean Discrepancy flows). In contrast, DrMMD flow can simultaneously (i) guarantee near-global convergence for a broad class of targets in both continuous and discrete time, and (ii) be implemented in closed form using only samples. The former is achieved by leveraging the connection between the DrMMD and the $χ^2$-divergence, while the latter comes by treating DrMMD as MMD with a de-regularized kernel. Our numerical scheme uses an adaptive de-regularization schedule throughout the flow to optimally trade off between discretization errors and deviations from the $χ^2$ regime. The potential application of the DrMMD flow is demonstrated across several numerical experiments, including a large-scale setting of training student/teacher networks.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Mining Frequent Structures in Conceptual Models
Authors:
Mattia Fumagalli,
Tiago Prince Sales,
Pedro Paulo F. Barcelos,
Giovanni Micale,
Philipp-Lorenz Glaser,
Dominik Bork,
Vadim Zaytsev,
Diego Calvanese,
Giancarlo Guizzardi
Abstract:
The problem of using structured methods to represent knowledge is well-known in conceptual modeling and has been studied for many years. It has been proven that adopting modeling patterns represents an effective structural method. Patterns are, indeed, generalizable recurrent structures that can be exploited as solutions to design problems. They aid in understanding and improving the process of cr…
▽ More
The problem of using structured methods to represent knowledge is well-known in conceptual modeling and has been studied for many years. It has been proven that adopting modeling patterns represents an effective structural method. Patterns are, indeed, generalizable recurrent structures that can be exploited as solutions to design problems. They aid in understanding and improving the process of creating models. The undeniable value of using patterns in conceptual modeling was demonstrated in several experimental studies. However, discovering patterns in conceptual models is widely recognized as a highly complex task and a systematic solution to pattern identification is currently lacking. In this paper, we propose a general approach to the problem of discovering frequent structures, as they occur in conceptual modeling languages. As proof of concept, we implement our approach by focusing on two widely-used conceptual modeling languages. This implementation includes an exploratory tool that integrates a frequent subgraph mining algorithm with graph manipulation techniques. The tool processes multiple conceptual models and identifies recurrent structures based on various criteria. We validate the tool using two state-of-the-art curated datasets: one consisting of models encoded in OntoUML and the other in ArchiMate. The primary objective of our approach is to provide a support tool for language engineers. This tool can be used to identify both effective and ineffective modeling practices, enabling the refinement and evolution of conceptual modeling languages. Furthermore, it facilitates the reuse of accumulated expertise, ultimately supporting the creation of higher-quality models in a given language.
△ Less
Submitted 25 December, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Maximum Likelihood Learning of Unnormalized Models for Simulation-Based Inference
Authors:
Pierre Glaser,
Michael Arbel,
Samo Hromadka,
Arnaud Doucet,
Arthur Gretton
Abstract:
We introduce two synthetic likelihood methods for Simulation-Based Inference (SBI), to conduct either amortized or targeted inference from experimental observations when a high-fidelity simulator is available. Both methods learn a conditional energy-based model (EBM) of the likelihood using synthetic data generated by the simulator, conditioned on parameters drawn from a proposal distribution. The…
▽ More
We introduce two synthetic likelihood methods for Simulation-Based Inference (SBI), to conduct either amortized or targeted inference from experimental observations when a high-fidelity simulator is available. Both methods learn a conditional energy-based model (EBM) of the likelihood using synthetic data generated by the simulator, conditioned on parameters drawn from a proposal distribution. The learned likelihood can then be combined with any prior to obtain a posterior estimate, from which samples can be drawn using MCMC. Our methods uniquely combine a flexible Energy-Based Model and the minimization of a KL loss: this is in contrast to other synthetic likelihood methods, which either rely on normalizing flows, or minimize score-based objectives; choices that come with known pitfalls. We demonstrate the properties of both methods on a range of synthetic datasets, and apply them to a neuroscience model of the pyloric network in the crab, where our method outperforms prior art for a fraction of the simulation budget.
△ Less
Submitted 18 April, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support
Authors:
Pierre Glaser,
Michael Arbel,
Arthur Gretton
Abstract:
We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergence between a moving source and a fixed target distribution. This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions. When using a Reproducing Kernel Hilbert Space (RKHS) to defin…
▽ More
We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergence between a moving source and a fixed target distribution. This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions. When using a Reproducing Kernel Hilbert Space (RKHS) to define the function class, we show that the KALE continuously interpolates between the KL and the Maximum Mean Discrepancy (MMD). Like the MMD and other Integral Probability Metrics, the KALE remains well defined for mutually singular distributions. Nonetheless, the KALE inherits from the limiting KL a greater sensitivity to mismatch in the support of the distributions, compared with the MMD. These two properties make the KALE gradient flow particularly well suited when the target distribution is supported on a low-dimensional manifold. Under an assumption of sufficient smoothness of the trajectories, we show the global convergence of the KALE flow. We propose a particle implementation of the flow given initial samples from the source and the target distribution, which we use to empirically confirm the KALE's properties.
△ Less
Submitted 29 October, 2021; v1 submitted 16 June, 2021;
originally announced June 2021.