Search | arXiv e-print repository

Accelerating Markov Chain Monte Carlo sampling with diffusion models

Authors: N. T. Hunt-Smith, W. Melnitchouk, F. Ringer, N. Sato, A. W Thomas, M. J. White

Abstract: Global fits of physics models require efficient methods for exploring high-dimensional and/or multimodal posterior functions. We introduce a novel method for accelerating Markov Chain Monte Carlo (MCMC) sampling by pairing a Metropolis-Hastings algorithm with a diffusion model that can draw global samples with the aim of approximating the posterior. We briefly review diffusion models in the contex… ▽ More Global fits of physics models require efficient methods for exploring high-dimensional and/or multimodal posterior functions. We introduce a novel method for accelerating Markov Chain Monte Carlo (MCMC) sampling by pairing a Metropolis-Hastings algorithm with a diffusion model that can draw global samples with the aim of approximating the posterior. We briefly review diffusion models in the context of image synthesis before providing a streamlined diffusion model tailored towards low-dimensional data arrays. We then present our adapted Metropolis-Hastings algorithm which combines local proposals with global proposals taken from a diffusion model that is regularly trained on the samples produced during the MCMC run. Our approach leads to a significant reduction in the number of likelihood evaluations required to obtain an accurate representation of the Bayesian posterior across several analytic functions, as well as for a physical example based on a global analysis of parton distribution functions. Our method is extensible to other MCMC techniques, and we briefly compare our method to similar approaches based on normalizing flows. A code implementation can be found at https://github.com/NickHunt-Smith/MCMC-diffusion. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: 21 pages, 8 figures, 1 table

arXiv:2210.15155 [pdf, other]

Maximum likelihood estimation for left-truncated log-logistic distributions with a given truncation point

Authors: Markus Kreer, Ayse Kizilersu, Jake Guscott, Lukas Christopher Schmitz, Anthony W. Thomas

Abstract: The maximum likelihood estimation of the left-truncated log-logistic distribution with a given truncation point is analyzed in detail from both mathematical and numerical perspectives. These maximum likelihood equations often do not possess a solution, even for small truncations. A simple criterion is provided for the existence of a regular maximum likelihood solution. In this case a profile likel… ▽ More The maximum likelihood estimation of the left-truncated log-logistic distribution with a given truncation point is analyzed in detail from both mathematical and numerical perspectives. These maximum likelihood equations often do not possess a solution, even for small truncations. A simple criterion is provided for the existence of a regular maximum likelihood solution. In this case a profile likelihood function can be constructed and the optimisation problem is reduced to one dimension. When the maximum likelihood equations do not admit a solution for certain data samples, it is shown that the Pareto distribution is the $L^1$-limit of the degenerated left-truncated log-logistic distribution. Using this mathematical information, a highly efficient Monte Carlo simulation is performed to obtain critical values for some goodness-of-fit tests. The confidence tables and an interpolation formula are provided and several applications to real world data are presented. △ Less

Submitted 26 October, 2022; originally announced October 2022.

Comments: 27 pages, 4 figures

arXiv:2108.06896 [pdf]

Challenges for cognitive decoding using deep learning methods

Authors: Armin W. Thomas, Christopher Ré, Russell A. Poldrack

Abstract: In cognitive decoding, researchers aim to characterize a brain region's representations by identifying the cognitive states (e.g., accepting/rejecting a gamble) that can be identified from the region's activity. Deep learning (DL) methods are highly promising for cognitive decoding, with their unmatched ability to learn versatile representations of complex data. Yet, their widespread application i… ▽ More In cognitive decoding, researchers aim to characterize a brain region's representations by identifying the cognitive states (e.g., accepting/rejecting a gamble) that can be identified from the region's activity. Deep learning (DL) methods are highly promising for cognitive decoding, with their unmatched ability to learn versatile representations of complex data. Yet, their widespread application in cognitive decoding is hindered by their general lack of interpretability as well as difficulties in applying them to small datasets and in ensuring their reproducibility and robustness. We propose to approach these challenges by leveraging recent advances in explainable artificial intelligence and transfer learning, while also providing specific recommendations on how to improve the reproducibility and robustness of DL modeling results. △ Less

Submitted 16 August, 2021; originally announced August 2021.

arXiv:1907.01953 [pdf, other]

Deep Transfer Learning For Whole-Brain fMRI Analyses

Authors: Armin W. Thomas, Klaus-Robert Müller, Wojciech Samek

Abstract: The application of deep learning (DL) models to the decoding of cognitive states from whole-brain functional Magnetic Resonance Imaging (fMRI) data is often hindered by the small sample size and high dimensionality of these datasets. Especially, in clinical settings, where patient data are scarce. In this work, we demonstrate that transfer learning represents a solution to this problem. Particular… ▽ More The application of deep learning (DL) models to the decoding of cognitive states from whole-brain functional Magnetic Resonance Imaging (fMRI) data is often hindered by the small sample size and high dimensionality of these datasets. Especially, in clinical settings, where patient data are scarce. In this work, we demonstrate that transfer learning represents a solution to this problem. Particularly, we show that a DL model, which has been previously trained on a large openly available fMRI dataset of the Human Connectome Project, outperforms a model variant with the same architecture, but which is trained from scratch, when both are applied to the data of a new, unrelated fMRI task. Even further, the pre-trained DL model variant is already able to correctly decode 67.51% of the cognitive states from a test dataset with 100 individuals, when fine-tuned on a dataset of the size of only three subjects. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Comments: 8 pages, 3 figures

arXiv:1810.09945 [pdf, other]

Analyzing Neuroimaging Data Through Recurrent Deep Learning Models

Authors: Armin W. Thomas, Hauke R. Heekeren, Klaus-Robert Müller, Wojciech Samek

Abstract: The application of deep learning (DL) models to neuroimaging data poses several challenges, due to the high dimensionality, low sample size and complex temporo-spatial dependency structure of these datasets. Even further, DL models act as as black-box models, impeding insight into the association of cognitive state and brain activity. To approach these challenges, we introduce the DeepLight framew… ▽ More The application of deep learning (DL) models to neuroimaging data poses several challenges, due to the high dimensionality, low sample size and complex temporo-spatial dependency structure of these datasets. Even further, DL models act as as black-box models, impeding insight into the association of cognitive state and brain activity. To approach these challenges, we introduce the DeepLight framework, which utilizes long short-term memory (LSTM) based DL models to analyze whole-brain functional Magnetic Resonance Imaging (fMRI) data. To decode a cognitive state (e.g., seeing the image of a house), DeepLight separates the fMRI volume into a sequence of axial brain slices, which is then sequentially processed by an LSTM. To maintain interpretability, DeepLight adapts the layer-wise relevance propagation (LRP) technique. Thereby, decomposing its decoding decision into the contributions of the single input voxels to this decision. Importantly, the decomposition is performed on the level of single fMRI volumes, enabling DeepLight to study the associations between cognitive state and brain activity on several levels of data granularity, from the level of the group down to the level of single time points. To demonstrate the versatility of DeepLight, we apply it to a large fMRI dataset of the Human Connectome Project. We show that DeepLight outperforms conventional approaches of uni- and multivariate fMRI analysis in decoding the cognitive states and in identifying the physiologically appropriate brain regions associated with these states. We further demonstrate DeepLight's ability to study the fine-grained temporo-spatial variability of brain activity over sequences of single fMRI samples. △ Less

Submitted 5 April, 2019; v1 submitted 23 October, 2018; originally announced October 2018.

Comments: 36 pages, 9 figures

arXiv:1310.3161 [pdf, ps, other]

doi 10.1016/j.spl.2013.09.028

Fractional Poisson processes and their representation by infinite systems of ordinary differential equations

Authors: Markus Kreer, Ayse Kizilersu, Anthony W. Thomas

Abstract: Fractional Poisson processes, a rapidly growing area of non-Markovian stochastic processes, are useful in statistics to describe data from counting processes when waiting times are not exponentially distributed. We show that the fractional Kolmogorov-Feller equations for the probabilities at time t can be representated by an infinite linear system of ordinary differential equations of first order… ▽ More Fractional Poisson processes, a rapidly growing area of non-Markovian stochastic processes, are useful in statistics to describe data from counting processes when waiting times are not exponentially distributed. We show that the fractional Kolmogorov-Feller equations for the probabilities at time t can be representated by an infinite linear system of ordinary differential equations of first order in a transformed time variable. These new equations resemble a linear version of the discrete coagulation-fragmentation equations, well-known from the non-equilibrium theory of gelation, cluster-dynamics and phase transitions in physics and chemistry. △ Less

Submitted 11 October, 2013; originally announced October 2013.

Comments: 15 pages

Report number: ADP-13-16/T836

Journal ref: Statistics and Probability Letters84 (2014), pp. 27-32

Showing 1–6 of 6 results for author: Thomas, A W