-
Exact Bayesian inference for Markov switching diffusions
Authors:
Timothée Stumpf-Fétizon,
Krzysztof Łatuszyński,
Jan Palczewski,
Gareth Roberts
Abstract:
We give the first exact Bayesian methodology for the problem of inference in discretely observed regime switching diffusions. We design an MCMC and an MCEM algorithm that target the exact posterior of diffusion parameters and the latent regime process. The algorithms are exact in the sense that they target the correct posterior distribution of the continuous model, so that the errors are due to Mo…
▽ More
We give the first exact Bayesian methodology for the problem of inference in discretely observed regime switching diffusions. We design an MCMC and an MCEM algorithm that target the exact posterior of diffusion parameters and the latent regime process. The algorithms are exact in the sense that they target the correct posterior distribution of the continuous model, so that the errors are due to Monte Carlo only. Switching diffusion models extend ordinary diffusions by allowing for jumps in instantaneous drift and volatility. The jumps are driven by a latent, continuous time Markov switching process. We illustrate the method on numerical examples, including an empirical analysis of the method's scalability in the length of the time series, and find that it is comparable in computational cost with discrete approximations while avoiding their shortcomings.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
MCMC for multi-modal distributions
Authors:
Krzysztof Łatuszyński,
Matthew T. Moores,
Timothée Stumpf-Fétizon
Abstract:
We explain the fundamental challenges of sampling from multimodal distributions, particularly for high-dimensional problems. We present the major types of MCMC algorithms that are designed for this purpose, including parallel tempering, mode jumping and Wang-Landau, as well as several state-of-the-art approaches that have recently been proposed. We demonstrate these methods using both synthetic an…
▽ More
We explain the fundamental challenges of sampling from multimodal distributions, particularly for high-dimensional problems. We present the major types of MCMC algorithms that are designed for this purpose, including parallel tempering, mode jumping and Wang-Landau, as well as several state-of-the-art approaches that have recently been proposed. We demonstrate these methods using both synthetic and real-world examples of multimodal distributions with discrete or continuous state spaces.
△ Less
Submitted 10 January, 2025;
originally announced January 2025.
-
Adaptive Stereographic MCMC
Authors:
Cameron Bell,
Krzystof Łatuszyński,
Gareth O. Roberts
Abstract:
In order to tackle the problem of sampling from heavy tailed, high dimensional distributions via Markov Chain Monte Carlo (MCMC) methods, Yang, Latuszyński, and Roberts (2022) (arXiv:2205.12112) introduces the stereographic projection as a tool to compactify $\mathbb{R}^d$ and transform the problem into sampling from a density on the unit sphere $\mathbb{S}^d$. However, the improvement in algorith…
▽ More
In order to tackle the problem of sampling from heavy tailed, high dimensional distributions via Markov Chain Monte Carlo (MCMC) methods, Yang, Latuszyński, and Roberts (2022) (arXiv:2205.12112) introduces the stereographic projection as a tool to compactify $\mathbb{R}^d$ and transform the problem into sampling from a density on the unit sphere $\mathbb{S}^d$. However, the improvement in algorithmic efficiency, as well as the computational cost of the implementation, are still significantly impacted by the parameters used in this transformation.
To address this, we introduce adaptive versions three stereographic MCMC algorithms - the Stereographic Random Walk (SRW), the Stereographic Slice Sampler (SSS), and the Stereographic Bouncy Particle Sampler (SBPS) - which automatically update the parameters of the algorithms as the run progresses. The adaptive setup allows to better exploit the power of the stereographic projection, even when the target distribution is neither centered nor homogeneous. Unlike Hamiltonian Monte Carlo (HMC) and other off-the-shelf MCMC samplers, the resulting algorithms are robust to starting far from the mean in heavy-tailed, high-dimensional settings. To prove convergence properties, we develop a novel framework for the analysis of adaptive MCMC algorithms over collections of simultaneously uniformly ergodic Markov operators, which is applicable to continuous-time processes, such as SBPS. This framework allows us to obtain $\mathcal{L}^2$ and almost sure convergence results, and a CLT for our adaptive stereographic algorithms.
△ Less
Submitted 15 May, 2025; v1 submitted 21 August, 2024;
originally announced August 2024.
-
Almost sure convergence rates of adaptive increasingly rare Markov chain Monte Carlo
Authors:
Julian Hofstadler,
Krzysztof Latuszynski,
Gareth O. Roberts,
Daniel Rudolf
Abstract:
We consider adaptive increasingly rare Markov chain Monte Carlo (AIR MCMC), which is an adaptive MCMC method, where the adaptation concerning the past happens less and less frequently over time. Under a contraction assumption for a Wasserstein-like function we deduce upper bounds of the convergence rate of Monte Carlo sums taking a renormalisation factor into account that is close to the one that…
▽ More
We consider adaptive increasingly rare Markov chain Monte Carlo (AIR MCMC), which is an adaptive MCMC method, where the adaptation concerning the past happens less and less frequently over time. Under a contraction assumption for a Wasserstein-like function we deduce upper bounds of the convergence rate of Monte Carlo sums taking a renormalisation factor into account that is close to the one that appears in a law of the iterated logarithm. We demonstrate the applicability of our results by considering different settings, among which are those of simultaneous geometric and uniform ergodicity. All proofs are carried out on an augmented state space, including the classical non-augmented setting as a special case. In contrast to other adaptive MCMC limit theory, some technical assumptions, like diminishing adaptation, are not needed.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Bernoulli factories and duality in Wright-Fisher and Allen-Cahn models of population genetics
Authors:
Jere Koskela,
Krzysztof Łatuszyński,
Dario Spanò
Abstract:
Mathematical models of genetic evolution often come in pairs, connected by a so-called duality relation. The most seminal example are the Wright-Fisher diffusion and the Kingman coalescent, where the former describes the stochastic evolution of neutral allele frequencies in a large population forwards in time, and the latter describes the genetic ancestry of randomly sampled individuals from the p…
▽ More
Mathematical models of genetic evolution often come in pairs, connected by a so-called duality relation. The most seminal example are the Wright-Fisher diffusion and the Kingman coalescent, where the former describes the stochastic evolution of neutral allele frequencies in a large population forwards in time, and the latter describes the genetic ancestry of randomly sampled individuals from the population backwards in time. As well as providing a richer description than either model in isolation, duality often yields equations satisfied by quantities of interest. We employ the so-called Bernoulli factory - a celebrated tool in simulation-based computing - to derive duality relations for broad classes of genetics models. As concrete examples, we present Wright-Fisher diffusions with general drift functions, and Allen-Cahn equations with general, nonlinear forcing terms. The drift and forcing functions can be interpreted as the action of frequency-dependent selection. To our knowledge, this work is the first time a connection has been drawn between Bernoulli factories and duality in models of population genetics.
△ Less
Submitted 1 February, 2024; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Solidarity of Gibbs Samplers: the spectral gap
Authors:
Iwona Chlebicka,
Krzysztof Łatuszyński,
Błażej Miasojedow
Abstract:
Gibbs samplers are preeminent Markov chain Monte Carlo algorithms used in computational physics and statistical computing. Yet, their most fundamental properties, such as relations between convergence characteristics of their various versions, are not well understood.
In this paper we prove the solidarity of their spectral gaps: if any of the random scan or $d!$ deterministic scans has a~spectra…
▽ More
Gibbs samplers are preeminent Markov chain Monte Carlo algorithms used in computational physics and statistical computing. Yet, their most fundamental properties, such as relations between convergence characteristics of their various versions, are not well understood.
In this paper we prove the solidarity of their spectral gaps: if any of the random scan or $d!$ deterministic scans has a~spectral gap then all of them have. Our methods rely on geometric interpretation of the Gibbs samplers as alternating projection algorithms and analysis of the rate of convergence in the von Neumann--Halperin method of cyclic alternating projections.
In addition, we provide a quantitative result: if the spectral gap of the random scan Gibbs sampler scales polynomially with dimension, so does the spectral gap of any of the deterministic scans.
△ Less
Submitted 9 July, 2024; v1 submitted 4 April, 2023;
originally announced April 2023.
-
Stereographic Markov Chain Monte Carlo
Authors:
Jun Yang,
Krzysztof Łatuszyński,
Gareth O. Roberts
Abstract:
High-dimensional distributions, especially those with heavy tails, are notoriously difficult for off-the-shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information, and local moves results in empirically observed ``stickiness'' and poor theoretical mixing properties -- lack of geometric ergodicity. In this paper, we introduce a new class of MCMC samplers that…
▽ More
High-dimensional distributions, especially those with heavy tails, are notoriously difficult for off-the-shelf MCMC samplers: the combination of unbounded state spaces, diminishing gradient information, and local moves results in empirically observed ``stickiness'' and poor theoretical mixing properties -- lack of geometric ergodicity. In this paper, we introduce a new class of MCMC samplers that map the original high-dimensional problem in Euclidean space onto a sphere and remedy these notorious mixing problems. In particular, we develop random-walk Metropolis type algorithms as well as versions of the Bouncy Particle Sampler that are uniformly ergodic for a large class of light and heavy-tailed distributions and also empirically exhibit rapid convergence in high dimensions. In the best scenario, the proposed samplers can enjoy the ``blessings of dimensionality'' that the convergence is faster in higher dimensions.
△ Less
Submitted 20 February, 2024; v1 submitted 24 May, 2022;
originally announced May 2022.
-
Optimal Scaling of MCMC Beyond Metropolis
Authors:
Sanket Agrawal,
Dootika Vats,
Krzysztof Łatuszyński,
Gareth O. Roberts
Abstract:
The problem of optimally scaling the proposal distribution in a Markov chain Monte Carlo algorithm is critical to the quality of the generated samples. Much work has gone into obtaining such results for various Metropolis-Hastings (MH) algorithms. Recently, acceptance probabilities other than MH are being employed in problems with intractable target distributions. There is little resource availabl…
▽ More
The problem of optimally scaling the proposal distribution in a Markov chain Monte Carlo algorithm is critical to the quality of the generated samples. Much work has gone into obtaining such results for various Metropolis-Hastings (MH) algorithms. Recently, acceptance probabilities other than MH are being employed in problems with intractable target distributions. There is little resource available on tuning the Gaussian proposal distributions for this situation. We obtain optimal scaling results for a general class of acceptance functions, which includes Barker's and Lazy-MH. In particular, optimal values for the Barker's algorithm are derived and found to be significantly different from that obtained for the MH algorithm. Our theoretical conclusions are supported by numerical simulations indicating that when the optimal proposal variance is unknown, tuning to the optimal acceptance probability remains an effective strategy.
△ Less
Submitted 4 February, 2022; v1 submitted 5 April, 2021;
originally announced April 2021.
-
Exact Bayesian inference for diffusion driven Cox processes
Authors:
Flavio B. Gonçalves,
Krzysztof G. Łatuszyński,
Gareth O. Roberts
Abstract:
In this paper, we present a novel methodology to perform Bayesian inference for Cox processes in which the intensity function is driven by a diffusion process. The novelty lies in the fact that no discretization error is involved, despite the non-tractability of both the likelihood function and the transition density of the diffusion. The methodology is based on an MCMC algorithm and its exactness…
▽ More
In this paper, we present a novel methodology to perform Bayesian inference for Cox processes in which the intensity function is driven by a diffusion process. The novelty lies in the fact that no discretization error is involved, despite the non-tractability of both the likelihood function and the transition density of the diffusion. The methodology is based on an MCMC algorithm and its exactness is built on retrospective sampling techniques. The efficiency of the methodology is investigated in some simulated examples and its applicability is illustrated in some real data analyzes.
△ Less
Submitted 2 June, 2023; v1 submitted 11 July, 2020;
originally announced July 2020.
-
Efficient Bernoulli factory MCMC for intractable posteriors
Authors:
Dootika Vats,
Flávio Gonçalves,
Krzysztof Łatuszyński,
Gareth O. Roberts
Abstract:
Accept-reject based Markov chain Monte Carlo (MCMC) algorithms have traditionally utilised acceptance probabilities that can be explicitly written as a function of the ratio of the target density at the two contested points. This feature is rendered almost useless in Bayesian posteriors with unknown functional forms. We introduce a new family of MCMC acceptance probabilities that has the distingui…
▽ More
Accept-reject based Markov chain Monte Carlo (MCMC) algorithms have traditionally utilised acceptance probabilities that can be explicitly written as a function of the ratio of the target density at the two contested points. This feature is rendered almost useless in Bayesian posteriors with unknown functional forms. We introduce a new family of MCMC acceptance probabilities that has the distinguishing feature of not being a function of the ratio of the target density at the two points. We present two stable Bernoulli factories that generate events within this class of acceptance probabilities. The efficiency of our methods rely on obtaining reasonable local upper or lower bounds on the target density and we present two classes of problems where such bounds are viable: Bayesian inference for diffusions and MCMC on constrained spaces. The resulting portkey Barker's algorithms are exact and computationally more efficient that the current state-of-the-art.
△ Less
Submitted 23 April, 2021; v1 submitted 16 April, 2020;
originally announced April 2020.
-
From the Bernoulli Factory to a Dice Enterprise via Perfect Sampling of Markov Chains
Authors:
Giulio Morina,
Krzysztof Latuszynski,
Piotr Nayar,
Alex Wendland
Abstract:
Given a $p$-coin that lands heads with unknown probability $p$, we wish to produce an $f(p)$-coin for a given function $f: (0,1) \rightarrow (0,1)$. This problem is commonly known as the Bernoulli Factory and results on its solvability and complexity have already been obtained. Nevertheless, generic ways to design a practical Bernoulli Factory for a given function $f$ exist only in a few special c…
▽ More
Given a $p$-coin that lands heads with unknown probability $p$, we wish to produce an $f(p)$-coin for a given function $f: (0,1) \rightarrow (0,1)$. This problem is commonly known as the Bernoulli Factory and results on its solvability and complexity have already been obtained. Nevertheless, generic ways to design a practical Bernoulli Factory for a given function $f$ exist only in a few special cases. We present a constructive way to build an efficient Bernoulli Factory when $f(p)$ is a rational function with coefficients in $\mathbb{R}$. Moreover, we extend the Bernoulli Factory problem to a more general setting where we have access to an $m$-sided die and we wish to roll a $v$-sided one; i.e., we consider rational functions between open probability simplices. Our construction consists of rephrasing the original problem as simulating from the stationary distribution of a certain class of Markov chains - a task that we show can be achieved using perfect simulation techniques with the original $m$-sided die as the only source of randomness. In the Bernoulli Factory case, the number of tosses needed by the algorithm has exponential tails and its expected value can be bounded uniformly in $p$. En route to optimizing the algorithm we show a fact of independent interest: every finite, integer valued, random variable will eventually become log-concave after convolving with enough Bernoulli trials.
△ Less
Submitted 28 September, 2020; v1 submitted 19 December, 2019;
originally announced December 2019.
-
A Framework for Adaptive MCMC Targeting Multimodal Distributions
Authors:
Emilia Pompe,
Chris Holmes,
Krzysztof Łatuszyński
Abstract:
We propose a new Monte Carlo method for sampling from multimodal distributions. The idea of this technique is based on splitting the task into two: finding the modes of a target distribution $π$ and sampling, given the knowledge of the locations of the modes. The sampling algorithm relies on steps of two types: local ones, preserving the mode; and jumps to regions associated with different modes.…
▽ More
We propose a new Monte Carlo method for sampling from multimodal distributions. The idea of this technique is based on splitting the task into two: finding the modes of a target distribution $π$ and sampling, given the knowledge of the locations of the modes. The sampling algorithm relies on steps of two types: local ones, preserving the mode; and jumps to regions associated with different modes. Besides, the method learns the optimal parameters of the algorithm while it runs, without requiring user intervention. Our technique should be considered as a flexible framework, in which the design of moves can follow various strategies known from the broad MCMC literature.
In order to design an adaptive scheme that facilitates both local and jump moves, we introduce an auxiliary variable representing each mode and we define a new target distribution $\tildeπ$ on an augmented state space $\mathcal{X}~\times~\mathcal{I}$, where $\mathcal{X}$ is the original state space of $π$ and $\mathcal{I}$ is the set of the modes. As the algorithm runs and updates its parameters, the target distribution $\tildeπ$ also keeps being modified. This motivates a new class of algorithms, Auxiliary Variable Adaptive MCMC. We prove general ergodic results for the whole class before specialising to the case of our algorithm.
△ Less
Submitted 11 January, 2019; v1 submitted 6 December, 2018;
originally announced December 2018.
-
Air Markov Chain Monte Carlo
Authors:
Cyril Chimisov,
Krzysztof Latuszynski,
Gareth Roberts
Abstract:
We introduce a class of Adapted Increasingly Rarely Markov Chain Monte Carlo (AirMCMC) algorithms where the underlying Markov kernel is allowed to be changed based on the whole available chain output but only at specific time points separated by an increasing number of iterations. The main motivation is the ease of analysis of such algorithms. Under the assumption of either simultaneous or (weaker…
▽ More
We introduce a class of Adapted Increasingly Rarely Markov Chain Monte Carlo (AirMCMC) algorithms where the underlying Markov kernel is allowed to be changed based on the whole available chain output but only at specific time points separated by an increasing number of iterations. The main motivation is the ease of analysis of such algorithms. Under the assumption of either simultaneous or (weaker) local simultaneous geometric drift condition, or simultaneous polynomial drift we prove the $L_2-$convergence, Weak and Strong Laws of Large Numbers (WLLN, SLLN), Central Limit Theorem (CLT), and discuss how our approach extends the existing results. We argue that many of the known Adaptive MCMC algorithms may be transformed into the corresponding Air versions, and provide an empirical evidence that performance of the Air version stays virtually the same.
△ Less
Submitted 28 January, 2018;
originally announced January 2018.
-
Adapting The Gibbs Sampler
Authors:
Cyril Chimisov,
Krzysztof Latuszynski,
Gareth Roberts
Abstract:
The popularity of Adaptive MCMC has been fueled on the one hand by its success in applications, and on the other hand, by mathematically appealing and computationally straightforward optimisation criteria for the Metropolis algorithm acceptance rate (and, equivalently, proposal scale). Similarly principled and operational criteria for optimising the selection probabilities of the Random Scan Gibbs…
▽ More
The popularity of Adaptive MCMC has been fueled on the one hand by its success in applications, and on the other hand, by mathematically appealing and computationally straightforward optimisation criteria for the Metropolis algorithm acceptance rate (and, equivalently, proposal scale). Similarly principled and operational criteria for optimising the selection probabilities of the Random Scan Gibbs Sampler have not been devised to date.
In the present work, we close this gap and develop a general purpose Adaptive Random Scan Gibbs Sampler that adapts the selection probabilities. The adaptation is guided by optimising the $L_2-$spectral gap for the target's Gaussian analogue, gradually, as target's global covariance is learned by the sampler. The additional computational cost of the adaptation represents a small fraction of the total simulation effort. `
We present a number of moderately- and high-dimensional examples, including truncated Gaussians, Bayesian Hierarchical Models and Hidden Markov Models, where significant computational gains are empirically observed for both, Adaptive Gibbs, and Adaptive Metropolis within Adaptive Gibbs version of the algorithm. We argue that Adaptive Random Scan Gibbs Samplers can be routinely implemented and substantial computational gains will be observed across many typical Gibbs sampling problems.
We shall give conditions under which ergodicity of the adaptive algorithms can be established.
△ Less
Submitted 28 January, 2018;
originally announced January 2018.
-
Continious-time Importance Sampling: Monte Carlo Methods which Avoid Time-discretisation Error
Authors:
Paul Fearnhead,
Krzystof Latuszynski,
Gareth O. Roberts,
Giorgos Sermaidis
Abstract:
In this paper we develop a continuous-time sequential importance sampling (CIS) algorithm which eliminates time-discretisation errors and provides online unbiased estimation for continuous time Markov processes, in particular for diffusions. Our work removes the strong conditions imposed by the EA and thus extends significantly the class of discretisation error-free MC methods for diffusions. The…
▽ More
In this paper we develop a continuous-time sequential importance sampling (CIS) algorithm which eliminates time-discretisation errors and provides online unbiased estimation for continuous time Markov processes, in particular for diffusions. Our work removes the strong conditions imposed by the EA and thus extends significantly the class of discretisation error-free MC methods for diffusions. The reason that CIS can be applied more generally than EA is that it no longer works on the path space of the SDE. Instead it uses proposal distributions for the transition density of the diffusion, and proposal distributions that are absolutely continuous with respect to the true transition density exist for general SDEs.
△ Less
Submitted 17 December, 2017;
originally announced December 2017.
-
Barker's algorithm for Bayesian inference with intractable likelihoods
Authors:
Flavio B. Gonçalves,
Krzysztof Łatuszyński,
Gareth O. Roberts
Abstract:
In this expository paper we abstract and describe a simple MCMC scheme for sampling from intractable target densities. The approach has been introduced in Gonçalves et al. (2017a) in the specific context of jump-diffusions, and is based on the Barker's algorithm paired with a simple Bernoulli factory type scheme, the so called 2-coin algorithm. In many settings it is an alternative to standard Met…
▽ More
In this expository paper we abstract and describe a simple MCMC scheme for sampling from intractable target densities. The approach has been introduced in Gonçalves et al. (2017a) in the specific context of jump-diffusions, and is based on the Barker's algorithm paired with a simple Bernoulli factory type scheme, the so called 2-coin algorithm. In many settings it is an alternative to standard Metropolis-Hastings pseudo-marginal method for simulating from intractable target densities. Although Barker's is well-known to be slightly less efficient than Metropolis-Hastings, the key advantage of our approach is that it allows to implement the "marginal Barker's" instead of the extended state space pseudo-marginal Metropolis-Hastings, owing to the special form of the accept/reject probability. We shall illustrate our methodology in the context of Bayesian inference for discretely observed Wright-Fisher family of diffusions.
△ Less
Submitted 22 September, 2017;
originally announced September 2017.
-
In Search of Lost (Mixing) Time: Adaptive Markov chain Monte Carlo schemes for Bayesian variable selection with very large p
Authors:
Jim Griffin,
Krys Latuszynski,
Mark Steel
Abstract:
The availability of data sets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these data sets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. The current paper proposes new adaptive Markov chain Monte Carlo algorithms to address…
▽ More
The availability of data sets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these data sets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. The current paper proposes new adaptive Markov chain Monte Carlo algorithms to address this shortcoming. The adaptive design of these algorithms exploits the observation that in large $p$ small $n$ settings, the majority of the $p$ variables will be approximately uncorrelated a posteriori. The algorithms adaptively build suitable non-local proposals that result in moves with squared jumping distance significantly larger than standard methods. Their performance is studied empirically in high-dimensional problems (with both simulated and actual data) and speedups of up to 4 orders of magnitude are observed. The proposed algorithms are easily implementable on multi-core architectures and are well suited for parallel tempering or sequential Monte Carlo implementations.
△ Less
Submitted 7 May, 2019; v1 submitted 18 August, 2017;
originally announced August 2017.
-
Exact Monte Carlo likelihood-based inference for jump-diffusion processes
Authors:
Flávio B. Gonçalves,
Krzysztof G. Łatuszyński,
Gareth O. Roberts
Abstract:
Statistical inference for discretely observed jump-diffusion processes is a complex problem which motivates new methodological challenges. Thus existing approaches invariably resort to time-discretisations which inevitably lead to approximations in inference. In this paper, we give the first general collection of methodologies for exact (in this context meaning discretisation-free) likelihood-base…
▽ More
Statistical inference for discretely observed jump-diffusion processes is a complex problem which motivates new methodological challenges. Thus existing approaches invariably resort to time-discretisations which inevitably lead to approximations in inference. In this paper, we give the first general collection of methodologies for exact (in this context meaning discretisation-free) likelihood-based inference for discretely observed finite activity jump-diffusions. The only sources of error involved are Monte Carlo error and convergence of EM or MCMC algorithms. We shall introduce both frequentist and Bayesian approaches, illustrating the methodology through simulated and real examples.
△ Less
Submitted 1 March, 2023; v1 submitted 2 July, 2017;
originally announced July 2017.
-
Discussion of "Sequential Quasi-Monte-Carlo Sampling" by M. Gerber and N. Chopin
Authors:
M. Pollock,
A. M. Johansen,
K. Łatuszyński,
G. O. Roberts
Abstract:
In this comment we consider whether QMC methods can be further embedded within SMC schemes in settings in which the transition density of the latent process is intractable and pseudo-marginal methods are deployed.
In this comment we consider whether QMC methods can be further embedded within SMC schemes in settings in which the transition density of the latent process is intractable and pseudo-marginal methods are deployed.
△ Less
Submitted 4 February, 2015;
originally announced February 2015.
-
Bayesian computation: a perspective on the current state, and sampling backwards and forwards
Authors:
Peter J. Green,
Krzysztof Łatuszyński,
Marcelo Pereyra,
Christian P. Robert
Abstract:
The past decades have seen enormous improvements in computational inference based on statistical models, with continual enhancement in a wide range of computational tools, in competition. In Bayesian inference, first and foremost, MCMC techniques continue to evolve, moving from random walk proposals to Langevin drift, to Hamiltonian Monte Carlo, and so on, with both theoretical and algorithmic inp…
▽ More
The past decades have seen enormous improvements in computational inference based on statistical models, with continual enhancement in a wide range of computational tools, in competition. In Bayesian inference, first and foremost, MCMC techniques continue to evolve, moving from random walk proposals to Langevin drift, to Hamiltonian Monte Carlo, and so on, with both theoretical and algorithmic inputs opening wider access to practitioners. However, this impressive evolution in capacity is confronted by an even steeper increase in the complexity of the models and datasets to be addressed. The difficulties of modelling and then handling ever more complex datasets most likely call for a new type of tool for computational inference that dramatically reduce the dimension and size of the raw data while capturing its essential aspects. Approximate models and algorithms may thus be at the core of the next computational revolution.
△ Less
Submitted 9 May, 2015; v1 submitted 4 February, 2015;
originally announced February 2015.
-
Individual adaptation: an adaptive MCMC scheme for variable selection problems
Authors:
Jim Griffin,
Krzysztof Latuszynski,
Mark Steel
Abstract:
The increasing size of data sets has lead to variable selection in regression becoming increasingly important. Bayesian approaches are attractive since they allow uncertainty about the choice of variables to be formally included in the analysis. The application of fully Bayesian variable selection methods to large data sets is computationally challenging. We describe an adaptive Markov chain Monte…
▽ More
The increasing size of data sets has lead to variable selection in regression becoming increasingly important. Bayesian approaches are attractive since they allow uncertainty about the choice of variables to be formally included in the analysis. The application of fully Bayesian variable selection methods to large data sets is computationally challenging. We describe an adaptive Markov chain Monte Carlo approach called Individual Adaptation which adjusts a general proposal to the data. We show that the algorithm is ergodic and discuss its use within parallel tempering and sequential Monte Carlo approaches. We illustrate the use of the method on two data sets including a gene expression analysis with 22 577 variables.
△ Less
Submitted 29 December, 2014; v1 submitted 21 December, 2014;
originally announced December 2014.
-
Convergence of hybrid slice sampling via spectral gap
Authors:
Krzysztof Łatuszyński,
Daniel Rudolf
Abstract:
It is known that the simple slice sampler has robust convergence properties, however the class of problems where it can be implemented is limited. In contrast, we consider hybrid slice samplers which are easily implementable and where another Markov chain approximately samples the uniform distribution on each slice. Under appropriate assumptions on the Markov chain on the slice we show a lower bou…
▽ More
It is known that the simple slice sampler has robust convergence properties, however the class of problems where it can be implemented is limited. In contrast, we consider hybrid slice samplers which are easily implementable and where another Markov chain approximately samples the uniform distribution on each slice. Under appropriate assumptions on the Markov chain on the slice we show a lower bound and an upper bound of the spectral gap of the hybrid slice sampler in terms of the spectral gap of the simple slice sampler. An immediate consequence of this is that spectral gap and geometric ergodicity of the hybrid slice sampler can be concluded from spectral gap and geometric ergodicity of its simple version which is very well understood. These results indicate that robustness properties of the simple slice sampler are inherited by (appropriately designed) easily implementable hybrid versions. We apply the developed theory and analyse a number of specific algorithms such as the stepping-out shrinkage slice sampling, hit-and-run slice sampling on a class of multivariate targets and an easily implementable combination of both procedures on multidimensional bimodal densities.
△ Less
Submitted 9 February, 2024; v1 submitted 9 September, 2014;
originally announced September 2014.
-
Perfect simulation using atomic regeneration with application to Sequential Monte Carlo
Authors:
Anthony Lee,
Arnaud Doucet,
Krzysztof Łatuszyński
Abstract:
Consider an irreducible, Harris recurrent Markov chain of transition kernel Π and invariant probability measure π. If Π satisfies a minorization condition, then the split chain allows the identification of regeneration times which may be exploited to obtain perfect samples from π. Unfortunately, many transition kernels associated with complex Markov chain Monte Carlo algorithms are analytically in…
▽ More
Consider an irreducible, Harris recurrent Markov chain of transition kernel Π and invariant probability measure π. If Π satisfies a minorization condition, then the split chain allows the identification of regeneration times which may be exploited to obtain perfect samples from π. Unfortunately, many transition kernels associated with complex Markov chain Monte Carlo algorithms are analytically intractable, so establishing a minorization condition and simulating the split chain is challenging, if not impossible. For uniformly ergodic Markov chains with intractable transition kernels, we propose two efficient perfect simulation procedures of similar expected running time which are instances of the multigamma coupler and an imputation scheme. These algorithms overcome the intractability of the kernel by introducing an artificial atom and using a Bernoulli factory. We detail an application of these procedures when Π is the recently introduced iterated conditional Sequential Monte Carlo kernel. We additionally provide results on the general applicability of the methodology, and how Sequential Monte Carlo methods may be used to facilitate perfect simulation and/or unbiased estimation of expectations with respect to the stationary distribution of a non-uniformly ergodic Markov chain.
△ Less
Submitted 22 July, 2014;
originally announced July 2014.
-
Stability of adversarial Markov chains, with an application to adaptive MCMC algorithms
Authors:
Radu V. Craiu,
Lawrence Gray,
Krzysztof Łatuszyński,
Neal Madras,
Gareth O. Roberts,
Jeffrey S. Rosenthal
Abstract:
We consider whether ergodic Markov chains with bounded step size remain bounded in probability when their transitions are modified by an adversary on a bounded subset. We provide counterexamples to show that the answer is no in general, and prove theorems to show that the answer is yes under various additional assumptions. We then use our results to prove convergence of various adaptive Markov cha…
▽ More
We consider whether ergodic Markov chains with bounded step size remain bounded in probability when their transitions are modified by an adversary on a bounded subset. We provide counterexamples to show that the answer is no in general, and prove theorems to show that the answer is yes under various additional assumptions. We then use our results to prove convergence of various adaptive Markov chain Monte Carlo algorithms.
△ Less
Submitted 5 November, 2015; v1 submitted 16 March, 2014;
originally announced March 2014.
-
The Containment Condition and AdapFail algorithms
Authors:
Krzysztof Latuszynski,
Jeffrey S. Rosenthal
Abstract:
This short note investigates convergence of adaptive MCMC algorithms, i.e.\ algorithms which modify the Markov chain update probabilities on the fly. We focus on the Containment condition introduced in \cite{roberts2007coupling}. We show that if the Containment condition is \emph{not} satisfied, then the algorithm will perform very poorly. Specifically, with positive probability, the adaptive algo…
▽ More
This short note investigates convergence of adaptive MCMC algorithms, i.e.\ algorithms which modify the Markov chain update probabilities on the fly. We focus on the Containment condition introduced in \cite{roberts2007coupling}. We show that if the Containment condition is \emph{not} satisfied, then the algorithm will perform very poorly. Specifically, with positive probability, the adaptive algorithm will be asymptotically less efficient then \emph{any} nonadaptive ergodic MCMC algorithm. We call such algorithms \texttt{AdapFail}, and conclude that they should not be used.
△ Less
Submitted 28 December, 2013; v1 submitted 6 July, 2013;
originally announced July 2013.
-
Variance bounding and geometric ergodicity of Markov chain Monte Carlo kernels for approximate Bayesian computation
Authors:
Anthony Lee,
Krzysztof Latuszynski
Abstract:
Approximate Bayesian computation has emerged as a standard computational tool when dealing with the increasingly common scenario of completely intractable likelihood functions in Bayesian inference. We show that many common Markov chain Monte Carlo kernels used to facilitate inference in this setting can fail to be variance bounding, and hence geometrically ergodic, which can have consequences for…
▽ More
Approximate Bayesian computation has emerged as a standard computational tool when dealing with the increasingly common scenario of completely intractable likelihood functions in Bayesian inference. We show that many common Markov chain Monte Carlo kernels used to facilitate inference in this setting can fail to be variance bounding, and hence geometrically ergodic, which can have consequences for the reliability of estimates in practice. This phenomenon is typically independent of the choice of tolerance in the approximation. We then prove that a recently introduced Markov kernel in this setting can inherit variance bounding and geometric ergodicity from its intractable Metropolis--Hastings counterpart, under reasonably weak and manageable conditions. We show that the computational cost of this alternative kernel is bounded whenever the prior is proper, and present indicative results on an example where spectral gaps and asymptotic variances can be computed, as well as an example involving inference for a partially and discretely observed, time-homogeneous, pure jump Markov process. We also supply two general theorems, one of which provides a simple sufficient condition for lack of variance bounding for reversible kernels and the other provides a positive result concerning inheritance of variance bounding and geometric ergodicity for mixtures of reversible kernels.
△ Less
Submitted 25 March, 2014; v1 submitted 24 October, 2012;
originally announced October 2012.
-
Nonasymptotic bounds on the estimation error of MCMC algorithms
Authors:
Krzysztof Łatuszyński,
Błażej Miasojedow,
Wojciech Niemiro
Abstract:
We address the problem of upper bounding the mean square error of MCMC estimators. Our analysis is nonasymptotic. We first establish a general result valid for essentially all ergodic Markov chains encountered in Bayesian computation and a possibly unbounded target function $f$. The bound is sharp in the sense that the leading term is exactly $σ_{\mathrm {as}}^2(P,f)/n$, where…
▽ More
We address the problem of upper bounding the mean square error of MCMC estimators. Our analysis is nonasymptotic. We first establish a general result valid for essentially all ergodic Markov chains encountered in Bayesian computation and a possibly unbounded target function $f$. The bound is sharp in the sense that the leading term is exactly $σ_{\mathrm {as}}^2(P,f)/n$, where $σ_{\mathrm{as}}^2(P,f)$ is the CLT asymptotic variance. Next, we proceed to specific additional assumptions and give explicit computable bounds for geometrically and polynomially ergodic Markov chains under quantitative drift conditions. As a corollary, we provide results on confidence estimation.
△ Less
Submitted 11 December, 2013; v1 submitted 23 June, 2011;
originally announced June 2011.
-
CLTs and asymptotic variance of time-sampled Markov chains
Authors:
Krzysztof Latuszynski,
Gareth O. Roberts
Abstract:
For a Markov transition kernel $P$ and a probability distribution $ μ$ on nonnegative integers, a time-sampled Markov chain evolves according to the transition kernel $P_μ = \sum_k μ(k)P^k.$ In this note we obtain CLT conditions for time-sampled Markov chains and derive a spectral formula for the asymptotic variance. Using these results we compare efficiency of Barker's and Metropolis algorithms i…
▽ More
For a Markov transition kernel $P$ and a probability distribution $ μ$ on nonnegative integers, a time-sampled Markov chain evolves according to the transition kernel $P_μ = \sum_k μ(k)P^k.$ In this note we obtain CLT conditions for time-sampled Markov chains and derive a spectral formula for the asymptotic variance. Using these results we compare efficiency of Barker's and Metropolis algorithms in terms of asymptotic variance.
△ Less
Submitted 3 June, 2011; v1 submitted 10 February, 2011;
originally announced February 2011.
-
Adaptive Gibbs samplers and related MCMC methods
Authors:
Krzysztof Łatuszyński,
Gareth O. Roberts,
Jeffrey S. Rosenthal
Abstract:
We consider various versions of adaptive Gibbs and Metropolis-within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run by learning as they go in an attempt to optimize the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various positi…
▽ More
We consider various versions of adaptive Gibbs and Metropolis-within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run by learning as they go in an attempt to optimize the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various positive results guaranteeing convergence of adaptive Gibbs samplers under certain conditions.
△ Less
Submitted 27 February, 2013; v1 submitted 30 January, 2011;
originally announced January 2011.
-
Nonasymptotic bounds on the mean square error for MCMC estimates via renewal techniques
Authors:
Krzysztof Latuszynski,
Blazej Miasojedow,
Wojciech Niemiro
Abstract:
The Nummellin's split chain construction allows to decompose a Markov chain Monte Carlo (MCMC) trajectory into i.i.d. "excursions". RegenerativeMCMC algorithms based on this technique use a random number of samples. They have been proposed as a promising alternative to usual fixed length simulation [25, 33, 14]. In this note we derive nonasymptotic bounds on the mean square error (MSE) of regenera…
▽ More
The Nummellin's split chain construction allows to decompose a Markov chain Monte Carlo (MCMC) trajectory into i.i.d. "excursions". RegenerativeMCMC algorithms based on this technique use a random number of samples. They have been proposed as a promising alternative to usual fixed length simulation [25, 33, 14]. In this note we derive nonasymptotic bounds on the mean square error (MSE) of regenerative MCMC estimates via techniques of renewal theory and sequential statistics. These results are applied to costruct confidence intervals. We then focus on two cases of particular interest: chains satisfying the Doeblin condition and a geometric drift condition. Available explicit nonasymptotic results are compared for different schemes of MCMC simulation.
△ Less
Submitted 12 May, 2011; v1 submitted 30 January, 2011;
originally announced January 2011.
-
Adaptive Gibbs samplers
Authors:
Krzysztof Latuszynski,
Jeffrey S. Rosenthal
Abstract:
We consider various versions of adaptive Gibbs and Metropolis within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run, by learning as they go in an attempt to optimise the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various pos…
▽ More
We consider various versions of adaptive Gibbs and Metropolis within-Gibbs samplers, which update their selection probabilities (and perhaps also their proposal distributions) on the fly during a run, by learning as they go in an attempt to optimise the algorithm. We present a cautionary example of how even a simple-seeming adaptive Gibbs sampler may fail to converge. We then present various positive results guaranteeing convergence of adaptive Gibbs samplers under certain conditions.
△ Less
Submitted 15 January, 2010;
originally announced January 2010.
-
Rigorous confidence bounds for MCMC under a geometric drift condition
Authors:
Krzysztof Latuszynski,
Wojciech Niemiro
Abstract:
We assume a drift condition towards a small set and bound the mean square error of estimators obtained by taking averages along a single trajectory of a Markov chain Monte Carlo algorithm. We use these bounds to construct fixed-width nonasymptotic confidence intervals. For a possibly unbounded function $f:\stany \to R,$ let $I=\int_{\stany} f(x) π(x) dx$ be the value of interest and…
▽ More
We assume a drift condition towards a small set and bound the mean square error of estimators obtained by taking averages along a single trajectory of a Markov chain Monte Carlo algorithm. We use these bounds to construct fixed-width nonasymptotic confidence intervals. For a possibly unbounded function $f:\stany \to R,$ let $I=\int_{\stany} f(x) π(x) dx$ be the value of interest and $\hat{I}_{t,n}=(1/n)\sum_{i=t}^{t+n-1}f(X_i)$ its MCMC estimate. Precisely, we derive lower bounds for the length of the trajectory $n$ and burn-in time $t$ which ensure that $$P(|\hat{I}_{t,n}-I|\leq \varepsilon)\geq 1-α.$$ The bounds depend only and explicitly on drift parameters, on the $V-$norm of $f,$ where $V$ is the drift function and on precision and confidence parameters $\varepsilon, α.$ Next we analyse an MCMC estimator based on the median of multiple shorter runs that allows for sharper bounds for the required total simulation cost. In particular the methodology can be applied for computing Bayesian estimators in practically relevant models. We illustrate our bounds numerically in a simple example.
△ Less
Submitted 14 August, 2009;
originally announced August 2009.
-
Nonasymptotic bounds on the estimation error for regenerative MCMC algorithms
Authors:
Krzysztof Latuszynski,
Blazej Miasojedow,
Wojciech Niemiro
Abstract:
MCMC methods are used in Bayesian statistics not only to sample from posterior distributions but also to estimate expectations. Underlying functions are most often defined on a continuous state space and can be unbounded. We consider a regenerative setting and Monte Carlo estimators based on i.i.d. blocks of a Markov chain trajectory. The main result is an inequality for the mean square error. W…
▽ More
MCMC methods are used in Bayesian statistics not only to sample from posterior distributions but also to estimate expectations. Underlying functions are most often defined on a continuous state space and can be unbounded. We consider a regenerative setting and Monte Carlo estimators based on i.i.d. blocks of a Markov chain trajectory. The main result is an inequality for the mean square error. We also consider confidence bounds. We first derive the results in terms of the asymptotic variance and then bound the asymptotic variance for both uniformly ergodic and geometrically ergodic Markov chains.
△ Less
Submitted 28 July, 2009;
originally announced July 2009.
-
Regeneration and Fixed-Width Analysis of Markov Chain Monte Carlo Algorithms
Authors:
Krzysztof Latuszynski
Abstract:
In the thesis we take the split chain approach to analyzing Markov chains and use it to establish fixed-width results for estimators obtained via Markov chain Monte Carlo procedures (MCMC). Theoretical results include necessary and sufficient conditions in terms of regeneration for central limit theorems for ergodic Markov chains and a regenerative proof of a CLT version for uniformly ergodic Ma…
▽ More
In the thesis we take the split chain approach to analyzing Markov chains and use it to establish fixed-width results for estimators obtained via Markov chain Monte Carlo procedures (MCMC). Theoretical results include necessary and sufficient conditions in terms of regeneration for central limit theorems for ergodic Markov chains and a regenerative proof of a CLT version for uniformly ergodic Markov chains with $E_πf^2< \infty.$ To obtain asymptotic confidence intervals for MCMC estimators, strongly consistent estimators of the asymptotic variance are essential. We relax assumptions required to obtain such estimators. Moreover, under a drift condition, nonasymptotic fixed-width results for MCMC estimators for a general state space setting (not necessarily compact) and not necessarily bounded target function $f$ are obtained. The last chapter is devoted to the idea of adaptive Monte Carlo simulation and provides convergence results and law of large numbers for adaptive procedures under path-stability condition for transition kernels.
△ Less
Submitted 27 July, 2009;
originally announced July 2009.
-
Simulating Events of Unknown Probabilities via Reverse Time Martingales
Authors:
Krzysztof Latuszynski,
Ioannis Kosmidis,
Omiros Papaspiliopoulos,
Gareth O. Roberts
Abstract:
Assume that one aims to simulate an event of unknown probability $s\in (0,1)$ which is uniquely determined, however only its approximations can be obtained using a finite computational effort. Such settings are often encountered in statistical simulations. We consider two specific examples. First, the exact simulation of non-linear diffusions, second, the celebrated Bernoulli factory problem of…
▽ More
Assume that one aims to simulate an event of unknown probability $s\in (0,1)$ which is uniquely determined, however only its approximations can be obtained using a finite computational effort. Such settings are often encountered in statistical simulations. We consider two specific examples. First, the exact simulation of non-linear diffusions, second, the celebrated Bernoulli factory problem of generating an $f(p)-$coin given a sequence $X_1,X_2,...$ of independent tosses of a $p-$coin (with known $f$ and unknown $p$). We describe a general framework and provide algorithms where this kind of problems can be fitted and solved. The algorithms are straightforward to implement and thus allow for effective simulation of desired events of probability $s.$ In the case of diffusions, we obtain the algorithm of \cite{BeskosRobertsEA1} as a specific instance of the generic framework developed here. In the case of the Bernoulli factory, our work offers a statistical understanding of the Nacu-Peres algorithm for $f(p) = \min\{2p, 1-2\varepsilon\}$ (which is central to the general question) and allows for its immediate implementation that avoids algorithmic difficulties of the original version.
△ Less
Submitted 21 November, 2009; v1 submitted 23 July, 2009;
originally announced July 2009.