Search | arXiv e-print repository

arXiv:2005.05584 [pdf, other]

Non-reversible guided Metropolis kernel

Abstract: We construct a class of non-reversible Metropolis kernels as a multivariate extension of the guided-walk kernel proposed by Gustafson 1998. The main idea of our method is to introduce a projection that maps a state space to a totally ordered group. By using Haar measure, we construct a novel Markov kernel termed Haar-mixture kernel, which is of interest in its own right. This is achieved by induci… ▽ More We construct a class of non-reversible Metropolis kernels as a multivariate extension of the guided-walk kernel proposed by Gustafson 1998. The main idea of our method is to introduce a projection that maps a state space to a totally ordered group. By using Haar measure, we construct a novel Markov kernel termed Haar-mixture kernel, which is of interest in its own right. This is achieved by inducing a topological structure to the totally ordered group. Our proposed method, the Delta-guided Metropolis--Haar kernel, is constructed by using the Haar-mixture kernel as a proposal kernel. The proposed non-reversible kernel is at least 10 times better than the random-walk Metropolis kernel and Hamiltonian Monte Carlo kernel for the logistic regression and a discretely observed stochastic process in terms of effective sample size per second. △ Less

Submitted 12 March, 2021; v1 submitted 12 May, 2020; originally announced May 2020.

Comments: 27 pages, 5 figures

arXiv:1807.11358 [pdf, other]

High-dimensional scaling limits of piecewise deterministic sampling algorithms

Authors: Joris Bierkens, Kengo Kamatani, Gareth O. Roberts

Abstract: Piecewise deterministic Markov processes are an important new tool in the design of Markov Chain Monte Carlo algorithms. Two examples of fundamental importance are the Bouncy Particle Sampler (BPS) and the Zig-Zag process (ZZ). In this paper scaling limits for both algorithms are determined. Here the dimensionality of the space tends towards infinity and the target distribution is the multivariate… ▽ More Piecewise deterministic Markov processes are an important new tool in the design of Markov Chain Monte Carlo algorithms. Two examples of fundamental importance are the Bouncy Particle Sampler (BPS) and the Zig-Zag process (ZZ). In this paper scaling limits for both algorithms are determined. Here the dimensionality of the space tends towards infinity and the target distribution is the multivariate standard normal distribution. For several quantities of interest (angular momentum, first coordinate, and negative log-density) the scaling limits show qualitatively very different and rich behaviour. Based on these scaling limits the performance of the two algorithms in high dimensions can be compared. Although for angular momentum both processes require only a computational effort of $O(d)$ to obtain approximately independent samples, the computational effort for negative log-density and first coordinate differ: for these BPS requires $O(d^2)$ computational effort whereas ZZ requires $O(d)$. Finally we provide a criterion for the choice of the refreshment rate of BPS. △ Less

Submitted 30 July, 2019; v1 submitted 30 July, 2018; originally announced July 2018.

Comments: 51 pages, 10 figures

MSC Class: 60F05; 65C05

arXiv:1711.10065 [pdf, ps, other]

doi 10.1214/18-AAP1431

On One-Dimensional Riccati Diffusions

Authors: Adrian N. Bishop, Pierre Del Moral, Kengo Kamatani, Bruno Remillard

Abstract: This article is concerned with the fluctuation analysis and the stability properties of a class of one-dimensional Riccati diffusions. These one-dimensional stochastic differential equations exhibit a quadratic drift function and a non-Lipschitz continuous diffusion function. We present a novel approach, combining tangent process techniques, Feynman-Kac path integration, and exponential change of… ▽ More This article is concerned with the fluctuation analysis and the stability properties of a class of one-dimensional Riccati diffusions. These one-dimensional stochastic differential equations exhibit a quadratic drift function and a non-Lipschitz continuous diffusion function. We present a novel approach, combining tangent process techniques, Feynman-Kac path integration, and exponential change of measures, to derive sharp exponential decays to equilibrium. We also provide uniform estimates with respect to the time horizon, quantifying with some precision the fluctuations of these diffusions around a limiting deterministic Riccati differential equation. These results provide a stronger and almost sure version of the conventional central limit theorem. We illustrate these results in the context of ensemble Kalman-Bucy filtering. To the best of our knowledge, the exponential stability and the fluctuation analysis developed in this work are the first results of this kind for this class of nonlinear diffusions. △ Less

Submitted 1 December, 2018; v1 submitted 27 November, 2017; originally announced November 2017.

Journal ref: The Annals of Applied Probability, Volume 29, No. 2, pages: 1127-1187, 2019

arXiv:1707.08788 [pdf, other]

Bayesian inference for Stable Levy driven Stochastic Differential Equations with high-frequency data

Authors: Ajay Jasra, Kengo Kamatani, Hiroki Masuda

Abstract: In this article we consider parametric Bayesian inference for stochastic differential equations (SDE) driven by a pure-jump stable Levy process, which is observed at high frequency. In most cases of practical interest, the likelihood function is not available, so we use a quasi-likelihood and place an associated prior on the unknown parameters. It is shown under regularity conditions that there is… ▽ More In this article we consider parametric Bayesian inference for stochastic differential equations (SDE) driven by a pure-jump stable Levy process, which is observed at high frequency. In most cases of practical interest, the likelihood function is not available, so we use a quasi-likelihood and place an associated prior on the unknown parameters. It is shown under regularity conditions that there is a Bernstein-von Mises theorem associated to the posterior. We then develop a Markov chain Monte Carlo (MCMC) algorithm for Bayesian inference and assisted by our theoretical results, we show how to scale Metropolis-Hastings proposals when the frequency of the data grows, in order to prevent the acceptance ratio going to zero in the large data limit. Our algorithm is presented on numerical examples that help to verify our theoretical findings. △ Less

Submitted 27 July, 2017; originally announced July 2017.

arXiv:1701.05892 [pdf, other]

Bayesian Static Parameter Estimation for Partially Observed Diffusions via Multilevel Monte Carlo

Authors: Ajay Jasra, Kengo Kamatani, Kody J. H. Law, Yan Zhou

Abstract: In this article we consider static Bayesian parameter estimation for partially observed diffusions that are discretely observed. We work under the assumption that one must resort to discretizing the underlying diffusion process, for instance using the Euler-Maruyama method. Given this assumption, we show how one can use Markov chain Monte Carlo (MCMC) and particularly particle MCMC [Andrieu, C., D… ▽ More In this article we consider static Bayesian parameter estimation for partially observed diffusions that are discretely observed. We work under the assumption that one must resort to discretizing the underlying diffusion process, for instance using the Euler-Maruyama method. Given this assumption, we show how one can use Markov chain Monte Carlo (MCMC) and particularly particle MCMC [Andrieu, C., Doucet, A. and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods (with discussion). J. R. Statist. Soc. Ser. B, 72, 269--342] to implement a new approximation of the multilevel (ML) Monte Carlo (MC) collapsing sum identity. Our approach comprises constructing an approximate coupling of the posterior density of the joint distribution over parameter and hidden variables at two different discretization levels and then correcting by an importance sampling method. The variance of the weights are independent of the length of the observed data set. The utility of such a method is that, for a prescribed level of mean square error, the cost of this MLMC method is provably less than i.i.d. sampling from the posterior associated to the most precise discretization. However the method here comprises using only known and efficient simulation methodologies. The theoretical results are illustrated by inference of the parameters of two prototypical processes given noisy partial observations of the process: the first is an Ornstein Uhlenbeck process and the second is a more general Langevin equation. △ Less

Submitted 20 January, 2017; originally announced January 2017.

arXiv:1602.02889 [pdf, other]

Ergodicity of Markov chain Monte Carlo with reversible proposal

Authors: Kengo Kamatani

Abstract: We describe ergodic properties of some Metropolis-Hastings (MH) algorithms for heavy-tailed target distributions. The analysis usually falls into sub-geometric ergodicity framework but we prove that the mixed preconditioned Crank-Nicolson (MpCN) algorithm has geometric ergodicity even for heavy-tailed target distributions. This useful property comes from the fact that the MpCN algorithm becomes a… ▽ More We describe ergodic properties of some Metropolis-Hastings (MH) algorithms for heavy-tailed target distributions. The analysis usually falls into sub-geometric ergodicity framework but we prove that the mixed preconditioned Crank-Nicolson (MpCN) algorithm has geometric ergodicity even for heavy-tailed target distributions. This useful property comes from the fact that the MpCN algorithm becomes a random-walk Metropolis algorithm under suitable transformation. △ Less

Submitted 9 February, 2016; originally announced February 2016.

Comments: 14 pages

MSC Class: 65C05; 65C40; 60J05

arXiv:1510.04977 [pdf, other]

Multilevel particle filter

Authors: Ajay Jasra, Kengo Kamatani, Kody J. H. Law, Yan Zhou

Abstract: In this paper the filtering of partially observed diffusions, with discrete-time observations, is considered. It is assumed that only biased approximations of the diffusion can be obtained, for choice of an accuracy parameter indexed by $l$. A multilevel estimator is proposed, consisting of a telescopic sum of increment estimators associated to the successive levels. The work associated to… ▽ More In this paper the filtering of partially observed diffusions, with discrete-time observations, is considered. It is assumed that only biased approximations of the diffusion can be obtained, for choice of an accuracy parameter indexed by $l$. A multilevel estimator is proposed, consisting of a telescopic sum of increment estimators associated to the successive levels. The work associated to $\mathcal{O}(\varepsilon^2)$ mean-square error between the multilevel estimator and average with respect to the filtering distribution is shown to scale optimally, for example as $\mathcal{O}(\varepsilon^{-2})$ for optimal rates of convergence of the underlying diffusion approximation. The method is illustrated on some toy examples as well as estimation of interest rate based on real S&P 500 stock price data. △ Less

Submitted 16 October, 2015; originally announced October 2015.

arXiv:1406.5392 [pdf, ps, other]

Rate optimality of Random walk Metropolis algorithm in high-dimension with heavy-tailed target distribution

Authors: Kengo Kamatani

Abstract: The choice of the increment distribution is crucial for the random-walk Metropolis-Hastings (RWM) algorithm. In this paper we study the optimal choice in high-dimension setting among all possible increment distributions. The conclusion is rather counter intuitive, but the optimal rate of convergence is attained by the usual choice, the normal distribution as the increment distribution. In particul… ▽ More The choice of the increment distribution is crucial for the random-walk Metropolis-Hastings (RWM) algorithm. In this paper we study the optimal choice in high-dimension setting among all possible increment distributions. The conclusion is rather counter intuitive, but the optimal rate of convergence is attained by the usual choice, the normal distribution as the increment distribution. In particular, no heavy-tailed increment distribution can improve the rate. △ Less

Submitted 20 May, 2016; v1 submitted 20 June, 2014; originally announced June 2014.

Comments: 12pages

arXiv:1108.2477 [pdf, other]

doi 10.1051/ps/2014004

Local degeneracy of Markov chain Monte Carlo methods

Authors: Kengo Kamatani

Abstract: We study asymptotic behavior of Monte Carlo method. Local consistency is one of an ideal property of Monte Carlo method. However, it may fail to hold local consistency for several reason. In fact, in practice, it is more important to study such a non-ideal behavior. We call local degeneracy for one of a non-ideal behavior of Monte Carlo methods. We show some equivalent conditions for local degener… ▽ More We study asymptotic behavior of Monte Carlo method. Local consistency is one of an ideal property of Monte Carlo method. However, it may fail to hold local consistency for several reason. In fact, in practice, it is more important to study such a non-ideal behavior. We call local degeneracy for one of a non-ideal behavior of Monte Carlo methods. We show some equivalent conditions for local degeneracy. As an application we study a Gibbs sampler (data augmentation) for cumulative logit model with or without marginal augmentation. It is well known that natural Gibbs sampler does not work well for this model. In a sense of local consistency and degeneracy, marginal augmentation is shown to improve the asymptotic property. However, when the number of categories is large, both methods are not locally consistent. △ Less

Submitted 11 January, 2012; v1 submitted 6 August, 2011; originally announced August 2011.

Comments: 30 pages, 3 figures

arXiv:1103.5679 [pdf, other]

Weak consistency of Markov chain Monte Carlo methods

Authors: Kengo Kamatani

Abstract: Markov chain Monte Calro methods (MCMC) are commonly used in Bayesian statistics. In the last twenty years, many results have been established for the calculation of the exact convergence rate of MCMC methods. We introduce another rate of convergence for MCMC methods by approximation techniques. This rate can be obtained by the convergence of the Markov chain to a diffusion process. We apply it to… ▽ More Markov chain Monte Calro methods (MCMC) are commonly used in Bayesian statistics. In the last twenty years, many results have been established for the calculation of the exact convergence rate of MCMC methods. We introduce another rate of convergence for MCMC methods by approximation techniques. This rate can be obtained by the convergence of the Markov chain to a diffusion process. We apply it to a simple mixture model and obtain its convergence rate. Numerical simulations are performed to illustrate the effect of the rate. △ Less

Submitted 25 September, 2013; v1 submitted 29 March, 2011; originally announced March 2011.

Comments: 14 pages

Journal ref: Bulletin of Informatics and Cybernetics, 45 (2013) 103-123

arXiv:1012.0996 [pdf, other]

doi 10.1007/s10463-013-0403-3

Local Consistency of Markov Chain Monte Carlo Methods

Authors: Kengo Kamatani

Abstract: In this paper, we introduce the notion of efficiency (consistency) and examine some asymptotic properties of Markov chain Monte Carlo methods. We apply these results to the data augmentation (DA) procedure for independent and identically distributed observations. More precisely, we show that if both the sample size and the running time of the DA procedure tend to infinity the empirical distributio… ▽ More In this paper, we introduce the notion of efficiency (consistency) and examine some asymptotic properties of Markov chain Monte Carlo methods. We apply these results to the data augmentation (DA) procedure for independent and identically distributed observations. More precisely, we show that if both the sample size and the running time of the DA procedure tend to infinity the empirical distribution of the DA procedure tends to the posterior distribution. This is a local property of the DA procedure, which may be, in some cases, more helpful than the global properties to describe its behavior. The advantages of using the local properties are the simplicity and the generality of the results. The local properties provide useful insight into the problem of how to construct efficient algorithms. △ Less

Submitted 25 September, 2013; v1 submitted 5 December, 2010; originally announced December 2010.

Comments: 12 pages

Journal ref: Ann. Inst. Statist. Math. 66(1) (2014) 63-74

Showing 1–11 of 11 results for author: Kamatani, K