-
Convergence of Statistical Estimators via Mutual Information Bounds
Authors:
El Mahdi Khribch,
Pierre Alquier
Abstract:
Recent advances in statistical learning theory have revealed profound connections between mutual information (MI) bounds, PAC-Bayesian theory, and Bayesian nonparametrics. This work introduces a novel mutual information bound for statistical models. The derived bound has wide-ranging applications in statistical inference. It yields improved contraction rates for fractional posteriors in Bayesian n…
▽ More
Recent advances in statistical learning theory have revealed profound connections between mutual information (MI) bounds, PAC-Bayesian theory, and Bayesian nonparametrics. This work introduces a novel mutual information bound for statistical models. The derived bound has wide-ranging applications in statistical inference. It yields improved contraction rates for fractional posteriors in Bayesian nonparametrics. It can also be used to study a wide range of estimation methods, such as variational inference or Maximum Likelihood Estimation (MLE). By bridging these diverse areas, this work advances our understanding of the fundamental limits of statistical inference and the role of information in learning from data. We hope that these results will not only clarify connections between statistical inference and information theory but also help to develop a new toolbox to study a wide range of estimators.
△ Less
Submitted 24 December, 2024;
originally announced December 2024.
-
On importance sampling and independent Metropolis-Hastings with an unbounded weight function
Authors:
George Deligiannidis,
Pierre E. Jacob,
El Mahdi Khribch,
Guanyang Wang
Abstract:
Importance sampling and independent Metropolis-Hastings (IMH) are among the fundamental building blocks of Monte Carlo methods. Both require a proposal distribution that globally approximates the target distribution. The Radon-Nikodym derivative of the target distribution relative to the proposal is called the weight function. Under the weak assumption that the weight is unbounded but has a number…
▽ More
Importance sampling and independent Metropolis-Hastings (IMH) are among the fundamental building blocks of Monte Carlo methods. Both require a proposal distribution that globally approximates the target distribution. The Radon-Nikodym derivative of the target distribution relative to the proposal is called the weight function. Under the weak assumption that the weight is unbounded but has a number of finite moments under the proposal distribution, we obtain new results on the approximation error of importance sampling and of the particle independent Metropolis-Hastings algorithm (PIMH), which includes IMH as a special case. For IMH and PIMH, we show that the common random numbers coupling is maximal. Using that coupling we derive bounds on the total variation distance of a PIMH chain to the target distribution. The bounds are sharp with respect to the number of particles and the number of iterations. Our results allow a formal comparison of the finite-time biases of importance sampling and IMH. We further consider bias removal techniques using couplings of PIMH, and provide conditions under which the resulting unbiased estimators have finite moments. We compare the asymptotic efficiency of regular and unbiased importance sampling estimators as the number of particles goes to infinity.
△ Less
Submitted 14 November, 2024;
originally announced November 2024.
-
On Mixing Times of Metropolized Algorithm With Optimization Step (MAO) : A New Framework
Authors:
EL Mahdi Khribch,
George Deligiannidis,
Daniel Paulin
Abstract:
In this paper, we consider sampling from a class of distributions with thin tails supported on $\mathbb{R}^d$ and make two primary contributions. First, we propose a new Metropolized Algorithm With Optimization Step (MAO), which is well suited for such targets. Our algorithm is capable of sampling from distributions where the Metropolis-adjusted Langevin algorithm (MALA) is not converging or lacki…
▽ More
In this paper, we consider sampling from a class of distributions with thin tails supported on $\mathbb{R}^d$ and make two primary contributions. First, we propose a new Metropolized Algorithm With Optimization Step (MAO), which is well suited for such targets. Our algorithm is capable of sampling from distributions where the Metropolis-adjusted Langevin algorithm (MALA) is not converging or lacking in theoretical guarantees. Second, we derive upper bounds on the mixing time of MAO. Our results are supported by simulations on multiple target distributions.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.