-
Estimation When Both Covariance And Precision Matrices Are Sparse
Authors:
Shev Macnamara,
Erik Schlögl,
Zdravko I. Botev
Abstract:
We offer a method to estimate a covariance matrix in the special case that \textit{both} the covariance matrix and the precision matrix are sparse --- a constraint we call double sparsity. The estimation method is maximum likelihood, subject to the double sparsity constraint. In our method, only a particular class of sparsity pattern is allowed: both the matrix and its inverse must be subordinate…
▽ More
We offer a method to estimate a covariance matrix in the special case that \textit{both} the covariance matrix and the precision matrix are sparse --- a constraint we call double sparsity. The estimation method is maximum likelihood, subject to the double sparsity constraint. In our method, only a particular class of sparsity pattern is allowed: both the matrix and its inverse must be subordinate to the same chordal graph. Compared to a naive enforcement of double sparsity, our chordal graph approach exploits a special algebraic local inverse formula. This local inverse property makes computations that would usually involve an inverse (of either precision matrix or covariance matrix) much faster. In the context of estimation of covariance matrices, our proposal appears to be the first to find such special pairs of covariance and precision matrices.
△ Less
Submitted 14 August, 2021;
originally announced August 2021.
-
Sampling Conditionally on a Rare Event via Generalized Splitting
Authors:
Zdravko I. Botev,
Pierre L'Ecuyer
Abstract:
We propose and analyze a generalized splitting method to sample approximately from a distribution conditional on the occurrence of a rare event. This has important applications in a variety of contexts in operations research, engineering, and computational statistics. The method uses independent trials starting from a single particle. We exploit this independence to obtain asymptotic and non-asymp…
▽ More
We propose and analyze a generalized splitting method to sample approximately from a distribution conditional on the occurrence of a rare event. This has important applications in a variety of contexts in operations research, engineering, and computational statistics. The method uses independent trials starting from a single particle. We exploit this independence to obtain asymptotic and non-asymptotic bounds on the total variation error of the sampler. Our main finding is that the approximation error depends crucially on the relative variability of the number of points produced by the splitting algorithm in one run, and that this relative variability can be readily estimated via simulation. We illustrate the relevance of the proposed method on an application in which one needs to sample (approximately) from an intractable posterior density in Bayesian inference.
△ Less
Submitted 8 September, 2019;
originally announced September 2019.
-
Regenerative Simulation for the Bayesian Lasso
Authors:
Y. -L. Chen,
Z. I. Botev
Abstract:
The Gibbs sampler of Park and Casella is one of the most popular MCMC methods for sampling from the posterior density of the Bayesian Lasso regression. As with many Markov chain samplers, their Gibbs sampler lacks a theoretically sound method of output analysis --- a method for estimating the variance of a given ergodic average and estimating how closely the chain is sampling from the stationary d…
▽ More
The Gibbs sampler of Park and Casella is one of the most popular MCMC methods for sampling from the posterior density of the Bayesian Lasso regression. As with many Markov chain samplers, their Gibbs sampler lacks a theoretically sound method of output analysis --- a method for estimating the variance of a given ergodic average and estimating how closely the chain is sampling from the stationary distribution, that is, the burn-in.
In this paper, we address this shortcoming by identifying regenerative structure in the sampler of Park and Casella, thus providing a theoretically sound method of assessing its performance. The regenerative structure provides both a strongly consistent variance estimator, and an estimator of (an upper bound on) the total variation distance from the target posterior density. The result is a simple and theoretically sound way to assess the stationarity of the Park and Casella and, more generally, other MCMC samplers, for which regenerative simulation is possible.
We perform a numerical study in which we validate the standard errors calculated by our regenerative method by comparing it with the standard errors calculated by an AR(1) heuristic approximation. Thus, we show that for the Bayesian Lasso model, the regenerative method is a viable and theoretically justified alternative to the existing ad-hoc MCMC convergence diagnostics.
△ Less
Submitted 5 June, 2018;
originally announced June 2018.
-
Reliability Estimation for Networks with Minimal Flow Demand and Random Link Capacities
Authors:
Zdravko I. Botev,
Pierre L'Ecuyer,
Bruno Tuffin
Abstract:
We consider a network whose links have random capacities and in which a certain target amount of flow must be carried from some source nodes to some destination nodes. Each destination node has a fixed demand that must be satisfied and each source node has a given supply. We want to estimate the unreliability of the network, defined as the probability that the network cannot carry the required amo…
▽ More
We consider a network whose links have random capacities and in which a certain target amount of flow must be carried from some source nodes to some destination nodes. Each destination node has a fixed demand that must be satisfied and each source node has a given supply. We want to estimate the unreliability of the network, defined as the probability that the network cannot carry the required amount of flow to meet the demand at all destination nodes. When this unreliability is very small, which is our main interest in this paper, standard Monte Carlo estimators become useless because failure to meet the demand is a rare event. We propose and compare two different methods to handle this situation, one based on a conditional Monte Carlo approach and the other based on generalized splitting. We find that the first is more effective when the network is highly reliable and not too large, whereas for a larger network and/or moderate reliability, the second is more effective.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
The Normal Law Under Linear Restrictions: Simulation and Estimation via Minimax Tilting
Authors:
Z. I. Botev
Abstract:
Simulation from the truncated multivariate normal distribution in high dimensions is a recurrent problem in statistical computing, and is typically only feasible using approximate MCMC sampling. In this article we propose a minimax tilting method for exact iid simulation from the truncated multivariate normal distribution. The new methodology provides both a method for simulation and an efficient…
▽ More
Simulation from the truncated multivariate normal distribution in high dimensions is a recurrent problem in statistical computing, and is typically only feasible using approximate MCMC sampling. In this article we propose a minimax tilting method for exact iid simulation from the truncated multivariate normal distribution. The new methodology provides both a method for simulation and an efficient estimator to hitherto intractable Gaussian integrals. We prove that the estimator possesses a rare vanishing relative error asymptotic property. Numerical experiments suggest that the proposed scheme is accurate in a wide range of setups for which competing estimation schemes fail. We give an application to exact iid simulation from the Bayesian posterior of the probit regression model.
△ Less
Submitted 14 March, 2016;
originally announced March 2016.
-
Rare-event Probability Estimation via Empirical Likelihood Maximization
Authors:
A. Huang,
Z. I. Botev
Abstract:
We explore past and recent developments in rare-event probability estimation with a particular focus on a novel Monte Carlo technique Empirical Likelihood Maximization (ELM). This is a versatile method that involves sampling from a sequence of densities using MCMC and maximizing an empirical likelihood. The quantity of interest, the probability of a given rare-event, is estimated by solving a conv…
▽ More
We explore past and recent developments in rare-event probability estimation with a particular focus on a novel Monte Carlo technique Empirical Likelihood Maximization (ELM). This is a versatile method that involves sampling from a sequence of densities using MCMC and maximizing an empirical likelihood. The quantity of interest, the probability of a given rare-event, is estimated by solving a convex optimization program related to likelihood maximization. Numerical experiments are performed using this new technique and benchmarks are given against existing robust algorithms and estimators.
△ Less
Submitted 10 December, 2013;
originally announced December 2013.
-
Semiparametric Cross Entropy for rare-event simulation
Authors:
Z. I. Botev,
A. Ridder,
L. Rojas-Nandayapa
Abstract:
The Cross Entropy method is a well-known adaptive importance sampling method for rare-event probability estimation, which requires estimating an optimal importance sampling density within a parametric class. In this article we estimate an optimal importance sampling density within a wider semiparametric class of distributions. We show that this semiparametric version of the Cross Entropy method fr…
▽ More
The Cross Entropy method is a well-known adaptive importance sampling method for rare-event probability estimation, which requires estimating an optimal importance sampling density within a parametric class. In this article we estimate an optimal importance sampling density within a wider semiparametric class of distributions. We show that this semiparametric version of the Cross Entropy method frequently yields efficient estimators. We illustrate the excellent practical performance of the method with numerical experiments and show that for the problems we consider it typically outperforms alternative schemes by orders of magnitude.
△ Less
Submitted 14 October, 2013;
originally announced October 2013.
-
Spatial Process Generation
Authors:
Dirk P. Kroese,
Zdravko I. Botev
Abstract:
The generation of random spatial data on a computer is an important tool for understanding the behavior of spatial processes. In this paper we describe how to generate realizations from the main types of spatial processes, including Gaussian and Markov random fields, point processes, spatial Wiener processes, and Levy fields. Concrete MATLAB code is provided.
The generation of random spatial data on a computer is an important tool for understanding the behavior of spatial processes. In this paper we describe how to generate realizations from the main types of spatial processes, including Gaussian and Markov random fields, point processes, spatial Wiener processes, and Levy fields. Concrete MATLAB code is provided.
△ Less
Submitted 1 August, 2013;
originally announced August 2013.