-
Approximate Bayesian Computation with Statistical Distances for Model Selection
Authors:
Christian Angelopoulos,
Clara Grazian
Abstract:
Model selection is a key task in statistics, playing a critical role across various scientific disciplines. While no model can fully capture the complexities of a real-world data-generating process, identifying the model that best approximates it can provide valuable insights. Bayesian statistics offers a flexible framework for model selection by updating prior beliefs as new data becomes availabl…
▽ More
Model selection is a key task in statistics, playing a critical role across various scientific disciplines. While no model can fully capture the complexities of a real-world data-generating process, identifying the model that best approximates it can provide valuable insights. Bayesian statistics offers a flexible framework for model selection by updating prior beliefs as new data becomes available, allowing for ongoing refinement of candidate models. This is typically achieved by calculating posterior probabilities, which quantify the support for each model given the observed data. However, in cases where likelihood functions are intractable, exact computation of these posterior probabilities becomes infeasible. Approximate Bayesian Computation (ABC) has emerged as a likelihood-free method and it is traditionally used with summary statistics to reduce data dimensionality, however this often results in information loss difficult to quantify, particularly in model selection contexts. Recent advancements propose the use of full data approaches based on statistical distances, offering a promising alternative that bypasses the need for summary statistics and potentially allows recovery of the exact posterior distribution. Despite these developments, full data ABC approaches have not yet been widely applied to model selection problems. This paper seeks to address this gap by investigating the performance of ABC with statistical distances in model selection. Through simulation studies and an application to toad movement models, this work explores whether full data approaches can overcome the limitations of summary statistic-based ABC for model choice.
△ Less
Submitted 30 October, 2024; v1 submitted 28 October, 2024;
originally announced October 2024.
-
Novel Bayesian algorithms for ARFIMA long-memory processes: a comparison between MCMC and ABC approaches
Authors:
James Cohen Gabor,
Clara Grazian
Abstract:
This paper presents a comparative study of two Bayesian approaches - Markov Chain Monte Carlo (MCMC) and Approximate Bayesian Computation (ABC) - for estimating the parameters of autoregressive fractionally-integrated moving average (ARFIMA) models, which are widely used to capture long-memory in time series data. We propose a novel MCMC algorithm that filters the time series into distinct long-me…
▽ More
This paper presents a comparative study of two Bayesian approaches - Markov Chain Monte Carlo (MCMC) and Approximate Bayesian Computation (ABC) - for estimating the parameters of autoregressive fractionally-integrated moving average (ARFIMA) models, which are widely used to capture long-memory in time series data. We propose a novel MCMC algorithm that filters the time series into distinct long-memory and ARMA components, and benchmarked it against standard approaches. Additionally, a new ABC method is proposed, using three different summary statistics used for posterior estimation. The methods are implemented and evaluated through an extensive simulation study, as well as applied to a real-world financial dataset, specifically the quarterly U.S. Gross National Product (GNP) series. The results demonstrate the effectiveness of the Bayesian methods in estimating long-memory and short-memory parameters, with the filtered MCMC showing superior performance in various metrics. This study enhances our understanding of Bayesian techniques in ARFIMA modeling, providing insights into their advantages and limitations when applied to complex time series data.
△ Less
Submitted 17 October, 2024;
originally announced October 2024.
-
Scalable Expectation Propagation for Mixed-Effects Regression
Authors:
Jackson Zhou,
John T. Ormerod,
Clara Grazian
Abstract:
Mixed-effects regression models represent a useful subclass of regression models for grouped data; the introduction of random effects allows for the correlation between observations within each group to be conveniently captured when inferring the fixed effects. At a time where such regression models are being fit to increasingly large datasets with many groups, it is ideal if (a) the time it takes…
▽ More
Mixed-effects regression models represent a useful subclass of regression models for grouped data; the introduction of random effects allows for the correlation between observations within each group to be conveniently captured when inferring the fixed effects. At a time where such regression models are being fit to increasingly large datasets with many groups, it is ideal if (a) the time it takes to make the inferences scales linearly with the number of groups and (b) the inference workload can be distributed across multiple computational nodes in a numerically stable way, if the dataset cannot be stored in one location. Current Bayesian inference approaches for mixed-effects regression models do not seem to account for both challenges simultaneously. To address this, we develop an expectation propagation (EP) framework in this setting that is both scalable and numerically stable when distributed for the case where there is only one grouping factor. The main technical innovations lie in the sparse reparameterisation of the EP algorithm, and a moment propagation (MP) based refinement for multivariate random effect factor approximations. Experiments are conducted to show that this EP framework achieves linear scaling, while having comparable accuracy to other scalable approximate Bayesian inference (ABI) approaches.
△ Less
Submitted 24 September, 2024; v1 submitted 22 September, 2024;
originally announced September 2024.
-
Bayesian Consistency for Long Memory Processes: A Semiparametric Perspective
Authors:
Clara Grazian
Abstract:
In this work, we will investigate a Bayesian approach to estimating the parameters of long memory models. Long memory, characterized by the phenomenon of hyperbolic autocorrelation decay in time series, has garnered significant attention. This is because, in many situations, the assumption of short memory, such as the Markovianity assumption, can be deemed too restrictive. Applications for long me…
▽ More
In this work, we will investigate a Bayesian approach to estimating the parameters of long memory models. Long memory, characterized by the phenomenon of hyperbolic autocorrelation decay in time series, has garnered significant attention. This is because, in many situations, the assumption of short memory, such as the Markovianity assumption, can be deemed too restrictive. Applications for long memory models can be readily found in fields such as astronomy, finance, and environmental sciences. However, current parametric and semiparametric approaches to modeling long memory present challenges, particularly in the estimation process.
In this study, we will introduce various methods applied to this problem from a Bayesian perspective, along with a novel semiparametric approach for deriving the posterior distribution of the long memory parameter. Additionally, we will establish the asymptotic properties of the model. An advantage of this approach is that it allows to implement state-of-the-art efficient algorithms for nonparametric Bayesian models.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Stochastic Variational Inference for GARCH Models
Authors:
Hanwen Xuan,
Luca Maestrini,
Feng Chen,
Clara Grazian
Abstract:
Stochastic variational inference algorithms are derived for fitting various heteroskedastic time series models. We examine Gaussian, t, and skew-t response GARCH models and fit these using Gaussian variational approximating densities. We implement efficient stochastic gradient ascent procedures based on the use of control variates or the reparameterization trick and demonstrate that the proposed i…
▽ More
Stochastic variational inference algorithms are derived for fitting various heteroskedastic time series models. We examine Gaussian, t, and skew-t response GARCH models and fit these using Gaussian variational approximating densities. We implement efficient stochastic gradient ascent procedures based on the use of control variates or the reparameterization trick and demonstrate that the proposed implementations provide a fast and accurate alternative to Markov chain Monte Carlo sampling. Additionally, we present sequential updating versions of our variational algorithms, which are suitable for efficient portfolio construction and dynamic asset allocation.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Clustering MIC data through Bayesian mixture models: an application to detect M. Tuberculosis resistance mutations
Authors:
Clara Grazian
Abstract:
Antimicrobial resistance is becoming a major threat to public health throughout the world. Researchers are attempting to contrast it by developing both new antibiotics and patient-specific treatments. In the second case, whole-genome sequencing has had a huge impact in two ways: first, it is becoming cheaper and faster to perform whole-genome sequencing, and this makes it competitive with respect…
▽ More
Antimicrobial resistance is becoming a major threat to public health throughout the world. Researchers are attempting to contrast it by developing both new antibiotics and patient-specific treatments. In the second case, whole-genome sequencing has had a huge impact in two ways: first, it is becoming cheaper and faster to perform whole-genome sequencing, and this makes it competitive with respect to standard phenotypic tests; second, it is possible to statistically associate the phenotypic patterns of resistance to specific mutations in the genome. Therefore, it is now possible to develop catalogues of genomic variants associated with resistance to specific antibiotics, in order to improve prediction of resistance and suggest treatments. It is essential to have robust methods for identifying mutations associated to resistance and continuously updating the available catalogues. This work proposes a general method to study minimal inhibitory concentration (MIC) distributions and to identify clusters of strains showing different levels of resistance to antimicrobials. Once the clusters are identified and strains allocated to each of them, it is possible to perform regression method to identify with high statistical power the mutations associated with resistance. The method is applied to a new 96-well microtiter plate used for testing M. Tuberculosis.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
An application of copulas to OPEC's changing influence on fossil fuel prices
Authors:
Clara Grazian,
Alex McInnes
Abstract:
This work examines how the dependence structures between energy futures asset prices differ in two periods identified before and after the 2008 global financial crisis. These two periods were characterised by a difference in the number of extraordinary meetings of OPEC countries organised to announce a change of oil production. In the period immediately following the global financial crisis, the d…
▽ More
This work examines how the dependence structures between energy futures asset prices differ in two periods identified before and after the 2008 global financial crisis. These two periods were characterised by a difference in the number of extraordinary meetings of OPEC countries organised to announce a change of oil production. In the period immediately following the global financial crisis, the decrease in oil prices and oil and gas demand forced OPEC countries to make frequent adjustments to the production of oil, while, since the first quarter of 2010, the recovery led to more regular meetings, with only three organised extraordinary meetings. We propose to use a copula model to study how the dependence structure among energy prices changed among the two periods. The use of copula models allows to introduce flexible and realistic models for the marginal time series; once marginal parameters are estimated, the estimates are used to fit several copula models for all asset combinations. Model selection techniques based on information criteria are implemented to choose the best models both for the univariate asset prices series and for the distribution of co-movements. The changes in the dependence structure of couple of assets are investigated through copula functionals and their uncertainty estimated through a bootstrapping method. We find the strength of dependence between asset combinations considerably differ between the two periods, showing a significant decrease for all the pairs of assets.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
Advances in Bayesian random partition models: A comprehensive review
Authors:
Clara Grazian
Abstract:
Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often been shown to be inconsistent, and incorporating a dependence structure among clusters introduces additional challenges in the estimation process. In a Bayesian fr…
▽ More
Clustering is a crucial task in various domains of knowledge, including medicine, epidemiology, genomics, environmental science, economics, and visual sciences, among others. Methodologies for inferring the number of clusters have often been shown to be inconsistent, and incorporating a dependence structure among clusters introduces additional challenges in the estimation process. In a Bayesian framework, clustering is performed by treating the unknown partition as a random object and defining a prior distribution for it. This prior distribution can be induced by models assumed for the observations or directly defined on the partition itself. However, recent findings have revealed difficulties in consistently estimating the number of clusters and, consequently, the partition. Furthermore, summarizing the posterior distribution of the partition remains an open problem due to the high dimensionality of the partition space. This study aims to review Bayesian approaches for random partition models, highlighting the advantages and disadvantages of each method, and suggesting potential avenues for future research.
△ Less
Submitted 22 May, 2025; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Skew-Normal Posterior Approximations
Authors:
Jackson Zhou,
Clara Grazian,
John Ormerod
Abstract:
Many approximate Bayesian inference methods assume a particular parametric form for approximating the posterior distribution. A multivariate Gaussian distribution provides a convenient density for such approaches; examples include the Laplace, penalized quasi-likelihood, Gaussian variational, and expectation propagation methods. Unfortunately, these all ignore the potential skewness of the posteri…
▽ More
Many approximate Bayesian inference methods assume a particular parametric form for approximating the posterior distribution. A multivariate Gaussian distribution provides a convenient density for such approaches; examples include the Laplace, penalized quasi-likelihood, Gaussian variational, and expectation propagation methods. Unfortunately, these all ignore the potential skewness of the posterior distribution. We propose a modification that accounts for skewness, where key statistics of the posterior distribution are matched instead to a multivariate skew-normal distribution. A combination of simulation studies and benchmarking were conducted to compare the performance of this skew-normal matching method (both as a standalone approximation and as a post-hoc skewness adjustment) with existing Gaussian and skewed approximations. We show empirically that for small and moderate dimensional cases, skew-normal matching can be much more accurate than these other approaches. For post-hoc skewness adjustments, this comes at very little cost in additional computational time.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Bayesian Copula Directional Dependence for causal inference on gene expression data
Authors:
Vasiliki Vamvaka,
Clara Grazian
Abstract:
Modelling and understanding directional gene networks is a major challenge in biology as they play an important role in the architecture and function of genetic systems. Copula Directional Dependence (CDD) can measure the directed connectivity among variables without any strict requirements of distributional and linearity assumptions. Furthermore, copulas can achieve that by isolating the dependen…
▽ More
Modelling and understanding directional gene networks is a major challenge in biology as they play an important role in the architecture and function of genetic systems. Copula Directional Dependence (CDD) can measure the directed connectivity among variables without any strict requirements of distributional and linearity assumptions. Furthermore, copulas can achieve that by isolating the dependence structure of a joint distribution. In this work, a novel extension of the frequentist CDD in the Bayesian setting is introduced. The new method is compared against the frequentist CDD and validated on six gene interactions, three coming from a mouse scRNA-seq dataset and three coming from a bulk epigenome dataset. The results illustrate that the novel proposed Bayesian CDD was able to identify four out of six true interactions with increased robustness compared to the frequentist method. Therefore, the Bayesian CDD can be considered as an alternative way for modeling the information flow in gene networks.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
Approximate Bayesian Conditional Copulas
Authors:
Clara Grazian,
Luciana Dalla Valle,
Brunero Liseo
Abstract:
Copula models are flexible tools to represent complex structures of dependence for multivariate random variables. According to Sklar's theorem (Sklar, 1959), any d-dimensional absolutely continuous density can be uniquely represented as the product of the marginal distributions and a copula function which captures the dependence structure among the vector components. In real data applications, the…
▽ More
Copula models are flexible tools to represent complex structures of dependence for multivariate random variables. According to Sklar's theorem (Sklar, 1959), any d-dimensional absolutely continuous density can be uniquely represented as the product of the marginal distributions and a copula function which captures the dependence structure among the vector components. In real data applications, the interest of the analyses often lies on specific functionals of the dependence, which quantify aspects of it in a few numerical values. A broad literature exists on such functionals, however extensions to include covariates are still limited. This is mainly due to the lack of unbiased estimators of the copula function, especially when one does not have enough information to select the copula model. Recent advances in computational methodologies and algorithms have allowed inference in the presence of complicated likelihood functions, especially in the Bayesian approach, whose methods, despite being computationally intensive, allow us to better evaluate the uncertainty of the estimates. In this work, we present several Bayesian methods to approximate the posterior distribution of functionals of the dependence, using nonparametric models which avoid the selection of the copula function. These methods are compared in simulation studies and in two realistic applications, from civil engineering and astrophysics.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
A review of Approximate Bayesian Computation methods via density estimation: inference for simulator-models
Authors:
Clara Grazian,
Yanan Fan
Abstract:
This paper provides a review of Approximate Bayesian Computation (ABC) methods for carrying out Bayesian posterior inference, through the lens of density estimation. We describe several recent algorithms and make connection with traditional approaches. We show advantages and limitations of models based on parametric approaches and we then draw attention to developments in machine learning, which w…
▽ More
This paper provides a review of Approximate Bayesian Computation (ABC) methods for carrying out Bayesian posterior inference, through the lens of density estimation. We describe several recent algorithms and make connection with traditional approaches. We show advantages and limitations of models based on parametric approaches and we then draw attention to developments in machine learning, which we believe have the potential to make ABC scalable to higher dimensions and may be the future direction for research in this area.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
New formulation of the Logistic-Gaussian process to analyze trajectory tracking data
Authors:
Gianluca Mastrantonio,
Clara Grazian,
Sara Mancinelli,
Enrico Bibbona
Abstract:
Improved communication systems, shrinking battery sizes and the price drop of tracking devices have led to an increasing availability of trajectory tracking data. These data are often analyzed to understand animal behavior.
In this work, we propose a new model for interpreting the animal movent as a mixture of characteristic patterns, that we interpret as different behaviors. The probability tha…
▽ More
Improved communication systems, shrinking battery sizes and the price drop of tracking devices have led to an increasing availability of trajectory tracking data. These data are often analyzed to understand animal behavior.
In this work, we propose a new model for interpreting the animal movent as a mixture of characteristic patterns, that we interpret as different behaviors. The probability that the animal is behaving according to a specific pattern, at each time instant, is non-parametrically estimated using the Logistic-Gaussian process. Owing to a new formalization and the way we specify the coregionalization matrix of the associated multivariate Gaussian process, our model is invariant with respect to the choice of the reference element and of the ordering of the probability vector components. We fit the model under a Bayesian framework, and show that the Markov chain Monte Carlo algorithm we propose is straightforward to implement.
We perform a simulation study with the aim of showing the ability of the estimation procedure to retrieve the model parameters. We also test the performance of the information criterion we used to select the number of behaviors. The model is then applied to a real dataset where a wolf has been observed before and after procreation. The results are easy to interpret, and clear differences emerge in the two phases.
△ Less
Submitted 11 September, 2019; v1 submitted 1 August, 2018;
originally announced August 2018.
-
On a Loss-based prior for the number of components in mixture models
Authors:
Clara Grazian,
Cristiano Villa,
Brunero Liseo
Abstract:
We propose a prior distribution for the number of components of a finite mixture model. The novelty is that the prior distribution is obtained by considering the loss one would incur if the true value representing the number of components were not considered. The prior has an elegant and easy to implement structure, which allows to naturally include any prior information one may have as well as to…
▽ More
We propose a prior distribution for the number of components of a finite mixture model. The novelty is that the prior distribution is obtained by considering the loss one would incur if the true value representing the number of components were not considered. The prior has an elegant and easy to implement structure, which allows to naturally include any prior information one may have as well as to opt for a default solution in cases where this information is not available. The performance of the prior, and comparison with existing alternatives, is studied through the analysis of both real and simulated data.
△ Less
Submitted 4 September, 2018; v1 submitted 20 July, 2018;
originally announced July 2018.
-
Approximating the Likelihood in Approximate Bayesian Computation
Authors:
Christopher C Drovandi,
Clara Grazian,
Kerrie Mengersen,
Christian Robert
Abstract:
This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018).
The conceptual and methodological framework that underpins approximate Bayesian computation (ABC) is targetted primarily towards problems in which the likelihood is either challenging or missing. ABC uses a simulation-based non-parametric estimate of the likelihood of a summary statistic and assumes…
▽ More
This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018).
The conceptual and methodological framework that underpins approximate Bayesian computation (ABC) is targetted primarily towards problems in which the likelihood is either challenging or missing. ABC uses a simulation-based non-parametric estimate of the likelihood of a summary statistic and assumes that the generation of data from the model is computationally cheap. This chapter reviews two alternative approaches for estimating the intractable likelihood, with the goal of reducing the necessary model simulations to produce an approximate posterior. The first of these is a Bayesian version of the synthetic likelihood (SL), initially developed by Wood (2010), which uses a multivariate normal approximation to the summary statistic likelihood. Using the parametric approximation as opposed to the non-parametric approximation of ABC, it is possible to reduce the number of model simulations required. The second likelihood approximation method we consider in this chapter is based on the empirical likelihood (EL), which is a non-parametric technique and involves maximising a likelihood constructed empirically under a set of moment constraints. Mengersen et al (2013) adapt the EL framework so that it can be used to form an approximate posterior for problems where ABC can be applied, that is, for models with intractable likelihoods. However, unlike ABC and the Bayesian SL (BSL), the Bayesian EL (BCel) approach can be used to completely avoid model simulations in some cases. The BSL and BCel methods are illustrated on models of varying complexity.
△ Less
Submitted 18 March, 2018;
originally announced March 2018.
-
Jeffreys priors for mixture estimation: properties and alternatives
Authors:
Clara Grazian,
Christian P. Robert
Abstract:
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. The implementation and the properties of Jeffreys priors in several mixture settings are studied. It is shown that the associated posterior di…
▽ More
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. The implementation and the properties of Jeffreys priors in several mixture settings are studied. It is shown that the associated posterior distributions most often are improper. Nevertheless, the Jeffreys prior for the mixture weights conditionally on the parameters of the mixture components will be shown to have the property of conservativeness with respect to the number of components, in case of overfitted mixture and it can be therefore used as a default priors in this context.
△ Less
Submitted 12 December, 2017; v1 submitted 6 June, 2017;
originally announced June 2017.
-
Modelling Preference Data with the Wallenius Distribution
Authors:
Clara Grazian,
Fabrizio Leisen,
Brunero Liseo
Abstract:
The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the imp…
▽ More
The Wallenius distribution is a generalisation of the Hypergeometric distribution where weights are assigned to balls of different colours. This naturally defines a model for ranking categories which can be used for classification purposes. Since, in general, the resulting likelihood is not analytically available, we adopt an approximate Bayesian computational (ABC) approach for estimating the importance of the categories. We illustrate the performance of the estimation procedure on simulated datasets. Finally, we use the new model for analysing two datasets about movies ratings and Italian academic statisticians' journal preferences. The latter is a novel dataset collected by the authors.
△ Less
Submitted 28 June, 2018; v1 submitted 27 January, 2017;
originally announced January 2017.
-
Jeffreys priors for mixture estimation
Authors:
Clara Grazian,
Christian Robert
Abstract:
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. We study in this paper the implementation and the properties of Jeffreys priors in several mixture settings, show that the associated posterio…
▽ More
While Jeffreys priors usually are well-defined for the parameters of mixtures of distributions, they are not available in closed form. Furthermore, they often are improper priors. Hence, they have never been used to draw inference on the mixture parameters. We study in this paper the implementation and the properties of Jeffreys priors in several mixture settings, show that the associated posterior distributions most often are improper, and then propose a noninformative alternative for the analysis of mixtures.
△ Less
Submitted 20 December, 2015; v1 submitted 10 November, 2015;
originally announced November 2015.
-
Approximate Bayesian inference in semiparametric copula models
Authors:
Clara Grazian,
Brunero Liseo
Abstract:
We describe a simple method for making inference on a functional of a multivariate distribution. The method is based on a copula representation of the multivariate distribution and it is based on the properties of an Approximate Bayesian Monte Carlo algorithm, where the proposed values of the functional of interest are weighed in terms of their empirical likelihood. This method is particularly use…
▽ More
We describe a simple method for making inference on a functional of a multivariate distribution. The method is based on a copula representation of the multivariate distribution and it is based on the properties of an Approximate Bayesian Monte Carlo algorithm, where the proposed values of the functional of interest are weighed in terms of their empirical likelihood. This method is particularly useful when the "true" likelihood function associated with the working model is too costly to evaluate or when the working model is only partially specified.
△ Less
Submitted 16 July, 2017; v1 submitted 10 March, 2015;
originally announced March 2015.
-
Accelerating Metropolis-Hastings algorithms by Delayed Acceptance
Authors:
Marco Banterle,
Clara Grazian,
Anthony Lee,
Christian P. Robert
Abstract:
MCMC algorithms such as Metropolis-Hastings algorithms are slowed down by the computation of complex target distributions as exemplified by huge datasets. We offer in this paper a useful generalisation of the Delayed Acceptance approach, devised to reduce the computational costs of such algorithms by a simple and universal divide-and-conquer strategy. The idea behind the generic acceleration is to…
▽ More
MCMC algorithms such as Metropolis-Hastings algorithms are slowed down by the computation of complex target distributions as exemplified by huge datasets. We offer in this paper a useful generalisation of the Delayed Acceptance approach, devised to reduce the computational costs of such algorithms by a simple and universal divide-and-conquer strategy. The idea behind the generic acceleration is to divide the acceptance step into several parts, aiming at a major reduction in computing time that out-ranks the corresponding reduction in acceptance probability. Each of the components can be sequentially compared with a uniform variate, the first rejection signalling that the proposed value is considered no further. We develop moreover theoretical bounds for the variance of associated estimators with respect to the variance of the standard Metropolis-Hastings and detail some results on optimal scaling and general optimisation of the procedure. We illustrate those accelerating features on a series of examples
△ Less
Submitted 5 March, 2015; v1 submitted 3 March, 2015;
originally announced March 2015.
-
A discussion of "Bayesian model selection based on proper scoring rules" by A.P. Dawid and M. Musio
Authors:
Clara Grazian,
Ilaria Masiani,
Christian P. Robert
Abstract:
This note is a discussion of the article "Bayesian model selection based on proper scoring rules" by A.P. Dawid and M. Musio, to appear in Bayesian Analysis. While appreciating the concepts behind the use of proper scoring rules, including the inclusion of improper priors, we point out here some possible practical difficulties with the advocated approach.
This note is a discussion of the article "Bayesian model selection based on proper scoring rules" by A.P. Dawid and M. Musio, to appear in Bayesian Analysis. While appreciating the concepts behind the use of proper scoring rules, including the inclusion of improper priors, we point out here some possible practical difficulties with the advocated approach.
△ Less
Submitted 26 February, 2015;
originally announced February 2015.
-
Accelerating Metropolis-Hastings algorithms: Delayed acceptance with prefetching
Authors:
Marco Banterle,
Clara Grazian,
Christian P. Robert
Abstract:
MCMC algorithms such as Metropolis-Hastings algorithms are slowed down by the computation of complex target distributions as exemplified by huge datasets. We offer in this paper an approach to reduce the computational costs of such algorithms by a simple and universal divide-and-conquer strategy. The idea behind the generic acceleration is to divide the acceptance step into several parts, aiming a…
▽ More
MCMC algorithms such as Metropolis-Hastings algorithms are slowed down by the computation of complex target distributions as exemplified by huge datasets. We offer in this paper an approach to reduce the computational costs of such algorithms by a simple and universal divide-and-conquer strategy. The idea behind the generic acceleration is to divide the acceptance step into several parts, aiming at a major reduction in computing time that outranks the corresponding reduction in acceptance probability. The division decomposes the "prior x likelihood" term into a product such that some of its components are much cheaper to compute than others. Each of the components can be sequentially compared with a uniform variate, the first rejection signalling that the proposed value is considered no further, This approach can in turn be accelerated as part of a prefetching algorithm taking advantage of the parallel abilities of the computer at hand. We illustrate those accelerating features on a series of toy and realistic examples.
△ Less
Submitted 10 June, 2014;
originally announced June 2014.
-
Approximate Integrated Likelihood via ABC methods
Authors:
Clara Grazian,
Brunero Liseo
Abstract:
We propose a novel use of a recent new computational tool for Bayesian inference, namely the Approximate Bayesian Computation (ABC) methodology. ABC is a way to handle models for which the likelihood function may be intractable or even unavailable and/or too costly to evaluate; in particular, we consider the problem of eliminating the nuisance parameters from a complex statistical model in order t…
▽ More
We propose a novel use of a recent new computational tool for Bayesian inference, namely the Approximate Bayesian Computation (ABC) methodology. ABC is a way to handle models for which the likelihood function may be intractable or even unavailable and/or too costly to evaluate; in particular, we consider the problem of eliminating the nuisance parameters from a complex statistical model in order to produce a likelihood function depending on the quantity of interest only. Given a proper prior for the entire vector parameter, we propose to approximate the integrated likelihood by the ratio of kernel estimators of the marginal posterior and prior for the quantity of interest. We present several examples.
△ Less
Submitted 3 March, 2014;
originally announced March 2014.