-
A Standardization Procedure to Incorporate Variance Partitioning Based Priors in Latent Gaussian Models
Authors:
Luisa Ferrari,
Massimo Ventrucci
Abstract:
Latent Gaussian Models (LGMs) are a subset of Bayesian Hierarchical models where Gaussian priors, conditional on variance parameters, are assigned to all effects in the model. LGMs are employed in many fields for their flexibility and computational efficiency. However, practitioners find prior elicitation on the variance parameters challenging because of a lack of intuitive interpretation for them…
▽ More
Latent Gaussian Models (LGMs) are a subset of Bayesian Hierarchical models where Gaussian priors, conditional on variance parameters, are assigned to all effects in the model. LGMs are employed in many fields for their flexibility and computational efficiency. However, practitioners find prior elicitation on the variance parameters challenging because of a lack of intuitive interpretation for them. Recently, several papers have tackled this issue by rethinking the model in terms of variance partitioning (VP) and assigning priors to parameters reflecting the relative contribution of each effect to the total variance. So far, the class of priors based on VP has been mainly deployed for random effects and fixed effects separately. This work presents a novel standardization procedure that expands the applicability of VP priors to a broader class of LGMs, including both fixed and random effects. We describe the steps required for standardization through various examples, with a particular focus on the popular class of intrinsic Gaussian Markov random fields (IGMRFs). The practical advantages of standardization are demonstrated with simulated data and a real dataset on survival analysis.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Informed Bayesian Finite Mixture Models via Asymmetric Dirichlet Priors
Authors:
Garritt L. Page,
Massimo Ventrucci,
Maria Franco-Villoria
Abstract:
Finite mixture models are flexible methods that are commonly used for model-based clustering. A recent focus in the model-based clustering literature is to highlight the difference between the number of components in a mixture model and the number of clusters. The number of clusters is more relevant from a practical stand point, but to date, the focus of prior distribution formulation has been on…
▽ More
Finite mixture models are flexible methods that are commonly used for model-based clustering. A recent focus in the model-based clustering literature is to highlight the difference between the number of components in a mixture model and the number of clusters. The number of clusters is more relevant from a practical stand point, but to date, the focus of prior distribution formulation has been on the number of components. In light of this, we develop a finite mixture methodology that permits eliciting prior information directly on the number of clusters in an intuitive way. This is done by employing an asymmetric Dirichlet distribution as a prior on the weights of a finite mixture. Further, a penalized complexity motivated prior is employed for the Dirichlet shape parameter. We illustrate the ease to which prior information can be elicited via our construction and the flexibility of the resulting induced prior on the number of clusters. We also demonstrate the utility of our approach using numerical experiments and two real world data sets.
△ Less
Submitted 1 August, 2023;
originally announced August 2023.
-
A comparison of priors for variance parameters in Bayesian basket trials
Authors:
Massimo Ventrucci,
Alessandro Vagheggini
Abstract:
Phase II basket trials are popular tools to evaluate efficacy of a new treatment targeting genetic alteration common to a set of different cancer histologies. Efficient designs are obtained by pooling data from the different arms (e.g., cancer histologies) via Bayesian hierarchical modelling, with a variance parameter controlling the strength of shrinkage of each arm treatment effect to the overal…
▽ More
Phase II basket trials are popular tools to evaluate efficacy of a new treatment targeting genetic alteration common to a set of different cancer histologies. Efficient designs are obtained by pooling data from the different arms (e.g., cancer histologies) via Bayesian hierarchical modelling, with a variance parameter controlling the strength of shrinkage of each arm treatment effect to the overall treatment effect. One critical aspect of this approach is that prior choice on the variance plays a major role in determining the strength of shrinkage and impacts the operating characteristics of the design. We review the priors most commonly adopted in previous works and compare them with the recently introduced penalized complexity (PC) priors. Our simulation study shows comparable behaviour for the PC prior and the gold standard choice half-t prior, with the former performing better in the homogeneous scenario where all histologies respond similarly to the treatment. We argue that PC priors offer advantages over other priors because they allow the user to handle the degree of shrinkage by means of only one parameter and can be elicited based on clinical opinion when available.
△ Less
Submitted 29 October, 2022;
originally announced October 2022.
-
Variance partitioning in spatio-temporal disease mapping models
Authors:
Maria Franco-Villoria,
Massimo Ventrucci,
Håvard Rue
Abstract:
Bayesian disease mapping, yet if undeniably useful to describe variation in risk over time and space, comes with the hurdle of prior elicitation on hard-to-interpret random effect precision parameters. We introduce a reparametrized version of the popular spatio-temporal interaction models, based on Kronecker product intrinsic Gaussian Markov Random Fields, that we name the variance partitioning (V…
▽ More
Bayesian disease mapping, yet if undeniably useful to describe variation in risk over time and space, comes with the hurdle of prior elicitation on hard-to-interpret random effect precision parameters. We introduce a reparametrized version of the popular spatio-temporal interaction models, based on Kronecker product intrinsic Gaussian Markov Random Fields, that we name the variance partitioning (VP) model. The VP model includes a mixing parameter that balances the contribution of the main and interaction effects to the total (generalized) variance and enhances interpretability. The use of a penalized complexity prior on the mixing parameter aids in coding prior information in a intuitive way. We illustrate the advantages of the VP model using two case studies.
△ Less
Submitted 17 May, 2022; v1 submitted 27 September, 2021;
originally announced September 2021.
-
A spectral adjustment for spatial confounding
Authors:
Yawen Guan,
Garritt L. Page,
Brian J Reich,
Massimo Ventrucci,
Shu Yang
Abstract:
Adjusting for an unmeasured confounder is generally an intractable problem, but in the spatial setting it may be possible under certain conditions. In this paper, we derive necessary conditions on the coherence between the treatment variable of interest and the unmeasured confounder that ensure the causal effect of the treatment is estimable. We specify our model and assumptions in the spectral do…
▽ More
Adjusting for an unmeasured confounder is generally an intractable problem, but in the spatial setting it may be possible under certain conditions. In this paper, we derive necessary conditions on the coherence between the treatment variable of interest and the unmeasured confounder that ensure the causal effect of the treatment is estimable. We specify our model and assumptions in the spectral domain to allow for different degrees of confounding at different spatial resolutions. The key assumption that ensures identifiability is that confounding present at global scales dissipates at local scales. We show that this assumption in the spectral domain is equivalent to adjusting for global-scale confounding in the spatial domain by adding a spatially smoothed version of the treatment variable to the mean of the response variable. Within this general framework, we propose a sequence of confounder adjustment methods that range from parametric adjustments based on the Matern coherence function to more robust semi-parametric methods that use smoothing splines. These ideas are applied to areal and geostatistical data for both simulated and real datasets
△ Less
Submitted 21 December, 2020;
originally announced December 2020.
-
PC priors for residual correlation parameters in one-factor mixed models
Authors:
Massimo Ventrucci,
Daniela Cocchi,
Gemma Burgazzi,
Alex Laini
Abstract:
Lack of independence in the residuals from linear regression motivates the use of random effect models in many applied fields. We start from the one-way anova model and extend it to a general class of one-factor Bayesian mixed models, discussing several correlation structures for the within group residuals. All the considered group models are parametrized in terms of a single correlation (hyper-)p…
▽ More
Lack of independence in the residuals from linear regression motivates the use of random effect models in many applied fields. We start from the one-way anova model and extend it to a general class of one-factor Bayesian mixed models, discussing several correlation structures for the within group residuals. All the considered group models are parametrized in terms of a single correlation (hyper-)parameter, controlling the shrinkage towards the case of independent residuals (iid). We derive a penalized complexity (PC) prior for the correlation parameter of a generic group model. This prior has desirable properties from a practical point of view: i) it ensures appropriate shrinkage to the iid case; ii) it depends on a scaling parameter whose choice only requires a prior guess on the proportion of total variance explained by the grouping factor; iii) it is defined on a distance scale common to all group models, thus the scaling parameter can be chosen in the same manner regardless the adopted group model. We show the benefit of using these PC priors in a case study in community ecology where different group models are compared.
△ Less
Submitted 3 December, 2019; v1 submitted 23 February, 2019;
originally announced February 2019.
-
A unified view on Bayesian varying coefficient models
Authors:
Maria Franco-Villoria,
Massimo Ventrucci,
Håvard Rue
Abstract:
Varying coefficient models are useful in applications where the effect of the covariate might depend on some other covariate such as time or location. Various applications of these models often give rise to case-specific prior distributions for the parameter(s) describing how much the coefficients vary. In this work, we introduce a unified view of varying coefficients models, arguing for a way of…
▽ More
Varying coefficient models are useful in applications where the effect of the covariate might depend on some other covariate such as time or location. Various applications of these models often give rise to case-specific prior distributions for the parameter(s) describing how much the coefficients vary. In this work, we introduce a unified view of varying coefficients models, arguing for a way of specifying these prior distributions that are coherent across various applications, avoid overfitting and have a coherent interpretation. We do this by considering varying coefficients models as a flexible extension of the natural simpler model and capitalising on the recently proposed framework of penalized complexity (PC) priors. We illustrate our approach in two spatial examples where varying coefficient models are relevant.
△ Less
Submitted 4 December, 2019; v1 submitted 6 June, 2018;
originally announced June 2018.
-
P-spline smoothing for spatial data collected worldwide
Authors:
Fedele Greco,
Massimo Ventrucci,
Elisa Castelli
Abstract:
Spatial data collected worldwide at a huge number of locations are frequently used in environmental and climate studies. Spatial modelling for this type of data presents both methodological and computational challenges. In this work we illustrate a computationally efficient non parametric framework to model and estimate the spatial field while accounting for geodesic distances between locations. T…
▽ More
Spatial data collected worldwide at a huge number of locations are frequently used in environmental and climate studies. Spatial modelling for this type of data presents both methodological and computational challenges. In this work we illustrate a computationally efficient non parametric framework to model and estimate the spatial field while accounting for geodesic distances between locations. The spatial field is modelled via penalized splines (P-splines) using intrinsic Gaussian Markov Random Field (GMRF) priors for the spline coefficients. The key idea is to use the sphere as a surrogate for the Globe, then build the basis of B-spline functions on a geodesic grid system. The basis matrix is sparse and so is the precision matrix of the GMRF prior, thus computational efficiency is gained by construction. We illustrate the approach on a real climate study, where the goal is to identify the Intertropical Convergence Zone using high-resolution remote sensing data.
△ Less
Submitted 15 November, 2017;
originally announced November 2017.
-
A note on intrinsic Conditional Autoregressive models for disconnected graphs
Authors:
Anna Freni-Sterrantino,
Massimo Ventrucci,
Håvard Rue
Abstract:
In this note we discuss (Gaussian) intrinsic conditional autoregressive (CAR) models for disconnected graphs, with the aim of providing practical guidelines for how these models should be defined, scaled and implemented. We show how these suggestions can be implemented in two examples on disease mapping.
In this note we discuss (Gaussian) intrinsic conditional autoregressive (CAR) models for disconnected graphs, with the aim of providing practical guidelines for how these models should be defined, scaled and implemented. We show how these suggestions can be implemented in two examples on disease mapping.
△ Less
Submitted 13 May, 2017;
originally announced May 2017.
-
Penalized complexity priors for degrees of freedom in Bayesian P-splines
Authors:
Massimo Ventrucci,
Håvard Rue
Abstract:
Bayesian P-splines assume an intrinsic Gaussian Markov random field prior on the spline coefficients, conditional on a precision hyper-parameter $τ$. Prior elicitation of $τ$ is difficult. To overcome this issue we aim to building priors on an interpretable property of the model, indicating the complexity of the smooth function to be estimated. Following this idea, we propose Penalized Complexity…
▽ More
Bayesian P-splines assume an intrinsic Gaussian Markov random field prior on the spline coefficients, conditional on a precision hyper-parameter $τ$. Prior elicitation of $τ$ is difficult. To overcome this issue we aim to building priors on an interpretable property of the model, indicating the complexity of the smooth function to be estimated. Following this idea, we propose Penalized Complexity (PC) priors for the number of effective degrees of freedom. We present the general ideas behind the construction of these new PC priors, describe their properties and show how to implement them in P-splines for Gaussian data.
△ Less
Submitted 27 April, 2016; v1 submitted 18 November, 2015;
originally announced November 2015.