-
Algorithmic Bayesian Group Gibbs Selection
Authors:
Alan Lenarcic,
William Valdar
Abstract:
Bayesian model selection, with precedents in George and McCulloch (1993) and Abramovich et al. (1998), support credibility measures that relate model uncertainty, but computation can be costly when sparse priors are approximate. We design an exact selection engine suitable for Gauss noise, t-distributed noise, and logistic learning, benefiting from data-structures derived from coordinate descent l…
▽ More
Bayesian model selection, with precedents in George and McCulloch (1993) and Abramovich et al. (1998), support credibility measures that relate model uncertainty, but computation can be costly when sparse priors are approximate. We design an exact selection engine suitable for Gauss noise, t-distributed noise, and logistic learning, benefiting from data-structures derived from coordinate descent lasso. Gibbs sampler chains are stored in a compressed binary format compatible with Equi-Energy (Kou et al., 2006) tempering. We achieve a grouped-effects selection model, similar to the setting for group lasso, to determine co-entry of coefficients into the model. We derive a functional integrand for group inclusion, and introduce a new MCMC switching step to avoid numerical integration. Theorems show this step has exponential convergence to target distribution. We demonstrate a role for group selection to inform on genetic decomposition in a diallel experiment, and identify potential quantitative trait loci in p > 40K Heterogenous Stock haplotype/phenotype studies.
△ Less
Submitted 10 March, 2023; v1 submitted 9 January, 2019;
originally announced January 2019.
-
Bayesian Manifold-Constrained-Prior Model for an Experiment to Locate Xce
Authors:
Alan B. Lenarcic,
John D. Calaway,
Fernando Pardo-Manuel de Villena,
William Valdar
Abstract:
We propose an analysis for a novel experiment intended to locate the genetic locus Xce (X-chromosome controlling element), which biases the stochastic process of X-inactivation in the mouse. X-inactivation bias is a phenomenon where cells in the embryo randomly choose one parental chromosome to inactivate, but show an average bias towards one parental strain. Measurement of allele-specific gene-ex…
▽ More
We propose an analysis for a novel experiment intended to locate the genetic locus Xce (X-chromosome controlling element), which biases the stochastic process of X-inactivation in the mouse. X-inactivation bias is a phenomenon where cells in the embryo randomly choose one parental chromosome to inactivate, but show an average bias towards one parental strain. Measurement of allele-specific gene-expression through pyrosequencing was conducted on mouse crosses of an uncharacterized parent with known carriers. Our Bayesian analysis is suitable for this adaptive experimental design, accounting for the biases and differences in precision among genes. Model identifiability is facilitated by priors constrained to a manifold. We show that reparameterized slice-sampling can suitably tackle a general class of constrained priors. We demonstrate a physical model, based upon a "weighted-coin" hypothesis, that predicts X-inactivation ratios in untested crosses. This model suggests that Xce alleles differ due to a process known as copy number variation, where stronger Xce alleles are shorter sequences.
△ Less
Submitted 20 December, 2018;
originally announced December 2018.
-
Permutation tests of non-exchangeable null models
Authors:
Jeffrey Roach,
William Valdar
Abstract:
Generalizations to the permutation test are introduced to allow for situations in which the null model is not exchangeable. It is shown that the generalized permutation tests are exact, and a partial converse: that any test function that is exact on all probability densities coincides with a generalized permutation test on a particular region, is established. A most powerful generalized permutatio…
▽ More
Generalizations to the permutation test are introduced to allow for situations in which the null model is not exchangeable. It is shown that the generalized permutation tests are exact, and a partial converse: that any test function that is exact on all probability densities coincides with a generalized permutation test on a particular region, is established. A most powerful generalized permutation test is derived in closed form. Approximations to the most powerful generalized permutation test are proposed to reduce the computational burden required to compute the complete test. In particular, an explicit form for the approximate test is derived in terms of a multinomial Bernstein polynomial approximation, and its convergence to the most powerful generalized permutation test is demonstrated. In the case where the determination of p-values is of greater interest than testing of hypotheses, two approaches to estimation of significance are analyzed. Bounds on the deviation from significance of the exact most powerful test are given in terms of sample size. For both estimators, as sample size approaches infinity, the estimator converges to the significance of the most powerful generalized permutation test under mild conditions. Applications of generalized permutation testing to linear mixed models are provided.
△ Less
Submitted 30 August, 2018;
originally announced August 2018.
-
Joint Estimation of Multiple Dependent Gaussian Graphical Models with Applications to Mouse Genomics
Authors:
Yuying Xie,
Yufeng Liu,
William Valdar
Abstract:
Gaussian graphical models are widely used to represent conditional dependence among random variables. In this paper, we propose a novel estimator for data arising from a group of Gaussian graphical models that are themselves dependent. A motivating example is that of modeling gene expression collected on multiple tissues from the same individual: here the multivariate outcome is affected by depend…
▽ More
Gaussian graphical models are widely used to represent conditional dependence among random variables. In this paper, we propose a novel estimator for data arising from a group of Gaussian graphical models that are themselves dependent. A motivating example is that of modeling gene expression collected on multiple tissues from the same individual: here the multivariate outcome is affected by dependencies acting not only at the level of the specific tissues, but also at the level of the whole body; existing methods that assume independence among graphs are not applicable in this case. To estimate multiple dependent graphs, we decompose the problem into two graphical layers: the systemic layer, which affects all outcomes and thereby induces cross- graph dependence, and the category-specific layer, which represents graph-specific variation. We propose a graphical EM technique that estimates both layers jointly, establish estimation consistency and selection sparsistency of the proposed estimator, and confirm by simulation that the EM method is superior to a simple one-step method. We apply our technique to mouse genomics data and obtain biologically plausible results.
△ Less
Submitted 30 August, 2016;
originally announced August 2016.
-
A Permutation Approach for Selecting the Penalty Parameter in Penalized Model Selection
Authors:
Jeremy Sabourin,
William Valdar,
Andrew Nobel
Abstract:
We describe a simple, efficient, permutation based procedure for selecting the penalty parameter in the LASSO. The procedure, which is intended for applications where variable selection is the primary focus, can be applied in a variety of structural settings, including generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addi…
▽ More
We describe a simple, efficient, permutation based procedure for selecting the penalty parameter in the LASSO. The procedure, which is intended for applications where variable selection is the primary focus, can be applied in a variety of structural settings, including generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of three real data sets in which permutation selection is compared with cross-validation (CV), the Bayesian information criterion (BIC), and a selection method based on recently developed testing procedures for the LASSO.
△ Less
Submitted 8 April, 2014;
originally announced April 2014.