-
Review and Demonstration of a Mixture Representation for Simulation from Densities Involving Sums of Powers
Authors:
Maryclare Griffin
Abstract:
Penalized and robust regression, especially when approached from a Bayesian perspective, can involve the problem of simulating a random variable $\boldsymbol z$ from a posterior distribution that includes a term proportional to a sum of powers, $\|\boldsymbol z \|^q_q$, on the log scale. However, many popular gradient-based methods for Markov Chain Monte Carlo simulation from such posterior distri…
▽ More
Penalized and robust regression, especially when approached from a Bayesian perspective, can involve the problem of simulating a random variable $\boldsymbol z$ from a posterior distribution that includes a term proportional to a sum of powers, $\|\boldsymbol z \|^q_q$, on the log scale. However, many popular gradient-based methods for Markov Chain Monte Carlo simulation from such posterior distributions use Hamiltonian Monte Carlo and accordingly require conditions on the differentiability of the unnormalized posterior distribution that do not hold when $q \leq 1$ (Plummer, 2023). This is limiting; the setting where $q \leq 1$ includes widely used sparsity inducing penalized regression models and heavy tailed robust regression models. In the special case where $q = 1$, a latent variable representation that facilitates simulation from such a posterior distribution is well known. However, the setting where $q < 1$ has not been treated as thoroughly. In this note, we review the availability of a latent variable representation described in Devroye (2009), show how it can be used to simulate from such posterior distributions when $0 < q < 2$, and demonstrate its utility in the context of estimating the parameters of a Bayesian penalized regression model.
△ Less
Submitted 2 August, 2024;
originally announced August 2024.
-
A Simple Approach for Local and Global Variable Importance in Nonlinear Regression Models
Authors:
Emily T. Winn-Nuñez,
Maryclare Griffin,
Lorin Crawford
Abstract:
The ability to interpret machine learning models has become increasingly important as their usage in data science continues to rise. Most current interpretability methods are optimized to work on either (\textit{i}) a global scale, where the goal is to rank features based on their contributions to overall variation in an observed population, or (\textit{ii}) the local level, which aims to detail o…
▽ More
The ability to interpret machine learning models has become increasingly important as their usage in data science continues to rise. Most current interpretability methods are optimized to work on either (\textit{i}) a global scale, where the goal is to rank features based on their contributions to overall variation in an observed population, or (\textit{ii}) the local level, which aims to detail on how important a feature is to a particular individual in the data set. In this work, a new operator is proposed called the "GlObal And Local Score" (GOALS): a simple \textit{post hoc} approach to simultaneously assess local and global feature variable importance in nonlinear models. Motivated by problems in biomedicine, the approach is demonstrated using Gaussian process regression where the task of understanding how genetic markers are associated with disease progression both within individuals and across populations is of high interest. Detailed simulations and real data analyses illustrate the flexible and efficient utility of GOALS over state-of-the-art variable importance strategies.
△ Less
Submitted 10 August, 2023; v1 submitted 3 February, 2023;
originally announced February 2023.
-
Improved Pathwise Coordinate Descent for Power Penalties
Authors:
Maryclare Griffin
Abstract:
Pathwise coordinate descent algorithms have been used to compute entire solution paths for lasso and other penalized regression problems quickly with great success. They improve upon cold start algorithms by solving the problems that make up the solution path sequentially for an ordered set of tuning parameter values, instead of solving each problem separately. However, extending pathwise coordina…
▽ More
Pathwise coordinate descent algorithms have been used to compute entire solution paths for lasso and other penalized regression problems quickly with great success. They improve upon cold start algorithms by solving the problems that make up the solution path sequentially for an ordered set of tuning parameter values, instead of solving each problem separately. However, extending pathwise coordinate descent algorithms to more the general bridge or power family of $\ell_q$ penalties is challenging. Faster algorithms for computing solution paths for these penalties are needed because $\ell_q$ penalized regression problems can be nonconvex and especially burdensome to solve. In this paper, we show that a reparameterization of $\ell_q$ penalized regression problems is more amenable to pathwise coordinate descent algorithms. This allows us to improve computation of the mode-thresholding function for $\ell_q$ penalized regression problems in practice and introduce two separate pathwise algorithms. We show that either pathwise algorithm is faster than the corresponding cold-start alternative, and demonstrate that different pathwise algorithms may be more likely to reach better solutions.
△ Less
Submitted 14 August, 2023; v1 submitted 4 March, 2022;
originally announced March 2022.
-
Log-Gaussian Cox Process Modeling of Large Spatial Lightning Data using Spectral and Laplace Approximations
Authors:
Megan L. Gelsinger,
Maryclare Griffin,
David S. Matteson,
Joseph Guinness
Abstract:
Lightning is a destructive and highly visible product of severe storms, yet there is still much to be learned about the conditions under which lightning is most likely to occur. The GOES-16 and GOES-17 satellites, launched in 2016 and 2018 by NOAA and NASA, collect a wealth of data regarding individual lightning strike occurrence and potentially related atmospheric variables. The acute nature and…
▽ More
Lightning is a destructive and highly visible product of severe storms, yet there is still much to be learned about the conditions under which lightning is most likely to occur. The GOES-16 and GOES-17 satellites, launched in 2016 and 2018 by NOAA and NASA, collect a wealth of data regarding individual lightning strike occurrence and potentially related atmospheric variables. The acute nature and inherent spatial correlation in lightning data renders standard regression analyses inappropriate. Further, computational considerations are foregrounded by the desire to analyze the immense and rapidly increasing volume of lightning data. We present a new computationally feasible method that combines spectral and Laplace approximations in an EM algorithm, denoted SLEM, to fit the widely popular log-Gaussian Cox process model to large spatial point pattern datasets. In simulations, we find SLEM is competitive with contemporary techniques in terms of speed and accuracy. When applied to two lightning datasets, SLEM provides better out-of-sample prediction scores and quicker runtimes, suggesting its particular usefulness for analyzing lightning data, which tend to have sparse signals.
△ Less
Submitted 30 November, 2021;
originally announced November 2021.
-
Likelihood Inference for Possibly Non-Stationary Processes via Adaptive Overdifferencing
Authors:
Maryclare Griffin,
Gennady Samorodnitsky,
David S. Matteson
Abstract:
We make an observation that facilitates exact likelihood-based inference for the parameters of the popular ARFIMA model without requiring stationarity by allowing the upper bound $\bar{d}$ for the memory parameter $d$ to exceed $0.5$: estimating the parameters of a single non-stationary ARFIMA model is equivalent to estimating the parameters of a sequence of stationary ARFIMA models. This allows f…
▽ More
We make an observation that facilitates exact likelihood-based inference for the parameters of the popular ARFIMA model without requiring stationarity by allowing the upper bound $\bar{d}$ for the memory parameter $d$ to exceed $0.5$: estimating the parameters of a single non-stationary ARFIMA model is equivalent to estimating the parameters of a sequence of stationary ARFIMA models. This allows for the use of existing methods for evaluating the likelihood for an invertible and stationary ARFIMA model. This enables improved inference because many standard methods perform poorly when estimates are close to the boundary of the parameter space. It also allows us to leverage the wealth of likelihood approximations that have been introduced for estimating the parameters of a stationary process. We explore how estimation of the memory parameter $d$ depends on the upper bound $\bar{d}$ and introduce adaptive procedures for choosing $\bar{d}$. We show via simulation how our adaptive procedures estimate the memory parameter well, relative to existing alternatives, when the true value is as large as 2.5.
△ Less
Submitted 9 January, 2025; v1 submitted 8 November, 2020;
originally announced November 2020.
-
Modeling a Nonlinear Biophysical Trend Followed by Long-Memory Equilibrium with Unknown Change Point
Authors:
Wenyu Zhang,
Maryclare Griffin,
David S. Matteson
Abstract:
Measurements of many biological processes are characterized by an initial trend period followed by an equilibrium period. Scientists may wish to quantify features of the two periods, as well as the timing of the change point. Specifically, we are motivated by problems in the study of electrical cell-substrate impedance sensing (ECIS) data. ECIS is a popular new technology which measures cell behav…
▽ More
Measurements of many biological processes are characterized by an initial trend period followed by an equilibrium period. Scientists may wish to quantify features of the two periods, as well as the timing of the change point. Specifically, we are motivated by problems in the study of electrical cell-substrate impedance sensing (ECIS) data. ECIS is a popular new technology which measures cell behavior non-invasively. Previous studies using ECIS data have found that different cell types can be classified by their equilibrium behavior. However, it can be challenging to identify when equilibrium has been reached, and to quantify the relevant features of cells' equilibrium behavior. In this paper, we assume that measurements during the trend period are independent deviations from a smooth nonlinear function of time, and that measurements during the equilibrium period are characterized by a simple long memory model. We propose a method to simultaneously estimate the parameters of the trend and equilibrium processes and locate the change point between the two. We find that this method performs well in simulations and in practice. When applied to ECIS data, it produces estimates of change points and measures of cell equilibrium behavior which offer improved classification of infected and uninfected cells.
△ Less
Submitted 19 September, 2020; v1 submitted 18 July, 2020;
originally announced July 2020.
-
Structured Shrinkage Priors
Authors:
Maryclare Griffin,
Peter D. Hoff
Abstract:
In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured an…
▽ More
In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured and sparse estimates. In this paper we develop structured shrinkage priors that generalize multivariate normal, Laplace, exponential power and normal-gamma priors. These priors allow the regression coefficients to be correlated a priori without sacrificing elementwise sparsity or shrinkage. The primary challenges in working with these structured shrinkage priors are computational, as the corresponding penalties are intractable integrals and the full conditional distributions that are needed to approximate the posterior mode or simulate from the posterior distribution may be non-standard. We overcome these issues using a flexible elliptical slice sampling procedure, and demonstrate that these priors can be used to introduce structure while preserving sparsity.
△ Less
Submitted 26 April, 2023; v1 submitted 13 February, 2019;
originally announced February 2019.
-
Testing Sparsity-Inducing Penalties
Authors:
Maryclare Griffin,
Peter D. Hoff
Abstract:
Many penalized maximum likelihood estimators correspond to posterior mode estimators under specific prior distributions. Appropriateness of a particular class of penalty functions can therefore be interpreted as the appropriateness of a prior for the parameters. For example, the appropriateness of a lasso penalty for regression coefficients depends on the extent to which the empirical distribution…
▽ More
Many penalized maximum likelihood estimators correspond to posterior mode estimators under specific prior distributions. Appropriateness of a particular class of penalty functions can therefore be interpreted as the appropriateness of a prior for the parameters. For example, the appropriateness of a lasso penalty for regression coefficients depends on the extent to which the empirical distribution of the regression coefficients resembles a Laplace distribution. We give a testing procedure of whether or not a Laplace prior is appropriate and accordingly, whether or not using a lasso penalized estimate is appropriate. This testing procedure is designed to have power against exponential power priors which correspond to $\ell_q$ penalties. Via simulations, we show that this testing procedure achieves the desired level and has enough power to detect violations of the Laplace assumption when the numbers of observations and unknown regression coefficients are large. We then introduce an adaptive procedure that chooses a more appropriate prior and corresponding penalty from the class of exponential power priors when the null hypothesis is rejected. We show that this can improve estimation of the regression coefficients both when they are drawn from an exponential power distribution and when they are drawn from a spike-and-slab distribution.
△ Less
Submitted 8 September, 2018; v1 submitted 17 December, 2017;
originally announced December 2017.
-
Lasso ANOVA Decompositions for Matrix and Tensor Data
Authors:
Maryclare Griffin,
Peter D. Hoff
Abstract:
Consider the problem of estimating the entries of an unknown mean matrix or tensor given a single noisy realization. In the matrix case, this problem can be addressed by decomposing the mean matrix into a component that is additive in the rows and columns, i.e.\ the additive ANOVA decomposition of the mean matrix, plus a matrix of elementwise effects, and assuming that the elementwise effects may…
▽ More
Consider the problem of estimating the entries of an unknown mean matrix or tensor given a single noisy realization. In the matrix case, this problem can be addressed by decomposing the mean matrix into a component that is additive in the rows and columns, i.e.\ the additive ANOVA decomposition of the mean matrix, plus a matrix of elementwise effects, and assuming that the elementwise effects may be sparse. Accordingly, the mean matrix can be estimated by solving a penalized regression problem, applying a lasso penalty to the elementwise effects. Although solving this penalized regression problem is straightforward, specifying appropriate values of the penalty parameters is not. Leveraging the posterior mode interpretation of the penalized regression problem, moment-based empirical Bayes estimators of the penalty parameters can be defined. Estimation of the mean matrix using these these moment-based empirical Bayes estimators can be called LANOVA penalization, and the corresponding estimate of the mean matrix can be called the LANOVA estimate. The empirical Bayes estimators are shown to be consistent. Additionally, LANOVA penalization is extended to accommodate sparsity of row and column effects and to estimate an unknown mean tensor. The behavior of the LANOVA estimate is examined under misspecification of the distribution of the elementwise effects, and LANOVA penalization is applied to several datasets, including a matrix of microarray data, a three-way tensor of fMRI data and a three-way tensor of wheat infection data.
△ Less
Submitted 8 February, 2019; v1 submitted 24 March, 2017;
originally announced March 2017.