Search | arXiv e-print repository

The Lasso Distribution: Properties, Sampling Methods, and Applications in Bayesian Lasso Regression

Authors: Mohammad Javad Davoudabadi, Jonathon Tidswell, Samuel Muller, Garth Tarr, John T. Ormerod

Abstract: In this paper, we introduce a new probability distribution, the Lasso distribution. We derive several fundamental properties of the distribution, including closed-form expressions for its moments and moment-generating function. Additionally, we present an efficient and numerically stable algorithm for generating random samples from the distribution, facilitating its use in both theoretical and app… ▽ More In this paper, we introduce a new probability distribution, the Lasso distribution. We derive several fundamental properties of the distribution, including closed-form expressions for its moments and moment-generating function. Additionally, we present an efficient and numerically stable algorithm for generating random samples from the distribution, facilitating its use in both theoretical and applied settings. We establish that the Lasso distribution belongs to the exponential family. A direct application of the Lasso distribution arises in the context of an existing Gibbs sampler, where the full conditional distribution of each regression coefficient follows this distribution. This leads to a more computationally efficient and theoretically grounded sampling scheme. To facilitate the adoption of our methodology, we provide an R package implementing the proposed methods. Our findings offer new insights into the probabilistic structure underlying the Lasso penalty and provide practical improvements in Bayesian inference for high-dimensional regression problems. △ Less

Submitted 12 June, 2025; v1 submitted 8 June, 2025; originally announced June 2025.

Comments: 15 pages, 2 figures

arXiv:2409.14646 [pdf, other]

Scalable Expectation Propagation for Mixed-Effects Regression

Authors: Jackson Zhou, John T. Ormerod, Clara Grazian

Abstract: Mixed-effects regression models represent a useful subclass of regression models for grouped data; the introduction of random effects allows for the correlation between observations within each group to be conveniently captured when inferring the fixed effects. At a time where such regression models are being fit to increasingly large datasets with many groups, it is ideal if (a) the time it takes… ▽ More Mixed-effects regression models represent a useful subclass of regression models for grouped data; the introduction of random effects allows for the correlation between observations within each group to be conveniently captured when inferring the fixed effects. At a time where such regression models are being fit to increasingly large datasets with many groups, it is ideal if (a) the time it takes to make the inferences scales linearly with the number of groups and (b) the inference workload can be distributed across multiple computational nodes in a numerically stable way, if the dataset cannot be stored in one location. Current Bayesian inference approaches for mixed-effects regression models do not seem to account for both challenges simultaneously. To address this, we develop an expectation propagation (EP) framework in this setting that is both scalable and numerically stable when distributed for the case where there is only one grouping factor. The main technical innovations lie in the sparse reparameterisation of the EP algorithm, and a moment propagation (MP) based refinement for multivariate random effect factor approximations. Experiments are conducted to show that this EP framework achieves linear scaling, while having comparable accuracy to other scalable approximate Bayesian inference (ABI) approaches. △ Less

Submitted 24 September, 2024; v1 submitted 22 September, 2024; originally announced September 2024.

Comments: Changed author name from John Ormerod to John T. Ormerod to help with citations

arXiv:1812.06605 [pdf, other]

Variational Discriminant Analysis with Variable Selection

Authors: Weichang Yu, John T. Ormerod, Michael Stewart

Abstract: A fast Bayesian method that seamlessly fuses classification and hypothesis testing via discriminant analysis is developed. Building upon the original discriminant analysis classifier, modelling components are added to identify discriminative variables. A combination of cake priors and a novel form of variational Bayes we call reverse collapsed variational Bayes gives rise to variable selection tha… ▽ More A fast Bayesian method that seamlessly fuses classification and hypothesis testing via discriminant analysis is developed. Building upon the original discriminant analysis classifier, modelling components are added to identify discriminative variables. A combination of cake priors and a novel form of variational Bayes we call reverse collapsed variational Bayes gives rise to variable selection that can be directly posed as a multiple hypothesis testing approach using likelihood ratio statistics. Some theoretical arguments are presented showing that Chernoff-consistency (asymptotically zero type I and type II error) is maintained across all hypotheses. We apply our method on some publicly available genomics datasets and show that our method performs well in practice for its computational cost. An R package VaDA has also been made available on Github. △ Less

Submitted 27 August, 2019; v1 submitted 16 December, 2018; originally announced December 2018.

Comments: Under Review

arXiv:1812.03648 [pdf, other]

Variational Nonparametric Discriminant Analysis

Authors: Weichang Yu, Lamiae Azizi, John T. Ormerod

Abstract: Variable selection and classification are common objectives in the analysis of high-dimensional data. Most such methods make distributional assumptions that may not be compatible with the diverse families of distributions data can take. A novel Bayesian nonparametric discriminant analysis model that performs both variable selection and classification within a seamless framework is proposed. P{ó}ly… ▽ More Variable selection and classification are common objectives in the analysis of high-dimensional data. Most such methods make distributional assumptions that may not be compatible with the diverse families of distributions data can take. A novel Bayesian nonparametric discriminant analysis model that performs both variable selection and classification within a seamless framework is proposed. P{ó}lya tree priors are assigned to the unknown group-conditional distributions to account for their uncertainty, and allow prior beliefs about the distributions to be incorporated simply as hyperparameters. The adoption of collapsed variational Bayes inference in combination with a chain of functional approximations led to an algorithm with low computational cost. The resultant decision rules carry heuristic interpretations and are related to an existing two-sample Bayesian nonparametric hypothesis test. By an application to some simulated and publicly available real datasets, the proposed method exhibits good performance when compared to current state-of-the-art approaches. △ Less

Submitted 27 August, 2019; v1 submitted 10 December, 2018; originally announced December 2018.

Journal ref: Computational Statistics and Data Analysis, 142 (106817) (2020)

arXiv:1807.01422 [pdf, other]

Diagonal Discriminant Analysis with Feature Selection for High Dimensional Data

Authors: Sarah Elizabeth Romanes, John Thomas Ormerod, Jean YH Yang

Abstract: We introduce a new method of performing high dimensional discriminant analysis, which we call multiDA. We achieve this by constructing a hybrid model that seamlessly integrates a multiclass diagonal discriminant analysis model and feature selection components. Our feature selection component naturally simplifies to weights which are simple functions of likelihood ratio statistics allowing natural… ▽ More We introduce a new method of performing high dimensional discriminant analysis, which we call multiDA. We achieve this by constructing a hybrid model that seamlessly integrates a multiclass diagonal discriminant analysis model and feature selection components. Our feature selection component naturally simplifies to weights which are simple functions of likelihood ratio statistics allowing natural comparisons with traditional hypothesis testing methods. We provide heuristic arguments suggesting desirable asymptotic properties of our algorithm with regards to feature selection. We compare our method with several other approaches, showing marked improvements in regard to prediction accuracy, interpretability of chosen features, and algorithm run time. We demonstrate such strengths of our model by showing strong classification performance on publicly available high dimensional datasets, as well as through multiple simulation studies. We make an R package available implementing our approach. △ Less

Submitted 3 July, 2018; originally announced July 2018.

arXiv:1805.08423 [pdf, other]

Fast and Accurate Binary Response Mixed Model Analysis via Expectation Propagation

Authors: P. Hall, I. M. Johnstone, J. T. Ormerod, M. P. Wand, J. C. F. Yu

Abstract: Expectation propagation is a general prescription for approximation of integrals in statistical inference problems. Its literature is mainly concerned with Bayesian inference scenarios. However, expectation propagation can also be used to approximate integrals arising in frequentist statistical inference. We focus on likelihood-based inference for binary response mixed models and show that fast an… ▽ More Expectation propagation is a general prescription for approximation of integrals in statistical inference problems. Its literature is mainly concerned with Bayesian inference scenarios. However, expectation propagation can also be used to approximate integrals arising in frequentist statistical inference. We focus on likelihood-based inference for binary response mixed models and show that fast and accurate quadrature-free inference can be realized for the probit link case with multivariate random effects and higher levels of nesting. The approach is supported by asymptotic theory in which expectation propagation is seen to provide consistent estimation of the exact likelihood surface. Numerical studies reveal the availability of fast, highly accurate and scalable methodology for binary mixed model analysis. △ Less

Submitted 22 May, 2018; originally announced May 2018.

Comments: 35 pages, 5 figures

arXiv:1710.09146 [pdf, other]

Bayesian hypothesis tests with diffuse priors: Can we have our cake and eat it too?

Authors: John T. Ormerod, Michael Stewart, Weichang Yu, Sarah E. Romanes

Abstract: We introduce a new class of priors for Bayesian hypothesis testing, which we name "cake priors". These priors circumvent Bartlett's paradox (also called the Jeffreys-Lindley paradox); the problem associated with the use of diffuse priors leading to nonsensical statistical inferences. Cake priors allow the use of diffuse priors (having one's cake) while achieving theoretically justified inferences… ▽ More We introduce a new class of priors for Bayesian hypothesis testing, which we name "cake priors". These priors circumvent Bartlett's paradox (also called the Jeffreys-Lindley paradox); the problem associated with the use of diffuse priors leading to nonsensical statistical inferences. Cake priors allow the use of diffuse priors (having one's cake) while achieving theoretically justified inferences (eating it too). We demonstrate this methodology for Bayesian hypotheses tests for scenarios under which the one and two sample t-tests, and linear models are typically derived. The resulting Bayesian test statistic takes the form of a penalized likelihood ratio test statistic. By considering the sampling distribution under the null and alternative hypotheses we show for independent identically distributed regular parametric models that Bayesian hypothesis tests using cake priors are Chernoff-consistent, i.e., achieve zero type I and II errors asymptotically. Lindley's paradox is also discussed. We argue that a true Lindley's paradox will only occur with small probability for large sample sizes. △ Less

Submitted 25 October, 2017; originally announced October 2017.

arXiv:1305.2667 [pdf, other]

Mean field variational Bayesian inference for support vector machine classification

Authors: Jan Luts, John T. Ormerod

Abstract: A mean field variational Bayes approach to support vector machines (SVMs) using the latent variable representation on Polson & Scott (2012) is presented. This representation allows circumvention of many of the shortcomings associated with classical SVMs including automatic penalty parameter selection, the ability to handle dependent samples, missing data and variable selection. We demonstrate on s… ▽ More A mean field variational Bayes approach to support vector machines (SVMs) using the latent variable representation on Polson & Scott (2012) is presented. This representation allows circumvention of many of the shortcomings associated with classical SVMs including automatic penalty parameter selection, the ability to handle dependent samples, missing data and variable selection. We demonstrate on simulated and real datasets that our approach is easily extendable to non-standard situations and outperforms the classical SVM approach whilst remaining computationally efficient. △ Less

Submitted 12 May, 2013; originally announced May 2013.

Comments: 18 pages, 4 figures

arXiv:0707.0143 [pdf, other]

On semiparametric regression with O'Sullivan penalised splines

Authors: M. P. Wand, J. T. Ormerod

Abstract: This is an exposé on the use of O'Sullivan penalised splines in contemporary semiparametric regression, including mixed model and Bayesian formulations. O'Sullivan penalised splines are similar to P-splines, but have an advantage of being a direct generalisation of smoothing splines. Exact expressions for the O'Sullivan penalty matrix are obtained. Comparisons between the two reveals that O'Sull… ▽ More This is an exposé on the use of O'Sullivan penalised splines in contemporary semiparametric regression, including mixed model and Bayesian formulations. O'Sullivan penalised splines are similar to P-splines, but have an advantage of being a direct generalisation of smoothing splines. Exact expressions for the O'Sullivan penalty matrix are obtained. Comparisons between the two reveals that O'Sullivan penalised splines more closely mimic the natural boundary behaviour of smoothing splines. Implementation in modern computing environments such as Matlab, R and BUGS is discussed. △ Less

Submitted 2 July, 2007; originally announced July 2007.

Comments: 19 pages with 9 figures

Showing 1–9 of 9 results for author: Ormerod, J T