Skip to main content

Showing 1–14 of 14 results for author: van Wieringen, W N

.
  1. arXiv:2411.02396  [pdf, other

    stat.ME

    Fusion of Tree-induced Regressions for Clinico-genomic Data

    Authors: Jeroen M. Goedhart, Mark A. van de Wiel, Wessel N. van Wieringen, Thomas Klausch

    Abstract: Cancer prognosis is often based on a set of omics covariates and a set of established clinical covariates such as age and tumor stage. Combining these two sets poses challenges. First, dimension difference: clinical covariates should be favored because they are low-dimensional and usually have stronger prognostic ability than high-dimensional omics covariates. Second, interactions: genetic profile… ▽ More

    Submitted 4 November, 2024; originally announced November 2024.

    Comments: 27 pages, 3 figures, 1 table

  2. arXiv:2405.04917  [pdf, other

    stat.ME stat.ML

    Guiding adaptive shrinkage by co-data to improve regression-based prediction and feature selection

    Authors: Mark A. van de Wiel, Wessel N. van Wieringen

    Abstract: The high dimensional nature of genomics data complicates feature selection, in particular in low sample size studies - not uncommon in clinical prediction settings. It is widely recognized that complementary data on the features, `co-data', may improve results. Examples are prior feature groups or p-values from a related study. Such co-data are ubiquitous in genomics settings due to the availabili… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 19 pages, 7 figures. Including Supplementary Material

  3. arXiv:2010.05619  [pdf, other

    stat.ME stat.CO stat.ML

    rags2ridges: A One-Stop-Shop for Graphical Modeling of High-Dimensional Precision Matrices

    Authors: Carel F. W. Peeters, Anders Ellern Bilgrau, Wessel N. van Wieringen

    Abstract: A graphical model is an undirected network representing the conditional independence properties between random variables. Graphical modeling has become part and parcel of systems or network approaches to multivariate data, in particular when the variable dimension exceeds the observation dimension. rags2ridges is an R package for graphical modeling of high-dimensional precision matrices. It provid… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 30 pages, 10 figures

    Journal ref: Journal of Statistical Software, 102 (2022): 1-32

  4. arXiv:2007.02117  [pdf, ps, other

    stat.ME

    Transfer learning of regression models from a sequence of datasets by penalized estimation

    Authors: Wessel N. van Wieringen, Harald Binder

    Abstract: Transfer learning refers to the promising idea of initializing model fits based on pre-training on other data. We particularly consider regression modeling settings where parameter estimates from previous data can be used as anchoring points, yet may not be available for all parameters, thus covariance information cannot be reused. A procedure that updates through targeted penalized estimation, wh… ▽ More

    Submitted 4 July, 2020; originally announced July 2020.

  5. arXiv:1901.10217  [pdf, other

    stat.ME

    Incorporating prior information and borrowing information in high-dimensional sparse regression using the horseshoe and variational Bayes

    Authors: Gino B. Kpogbezan, Mark A. van de Wiel, Wessel N. van Wieringen, Aad W. van der Vaart

    Abstract: We introduce a sparse high-dimensional regression approach that can incorporate prior information on the regression parameters and can borrow information across a set of similar datasets. Prior information may for instance come from previous studies or genomic databases, and information borrowed across a set of genes or genomic networks. The approach is based on prior modelling of the regression p… ▽ More

    Submitted 29 January, 2019; originally announced January 2019.

  6. arXiv:1812.02401  [pdf, other

    stat.ME

    A parallel algorithm for penalized learning of the multivariate exponential family from data of mixed types

    Authors: Diederik S. Laman Trip, Wessel N. van Wieringen

    Abstract: Computational efficient evaluation of penalized estimators of multivariate exponential family distributions is sought. These distributions encompass among others Markov random fields with variates of mixed type (e.g. binary and continuous) as special case of interest. The model parameter is estimated by maximization of the pseudo-likelihood augmented with a convex penalty. The estimator is shown t… ▽ More

    Submitted 25 December, 2020; v1 submitted 6 December, 2018; originally announced December 2018.

  7. The Spectral Condition Number Plot for Regularization Parameter Determination

    Authors: Carel F. W. Peeters, Mark A. van de Wiel, Wessel N. van Wieringen

    Abstract: Many modern statistical applications ask for the estimation of a covariance (or precision) matrix in settings where the number of variables is larger than the number of observations. There exists a broad class of ridge-type estimators that employs regularization to cope with the subsequent singularity of the sample covariance matrix. These estimators depend on a penalty parameter and choosing its… ▽ More

    Submitted 14 August, 2016; originally announced August 2016.

    Comments: 41 pages, 7 figures, includes supplementary material

    Journal ref: Computational Statistics, 35(2):629-646, 2020

  8. arXiv:1605.07514  [pdf, other

    stat.ME

    An empirical Bayes approach to network recovery using external knowledge

    Authors: Gino B. Kpogbezan, Aad W. van der Vaart, Wessel N. van Wieringen, Gwenaël G. R. Leday, Mark A. van de Wiel

    Abstract: Reconstruction of a high-dimensional network may benefit substantially from the inclusion of prior knowledge on the network topology. In the case of gene interaction networks such knowledge may come for instance from pathway repositories like KEGG, or be inferred from data of a pilot study. The Bayesian framework provides a natural means of including such prior knowledge. Based on a Bayesian Simul… ▽ More

    Submitted 24 May, 2016; originally announced May 2016.

  9. arXiv:1510.03771  [pdf, other

    stat.ME

    Gene network reconstruction using global-local shrinkage priors

    Authors: Gwenaël G. R. Leday, Mathisca C. M. de Gunst, Gino B. Kpogbezan, Aad W. Van der Vaart, Wessel N. Van Wieringen, Mark A. Van de Wiel

    Abstract: Reconstructing a gene network from high-throughput molecular data is often a challenging task, as the number of parameters to estimate easily is much larger than the sample size. A conventional remedy is to regularize or penalize the model likelihood. In network models, this is often done locally in the neighbourhood of each node or gene. However, estimation of the many regularization parameters i… ▽ More

    Submitted 13 October, 2015; originally announced October 2015.

    Comments: 27 pages, 5 figures

  10. arXiv:1509.09169  [pdf, other

    stat.ME

    Lecture notes on ridge regression

    Authors: Wessel N. van Wieringen

    Abstract: The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here man… ▽ More

    Submitted 27 June, 2023; v1 submitted 30 September, 2015; originally announced September 2015.

  11. arXiv:1509.07982  [pdf, other

    stat.ME q-bio.MN stat.ML

    Targeted Fused Ridge Estimation of Inverse Covariance Matrices from Multiple High-Dimensional Data Classes

    Authors: Anders Ellern Bilgrau, Carel F. W. Peeters, Poul Svante Eriksen, Martin Bøgsted, Wessel N. van Wieringen

    Abstract: We consider the problem of jointly estimating multiple inverse covariance matrices from high-dimensional data consisting of distinct classes. An $\ell_2$-penalized maximum likelihood approach is employed. The suggested approach is flexible and generic, incorporating several other $\ell_2$-penalized estimators as special cases. In addition, the approach allows specification of target matrices throu… ▽ More

    Submitted 26 March, 2020; v1 submitted 26 September, 2015; originally announced September 2015.

    Comments: 52 pages, 11 figures

    Journal ref: Journal of Machine Learning Research, 21(26):1--52, 2020

  12. arXiv:1411.3496  [pdf, other

    stat.ME

    Better prediction by use of co-data: Adaptive group-regularized ridge regression

    Authors: Mark A. van de Wiel, Tonje G. Lien, Wina Verlaat, Wessel N. van Wieringen, Saskia M. Wilting

    Abstract: For many high-dimensional studies, additional information on the variables, like (genomic) annotation or external p-values, is available. In the context of binary and continuous prediction, we develop a method for adaptive group-regularized (logistic) ridge regression, which makes structural use of such 'co-data'. Here, 'groups' refer to a partition of the variables according to the co-data. We de… ▽ More

    Submitted 18 May, 2015; v1 submitted 13 November, 2014; originally announced November 2014.

    Comments: 15 pages, 2 figures. Supplementary Information available on first author's web site

    MSC Class: 62J07

  13. Ridge Estimation of Inverse Covariance Matrices from High-Dimensional Data

    Authors: Wessel N. van Wieringen, Carel F. W. Peeters

    Abstract: We study ridge estimation of the precision matrix in the high-dimensional setting where the number of variables is large relative to the sample size. We first review two archetypal ridge estimators and note that their utilized penalties do not coincide with common ridge penalties. Subsequently, starting from a common ridge penalty, analytic expressions are derived for two alternative ridge estimat… ▽ More

    Submitted 24 September, 2015; v1 submitted 4 March, 2014; originally announced March 2014.

    Journal ref: Computational Statistics & Data Analysis, 103 (2016): 284-303

  14. Modeling association between DNA copy number and gene expression with constrained piecewise linear regression splines

    Authors: Gwenaël G. R. Leday, Aad W. van der Vaart, Wessel N. van Wieringen, Mark A. van de Wiel

    Abstract: DNA copy number and mRNA expression are widely used data types in cancer studies, which combined provide more insight than separately. Whereas in existing literature the form of the relationship between these two types of markers is fixed a priori, in this paper we model their association. We employ piecewise linear regression splines (PLRS), which combine good interpretation with sufficient flexi… ▽ More

    Submitted 6 December, 2013; originally announced December 2013.

    Comments: Published in at http://dx.doi.org/10.1214/12-AOAS605 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS605

    Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 2, 823-845