Skip to main content

Showing 1–25 of 25 results for author: van de Geer, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2105.02083  [pdf, other

    math.ST cs.IT stat.ML

    AdaBoost and robust one-bit compressed sensing

    Authors: Geoffrey Chinot, Felix Kuchelmeister, Matthias Löffler, Sara van de Geer

    Abstract: This paper studies binary classification in robust one-bit compressed sensing with adversarial errors. It is assumed that the model is overparameterized and that the parameter of interest is effectively sparse. AdaBoost is considered, and, through its relation to the max-$\ell_1$-margin-classifier, prediction error bounds are derived. The developed theory is general and allows for heavy-tailed fea… ▽ More

    Submitted 8 December, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

    Comments: 40 pages, 4 figures, code available at https://github.com/Felix-127/Adaboost-and-robust-one-bit-compressed-sensing, extended results to features that satisfy weak-moment and anti-concentration assumption

    MSC Class: 62H30 (Primary); 94A12 (Secondary)

  2. arXiv:2012.00807  [pdf, ps, other

    math.ST cs.IT math.NA stat.ML

    On the robustness of minimum norm interpolators and regularized empirical risk minimizers

    Authors: Geoffrey Chinot, Matthias Löffler, Sara van de Geer

    Abstract: This article develops a general theory for minimum norm interpolating estimators and regularized empirical risk minimizers (RERM) in linear models in the presence of additive, potentially adversarial, errors. In particular, no conditions on the errors are imposed. A quantitative bound for the prediction error is given, relating it to the Rademacher complexity of the covariates, the norm of the min… ▽ More

    Submitted 7 October, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: 35 pages

    MSC Class: 62J05

  3. arXiv:1911.07231  [pdf, other

    math.ST stat.ML

    Adaptive Rates for Total Variation Image Denoising

    Authors: Francesco Ortelli, Sara van de Geer

    Abstract: We study the theoretical properties of image denoising via total variation penalized least-squares. We define the total vatiation in terms of the two-dimensional total discrete derivative of the image and show that it gives rise to denoised images that are piecewise constant on rectangular sets. We prove that, if the true image is piecewise constant on just a few rectangular sets, the denoised ima… ▽ More

    Submitted 26 January, 2021; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: 38 pages, 6 figures

    Journal ref: Journal of Machine Learning Research, 21(247), 2020

  4. arXiv:1904.10871  [pdf, ps, other

    math.ST stat.ML

    Prediction bounds for higher order total variation regularized least squares

    Authors: Francesco Ortelli, Sara van de Geer

    Abstract: We establish adaptive results for trend filtering: least squares estimation with a penalty on the total variation of $(k-1)^{\rm th}$ order differences. Our approach is based on combining a general oracle inequality for the $\ell_1$-penalized least squares estimator with "interpolating vectors" to upper-bound the "effective sparsity". This allows one to show that the $\ell_1$-penalty on the… ▽ More

    Submitted 17 July, 2020; v1 submitted 24 April, 2019; originally announced April 2019.

    Comments: 28 pages

    MSC Class: 62J07

  5. arXiv:1902.11192  [pdf, ps, other

    math.ST stat.ML

    Oracle inequalities for square root analysis estimators with application to total variation penalties

    Authors: Francesco Ortelli, Sara van de Geer

    Abstract: Through the direct study of the analysis estimator we derive oracle inequalities with fast and slow rates by adapting the arguments involving projections by Dalalyan, Hebiri and Lederer (2017). We then extend the theory to the square root analysis estimator. Finally, we focus on (square root) total variation regularized estimators on graphs and obtain constant-friendly rates, which, up to log-term… ▽ More

    Submitted 14 December, 2019; v1 submitted 28 February, 2019; originally announced February 2019.

    Journal ref: Information and Inference: A Journal of the IMA, iaaa002, 2020

  6. arXiv:1811.10443  [pdf, other

    math.ST stat.ME

    Sparse spectral estimation with missing and corrupted measurements

    Authors: Andreas Elsener, Sara van de Geer

    Abstract: Supervised learning methods with missing data have been extensively studied not just due to the techniques related to low-rank matrix completion. Also in unsupervised learning one often relies on imputation methods. As a matter of fact, missing values induce a bias in various estimators such as the sample covariance matrix. In the present paper, a convex method for sparse subspace estimation is ex… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

    Comments: 32 pages, 4 figures

  7. arXiv:1806.01918  [pdf, other

    stat.ML cs.LG

    A Framework for the construction of upper bounds on the number of affine linear regions of ReLU feed-forward neural networks

    Authors: Peter Hinz, Sara van de Geer

    Abstract: We present a framework to derive upper bounds on the number of regions that feed-forward neural networks with ReLU activation functions are affine linear on. It is based on an inductive analysis that keeps track of the number of such regions per dimensionality of their images within the layers. More precisely, the information about the number regions per dimensionality is pushed through the layers… ▽ More

    Submitted 9 March, 2020; v1 submitted 5 June, 2018; originally announced June 2018.

  8. arXiv:1806.01009  [pdf, ps, other

    math.ST stat.ML

    On the total variation regularized estimator over a class of tree graphs

    Authors: Francesco Ortelli, Sara van de Geer

    Abstract: We generalize to tree graphs obtained by connecting path graphs an oracle result obtained for the Fused Lasso over the path graph. Moreover we show that it is possible to substitute in the oracle inequality the minimum of the distances between jumps by their harmonic mean. In doing so we prove a lower bound on the compatibility constant for the total variation penalty. Our analysis leverages insig… ▽ More

    Submitted 16 June, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

    Comments: 42 pages

    Journal ref: Electronic Journal of Statistics, 12, 2018, 4517-4570

  9. arXiv:1801.10567  [pdf, other

    stat.ME math.ST

    De-biased sparse PCA: Inference and testing for eigenstructure of large covariance matrices

    Authors: Jana Janková, Sara van de Geer

    Abstract: Sparse principal component analysis (sPCA) has become one of the most widely used techniques for dimensionality reduction in high-dimensional datasets. The main challenge underlying sPCA is to estimate the first vector of loadings of the population covariance matrix, provided that only a certain number of loadings are non-zero. In this paper, we propose confidence intervals for individual loadings… ▽ More

    Submitted 31 January, 2018; originally announced January 2018.

    Comments: 41 pages

  10. arXiv:1801.08512  [pdf, other

    math.ST stat.ME

    Inference in high-dimensional graphical models

    Authors: Jana Jankova, Sara van de Geer

    Abstract: We provide a selected overview of methodology and theory for estimation and inference on the edge weights in high-dimensional directed and undirected Gaussian graphical models. For undirected graphical models, two main explicit constructions are provided: one based on a global method that maximizes the joint likelihood (the graphical Lasso) and one based on a local (nodewise) method that sequentia… ▽ More

    Submitted 25 January, 2018; originally announced January 2018.

    Comments: 29 pages

  11. Asymptotic Confidence Regions for High-dimensional Structured Sparsity

    Authors: Benjamin Stucky, Sara van de Geer

    Abstract: In the setting of high-dimensional linear regression models, we propose two frameworks for constructing pointwise and group confidence sets for penalized estimators which incorporate prior knowledge about the organization of the non-zero coefficients. This is done by desparsifying the estimator as in van de Geer et al. [18] and van de Geer and Stucky [17], then using an appropriate estimator for t… ▽ More

    Submitted 28 June, 2017; originally announced June 2017.

    Comments: 28 pages, 4 figures, 1 table

  12. arXiv:1610.01353  [pdf, other

    stat.ME math.ST

    Confidence regions for high-dimensional generalized linear models under sparsity

    Authors: Jana Janková, Sara van de Geer

    Abstract: We study asymptotically normal estimation and confidence regions for low-dimensional parameters in high-dimensional sparse models. Our approach is based on the $\ell_1$-penalized M-estimator which is used for construction of a bias corrected estimator. We show that the proposed estimator is asymptotically normal, under a sparsity assumption on the high-dimensional parameter, smoothness conditions… ▽ More

    Submitted 5 October, 2016; originally announced October 2016.

    Comments: 40 pages

  13. arXiv:1601.00815  [pdf, ps, other

    math.ST stat.ME

    Semi-parametric efficiency bounds for high-dimensional models

    Authors: Jana Jankova, Sara van de Geer

    Abstract: Asymptotic lower bounds for estimation play a fundamental role in assessing the quality of statistical procedures. In this paper we propose a framework for obtaining semi-parametric efficiency bounds for sparse high-dimensional models, where the dimension of the parameter is larger than the sample size. We adopt a semi-parametric point of view: we concentrate on one dimensional functions of a high… ▽ More

    Submitted 12 October, 2017; v1 submitted 5 January, 2016; originally announced January 2016.

    Comments: 68 pages

  14. High-dimensional inference in misspecified linear models

    Authors: Peter Bühlmann, Sara van de Geer

    Abstract: We consider high-dimensional inference when the assumed linear model is misspecified. We describe some correct interpretations and corresponding sufficient assumptions for valid asymptotic inference of the model parameters, which still have a useful meaning when the model is misspecified. We largely focus on the de-sparsified Lasso procedure but we also indicate some implications for (multiple) sa… ▽ More

    Submitted 22 March, 2015; originally announced March 2015.

    Comments: 24 pages, 4 figures

    MSC Class: 62J05; 62J07

    Journal ref: Electronic Journal of Statistics 2015, Vol. 9, 1449-1473

  15. arXiv:1403.6752  [pdf, other

    math.ST stat.ME

    Confidence intervals for high-dimensional inverse covariance estimation

    Authors: Jana Jankova, Sara van de Geer

    Abstract: We propose methodology for statistical inference for low-dimensional parameters of sparse precision matrices in a high-dimensional setting. Our method leads to a non-sparse estimator of the precision matrix whose entries have a Gaussian limiting distribution. Asymptotic properties of the novel estimator are analyzed for the case of sub-Gaussian observations under a sparsity assumption on the entri… ▽ More

    Submitted 11 August, 2015; v1 submitted 26 March, 2014; originally announced March 2014.

    Comments: 26 pages

    Journal ref: Electronic Journal of Statistics 2015, Vol. 9, No. 1, 1205 - 1229

  16. arXiv:1307.1067  [pdf, other

    math.ST stat.AP

    The partial linear model in high dimensions

    Authors: Patric Müller, Sara van de Geer

    Abstract: Partial linear models have been widely used as flexible method for modelling linear components in conjunction with non-parametric ones. Despite the presence of the non-parametric part, the linear, parametric part can under certain conditions be estimated with parametric rate. In this paper, we consider a high-dimensional linear part. We show that it can be estimated with oracle rates, using the LA… ▽ More

    Submitted 3 July, 2013; originally announced July 2013.

    Comments: 48 pages, 16 figures, submitted to Scandinavian Journal of Statistics

  17. Correlated variables in regression: clustering and sparse estimation

    Authors: Peter Bühlmann, Philipp Rütimann, Sara van de Geer, Cun-Hui Zhang

    Abstract: We consider estimation in a high-dimensional linear model with strongly correlated variables. We propose to cluster the variables first and do subsequent sparse estimation such as the Lasso for cluster-representatives or the group Lasso based on the structure from the clusters. Regarding the first step, we present a novel and bottom-up agglomerative clustering algorithm based on canonical correlat… ▽ More

    Submitted 26 September, 2012; originally announced September 2012.

    Comments: 40 pages, 6 figures

    MSC Class: 62J07; 62H30

    Journal ref: Journal of Statistical Planning and Inference 2013, Vol. 143, 1835-1858

  18. arXiv:1206.6721  [pdf, ps, other

    math.ST stat.ME

    Quasi-Likelihood and/or Robust Estimation in High Dimensions

    Authors: Sara van de Geer, Patric Müller

    Abstract: We consider the theory for the high-dimensional generalized linear model with the Lasso. After a short review on theoretical results in literature, we present an extension of the oracle results to the case of quasi-likelihood loss. We prove bounds for the prediction error and $\ell_1$-error. The results are derived under fourth moment conditions on the error distribution. The case of robust loss i… ▽ More

    Submitted 4 January, 2013; v1 submitted 28 June, 2012; originally announced June 2012.

    Comments: Published in at http://dx.doi.org/10.1214/12-STS397 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-STS-STS397

    Journal ref: Statistical Science 2012, Vol. 27, No. 4, 469-480

  19. L1-Penalization for Mixture Regression Models

    Authors: Nicolas Städler, Peter Bühlmann, Sara van de Geer

    Abstract: We consider a finite mixture of regressions (FMR) model for high-dimensional inhomogeneous data where the number of covariates may be much larger than sample size. We propose an l1-penalized maximum likelihood estimator in an appropriate parameterization. This kind of estimation belongs to a class of problems where optimization and theory for non-convex functions is needed. This distinguishes itse… ▽ More

    Submitted 27 February, 2012; originally announced February 2012.

    Comments: This is the author's version of the work (published as a discussion paper in TEST, 2010, Volume 19, 209--285). The final publication is available at http://www.springerlink.com

    Journal ref: TEST, 2010, Volume 19, 209--285

  20. arXiv:1107.0189  [pdf, other

    stat.ME

    The Lasso, correlated design, and improved oracle inequalities

    Authors: Sara van de Geer, Johannes Lederer

    Abstract: We study high-dimensional linear models and the $\ell_1$-penalized least squares estimator, also known as the Lasso estimator. In literature, oracle inequalities have been derived under restricted eigenvalue or compatibility conditions. In this paper, we complement this with entropy conditions which allow one to improve the dual norm bound, and demonstrate how this leads to new oracle inequalities… ▽ More

    Submitted 1 July, 2011; originally announced July 2011.

    Comments: 18 pages, 3 figures

    MSC Class: 62J05

  21. Estimation for High-Dimensional Linear Mixed-Effects Models Using $\ell_1$-Penalization

    Authors: Jürg Schelldorfer, Peter Bühlmann, Sara van de Geer

    Abstract: We propose an $\ell_1$-penalized estimation procedure for high-dimensional linear mixed-effects models. The models are useful whenever there is a grouping structure among high-dimensional observations, i.e. for clustered data. We prove a consistency and an oracle optimality result and we develop an algorithm with provable numerical convergence. Furthermore, we demonstrate the performance of the me… ▽ More

    Submitted 25 November, 2010; v1 submitted 19 February, 2010; originally announced February 2010.

    Journal ref: Scandinavian Journal of Statistics 2011, 38: 197-214

  22. arXiv:0910.0722  [pdf, other

    math.ST stat.ML

    On the conditions used to prove oracle results for the Lasso

    Authors: Sara A. van de Geer, Peter Bühlmann

    Abstract: Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue condition (Bickel et al., 2009) or the slightly weaker compatibility condition (van de Geer, 2007) are sufficient fo… ▽ More

    Submitted 5 October, 2009; originally announced October 2009.

    Comments: 33 pages, 1 figure

    Journal ref: Electronic Journal of Statistics, 3, (2009), 1360-1392

  23. arXiv:0903.2515  [pdf, ps, other

    math.ST stat.ML

    Adaptive Lasso for High Dimensional Regression and Gaussian Graphical Modeling

    Authors: Shuheng Zhou, Sara van de Geer, Peter Bühlmann

    Abstract: We show that the two-stage adaptive Lasso procedure (Zou, 2006) is consistent for high-dimensional model selection in linear and Gaussian graphical models. Our conditions for consistency cover more general situations than those accomplished in previous work: we prove that restricted eigenvalue conditions (Bickel et al., 2008) are also sufficient for sparse structure estimation.

    Submitted 13 March, 2009; originally announced March 2009.

    Comments: 30 pages

  24. arXiv:0903.1468  [pdf, ps, other

    stat.ML math.ST

    Taking Advantage of Sparsity in Multi-Task Learning

    Authors: Karim Lounici, Massimiliano Pontil, Alexandre B. Tsybakov, Sara van de Geer

    Abstract: We study the problem of estimating multiple linear regression equations for the purpose of both prediction and variable selection. Following recent work on multi-task learning Argyriou et al. [2008], we assume that the regression vectors share the same sparsity pattern. This means that the set of relevant predictor variables is the same across the different equations. This assumption leads us to… ▽ More

    Submitted 8 March, 2009; originally announced March 2009.

    Journal ref: 10 pages, 1 figure, Proc. Computational Learning Theory Conference (COLT 2009)

  25. High-dimensional additive modeling

    Authors: Lukas Meier, Sara van de Geer, Peter Bühlmann

    Abstract: We propose a new sparsity-smoothness penalty for high-dimensional generalized additive models. The combination of sparsity and smoothness is crucial for mathematical theory as well as performance for finite-sample data. We present a computationally efficient algorithm, with provable numerical convergence properties, for optimizing the penalized likelihood. Furthermore, we provide oracle results… ▽ More

    Submitted 18 November, 2009; v1 submitted 25 June, 2008; originally announced June 2008.

    Comments: Published in at http://dx.doi.org/10.1214/09-AOS692 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS692 MSC Class: 62G08; 62F12 (Primary); 62J07 (Secondary)

    Journal ref: Annals of Statistics 2009, Vol. 37, No. 6B, 3779-3821