Skip to main content

Showing 1–11 of 11 results for author: Villa, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2412.16765  [pdf, other

    cs.LG math.OC stat.ML

    Optimization Insights into Deep Diagonal Linear Networks

    Authors: Hippolyte Labarrière, Cesare Molinari, Lorenzo Rosasco, Silvia Villa, Cristian Vega

    Abstract: Overparameterized models trained with (stochastic) gradient descent are ubiquitous in modern machine learning. These large models achieve unprecedented performance on test data, but their theoretical understanding is still limited. In this paper, we take a step towards filling this gap by adopting an optimization perspective. More precisely, we study the implicit regularization properties of the g… ▽ More

    Submitted 1 April, 2025; v1 submitted 21 December, 2024; originally announced December 2024.

  2. arXiv:2212.12675  [pdf, other

    stat.ML cs.LG math.OC

    Iterative regularization in classification via hinge loss diagonal descent

    Authors: Vassilis Apidopoulos, Tomaso Poggio, Lorenzo Rosasco, Silvia Villa

    Abstract: Iterative regularization is a classic idea in regularization theory, that has recently become popular in machine learning. On the one hand, it allows to design efficient algorithms controlling at the same time numerical and statistical accuracy. On the other hand it allows to shed light on the learning curves observed while training neural networks. In this paper, we focus on iterative regularizat… ▽ More

    Submitted 9 October, 2024; v1 submitted 24 December, 2022; originally announced December 2022.

  3. arXiv:2202.00420  [pdf, other

    math.OC stat.ML

    Iterative regularization for low complexity regularizers

    Authors: Cesare Molinari, Mathurin Massias, Lorenzo Rosasco, Silvia Villa

    Abstract: Iterative regularization exploits the implicit bias of an optimization algorithm to regularize ill-posed problems. Constructing algorithms with such built-in regularization mechanisms is a classic challenge in inverse problems but also in modern machine learning, where it provides both a new perspective on algorithms analysis, and significant speed-ups compared to explicit regularization. In this… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

  4. arXiv:2106.08598  [pdf, other

    cs.LG stat.ML

    Ada-BKB: Scalable Gaussian Process Optimization on Continuous Domains by Adaptive Discretization

    Authors: Marco Rando, Luigi Carratino, Silvia Villa, Lorenzo Rosasco

    Abstract: Gaussian process optimization is a successful class of algorithms(e.g. GP-UCB) to optimize a black-box function through sequential evaluations. However, for functions with continuous domains, Gaussian process optimization has to rely on either a fixed discretization of the space, or the solution of a non-convex optimization subproblem at each evaluation. The first approach can negatively affect pe… ▽ More

    Submitted 11 March, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

  5. arXiv:2006.09859  [pdf, other

    stat.ML cs.LG

    Iterative regularization for convex regularizers

    Authors: Cesare Molinari, Mathurin Massias, Lorenzo Rosasco, Silvia Villa

    Abstract: We study iterative regularization for linear models, when the bias is convex but not necessarily strongly convex. We characterize the stability properties of a primal-dual gradient based approach, analyzing its convergence in the presence of worst case deterministic noise. As a main example, we specialize and illustrate the results for the problem of robust sparse recovery. Key to our analysis is… ▽ More

    Submitted 29 October, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

  6. Convergence of the Forward-Backward Algorithm: Beyond the Worst Case with the Help of Geometry

    Authors: Guillaume Garrigos, Lorenzo Rosasco, Silvia Villa

    Abstract: We provide a comprehensive study of the convergence of the forward-backward algorithm under suitable geometric conditions, such as conditioning or Łojasiewicz properties. These geometrical notions are usually local by nature, and may fail to describe the fine geometry of objective functions relevant in inverse problems and signal processing, that have a nice behaviour on manifolds, or sets open wi… ▽ More

    Submitted 13 November, 2020; v1 submitted 28 March, 2017; originally announced March 2017.

    Comments: After peer-review, the paper has been significantly modified: i) Section 3.3 has been completely rewritten, and contains a new sum rule (Theorem 3.15) ii) The end of Section 4.2 and Section 5.2 have been rewritten to include mirror-stratifiable problems iii) The Annex contains new proofs for small-but-not-trivial claims made throughout the paper iv) Theorems, Examples etc have been renumbered

    Journal ref: Math. Program. 198, 937-996 (2023)

  7. arXiv:1405.0042  [pdf, other

    stat.ML cs.LG math.OC math.PR

    Learning with incremental iterative regularization

    Authors: Lorenzo Rosasco, Silvia Villa

    Abstract: Within a statistical learning setting, we propose and study an iterative regularization algorithm for least squares defined by an incremental gradient method. In particular, we show that, if all other parameters are fixed a priori, the number of passes over the data (epochs) acts as a regularization parameter, and prove strong universal consistency, i.e. almost sure convergence of the risk, as wel… ▽ More

    Submitted 15 June, 2015; v1 submitted 30 April, 2014; originally announced May 2014.

    Comments: 30 pages

  8. arXiv:1303.5976  [pdf, ps, other

    stat.ML cs.LG

    On Learnability, Complexity and Stability

    Authors: Silvia Villa, Lorenzo Rosasco, Tomaso Poggio

    Abstract: We consider the fundamental question of learnability of a hypotheses class in the supervised learning setting and in the general learning setting introduced by Vladimir Vapnik. We survey classic results characterizing learnability in term of suitable notions of complexity, as well as more recent results that establish the connection between learnability and stability of a learning algorithm.

    Submitted 24 March, 2013; originally announced March 2013.

  9. arXiv:1209.0368  [pdf, other

    math.OC cs.LG stat.ML

    Proximal methods for the latent group lasso penalty

    Authors: Silvia Villa, Lorenzo Rosasco, Sofia Mosci, Alessandro Verri

    Abstract: We consider a regularized least squares problem, with regularization by structured sparsity-inducing norms, which extend the usual $\ell_1$ and the group lasso penalty, by allowing the subsets to overlap. Such regularizations lead to nonsmooth problems that are difficult to optimize, and we propose in this paper a suitable version of an accelerated proximal method to solve them. We prove convergen… ▽ More

    Submitted 3 September, 2012; originally announced September 2012.

    Comments: 4 figures

    MSC Class: 65K10; 90C25

  10. arXiv:1208.2572  [pdf, other

    stat.ML cs.LG math.OC

    Nonparametric sparsity and regularization

    Authors: Lorenzo Rosasco, Silvia Villa, Sofia Mosci, Matteo Santoro, Alessandro verri

    Abstract: In this work we are interested in the problems of supervised learning and variable selection when the input-output dependence is described by a nonlinear function depending on a few variables. Our goal is to consider a sparse nonparametric model, hence avoiding linear or additive models. The key idea is to measure the importance of each variable in the model by making use of partial derivatives. B… ▽ More

    Submitted 13 August, 2012; originally announced August 2012.

    Comments: 45 pages, 11 figures

  11. arXiv:1011.3728  [pdf, other

    cs.LG cs.IT stat.ML

    PADDLE: Proximal Algorithm for Dual Dictionaries LEarning

    Authors: Curzio Basso, Matteo Santoro, Alessandro Verri, Silvia Villa

    Abstract: Recently, considerable research efforts have been devoted to the design of methods to learn from data overcomplete dictionaries for sparse coding. However, learned dictionaries require the solution of an optimization problem for coding new data. In order to overcome this drawback, we propose an algorithm aimed at learning both a dictionary and its dual: a linear mapping directly performing the cod… ▽ More

    Submitted 16 November, 2010; originally announced November 2010.

    Report number: DISI-TR-2010-06