Skip to main content

Showing 1–14 of 14 results for author: Fornasier, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20836  [pdf, other

    math.NA cs.LG

    Solving partial differential equations with sampled neural networks

    Authors: Chinmay Datar, Taniya Kapoor, Abhishek Chandra, Qing Sun, Iryna Burak, Erik Lien Bolager, Anna Veselovska, Massimo Fornasier, Felix Dietrich

    Abstract: Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent pr… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 16 pages, 15 figures

  2. arXiv:2310.19548  [pdf, other

    math.OC cs.LG math.FA

    Approximation Theory, Computing, and Deep Learning on the Wasserstein Space

    Authors: Massimo Fornasier, Pascal Heid, Giacomo Enrico Sodini

    Abstract: The challenge of approximating functions in infinite-dimensional spaces from finite samples is widely regarded as formidable. We delve into the challenging problem of the numerical approximation of Sobolev-smooth functions defined on probability spaces. Our particular focus centers on the Wasserstein distance function, which serves as a relevant example. In contrast to the existing body of literat… ▽ More

    Submitted 10 October, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Revised version

    MSC Class: 49Q22; 33F05; 46E36; 28A33; 68T07

  3. arXiv:2307.02279  [pdf, other

    math.OC cs.LG eess.SY

    From NeurODEs to AutoencODEs: a mean-field control framework for width-varying Neural Networks

    Authors: Cristina Cipriani, Massimo Fornasier, Alessandro Scagliotti

    Abstract: The connection between Residual Neural Networks (ResNets) and continuous-time control systems (known as NeurODEs) has led to a mathematical analysis of neural networks which has provided interesting results of both theoretical and practical significance. However, by construction, NeurODEs have been limited to describing constant-width layers, making them unsuitable for modeling deep learning archi… ▽ More

    Submitted 10 August, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 35 pages, 11 figures. Minor adjustments and new bibliographical references

    Journal ref: Eur. J. Appl. Math 36 (2025) 188-230

  4. arXiv:2306.09778  [pdf, other

    cs.LG math.NA math.OC stat.ML

    Gradient is All You Need?

    Authors: Konstantin Riedl, Timo Klock, Carina Geldhauser, Massimo Fornasier

    Abstract: In this paper we provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently proposed multi-particle derivative-free optimization method, as a stochastic relaxation of gradient descent. Remarkably, we observe that through communication of the particles, CBO exhibits a stochastic gradien… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: 38 pages, 4 figures

  5. arXiv:2211.04589  [pdf, other

    cs.LG stat.ML

    Finite Sample Identification of Wide Shallow Neural Networks with Biases

    Authors: Massimo Fornasier, Timo Klock, Marco Mondelli, Michael Rauchensteiner

    Abstract: Artificial neural networks are functions depending on a finite number of parameters typically encoded as weights and biases. The identification of the parameters of the network from finite samples of input-output pairs is often referred to as the \emph{teacher-student model}, and this model has represented a popular framework for understanding training and generalization. Even if the problem is NP… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    MSC Class: 65D15; 68T07; 90C26

  6. arXiv:2101.07150  [pdf, other

    cs.LG

    Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal Samples

    Authors: Christian Fiedler, Massimo Fornasier, Timo Klock, Michael Rauchensteiner

    Abstract: In this paper we approach the problem of unique and stable identifiability of generic deep artificial neural networks with pyramidal shape and smooth activation functions from a finite number of input-output samples. More specifically we introduce the so-called entangled weights, which compose weights of successive layers intertwined with suitable diagonal and invertible matrices depending on the… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

    MSC Class: 65D15; 68T07; 90C26

  7. arXiv:2001.11994  [pdf, other

    math.AP cs.LG math.OC

    Consensus-Based Optimization on Hypersurfaces: Well-Posedness and Mean-Field Limit

    Authors: Massimo Fornasier, Hui Huang, Lorenzo Pareschi, Philippe Sünnen

    Abstract: We introduce a new stochastic differential model for global optimization of nonconvex functions on compact hypersurfaces. The model is inspired by the stochastic Kuramoto-Vicsek system and belongs to the class of Consensus-Based Optimization methods. In fact, particles move on the hypersurface driven by a drift towards an instantaneous consensus point, computed as a convex combination of the parti… ▽ More

    Submitted 7 December, 2020; v1 submitted 31 January, 2020; originally announced January 2020.

  8. arXiv:2001.11988  [pdf, other

    cs.LG math.AP math.NA math.OC stat.ML

    Consensus-Based Optimization on the Sphere: Convergence to Global Minimizers and Machine Learning

    Authors: Massimo Fornasier, Hui Huang, Lorenzo Pareschi, Philippe Sünnen

    Abstract: We investigate the implementation of a new stochastic Kuramoto-Vicsek-type model for global optimization of nonconvex functions on the sphere. This model belongs to the class of Consensus-Based Optimization. In fact, particles move on the sphere driven by a drift towards an instantaneous consensus point, which is computed as a convex combination of particle locations, weighted by the cost function… ▽ More

    Submitted 28 July, 2021; v1 submitted 31 January, 2020; originally announced January 2020.

  9. arXiv:1911.00298  [pdf, other

    cs.LG math.AP math.DS stat.ML

    Data-driven Evolutions of Critical Points

    Authors: Stefano Almi, Massimo Fornasier, Richard Huber

    Abstract: In this paper we are concerned with the learnability of energies from data obtained by observing time evolutions of their critical points starting at random initial equilibria. As a byproduct of our theoretical framework we introduce the novel concept of mean-field limit of critical point evolutions and of their energy balance as a new form of transport. We formulate the energy learning as a varia… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

  10. arXiv:1907.00485  [pdf, other

    cs.LG cs.IT stat.ML

    Robust and Resource Efficient Identification of Two Hidden Layer Neural Networks

    Authors: Massimo Fornasier, Timo Klock, Michael Rauchensteiner

    Abstract: We address the structure identification and the uniform approximation of two fully nonlinear layer neural networks of the type $f(x)=1^T h(B^T g(A^T x))$ on $\mathbb R^d$ from a small number of query samples. We approach the problem by sampling actively finite difference approximations to Hessians of the network. Gathering several approximate Hessians allows reliably to approximate the matrix subs… ▽ More

    Submitted 30 June, 2019; originally announced July 2019.

  11. arXiv:1804.01592  [pdf, other

    stat.ML cs.LG

    Robust and Resource Efficient Identification of Shallow Neural Networks by Fewest Samples

    Authors: Massimo Fornasier, Jan Vybíral, Ingrid Daubechies

    Abstract: We address the structure identification and the uniform approximation of sums of ridge functions $f(x)=\sum_{i=1}^m g_i(a_i\cdot x)$ on ${\mathbb R}^d$, representing a general form of a shallow feed-forward neural network, from a small number of query samples. Higher order differentiation, as used in our constructive approximations, of sums of ridge functions or of their compositions, as in deeper… ▽ More

    Submitted 6 May, 2021; v1 submitted 4 April, 2018; originally announced April 2018.

  12. arXiv:1311.1642  [pdf, other

    math.NA cs.IT

    Quasi-Linear Compressed Sensing

    Authors: Martin Ehler, Massimo Fornasier, Juliane Sigl

    Abstract: Inspired by significant real-life applications, in particular, sparse phase retrieval and sparse pulsation frequency detection in Asteroseismology, we investigate a general framework for compressed sensing, where the measurements are quasi-linear. We formulate natural generalizations of the well-known Restricted Isometry Property (RIP) towards nonlinear measurements, which allow us to prove both u… ▽ More

    Submitted 7 November, 2013; originally announced November 2013.

  13. arXiv:1307.5725  [pdf, other

    math.NA cs.IT

    Damping Noise-Folding and Enhanced Support Recovery in Compressed Sensing

    Authors: Marco Artina, Massimo Fornasier, Steffen Peter

    Abstract: The practice of compressed sensing suffers importantly in terms of the efficiency/accuracy trade-off when acquiring noisy signals prior to measurement. It is rather common to find results treating the noise affecting the measurements, avoiding in this way to face the so-called $\textit{noise-folding}$ phenomenon, related to the noise in the signal, eventually amplified by the measurement procedure… ▽ More

    Submitted 24 November, 2014; v1 submitted 22 July, 2013; originally announced July 2013.

    Comments: 33 pages

    MSC Class: 94A12; 94A20; 65F22; 90C05; 90C30

  14. arXiv:1008.3043  [pdf, ps, other

    math.NA cs.CC cs.LG stat.ML

    Learning Functions of Few Arbitrary Linear Parameters in High Dimensions

    Authors: Massimo Fornasier, Karin Schnass, Jan Vybiral

    Abstract: Let us assume that $f$ is a continuous function defined on the unit ball of $\mathbb R^d$, of the form $f(x) = g (A x)$, where $A$ is a $k \times d$ matrix and $g$ is a function of $k$ variables for $k \ll d$. We are given a budget $m \in \mathbb N$ of possible point evaluations $f(x_i)$, $i=1,...,m$, of $f$, which we are allowed to query in order to construct a uniform approximating function. Und… ▽ More

    Submitted 17 January, 2012; v1 submitted 18 August, 2010; originally announced August 2010.

    Comments: 31 pages, this version was accepted to Foundations of Computational Mathematics, the final publication will be available on http://www.springerlink.com