Skip to main content

Showing 1–15 of 15 results for author: Peluchetti, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.09376  [pdf, other

    cs.LG stat.ML

    BM$^2$: Coupled Schrödinger Bridge Matching

    Authors: Stefano Peluchetti

    Abstract: A Schrödinger bridge establishes a dynamic transport map between two target distributions via a reference process, simultaneously solving an associated entropic optimal transport problem. We consider the setting where samples from the target distributions are available, and the reference diffusion process admits tractable dynamics. We thus introduce Coupled Bridge Matching (BM$^2$), a simple non-i… ▽ More

    Submitted 18 January, 2025; v1 submitted 14 September, 2024; originally announced September 2024.

    Comments: Archival of: TMLR, 12/2024, https://openreview.net/forum?id=fqkq1MgONB

  2. arXiv:2408.14325  [pdf, other

    cs.LG stat.ML

    Function-Space MCMC for Bayesian Wide Neural Networks

    Authors: Lucia Pezzetti, Stefano Favaro, Stefano Peluchetti

    Abstract: Bayesian Neural Networks represent a fascinating confluence of deep learning and probabilistic reasoning, offering a compelling framework for understanding uncertainty in complex predictive models. In this paper, we investigate the use of the preconditioned Crank-Nicolson algorithm and its Langevin version to sample from a reparametrised posterior distribution of the neural network's weights, as t… ▽ More

    Submitted 9 March, 2025; v1 submitted 26 August, 2024; originally announced August 2024.

  3. arXiv:2312.14589  [pdf, other

    cs.LG stat.ML

    Non-Denoising Forward-Time Diffusions

    Authors: Stefano Peluchetti

    Abstract: The scope of this paper is generative modeling through diffusion processes. An approach falling within this paradigm is the work of Song et al. (2021), which relies on a time-reversal argument to construct a diffusion process targeting the desired data distribution. We show that the time-reversal argument, common to all denoising diffusion probabilistic modeling proposals, is not necessary. We obt… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: original date: 18 Nov 2021; archival of ICLR submission (https://openreview.net/forum?id=oVfIKuhqfC); no differences

  4. arXiv:2304.00917  [pdf, other

    stat.ML cs.LG

    Diffusion Bridge Mixture Transports, Schrödinger Bridge Problems and Generative Modeling

    Authors: Stefano Peluchetti

    Abstract: The dynamic Schrödinger bridge problem seeks a stochastic process that defines a transport between two target probability measures, while optimally satisfying the criteria of being closest, in terms of Kullback-Leibler divergence, to a reference process. We propose a novel sampling-based iterative algorithm, the iterated diffusion bridge mixture (IDBM) procedure, aimed at solving the dynamic Schrö… ▽ More

    Submitted 22 December, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Journal ref: Journal of Machine Learning Research 24(374):1-51, 2023

  5. arXiv:2206.08065  [pdf, other

    cs.LG math.ST stat.ML

    Large-width asymptotics for ReLU neural networks with $α$-Stable initializations

    Authors: Stefano Favaro, Sandra Fortini, Stefano Peluchetti

    Abstract: There is a recent and growing literature on large-width asymptotic properties of Gaussian neural networks (NNs), namely NNs whose weights are initialized as Gaussian distributions. Two popular problems are: i) the study of the large-width distributions of NNs, which characterizes the infinitely wide limit of a rescaled NN in terms of a Gaussian stochastic process; ii) the study of the large-width… ▽ More

    Submitted 4 January, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 29 pages

  6. arXiv:2108.02316  [pdf, other

    cs.LG math.PR math.ST stat.ML

    Deep Stable neural networks: large-width asymptotics and convergence rates

    Authors: Stefano Favaro, Sandra Fortini, Stefano Peluchetti

    Abstract: In modern deep learning, there is a recent and growing literature on the interplay between large-width asymptotic properties of deep Gaussian neural networks (NNs), i.e. deep NNs with Gaussian-distributed weights, and Gaussian stochastic processes (SPs). Such an interplay has proved to be critical in Bayesian inference under Gaussian SP priors, kernel regression for infinitely wide deep NNs traine… ▽ More

    Submitted 23 June, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Improve the proof of the main result in arXiv:2003.00394, and study convergence rates

  7. arXiv:2106.03780  [pdf, other

    cs.LG cs.AI eess.SY math.OC

    Learning Stochastic Optimal Policies via Gradient Descent

    Authors: Stefano Massaroli, Michael Poli, Stefano Peluchetti, Jinkyoo Park, Atsushi Yamashita, Hajime Asama

    Abstract: We systematically develop a learning-based treatment of stochastic optimal control (SOC), relying on direct optimization of parametric control policies. We propose a derivation of adjoint sensitivity results for stochastic differential equations through direct application of variational calculus. Then, given an objective function for a predetermined task specifying the desiderata for the controlle… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Journal ref: IEEE Control Systems Letters, 2021

  8. arXiv:2102.10307  [pdf, other

    math.PR cs.LG stat.ML

    Large-width functional asymptotics for deep Gaussian neural networks

    Authors: Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti

    Abstract: In this paper, we consider fully connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions. Extending previous results (Matthews et al., 2018a;b; Yang, 2019) we adopt a function-space perspective, i.e. we look at neural networks as infinite-dimensional random elements on the input space $\mathbb{R}^I$. Under… ▽ More

    Submitted 20 February, 2021; originally announced February 2021.

    Journal ref: International Conference on Learning Representations (ICLR) 2021

  9. arXiv:2102.04462  [pdf, other

    stat.ML cs.LG math.ST

    Learning-augmented count-min sketches via Bayesian nonparametrics

    Authors: Emanuele Dolera, Stefano Favaro, Stefano Peluchetti

    Abstract: The count-min sketch (CMS) is a time and memory efficient randomized data structure that provides estimates of tokens' frequencies in a data stream of tokens, i.e. point queries, based on random hashed data. A learning-augmented version of the CMS, referred to as CMS-DP, has been proposed by Cai, Mitzenmacher and Adams (\textit{NeurIPS} 2018), and it relies on Bayesian nonparametric (BNP) modeling… ▽ More

    Submitted 13 September, 2022; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: 47 pages

  10. arXiv:2102.03743  [pdf, ps, other

    stat.ML cs.LG

    A Bayesian nonparametric approach to count-min sketch under power-law data streams

    Authors: Emanuele Dolera, Stefano Favaro, Stefano Peluchetti

    Abstract: The count-min sketch (CMS) is a randomized data structure that provides estimates of tokens' frequencies in a large data stream using a compressed representation of the data by random hashing. In this paper, we rely on a recent Bayesian nonparametric (BNP) view on the CMS to develop a novel learning-augmented CMS under power-law data streams. We assume that tokens in the stream are drawn from an u… ▽ More

    Submitted 11 February, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

  11. arXiv:2102.03739  [pdf, ps, other

    stat.ML cs.LG

    Infinite-channel deep stable convolutional neural networks

    Authors: Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti

    Abstract: The interplay between infinite-width neural networks (NNs) and classes of Gaussian processes (GPs) is well known since the seminal work of Neal (1996). While numerous theoretical refinements have been proposed in the recent years, the interplay between NNs and GPs relies on two critical distributional assumptions on the NN's parameters: A1) finite variance; A2) independent and identical distributi… ▽ More

    Submitted 25 April, 2022; v1 submitted 7 February, 2021; originally announced February 2021.

    Comments: 20 pages, 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

    Journal ref: Neural Information Processing Systems (NeurIPS 2021)

  12. arXiv:2007.03253  [pdf, other

    stat.ML cs.LG math.ST

    Doubly infinite residual neural networks: a diffusion process approach

    Authors: Stefano Peluchetti, Stefano Favaro

    Abstract: Modern neural networks (NN) featuring a large number of layers (depth) and units per layer (width) have achieved a remarkable performance across many domains. While there exists a vast literature on the interplay between infinitely wide NNs and Gaussian processes, a little is known about analogous interplays with respect to infinitely deep NNs. NNs with independent and identically distributed (i.i… ▽ More

    Submitted 18 September, 2021; v1 submitted 7 July, 2020; originally announced July 2020.

    Journal ref: Journal of Machine Learning Research 22; (175):1-48, 2021

  13. arXiv:2003.00394  [pdf, other

    stat.ML cs.LG

    Stable behaviour of infinitely wide deep neural networks

    Authors: Stefano Favaro, Sandra Fortini, Stefano Peluchetti

    Abstract: We consider fully connected feed-forward deep neural networks (NNs) where weights and biases are independent and identically distributed as symmetric centered stable distributions. Then, we show that the infinite wide limit of the NN, under suitable scaling on the weights, is a stochastic process whose finite-dimensional distributions are multivariate stable distributions. The limiting process is… ▽ More

    Submitted 29 February, 2020; originally announced March 2020.

    Comments: 25 pages, 3 figures

  14. arXiv:1910.01319  [pdf, other

    cs.LG stat.ML

    An empirical study of pretrained representations for few-shot classification

    Authors: Tiago Ramalho, Thierry Sousbie, Stefano Peluchetti

    Abstract: Recent algorithms with state-of-the-art few-shot classification results start their procedure by computing data features output by a large pretrained model. In this paper we systematically investigate which models provide the best representations for a few-shot image classification task when pretrained on the Imagenet dataset. We test their representations when used as the starting point for diffe… ▽ More

    Submitted 3 October, 2019; originally announced October 2019.

  15. arXiv:1905.11065  [pdf, other

    stat.ML cs.LG

    Infinitely deep neural networks as diffusion processes

    Authors: Stefano Peluchetti, Stefano Favaro

    Abstract: When the parameters are independently and identically distributed (initialized) neural networks exhibit undesirable properties that emerge as the number of layers increases, e.g. a vanishing dependency on the input and a concentration on restrictive families of functions including constant functions. We consider parameter distributions that shrink as the number of layers increases in order to reco… ▽ More

    Submitted 29 February, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: 16 pages, 9 figures