Skip to main content

Showing 1–6 of 6 results for author: Munn, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.13397  [pdf, other

    cs.LG math.NA stat.ML

    Learning by solving differential equations

    Authors: Benoit Dherin, Michael Munn, Hanna Mazzawi, Michael Wunder, Sourabh Medapati, Javier Gonzalvo

    Abstract: Modern deep learning algorithms use variations of gradient descent as their main learning methods. Gradient descent can be understood as the simplest Ordinary Differential Equation (ODE) solver; namely, the Euler method applied to the gradient flow differential equation. Since Euler, many ODE solvers have been devised that follow the gradient flow equation more precisely and more stably. Runge-Kut… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  2. arXiv:2502.01557  [pdf, other

    cs.LG math.DS stat.ML

    Training in reverse: How iteration order influences convergence and stability in deep learning

    Authors: Benoit Dherin, Benny Avelin, Anders Karlsson, Hanna Mazzawi, Javier Gonzalvo, Michael Munn

    Abstract: Despite exceptional achievements, training neural networks remains computationally expensive and is often plagued by instabilities that can degrade convergence. While learning rate schedules can help mitigate these issues, finding optimal schedules is time-consuming and resource-intensive. This work explores theoretical issues concerning training stability in the constant-learning-rate (i.e., with… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

  3. arXiv:2405.18590  [pdf, other

    stat.ML cs.LG

    A Margin-based Multiclass Generalization Bound via Geometric Complexity

    Authors: Michael Munn, Benoit Dherin, Javier Gonzalvo

    Abstract: There has been considerable effort to better understand the generalization capabilities of deep neural networks both as a means to unlock a theoretical understanding of their success as well as providing directions for further improvements. In this paper, we investigate margin-based multiclass generalization bounds for neural networks which rely on a recent complexity measure, the geometric comple… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Accepted as an ICML 2023 workshop paper (Topology, Algebra and Geometry in Machine Learning)

    Journal ref: Proceedings of 2nd Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML), PMLR 221:189-205, 2023

  4. arXiv:2209.13083  [pdf, other

    cs.LG stat.ML

    Why neural networks find simple solutions: the many regularizers of geometric complexity

    Authors: Benoit Dherin, Michael Munn, Mihaela Rosca, David G. T. Barrett

    Abstract: In many contexts, simpler models are preferable to more complex models and the control of this model complexity is the goal for many methods in machine learning such as regularization, hyperparameter tuning and architecture design. In deep learning, it has been difficult to understand the underlying mechanisms of complexity control, since many traditional measures are not naturally suitable for de… ▽ More

    Submitted 23 December, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted as a NeurIPS 2022 paper

  5. arXiv:2111.15090  [pdf, other

    cs.LG stat.ML

    The Geometric Occam's Razor Implicit in Deep Learning

    Authors: Benoit Dherin, Michael Munn, David G. T. Barrett

    Abstract: In over-parameterized deep neural networks there can be many possible parameter configurations that fit the training data exactly. However, the properties of these interpolating solutions are poorly understood. We argue that over-parameterized neural networks trained with stochastic gradient descent are subject to a Geometric Occam's Razor; that is, these networks are implicitly regularized by the… ▽ More

    Submitted 30 November, 2021; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: Accepted as a NeurIPS 2021 workshop paper (OPT2021)

  6. arXiv:2006.08571  [pdf, other

    stat.ML cs.LG

    COT-GAN: Generating Sequential Data via Causal Optimal Transport

    Authors: Tianlin Xu, Li K. Wenliang, Michael Munn, Beatrice Acciaio

    Abstract: We introduce COT-GAN, an adversarial algorithm to train implicit generative models optimized for producing sequential data. The loss function of this algorithm is formulated using ideas from Causal Optimal Transport (COT), which combines classic optimal transport methods with an additional temporal causality constraint. Remarkably, we find that this causality condition provides a natural framework… ▽ More

    Submitted 21 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.