Skip to main content

Showing 1–6 of 6 results for author: Keller-Ressel, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.01382  [pdf, other

    stat.ML cs.LG

    Emergence of heavy tails in homogenized stochastic gradient descent

    Authors: Zhe Jiao, Martin Keller-Ressel

    Abstract: It has repeatedly been observed that loss minimization by stochastic gradient descent (SGD) leads to heavy-tailed distributions of neural network parameters. Here, we analyze a continuous diffusion approximation of SGD, called homogenized stochastic gradient descent, show that it behaves asymptotically heavy-tailed, and give explicit upper and lower bounds on its tail-index. We validate these boun… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    MSC Class: 60H30; 68Txx ACM Class: G.3; I.2.6

  2. arXiv:2305.06611  [pdf, other

    cs.CV

    Hyperbolic Deep Learning in Computer Vision: A Survey

    Authors: Pascal Mettes, Mina Ghadimi Atigh, Martin Keller-Ressel, Jeffrey Gu, Serena Yeung

    Abstract: Deep representation learning is a ubiquitous part of modern computer vision. While Euclidean space has been the de facto standard manifold for learning visual representations, hyperbolic space has recently gained rapid traction for learning in computer vision. Specifically, hyperbolic learning has shown a strong potential to embed hierarchical structures, learn from limited samples, quantify uncer… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  3. arXiv:2207.06775  [pdf, other

    stat.CO cs.LG math.MG

    Strain-Minimizing Hyperbolic Network Embeddings with Landmarks

    Authors: Martin Keller-Ressel, Stephanie Nargang

    Abstract: We introduce L-hydra (landmarked hyperbolic distance recovery and approximation), a method for embedding network- or distance-based data into hyperbolic space, which requires only the distance measurements to a few 'landmark nodes'. This landmark heuristic makes L-hydra applicable to large-scale graphs and improves upon previously introduced methods. As a mathematical justification, we show that a… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    MSC Class: 68R12; 91C15; 51M09 ACM Class: G.3

  4. arXiv:2106.14472  [pdf, other

    cs.LG

    Hyperbolic Busemann Learning with Ideal Prototypes

    Authors: Mina Ghadimi Atigh, Martin Keller-Ressel, Pascal Mettes

    Abstract: Hyperbolic space has become a popular choice of manifold for representation learning of various datatypes from tree-like structures and text to graphs. Building on the success of deep learning with prototypes in Euclidean and hyperspherical spaces, a few recent works have proposed hyperbolic prototypes for classification. Such approaches enable effective learning in low-dimensional output spaces a… ▽ More

    Submitted 23 November, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: accepted at NeurIPS 2021 (35th Conference on Neural Information Processing Systems)

  5. arXiv:2010.07744  [pdf, other

    stat.ML cs.LG

    A Theory of Hyperbolic Prototype Learning

    Authors: Martin Keller-Ressel

    Abstract: We introduce Hyperbolic Prototype Learning, a type of supervised learning, where class labels are represented by ideal points (points at infinity) in hyperbolic space. Learning is achieved by minimizing the 'penalized Busemann loss', a new loss function based on the Busemann function of hyperbolic geometry. We discuss several theoretical features of this setup. In particular, Hyperbolic Prototype… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 6 pages

    MSC Class: 68T07; 62J02 ACM Class: G.3; I.5

  6. arXiv:1903.08977  [pdf, other

    stat.CO cs.LG math.MG

    Hydra: A method for strain-minimizing hyperbolic embedding of network- and distance-based data

    Authors: Martin Keller-Ressel, Stephanie Nargang

    Abstract: We introduce hydra (hyperbolic distance recovery and approximation), a new method for embedding network- or distance-based data into hyperbolic space. We show mathematically that hydra satisfies a certain optimality guarantee: It minimizes the `hyperbolic strain' between original and embedded data points. Moreover, it recovers points exactly, when they are located on a hyperbolic submanifold of th… ▽ More

    Submitted 3 September, 2019; v1 submitted 21 March, 2019; originally announced March 2019.

    MSC Class: 68Wxx; 51M10 ACM Class: G.2.2; G.3