Skip to main content

Showing 1–3 of 3 results for author: Griniasty, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08915  [pdf, ps, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech

    An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models

    Authors: Jialin Mao, Itay Griniasty, Yan Sun, Mark K. Transtrum, James P. Sethna, Pratik Chaudhari

    Abstract: Recent experiments have shown that training trajectories of multiple deep neural networks with different architectures, optimization algorithms, hyper-parameter settings, and regularization methods evolve on a remarkably low-dimensional "hyper-ribbon-like" manifold in the space of probability distributions. Inspired by the similarities in the training trajectories of deep networks and linear netwo… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  2. arXiv:2305.01604  [pdf, other

    cs.LG cond-mat.dis-nn

    The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold

    Authors: Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna, Pratik Chaudhari

    Abstract: We develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying high-dimensional probabilistic models, we reveal that the training process explores an effectively low-dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization technique… ▽ More

    Submitted 19 March, 2024; v1 submitted 2 May, 2023; originally announced May 2023.

    Journal ref: Proceedings of the National Academy of Sciences 121.12 (2024)

  3. arXiv:2210.17011  [pdf, other

    cs.LG

    A picture of the space of typical learnable tasks

    Authors: Rahul Ramesh, Jialin Mao, Itay Griniasty, Rubing Yang, Han Kheng Teoh, Mark Transtrum, James P. Sethna, Pratik Chaudhari

    Abstract: We develop information geometric techniques to understand the representations learned by deep networks when they are trained on different tasks using supervised, meta-, semi-supervised and contrastive learning. We shed light on the following phenomena that relate to the structure of the space of tasks: (1) the manifold of probabilistic models trained on different tasks using different representati… ▽ More

    Submitted 21 July, 2023; v1 submitted 30 October, 2022; originally announced October 2022.