Skip to main content

Showing 1–2 of 2 results for author: Riabinin, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.13416  [pdf, ps, other

    cs.LG math.OC stat.ML

    Gluon: Making Muon & Scion Great Again! (Bridging Theory and Practice of LMO-based Optimizers for LLMs)

    Authors: Artem Riabinin, Egor Shulgin, Kaja Gruntkowska, Peter Richtárik

    Abstract: Recent developments in deep learning optimization have brought about radically new algorithms based on the Linear Minimization Oracle (LMO) framework, such as $\sf Muon$ and $\sf Scion$. After over a decade of $\sf Adam$'s dominance, these LMO-based methods are emerging as viable replacements, offering several practical advantages such as improved memory efficiency, better hyperparameter transfera… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  2. arXiv:2502.12329  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    A Novel Unified Parametric Assumption for Nonconvex Optimization

    Authors: Artem Riabinin, Ahmed Khaled, Peter Richtárik

    Abstract: Nonconvex optimization is central to modern machine learning, but the general framework of nonconvex optimization yields weak convergence guarantees that are too pessimistic compared to practice. On the other hand, while convexity enables efficient optimization, it is of limited applicability to many practical problems. To bridge this gap and better understand the practical success of optimization… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.