Skip to main content

Showing 1–2 of 2 results for author: Xiao, K L

Searching in archive stat. Search in all archives.
.
  1. arXiv:2411.12135  [pdf, other

    stat.ML cs.LG

    Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects

    Authors: Ke Liang Xiao, Noah Marshall, Atish Agarwala, Elliot Paquette

    Abstract: In recent years, signSGD has garnered interest as both a practical optimizer as well as a simple model to understand adaptive optimizers like Adam. Though there is a general consensus that signSGD acts to precondition optimization and reshapes noise, quantitatively understanding these effects in theoretically solvable settings remains difficult. We present an analysis of signSGD in a high dimensio… ▽ More

    Submitted 21 February, 2025; v1 submitted 18 November, 2024; originally announced November 2024.

  2. arXiv:2406.11733  [pdf, other

    stat.ML cs.LG

    To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions

    Authors: Noah Marshall, Ke Liang Xiao, Atish Agarwala, Elliot Paquette

    Abstract: The success of modern machine learning is due in part to the adaptive optimization methods that have been developed to deal with the difficulties of training large models over complex datasets. One such method is gradient clipping: a practical procedure with limited theoretical underpinnings. In this work, we study clipping in a least squares problem under streaming SGD. We develop a theoretical a… ▽ More

    Submitted 6 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.