Skip to main content

Showing 1–3 of 3 results for author: Giannou, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2403.03183  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    How Well Can Transformers Emulate In-context Newton's Method?

    Authors: Angeliki Giannou, Liu Yang, Tianhao Wang, Dimitris Papailiopoulos, Jason D. Lee

    Abstract: Transformer-based models have demonstrated remarkable in-context learning capabilities, prompting extensive research into its underlying mechanisms. Recent studies have suggested that Transformers can implement first-order optimization algorithms for in-context learning and even second order ones for the case of linear regression. In this work, we study whether Transformers can perform higher orde… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  2. arXiv:2306.16502  [pdf, other

    stat.ML cs.LG math.OC

    Stochastic Methods in Variational Inequalities: Ergodicity, Bias and Refinements

    Authors: Emmanouil-Vasileios Vlatakis-Gkaragkounis, Angeliki Giannou, Yudong Chen, Qiaomin Xie

    Abstract: For min-max optimization and variational inequalities problems (VIP) encountered in diverse machine learning tasks, Stochastic Extragradient (SEG) and Stochastic Gradient Descent Ascent (SGDA) have emerged as preeminent algorithms. Constant step-size variants of SEG/SGDA have gained popularity, with appealing benefits such as easy tuning and rapid forgiveness of initial conditions, but their conve… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: 37 pages, 6 main figures

  3. arXiv:2302.07937  [pdf, other

    cs.LG cs.AI stat.ML

    The Expressive Power of Tuning Only the Normalization Layers

    Authors: Angeliki Giannou, Shashank Rajput, Dimitris Papailiopoulos

    Abstract: Feature normalization transforms such as Batch and Layer-Normalization have become indispensable ingredients of state-of-the-art deep neural networks. Recent studies on fine-tuning large pretrained models indicate that just tuning the parameters of these affine transforms can achieve high accuracy for downstream tasks. These findings open the questions about the expressive power of tuning the norm… ▽ More

    Submitted 4 July, 2023; v1 submitted 15 February, 2023; originally announced February 2023.