Skip to main content

Showing 1–5 of 5 results for author: Grosse, R

Searching in archive math. Search in all archives.
.
  1. arXiv:2203.00089  [pdf, other

    cs.LG math.OC stat.ML

    Amortized Proximal Optimization

    Authors: Juhan Bae, Paul Vicol, Jeff Z. HaoChen, Roger Grosse

    Abstract: We propose a framework for online meta-optimization of parameters that govern optimization, called Amortized Proximal Optimization (APO). We first interpret various existing neural network optimizers as approximate stochastic proximal point methods which trade off the current-batch loss with proximity terms in both function space and weight space. The idea behind APO is to amortize the minimizatio… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: 37 pages, 30 figures

  2. arXiv:2102.09468  [pdf, other

    cs.LG math.OC stat.ML

    Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization

    Authors: Guodong Zhang, Yuanhao Wang, Laurent Lessard, Roger Grosse

    Abstract: Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simu… ▽ More

    Submitted 12 February, 2022; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: AISTATS 2022

  3. arXiv:2009.11359  [pdf, other

    math.OC cs.LG stat.ML

    A Unified Analysis of First-Order Methods for Smooth Games via Integral Quadratic Constraints

    Authors: Guodong Zhang, Xuchan Bao, Laurent Lessard, Roger Grosse

    Abstract: The theory of integral quadratic constraints (IQCs) allows the certification of exponential convergence of interconnected systems containing nonlinear or uncertain elements. In this work, we adapt the IQC theory to study first-order methods for smooth and strongly-monotone games and show how to design tailored quadratic constraints to get tight upper bounds of convergence rates. Using this framewo… ▽ More

    Submitted 26 April, 2021; v1 submitted 23 September, 2020; originally announced September 2020.

    Comments: Journal of Machine Learning Research

  4. arXiv:1808.10340  [pdf, ps, other

    cs.LG math.DG math.OC stat.ML

    A Coordinate-Free Construction of Scalable Natural Gradient

    Authors: Kevin Luk, Roger Grosse

    Abstract: Most neural networks are trained using first-order optimization methods, which are sensitive to the parameterization of the model. Natural gradient descent is invariant to smooth reparameterizations because it is defined in a coordinate-free way, but tractable approximations are typically defined in terms of coordinate systems, and hence may lose the invariance properties. We analyze the invarianc… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

  5. arXiv:1804.00325  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Aggregated Momentum: Stability Through Passive Damping

    Authors: James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse

    Abstract: Momentum is a simple and widely used trick which allows gradient-based optimizers to pick up speed along low curvature directions. Its performance depends crucially on a damping coefficient $β$. Large $β$ values can potentially deliver much larger speedups, but are prone to oscillations and instability; hence one typically resorts to small values such as 0.5 or 0.9. We propose Aggregated Momentum… ▽ More

    Submitted 1 May, 2019; v1 submitted 1 April, 2018; originally announced April 2018.

    Comments: 11 primary pages, 11 supplementary pages, 12 figures total

    Journal ref: International Conference on Learning Representations, 2019