Skip to main content

Showing 1–4 of 4 results for author: Maile, K

.
  1. arXiv:2308.06103  [pdf, other

    cs.LG

    Composable Function-preserving Expansions for Transformer Architectures

    Authors: Andrea Gesmundo, Kaitlin Maile

    Abstract: Training state-of-the-art neural networks requires a high cost in terms of compute and time. Model scale is recognized to be a critical factor to achieve and improve the state-of-the-art. Increasing the scale of a neural network normally requires restarting from scratch by randomly initializing all the parameters of the model, as this implies a change of architecture's parameters that does not all… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  2. arXiv:2210.05484  [pdf, other

    cs.LG

    Equivariance-aware Architectural Optimization of Neural Networks

    Authors: Kaitlin Maile, Dennis G. Wilson, Patrick Forré

    Abstract: Incorporating equivariance to symmetry groups as a constraint during neural network training can improve performance and generalization for tasks exhibiting those symmetries, but such symmetries are often not perfectly nor explicitly present. This motivates algorithmically optimizing the architectural constraints imposed by equivariance. We propose the equivariance relaxation morphism, which prese… ▽ More

    Submitted 7 February, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

  3. arXiv:2202.08539  [pdf, other

    cs.LG

    When, where, and how to add new neurons to ANNs

    Authors: Kaitlin Maile, Emmanuel Rachelson, Hervé Luga, Dennis G. Wilson

    Abstract: Neurogenesis in ANNs is an understudied and difficult problem, even compared to other forms of structural learning like pruning. By decomposing it into triggers and initializations, we introduce a framework for studying the various facets of neurogenesis: when, where, and how to add neurons during the learning process. We present the Neural Orthogonality (NORTH*) suite of neurogenesis strategies,… ▽ More

    Submitted 20 May, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: Accepted at the 1st AutoML conference, 2022

  4. arXiv:2106.11655  [pdf, other

    cs.LG cs.AI

    DARTS-PRIME: Regularization and Scheduling Improve Constrained Optimization in Differentiable NAS

    Authors: Kaitlin Maile, Erwan Lecarpentier, Hervé Luga, Dennis G. Wilson

    Abstract: Differentiable Architecture Search (DARTS) is a recent neural architecture search (NAS) method based on a differentiable relaxation. Due to its success, numerous variants analyzing and improving parts of the DARTS framework have recently been proposed. By considering the problem as a constrained bilevel optimization, we present and analyze DARTS-PRIME, a variant including improvements to architect… ▽ More

    Submitted 18 October, 2021; v1 submitted 22 June, 2021; originally announced June 2021.