Skip to main content

Showing 1–12 of 12 results for author: Brock, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2302.10322  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

    Authors: Bobby He, James Martens, Guodong Zhang, Aleksandar Botev, Andrew Brock, Samuel L Smith, Yee Whye Teh

    Abstract: Skip connections and normalisation layers form two standard architectural components that are ubiquitous for the training of Deep Neural Networks (DNNs), but whose precise roles are poorly understood. Recent approaches such as Deep Kernel Shaping have made progress towards reducing our reliance on them, using insights from wide NN kernel theory to improve signal propagation in vanilla DNNs (which… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: ICLR 2023

  2. arXiv:2102.06171  [pdf, other

    cs.CV cs.LG stat.ML

    High-Performance Large-Scale Image Recognition Without Normalization

    Authors: Andrew Brock, Soham De, Samuel L. Smith, Karen Simonyan

    Abstract: Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for l… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

  3. arXiv:2101.08692  [pdf, other

    cs.LG cs.CV stat.ML

    Characterizing signal propagation to close the performance gap in unnormalized ResNets

    Authors: Andrew Brock, Soham De, Samuel L. Smith

    Abstract: Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs. Building on recent theoretical analyses of deep ResNets at initialization, we propose a simple set of analysis tools to… ▽ More

    Submitted 27 January, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Comments: Published as a conference paper at ICLR 2021

  4. arXiv:2010.15040  [pdf, other

    stat.ML cs.LG

    Training Generative Adversarial Networks by Solving Ordinary Differential Equations

    Authors: Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff Donahue, Timothy P. Lillicrap, Pushmeet Kohli

    Abstract: The instability of Generative Adversarial Network (GAN) training has frequently been attributed to gradient descent. Consequently, recent methods have aimed to tailor the models and training procedures to stabilise the discrete updates. In contrast, we study the continuous-time dynamics induced by GAN training. Both theory and toy experiments suggest that these dynamics are in fact surprisingly st… ▽ More

    Submitted 28 November, 2020; v1 submitted 28 October, 2020; originally announced October 2020.

  5. arXiv:2010.10241  [pdf, ps, other

    stat.ML cs.CV cs.LG

    BYOL works even without batch statistics

    Authors: Pierre H. Richemond, Jean-Bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko

    Abstract: Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach for image representation. From an augmented view of an image, BYOL trains an online network to predict a target network representation of a different augmented view of the same image. Unlike contrastive methods, BYOL does not explicitly use a repulsion term built from negative pairs in its training objective. Yet, it avoids co… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  6. arXiv:2004.02967  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Evolving Normalization-Activation Layers

    Authors: Hanxiao Liu, Andrew Brock, Karen Simonyan, Quoc V. Le

    Abstract: Normalization layers and activation functions are fundamental components in deep networks and typically co-locate with each other. Here we propose to design them using an automated approach. Instead of designing them separately, we unify them into a single tensor-to-tensor computation graph, and evolve its structure starting from basic mathematical functions. Examples of such mathematical function… ▽ More

    Submitted 17 July, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

  7. arXiv:1902.00465  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    TF-Replicator: Distributed Machine Learning for Researchers

    Authors: Peter Buchlovsky, David Budden, Dominik Grewe, Chris Jones, John Aslanides, Frederic Besse, Andy Brock, Aidan Clark, Sergio Gómez Colmenarejo, Aedan Pope, Fabio Viola, Dan Belov

    Abstract: We describe TF-Replicator, a framework for distributed machine learning designed for DeepMind researchers and implemented as an abstraction over TensorFlow. TF-Replicator simplifies writing data-parallel and model-parallel research code. The same models can be effortlessly deployed to different cluster architectures (i.e. one or many machines containing CPUs, GPUs or TPU accelerators) using synchr… ▽ More

    Submitted 1 February, 2019; originally announced February 2019.

  8. arXiv:1809.11096  [pdf, other

    cs.LG stat.ML

    Large Scale GAN Training for High Fidelity Natural Image Synthesis

    Authors: Andrew Brock, Jeff Donahue, Karen Simonyan

    Abstract: Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenabl… ▽ More

    Submitted 25 February, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

  9. arXiv:1711.01297  [pdf, other

    stat.ML cs.LG

    Implicit Weight Uncertainty in Neural Networks

    Authors: Nick Pawlowski, Andrew Brock, Matthew C. H. Lee, Martin Rajchl, Ben Glocker

    Abstract: Modern neural networks tend to be overconfident on unseen, noisy or incorrectly labelled data and do not produce meaningful uncertainty measures. Bayesian deep learning aims to address this shortcoming with variational approximations (such as Bayes by Backprop or Multiplicative Normalising Flows). However, current approaches have limitations regarding flexibility and scalability. We introduce Baye… ▽ More

    Submitted 25 May, 2018; v1 submitted 3 November, 2017; originally announced November 2017.

    Comments: Submitted to NIPS 2018, under review

  10. arXiv:1706.04983  [pdf, other

    stat.ML cs.LG

    FreezeOut: Accelerate Training by Progressively Freezing Layers

    Authors: Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

    Abstract: The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time dur… ▽ More

    Submitted 18 June, 2017; v1 submitted 15 June, 2017; originally announced June 2017.

    Comments: Extended Abstract

  11. arXiv:1609.07093  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Neural Photo Editing with Introspective Adversarial Networks

    Authors: Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

    Abstract: The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation. We present the Neural Photo Editor, an interface that leverages the power of generative neural networks to make large, semantically coherent changes to existing images. To tackle the challenge of achieving accurate reconstructions without loss of feature qua… ▽ More

    Submitted 6 February, 2017; v1 submitted 22 September, 2016; originally announced September 2016.

    Comments: 10 pages, 7 figures, 3 tables

  12. arXiv:1608.04236  [pdf, other

    cs.CV cs.HC cs.LG stat.ML

    Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

    Authors: Andrew Brock, Theodore Lim, J. M. Ritchie, Nick Weston

    Abstract: When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencod… ▽ More

    Submitted 16 August, 2016; v1 submitted 15 August, 2016; originally announced August 2016.

    Comments: 9 pages, 5 figures, 2 tables