Skip to main content

Showing 1–13 of 13 results for author: Osindero, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2106.03517  [pdf, other

    cs.LG stat.ML

    Top-KAST: Top-K Always Sparse Training

    Authors: Siddhant M. Jayakumar, Razvan Pascanu, Jack W. Rae, Simon Osindero, Erich Elsen

    Abstract: Sparse neural networks are becoming increasingly important as the field seeks to improve the performance of existing models by scaling them up, while simultaneously trying to reduce power consumption and computational footprint. Unfortunately, most existing methods for inducing performant sparse models still entail the instantiation of dense parameters, or dense gradients in the backward-pass, dur… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Journal ref: Advances in Neural Information Processing Systems, 33, 20744-20754

  2. arXiv:2011.03535  [pdf

    cs.NE cs.LG stat.ML

    Contrastive Topographic Models: Energy-based density models applied to the understanding of sensory coding and cortical topography

    Authors: Simon Osindero

    Abstract: We address the problem of building theoretical models that help elucidate the function of the visual brain at computational/algorithmic and structural/mechanistic levels. We seek to understand how the receptive fields and topographic maps found in visual cortical areas relate to underlying computational desiderata. We view the development of sensory systems from the popular perspective of probabil… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

  3. arXiv:2009.12583  [pdf, other

    cs.LG stat.ML

    Small Data, Big Decisions: Model Selection in the Small-Data Regime

    Authors: Jorg Bornschein, Francesco Visin, Simon Osindero

    Abstract: Highly overparametrized neural networks can display curiously strong generalization performance - a phenomenon that has recently garnered a wealth of theoretical and empirical research in order to better understand it. In contrast to most previous work, which typically considers the performance as a function of the model size, in this paper we empirically study the generalization performance as th… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

    Journal ref: Proceedings of the International Conference on Machine (ICML 2020)

  4. arXiv:2006.07360  [pdf, other

    cs.LG stat.ML

    AlgebraNets

    Authors: Jordan Hoffmann, Simon Schmitt, Simon Osindero, Karen Simonyan, Erich Elsen

    Abstract: Neural networks have historically been built layerwise from the set of functions in ${f: \mathbb{R}^n \to \mathbb{R}^m }$, i.e. with activations and weights/parameters represented by real numbers, $\mathbb{R}$. Our work considers a richer set of objects for activations and weights, and undertakes a comprehensive study of alternative algebras as number representations by studying their performance… ▽ More

    Submitted 16 June, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  5. arXiv:2006.07232  [pdf, other

    cs.LG cs.NE stat.ML

    A Practical Sparse Approximation for Real Time Recurrent Learning

    Authors: Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

    Abstract: Current methods for training recurrent neural networks are based on backpropagation through time, which requires storing a complete history of network states, and prohibits updating the weights `online' (after every timestep). Real Time Recurrent Learning (RTRL) eliminates the need for history storage and allows for online weight updates, but does so at the expense of computational costs that are… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  6. arXiv:1912.07559  [pdf, other

    cs.LG stat.ML

    A Deep Neural Network's Loss Surface Contains Every Low-dimensional Pattern

    Authors: Wojciech Marian Czarnecki, Simon Osindero, Razvan Pascanu, Max Jaderberg

    Abstract: The work "Loss Landscape Sightseeing with Multi-Point Optimization" (Skorokhodov and Burtsev, 2019) demonstrated that one can empirically find arbitrary 2D binary patterns inside loss surfaces of popular neural networks. In this paper we prove that: (i) this is a general property of deep universal approximators; and (ii) this property holds for arbitrary smooth patterns, for other dimensionalities… ▽ More

    Submitted 2 January, 2020; v1 submitted 16 December, 2019; originally announced December 2019.

  7. arXiv:1912.06910  [pdf, other

    cs.LG cs.AI stat.ML

    Adapting Behaviour for Learning Progress

    Authors: Tom Schaul, Diana Borsa, David Ding, David Szepesvari, Georg Ostrovski, Will Dabney, Simon Osindero

    Abstract: Determining what experience to generate to best facilitate learning (i.e. exploration) is one of the distinguishing features and open challenges in reinforcement learning. The advent of distributed agents that interact with parallel instances of the environment has enabled larger scales and greater flexibility, but has not removed the need to tune exploration to the task, because the ideal data fo… ▽ More

    Submitted 14 December, 2019; originally announced December 2019.

  8. arXiv:1910.02720  [pdf, other

    stat.ML cs.LG cs.NE

    Meta-Learning Deep Energy-Based Memory Models

    Authors: Sergey Bartunov, Jack W Rae, Simon Osindero, Timothy P Lillicrap

    Abstract: We study the problem of learning associative memory -- a system which is able to retrieve a remembered pattern based on its distorted or incomplete version. Attractor networks provide a sound model of associative memory: patterns are stored as attractors of the network dynamics and associative retrieval is performed by running the dynamics starting from a query pattern until it converges to an att… ▽ More

    Submitted 20 April, 2021; v1 submitted 7 October, 2019; originally announced October 2019.

    Comments: ICLR 2020

  9. arXiv:1905.03030  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-learning of Sequential Strategies

    Authors: Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg

    Abstract: In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred… ▽ More

    Submitted 18 July, 2019; v1 submitted 8 May, 2019; originally announced May 2019.

    Comments: DeepMind Technical Report (15 pages, 6 figures). Version V1.1

  10. arXiv:1902.02186  [pdf, other

    cs.LG cs.AI stat.ML

    Distilling Policy Distillation

    Authors: Wojciech Marian Czarnecki, Razvan Pascanu, Simon Osindero, Siddhant M. Jayakumar, Grzegorz Swirszcz, Max Jaderberg

    Abstract: The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning. This process, referred to as distillation, has been used to great success, for example, by enhancing the optimisation of agents, leading to stronger performance faster, on harder domains [26, 32, 5, 8]. Despite the widespread use and conceptual simplicity of distillation, many different formul… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

    Comments: Accepted at AISTATS 2019

  11. arXiv:1807.05960  [pdf, other

    cs.LG cs.CV stat.ML

    Meta-Learning with Latent Embedding Optimization

    Authors: Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell

    Abstract: Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter spaces in extreme low-data regimes. We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of mod… ▽ More

    Submitted 26 March, 2019; v1 submitted 16 July, 2018; originally announced July 2018.

  12. arXiv:1806.01780  [pdf, other

    cs.LG stat.ML

    Mix&Match - Agent Curricula for Reinforcement Learning

    Authors: Wojciech Marian Czarnecki, Siddhant M. Jayakumar, Max Jaderberg, Leonard Hasenclever, Yee Whye Teh, Simon Osindero, Nicolas Heess, Razvan Pascanu

    Abstract: We introduce Mix&Match (M&M) - a training framework designed to facilitate rapid and effective learning in RL agents, especially those that would be too slow or too challenging to train otherwise. The key innovation is a procedure that allows us to automatically form a curriculum over agents. Through such a curriculum we can progressively train more complex agents by, effectively, bootstrapping fr… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: ICML 2018

  13. arXiv:1411.1784  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Conditional Generative Adversarial Nets

    Authors: Mehdi Mirza, Simon Osindero

    Abstract: Generative Adversarial Nets [8] were recently introduced as a novel way to train generative models. In this work we introduce the conditional version of generative adversarial nets, which can be constructed by simply feeding the data, y, we wish to condition on to both the generator and discriminator. We show that this model can generate MNIST digits conditioned on class labels. We also illustrate… ▽ More

    Submitted 6 November, 2014; originally announced November 2014.