Skip to main content

Showing 1–4 of 4 results for author: Swirszcz, G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2001.06232  [pdf, other

    cs.LG cs.CV stat.ML

    Sideways: Depth-Parallel Training of Video Models

    Authors: Mateusz Malinowski, Grzegorz Swirszcz, Joao Carreira, Viorica Patraucean

    Abstract: We propose Sideways, an approximate backpropagation scheme for training video models. In standard backpropagation, the gradients and activations at every computation step through the model are temporally synchronized. The forward activations need to be stored until the backward pass is executed, preventing inter-layer (depth) parallelization. However, can we leverage smooth, redundant input stream… ▽ More

    Submitted 30 March, 2020; v1 submitted 17 January, 2020; originally announced January 2020.

    Comments: Accepted at CVPR'20

  2. arXiv:1902.09592  [pdf, other

    cs.LG stat.ML

    Verification of Non-Linear Specifications for Neural Networks

    Authors: Chongli Qin, Krishnamurthy, Dvijotham, Brendan O'Donoghue, Rudy Bunel, Robert Stanforth, Sven Gowal, Jonathan Uesato, Grzegorz Swirszcz, Pushmeet Kohli

    Abstract: Prior work on neural network verification has focused on specifications that are linear functions of the output of the network, e.g., invariance of the classifier output under adversarial perturbations of the input. In this paper, we extend verification algorithms to be able to certify richer properties of neural networks. To do this we introduce the class of convex-relaxable specifications, which… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: ICLR conference paper

  3. arXiv:1902.02186  [pdf, other

    cs.LG cs.AI stat.ML

    Distilling Policy Distillation

    Authors: Wojciech Marian Czarnecki, Razvan Pascanu, Simon Osindero, Siddhant M. Jayakumar, Grzegorz Swirszcz, Max Jaderberg

    Abstract: The transfer of knowledge from one policy to another is an important tool in Deep Reinforcement Learning. This process, referred to as distillation, has been used to great success, for example, by enhancing the optimisation of agents, leading to stronger performance faster, on harder domains [26, 32, 5, 8]. Despite the widespread use and conceptual simplicity of distillation, many different formul… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

    Comments: Accepted at AISTATS 2019

  4. arXiv:1611.06310  [pdf, other

    stat.ML cs.LG cs.NE

    Local minima in training of neural networks

    Authors: Grzegorz Swirszcz, Wojciech Marian Czarnecki, Razvan Pascanu

    Abstract: There has been a lot of recent interest in trying to characterize the error surface of deep models. This stems from a long standing question. Given that deep networks are highly nonlinear systems optimized by local gradient methods, why do they not seem to be affected by bad local minima? It is widely believed that training of deep models using gradient methods works so well because the error surf… ▽ More

    Submitted 17 February, 2017; v1 submitted 19 November, 2016; originally announced November 2016.