Skip to main content

Showing 1–5 of 5 results for author: Ruderman, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:1812.01647  [pdf, other

    cs.LG cs.CR stat.ML

    Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures

    Authors: Jonathan Uesato, Ananya Kumar, Csaba Szepesvari, Tom Erez, Avraham Ruderman, Keith Anderson, Krishmamurthy, Dvijotham, Nicolas Heess, Pushmeet Kohli

    Abstract: This paper addresses the problem of evaluating learning systems in safety critical domains such as autonomous driving, where failures can have catastrophic consequences. We focus on two problems: searching for scenarios when learned agents fail and assessing their probability of failure. The standard method for agent evaluation in reinforcement learning, Vanilla Monte Carlo, can miss failures enti… ▽ More

    Submitted 4 December, 2018; originally announced December 2018.

  2. arXiv:1807.01281  [pdf, other

    cs.LG cs.AI stat.ML

    Human-level performance in first-person multiplayer games with population-based deep reinforcement learning

    Authors: Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green, Louise Deason, Joel Z. Leibo, David Silver, Demis Hassabis, Koray Kavukcuoglu, Thore Graepel

    Abstract: Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games. However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open challenge. I… ▽ More

    Submitted 3 July, 2018; originally announced July 2018.

  3. arXiv:1804.04438  [pdf, other

    cs.CV cs.LG stat.ML

    Pooling is neither necessary nor sufficient for appropriate deformation stability in CNNs

    Authors: Avraham Ruderman, Neil C. Rabinowitz, Ari S. Morcos, Daniel Zoran

    Abstract: Many of our core assumptions about how neural networks operate remain empirically untested. One common assumption is that convolutional neural networks need to be stable to small translations and deformations to solve image recognition tasks. For many years, this stability was baked into CNN architectures by incorporating interleaved pooling layers. Recently, however, interleaved pooling has large… ▽ More

    Submitted 25 May, 2018; v1 submitted 12 April, 2018; originally announced April 2018.

    Comments: NIPS 2018 submission

  4. arXiv:1606.04460  [pdf, other

    stat.ML cs.LG q-bio.NC

    Model-Free Episodic Control

    Authors: Charles Blundell, Benigno Uria, Alexander Pritzel, Yazhe Li, Avraham Ruderman, Joel Z Leibo, Jack Rae, Daan Wierstra, Demis Hassabis

    Abstract: State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance. Humans, on the other hand, can very quickly exploit highly rewarding nuances of an environment upon first discovery. In the brain, such rapid learning is thought to depend on the hippocampus and its capacity for episodic memory. Here we investigate whether a simple model of… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

  5. arXiv:1206.4664  [pdf

    cs.LG stat.ML

    Tighter Variational Representations of f-Divergences via Restriction to Probability Measures

    Authors: Avraham Ruderman, Mark Reid, Dario Garcia-Garcia, James Petterson

    Abstract: We show that the variational representations for f-divergences currently used in the literature can be tightened. This has implications to a number of methods recently proposed based on this representation. As an example application we use our tighter representation to derive a general f-divergence estimator based on two i.i.d. samples and derive the dual program for this estimator that performs w… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012