Skip to main content

Showing 1–3 of 3 results for author: Buckman, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:1906.02736  [pdf, other

    cs.LG stat.ML

    DeepMDP: Learning Continuous Latent Space Models for Representation Learning

    Authors: Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare

    Abstract: Many reinforcement learning (RL) tasks provide the agent with high-dimensional observations that can be simplified into low-dimensional continuous states. To formalize this process, we introduce the concept of a DeepMDP, a parameterized latent space model that is trained via the minimization of two tractable losses: prediction of rewards and prediction of the distribution over next latent states.… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: 13 pages main text, 16 pages appendix. ICML 2019

  2. arXiv:1807.01675  [pdf, other

    cs.LG cs.AI stat.ML

    Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion

    Authors: Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee

    Abstract: Integrating model-free and model-based approaches in reinforcement learning has the potential to achieve the high performance of model-free algorithms with low sample complexity. However, this is difficult because an imperfect dynamics model can degrade the performance of the learning algorithm, and in sufficiently complex environments, the dynamics model will almost always be imperfect. As a resu… ▽ More

    Submitted 7 June, 2019; v1 submitted 4 July, 2018; originally announced July 2018.

    Journal ref: Advances in Neural Information Processing Systems, 2019 (pp. 8224-8234)

  3. arXiv:1802.08768  [pdf, other

    stat.ML cs.LG

    Is Generator Conditioning Causally Related to GAN Performance?

    Authors: Augustus Odena, Jacob Buckman, Catherine Olsson, Tom B. Brown, Christopher Olah, Colin Raffel, Ian Goodfellow

    Abstract: Recent work (Pennington et al, 2017) suggests that controlling the entire distribution of Jacobian singular values is an important design consideration in deep learning. Motivated by this, we study the distribution of singular values of the Jacobian of the generator in Generative Adversarial Networks (GANs). We find that this Jacobian generally becomes ill-conditioned at the beginning of training.… ▽ More

    Submitted 18 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.