Skip to main content

Showing 1–4 of 4 results for author: Kroiss, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2106.04516  [pdf, other

    cs.DC cs.AI cs.LG

    Launchpad: A Programming Model for Distributed Machine Learning Research

    Authors: Fan Yang, Gabriel Barth-Maron, Piotr StaƄczyk, Matthew Hoffman, Siqi Liu, Manuel Kroiss, Aedan Pope, Alban Rrustemi

    Abstract: A major driver behind the success of modern machine learning algorithms has been their ability to process ever-larger amounts of data. As a result, the use of distributed systems in both research and production has become increasingly prevalent as a means to scale to this growing data. At the same time, however, distributing the learning process can drastically complicate the implementation of eve… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  2. arXiv:2104.06272  [pdf, other

    cs.LG

    Podracer architectures for scalable Reinforcement Learning

    Authors: Matteo Hessel, Manuel Kroiss, Aidan Clark, Iurii Kemaev, John Quan, Thomas Keck, Fabio Viola, Hado van Hasselt

    Abstract: Supporting state-of-the-art AI research requires balancing rapid prototyping, ease of use, and quick iteration, with the ability to deploy experiments at a scale traditionally associated with production systems.Deep learning frameworks such as TensorFlow, PyTorch and JAX allow users to transparently make use of accelerators, such as TPUs and GPUs, to offload the more computationally intensive part… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  3. arXiv:2102.04736  [pdf, other

    cs.LG cs.AI cs.DC

    Reverb: A Framework For Experience Replay

    Authors: Albin Cassirer, Gabriel Barth-Maron, Eugene Brevdo, Sabela Ramos, Toby Boyd, Thibault Sottiaux, Manuel Kroiss

    Abstract: A central component of training in Reinforcement Learning (RL) is Experience: the data used for training. The mechanisms used to generate and consume this data have an important effect on the performance of RL algorithms. In this paper, we introduce Reverb: an efficient, extensible, and easy to use system designed specifically for experience replay in RL. Reverb is designed to work efficiently i… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    Comments: 11 pages

  4. arXiv:1912.05500  [pdf, other

    cs.AI cs.LG

    What Can Learned Intrinsic Rewards Capture?

    Authors: Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

    Abstract: The objective of a reinforcement learning agent is to behave so as to maximise the sum of a suitable scalar function of state: the reward. These rewards are typically given and immutable. In this paper, we instead consider the proposition that the reward function itself can be a good locus of learned knowledge. To investigate this, we propose a scalable meta-gradient framework for learning useful… ▽ More

    Submitted 21 August, 2020; v1 submitted 11 December, 2019; originally announced December 2019.

    Comments: ICML 2020. The first two authors contributed equally