Skip to main content

Showing 1–6 of 6 results for author: Trochim, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2207.13131  [pdf, other

    cs.AI cs.LG cs.RO

    Semi-analytical Industrial Cooling System Model for Reinforcement Learning

    Authors: Yuri Chervonyi, Praneet Dutta, Piotr Trochim, Octavian Voicu, Cosmin Paduraru, Crystal Qian, Emre Karagozler, Jared Quincy Davis, Richard Chippendale, Gautam Bajaj, Sims Witherspoon, Jerry Luo

    Abstract: We present a hybrid industrial cooling system model that embeds analytical solutions within a multi-physics simulation. This model is designed for reinforcement learning (RL) applications and balances simplicity with simulation fidelity and interpretability. The model's fidelity is evaluated against real world data from a large scale cooling system. This is followed by a case study illustrating ho… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: 27 pages, 13 figures

  2. arXiv:2110.03363  [pdf, other

    cs.RO cs.AI cs.LG

    Evaluating model-based planning and planner amortization for continuous control

    Authors: Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza, Alessandro Davide Ialongo, Yuval Tassa, Jost Tobias Springenberg, Abbas Abdolmaleki, Nicolas Heess, Josh Merel, Martin Riedmiller

    Abstract: There is a widespread intuition that model-based control methods should be able to surpass the data efficiency of model-free approaches. In this paper we attempt to evaluate this intuition on various challenging locomotion tasks. We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning; the learned policy serves as a proposal for MPC.… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 9 pages main text, 30 pages with references and appendix including several ablations and additional experiments. Submitted to ICLR 2022

  3. arXiv:2109.14311  [pdf, other

    cs.LG cs.RO

    Learning Dynamics Models for Model Predictive Agents

    Authors: Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel Dulac-Arnold, Piotr Trochim, Nicolas Heess, Josh Merel, Yuval Tassa

    Abstract: Model-Based Reinforcement Learning involves learning a \textit{dynamics model} from data, and then using this model to optimise behaviour, most often with an online \textit{planner}. Much of the recent research along these lines presents a particular set of design choices, involving problem definition, model learning and planning. Given the multiple contributions, it is difficult to evaluate the e… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

  4. arXiv:2011.09294  [pdf, other

    cs.AI cs.LG

    Using Unity to Help Solve Intelligence

    Authors: Tom Ward, Andrew Bolt, Nik Hemmings, Simon Carter, Manuel Sanchez, Ricardo Barreira, Seb Noury, Keith Anderson, Jay Lemmon, Jonathan Coe, Piotr Trochim, Tom Handley, Adrian Bolton

    Abstract: In the pursuit of artificial general intelligence, our most significant measurement of progress is an agent's ability to achieve goals in a wide range of environments. Existing platforms for constructing such environments are typically constrained by the technologies they are founded on, and are therefore only able to provide a subset of scenarios necessary to evaluate progress. To overcome these… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  5. dm_control: Software and Tasks for Continuous Control

    Authors: Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Piotr Trochim, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, Nicolas Heess

    Abstract: The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. A MuJoCo wrapper provides convenient bindings to functions and data structures. The PyMJCF and Composer libraries enable procedural model manipulation and task authoring. The Control Suite is a fixed set of tasks with standardised structure, inten… ▽ More

    Submitted 7 September, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: arXiv admin note: text overlap with arXiv:1801.00690

  6. arXiv:1910.00528  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Augmenting learning using symmetry in a biologically-inspired domain

    Authors: Shruti Mishra, Abbas Abdolmaleki, Arthur Guez, Piotr Trochim, Doina Precup

    Abstract: Invariances to translation, rotation and other spatial transformations are a hallmark of the laws of motion, and have widespread use in the natural sciences to reduce the dimensionality of systems of equations. In supervised learning, such as in image classification tasks, rotation, translation and scale invariances are used to augment training datasets. In this work, we use data augmentation in a… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.