Skip to main content

Showing 1–10 of 10 results for author: Sukhatme, G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2308.03882  [pdf, other

    cs.LG cs.AI stat.ML

    Exploiting Generalization in Offline Reinforcement Learning via Unseen State Augmentations

    Authors: Nirbhay Modhe, Qiaozi Gao, Ashwin Kalyan, Dhruv Batra, Govind Thattai, Gaurav Sukhatme

    Abstract: Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions. Model-free methods penalize values at all unseen actions, while model-based methods are able to further exploit unseen states via model rollouts. However, such methods are handicapped in their ability to find unseen st… ▽ More

    Submitted 24 September, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  2. arXiv:2006.11751  [pdf, other

    cs.LG cs.AI stat.ML

    Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

    Authors: Aleksei Petrenko, Zhehui Huang, Tushar Kumar, Gaurav Sukhatme, Vladlen Koltun

    Abstract: Increasing the scale of reinforcement learning experiments has allowed researchers to achieve unprecedented results in both training sophisticated agents for video games, and in sim-to-real transfer for robotics. Typically such experiments rely on large distributed systems and require expensive hardware setups, limiting wider access to this exciting area of research. In this work we aim to solve t… ▽ More

    Submitted 22 June, 2020; v1 submitted 21 June, 2020; originally announced June 2020.

    Comments: Paper published in ICML2020. Visualizations of trained policies can be found at https://sites.google.com/view/sample-factory

  3. arXiv:2004.14567  [pdf, other

    cs.LG cs.RO stat.ML

    Plan-Space State Embeddings for Improved Reinforcement Learning

    Authors: Max Pflueger, Gaurav S. Sukhatme

    Abstract: Robot control problems are often structured with a policy function that maps state values into control values, but in many dynamic problems the observed state can have a difficult to characterize relationship with useful policy actions. In this paper we present a new method for learning state embeddings from plans or other forms of demonstrations such that the embedding space has a specified geome… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: Submitted to IROS 2020

  4. arXiv:2004.10190  [pdf, other

    cs.LG cs.CV cs.RO stat.ML

    Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning

    Authors: Ryan Julian, Benjamin Swanson, Gaurav S. Sukhatme, Sergey Levine, Chelsea Finn, Karol Hausman

    Abstract: One of the great promises of robot learning systems is that they will be able to learn from their mistakes and continuously adapt to ever-changing environments. Despite this potential, most of the robot learning systems today are deployed as a fixed policy and they are not being adapted after their deployment. Can we efficiently adapt previously learned behaviors to new environments, objects and p… ▽ More

    Submitted 31 July, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: 8.5 pages, 9 figures. See video overview and experiments at https://youtu.be/pPDVewcSpdc and project website at https://ryanjulian.me/continual-fine-tuning

  5. arXiv:1906.05374  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Meta-Learning via Learned Loss

    Authors: Sarah Bechtle, Artem Molchanov, Yevgen Chebotar, Edward Grefenstette, Ludovic Righetti, Gaurav Sukhatme, Franziska Meier

    Abstract: Typically, loss functions, regularization mechanisms and other important aspects of training parametric models are chosen heuristically from a limited set of options. In this paper, we take the first step towards automating this process, with the view of producing models which train faster and more robustly. Concretely, we present a meta-learning method for learning parametric loss functions that… ▽ More

    Submitted 19 January, 2021; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: Project website with code and video at https://sites.google.com/view/mlthree

  6. arXiv:1905.10706  [pdf, other

    cs.LG cs.RO eess.SY stat.ML

    Interactive Differentiable Simulation

    Authors: Eric Heiden, David Millard, Hejia Zhang, Gaurav S. Sukhatme

    Abstract: Intelligent agents need a physical understanding of the world to predict the impact of their actions in the future. While learning-based models of the environment dynamics have contributed to significant improvements in sample efficiency compared to model-free reinforcement learning algorithms, they typically fail to generalize to system states beyond the training data, while often grounding their… ▽ More

    Submitted 18 May, 2020; v1 submitted 25 May, 2019; originally announced May 2019.

  7. arXiv:1901.01977  [pdf, other

    cs.LG cs.AI stat.ML

    Accelerating Goal-Directed Reinforcement Learning by Model Characterization

    Authors: Shoubhik Debnath, Gaurav Sukhatme, Lantao Liu

    Abstract: We propose a hybrid approach aimed at improving the sample efficiency in goal-directed reinforcement learning. We do this via a two-step mechanism where firstly, we approximate a model from Model-Free reinforcement learning. Then, we leverage this approximate model along with a notion of reachability using Mean First Passage Times to perform Model-Based reinforcement learning. Built on such a nove… ▽ More

    Submitted 4 January, 2019; originally announced January 2019.

    Comments: The paper was published in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

  8. arXiv:1810.02422  [pdf, other

    cs.RO cs.AI cs.LG stat.ML

    Simulator Predictive Control: Using Learned Task Representations and MPC for Zero-Shot Generalization and Sequencing

    Authors: Zhanpeng He, Ryan Julian, Eric Heiden, Hejia Zhang, Stefan Schaal, Joseph J. Lim, Gaurav Sukhatme, Karol Hausman

    Abstract: Simulation-to-real transfer is an important strategy for making reinforcement learning practical with real robots. Successful sim-to-real transfer systems have difficulty producing policies which generalize across tasks, despite training for thousands of hours equivalent real robot time. To address this shortcoming, we present a novel approach to efficiently learning new robotic skills directly on… ▽ More

    Submitted 27 January, 2021; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: Presented at NeurIPS 2018 Workshop: Deep Reinforcement Learning. See https://youtu.be/te4JWe7LPKw for supplemental video

  9. arXiv:1809.10253  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Scaling simulation-to-real transfer by learning composable robot skills

    Authors: Ryan Julian, Eric Heiden, Zhanpeng He, Hejia Zhang, Stefan Schaal, Joseph J. Lim, Gaurav Sukhatme, Karol Hausman

    Abstract: We present a novel solution to the problem of simulation-to-real transfer, which builds on recent advances in robot skill decomposition. Rather than focusing on minimizing the simulation-reality gap, we learn a set of diverse policies that are parameterized in a way that makes them easily reusable. This diversity and parameterization of low-level skills allows us to find a transferable policy that… ▽ More

    Submitted 13 November, 2018; v1 submitted 26 September, 2018; originally announced September 2018.

    Comments: Presented at ISER 2018. See https://www.youtube.com/watch?v=Syr2RQTHqTs for supplemental video

  10. arXiv:1609.07560  [pdf, other

    cs.RO cs.AI cs.LG stat.ML

    Informative Planning and Online Learning with Sparse Gaussian Processes

    Authors: Kai-Chieh Ma, Lantao Liu, Gaurav S. Sukhatme

    Abstract: A big challenge in environmental monitoring is the spatiotemporal variation of the phenomena to be observed. To enable persistent sensing and estimation in such a setting, it is beneficial to have a time-varying underlying environmental model. Here we present a planning and learning method that enables an autonomous marine vehicle to perform persistent ocean monitoring tasks by learning and refini… ▽ More

    Submitted 23 September, 2016; originally announced September 2016.