Skip to main content

Showing 1–3 of 3 results for author: Ranchod, P

.
  1. arXiv:2407.14516  [pdf, other

    cs.RO cs.LG

    RobocupGym: A challenging continuous control benchmark in Robocup

    Authors: Michael Beukman, Branden Ingram, Geraud Nangue Tasse, Benjamin Rosman, Pravesh Ranchod

    Abstract: Reinforcement learning (RL) has progressed substantially over the past decade, with much of this progress being driven by benchmarks. Many benchmarks are focused on video or board games, and a large number of robotics benchmarks lack diversity and real-world applicability. In this paper, we aim to simplify the process of applying reinforcement learning in the 3D simulation league of Robocup, a rob… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2001.06793  [pdf, other

    cs.LG stat.ML

    Learning Options from Demonstration using Skill Segmentation

    Authors: Matthew Cockcroft, Shahil Mawjee, Steven James, Pravesh Ranchod

    Abstract: We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are l… ▽ More

    Submitted 19 January, 2020; originally announced January 2020.

    Comments: To be published in SAUPEC/RobMech/PRASA 2020. Consists of 6 pages, 5 figures

  3. arXiv:1509.01644  [pdf, other

    cs.AI cs.LG

    Reinforcement Learning with Parameterized Actions

    Authors: Warwick Masson, Pravesh Ranchod, George Konidaris

    Abstract: We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters. At each step the agent must select both which action to use and which parameters to use with that action. We introduce the Q-PAMDP algorithm for learning in these domains, show that it converges to a local optimum, and compare it to direct policy sea… ▽ More

    Submitted 26 November, 2015; v1 submitted 4 September, 2015; originally announced September 2015.

    Comments: Accepted for AAAI 2016