Skip to main content

Showing 1–3 of 3 results for author: Konidaris, G D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2002.01883  [pdf, other

    cs.LG cs.AI stat.ML

    Deep Radial-Basis Value Functions for Continuous Control

    Authors: Kavosh Asadi, Neev Parikh, Ronald E. Parr, George D. Konidaris, Michael L. Littman

    Abstract: A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep radial-basis value functions (RBVFs): value functions learned using a deep network with a radial-basis function (RBF) output layer. We show that the max… ▽ More

    Submitted 13 March, 2021; v1 submitted 5 February, 2020; originally announced February 2020.

    Comments: In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI)

  2. arXiv:1905.04388  [pdf, other

    cs.LG stat.ML

    Multi-Pass Q-Networks for Deep Reinforcement Learning with Parameterised Action Spaces

    Authors: Craig J. Bester, Steven D. James, George D. Konidaris

    Abstract: Parameterised actions in reinforcement learning are composed of discrete actions with continuous action-parameters. This provides a framework for solving complex domains that require combining high-level actions with flexible control. The recent P-DQN algorithm extends deep Q-networks to learn over such action spaces. However, it treats all action-parameters as a single joint input to the Q-networ… ▽ More

    Submitted 10 May, 2019; originally announced May 2019.

    Comments: 8 pages, 4 figures

  3. arXiv:1402.2871  [pdf, other

    cs.RO cs.AI cs.MA

    Planning for Decentralized Control of Multiple Robots Under Uncertainty

    Authors: Christopher Amato, George D. Konidaris, Gabriel Cruz, Christopher A. Maynor, Jonathan P. How, Leslie P. Kaelbling

    Abstract: We describe a probabilistic framework for synthesizing control policies for general multi-robot systems, given environment and sensor models and a cost function. Decentralized, partially observable Markov decision processes (Dec-POMDPs) are a general model of decision processes where a team of agents must cooperate to optimize some objective (specified by a shared reward or cost function) in the p… ▽ More

    Submitted 12 February, 2014; originally announced February 2014.

    ACM Class: I.2.9; I.2.11