Skip to main content

Showing 1–3 of 3 results for author: Way, E

Searching in archive cs. Search in all archives.
.
  1. Multi-Agent Learning of Numerical Methods for Hyperbolic PDEs with Factored Dec-MDP

    Authors: Yiwei Fu, Dheeraj S. K. Kapilavai, Elliot Way

    Abstract: Factored decentralized Markov decision process (Dec-MDP) is a framework for modeling sequential decision making problems in multi-agent systems. In this paper, we formalize the learning of numerical methods for hyperbolic partial differential equations (PDEs), specifically the Weighted Essentially Non-Oscillatory (WENO) scheme, as a factored Dec-MDP problem. We show that different reward formulati… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: Submitted to 20th International Conference on Practical Applications of Agents and Multi-Agent Systems (PAAMS 2022)

  2. arXiv:2203.08937  [pdf, other

    cs.LG cs.MA physics.comp-ph

    Backpropagation through Time and Space: Learning Numerical Methods with Multi-Agent Reinforcement Learning

    Authors: Elliot Way, Dheeraj S. K. Kapilavai, Yiwei Fu, Lei Yu

    Abstract: We introduce Backpropagation Through Time and Space (BPTTS), a method for training a recurrent spatio-temporal neural network, that is used in a homogeneous multi-agent reinforcement learning (MARL) setting to learn numerical methods for hyperbolic conservation laws. We treat the numerical schemes underlying partial differential equations (PDEs) as a Partially Observable Markov Game (POMG) in Rein… ▽ More

    Submitted 28 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

  3. arXiv:1902.03633  [pdf, other

    cs.LG stat.ML

    Diverse Exploration via Conjugate Policies for Policy Gradient Methods

    Authors: Andrew Cohen, Xingye Qiao, Lei Yu, Elliot Way, Xiangrong Tong

    Abstract: We address the challenge of effective exploration while maintaining good performance in policy gradient methods. As a solution, we propose diverse exploration (DE) via conjugate policies. DE learns and deploys a set of conjugate policies which can be conveniently generated as a byproduct of conjugate gradient descent. We provide both theoretical and empirical results showing the effectiveness of D… ▽ More

    Submitted 10 February, 2019; originally announced February 2019.

    Comments: AAAI 2019