Skip to main content

Showing 1–5 of 5 results for author: Fatemi, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2010.14680  [pdf, other

    cs.LG stat.ML

    Learning to Represent Action Values as a Hypergraph on the Action Vertices

    Authors: Arash Tavakoli, Mehdi Fatemi, Petar Kormushev

    Abstract: Action-value estimation is a critical component of many reinforcement learning (RL) methods whereby sample complexity relies heavily on how fast a good estimator for action value can be learned. By viewing this problem through the lens of representation learning, good representations of both state and action can facilitate action-value estimation. While advances in deep learning have seamlessly dr… ▽ More

    Submitted 20 June, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: ICLR 2021, code: https://github.com/atavakol/action-hypergraph-networks

  2. arXiv:1906.00572  [pdf, other

    cs.LG stat.ML

    Using a Logarithmic Mapping to Enable Lower Discount Factors in Reinforcement Learning

    Authors: Harm van Seijen, Mehdi Fatemi, Arash Tavakoli

    Abstract: In an effort to better understand the different ways in which the discount factor affects the optimization process in reinforcement learning, we designed a set of experiments to study each effect in isolation. Our analysis reveals that the common perception that poor performance of low discount factors is caused by (too) small action-gaps requires revision. We propose an alternative hypothesis tha… ▽ More

    Submitted 23 December, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019, code: https://github.com/microsoft/logrl

  3. Poisson Multi-Bernoulli Mapping Using Gibbs Sampling

    Authors: Maryam Fatemi, Karl Granström, Lennart Svensson, Francisco J. R. Ruiz, Lars Hammarstrand

    Abstract: This paper addresses the mapping problem. Using a conjugate prior form, we derive the exact theoretical batch multi-object posterior density of the map given a set of measurements. The landmarks in the map are modeled as extended objects, and the measurements are described as a Poisson process, conditioned on the map. We use a Poisson process prior on the map and prove that the posterior distribut… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

    Comments: 14 pages, 6 figures

    Journal ref: IEEE Transactions on Signal Processing, Vol. 65, Issue 11, June 2017

  4. arXiv:1704.00756  [pdf, other

    cs.LG cs.AI stat.ML

    Multi-Advisor Reinforcement Learning

    Authors: Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen

    Abstract: We consider tackling a single-agent RL problem by distributing it to $n$ learners. These learners, called advisors, endeavour to solve the problem from a different focus. Their advice, taking the form of action values, is then communicated to an aggregator, which is in control of the system. We show that the local planning method for the advisors is critical and that none of the ones found in the… ▽ More

    Submitted 14 November, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

    Comments: Submitted at ICLR2018

  5. arXiv:1605.06311  [pdf, other

    stat.CO cs.CV eess.SY

    Poisson multi-Bernoulli conjugate prior for multiple extended object filtering

    Authors: Karl Granstrom, Maryam Fatemi, Lennart Svensson

    Abstract: This paper presents a Poisson multi-Bernoulli mixture (PMBM) conjugate prior for multiple extended object filtering. A Poisson point process is used to describe the existence of yet undetected targets, while a multi-Bernoulli mixture describes the distribution of the targets that have been detected. The prediction and update equations are presented for the standard transition density and measureme… ▽ More

    Submitted 6 December, 2019; v1 submitted 20 May, 2016; originally announced May 2016.