Skip to main content

Showing 1–7 of 7 results for author: Cicek, D C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2209.00532  [pdf, other

    cs.LG cs.AI

    Actor Prioritized Experience Replay

    Authors: Baturay Saglam, Furkan B. Mutlu, Dogan C. Cicek, Suleyman S. Kozat

    Abstract: A widely-studied deep reinforcement learning (RL) technique known as Prioritized Experience Replay (PER) allows agents to learn from transitions sampled with non-uniform probability proportional to their temporal-difference (TD) error. Although it has been shown that PER is one of the most crucial components for the overall performance of deep RL methods in discrete action domains, many empirical… ▽ More

    Submitted 1 September, 2022; originally announced September 2022.

    Comments: 21 pages, 5 figures, 4 tables

  2. arXiv:2208.00755  [pdf, other

    cs.LG cs.AI

    Mitigating Off-Policy Bias in Actor-Critic Methods with One-Step Q-learning: A Novel Correction Approach

    Authors: Baturay Saglam, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat

    Abstract: Compared to on-policy counterparts, off-policy model-free deep reinforcement learning can improve data efficiency by repeatedly using the previously gathered data. However, off-policy learning becomes challenging when the discrepancy between the underlying distributions of the agent's policy and collected data increases. Although the well-studied importance sampling and off-policy policy gradient… ▽ More

    Submitted 25 September, 2023; v1 submitted 1 August, 2022; originally announced August 2022.

  3. arXiv:2207.13453  [pdf, other

    cs.LG cs.AI

    Safe and Robust Experience Sharing for Deterministic Policy Gradient Algorithms

    Authors: Baturay Saglam, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat

    Abstract: Learning in high dimensional continuous tasks is challenging, mainly when the experience replay memory is very limited. We introduce a simple yet effective experience sharing mechanism for deterministic policies in continuous action domains for the future off-policy deep reinforcement learning applications in which the allocated memory for the experience replay buffer is limited. To overcome the e… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: ICML 2022 Workshop on Responsible Decision Making in Dynamic Environments (poster: http://responsibledecisionmaking.github.io/assets/poster/19.pdf , presentation: http://drive.google.com/file/d/1vjjMh_z51xdOjsQCcGfU5ojAcrrf3dOS/view?usp=sharing )

  4. arXiv:2111.06780  [pdf, other

    cs.LG cs.AI

    AWD3: Dynamic Reduction of the Estimation Bias

    Authors: Dogan C. Cicek, Enes Duran, Baturay Saglam, Kagan Kaya, Furkan B. Mutlu, Suleyman S. Kozat

    Abstract: Value-based deep Reinforcement Learning (RL) algorithms suffer from the estimation bias primarily caused by function approximation and temporal difference (TD) learning. This problem induces faulty state-action value estimates and therefore harms the performance and robustness of the learning algorithms. Although several techniques were proposed to tackle, learning algorithms still suffer from thi… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: Accepted at The 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2021)

  5. arXiv:2111.01865  [pdf, other

    cs.LG cs.AI

    Off-Policy Correction for Deep Deterministic Policy Gradient Algorithms via Batch Prioritized Experience Replay

    Authors: Dogan C. Cicek, Enes Duran, Baturay Saglam, Furkan B. Mutlu, Suleyman S. Kozat

    Abstract: The experience replay mechanism allows agents to use the experiences multiple times. In prior works, the sampling probability of the transitions was adjusted according to their importance. Reassigning sampling probabilities for every transition in the replay buffer after each iteration is highly inefficient. Therefore, experience replay prioritization algorithms recalculate the significance of a t… ▽ More

    Submitted 12 November, 2021; v1 submitted 2 November, 2021; originally announced November 2021.

    Comments: Accepted at The 33rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2021)

  6. arXiv:2109.11788  [pdf, other

    cs.LG cs.AI stat.ML

    Parameter-free Reduction of the Estimation Bias in Deep Reinforcement Learning for Deterministic Policy Gradients

    Authors: Baturay Saglam, Furkan Burak Mutlu, Dogan Can Cicek, Suleyman Serdar Kozat

    Abstract: Approximation of the value functions in value-based deep reinforcement learning induces overestimation bias, resulting in suboptimal policies. We show that when the reinforcement signals received by the agents have a high variance, deep actor-critic approaches that overcome the overestimation bias lead to a substantial underestimation bias. We first address the detrimental issues in the existing a… ▽ More

    Submitted 19 May, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

  7. Estimation Error Correction in Deep Reinforcement Learning for Deterministic Actor-Critic Methods

    Authors: Baturay Saglam, Enes Duran, Dogan C. Cicek, Furkan B. Mutlu, Suleyman S. Kozat

    Abstract: In value-based deep reinforcement learning methods, approximation of value functions induces overestimation bias and leads to suboptimal policies. We show that in deep actor-critic methods that aim to overcome the overestimation bias, if the reinforcement signals received by the agent have a high variance, a significant underestimation bias arises. To minimize the underestimation, we introduce a p… ▽ More

    Submitted 23 September, 2021; v1 submitted 22 September, 2021; originally announced September 2021.