Skip to main content

Showing 1–3 of 3 results for author: Sarafian, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:2305.00303  [pdf, other

    cs.LG stat.ML

    A Coupled Flow Approach to Imitation Learning

    Authors: Gideon Freund, Elad Sarafian, Sarit Kraus

    Abstract: In reinforcement learning and imitation learning, an object of central importance is the state distribution induced by the policy. It plays a crucial role in the policy gradient theorem, and references to it--along with the related state-action distribution--can be found all across the literature. Despite its importance, the state distribution is mostly discussed indirectly and theoretically, rath… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: Accepted at ICML 2023

  2. arXiv:2006.08711  [pdf, other

    cs.LG math.OC stat.ML

    Explicit Gradient Learning

    Authors: Mor Sinay, Elad Sarafian, Yoram Louzoun, Noa Agmon, Sarit Kraus

    Abstract: Black-Box Optimization (BBO) methods can find optimal policies for systems that interact with complex environments with no analytical representation. As such, they are of interest in many Artificial Intelligence (AI) domains. Yet classical BBO methods fall short in high-dimensional non-convex problems. They are thus often overlooked in real-world AI tasks. Here we present a BBO method, termed Expl… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

  3. arXiv:1805.07805  [pdf, other

    cs.LG cs.AI stat.ML

    Constrained Policy Improvement for Safe and Efficient Reinforcement Learning

    Authors: Elad Sarafian, Aviv Tamar, Sarit Kraus

    Abstract: We propose a policy improvement algorithm for Reinforcement Learning (RL) which is called Rerouted Behavior Improvement (RBI). RBI is designed to take into account the evaluation errors of the Q-function. Such errors are common in RL when learning the $Q$-value from finite past experience data. Greedy policies or even constrained policy optimization algorithms which ignore these errors may suffer… ▽ More

    Submitted 10 July, 2019; v1 submitted 20 May, 2018; originally announced May 2018.