Skip to main content

Showing 1–6 of 6 results for author: Gadot, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.07054  [pdf, ps, other

    cs.LG cs.AI

    Policy Gradient with Tree Search: Avoiding Local Optimas through Lookahead

    Authors: Uri Koren, Navdeep Kumar, Uri Gadot, Giorgia Ramponi, Kfir Yehuda Levy, Shie Mannor

    Abstract: Classical policy gradient (PG) methods in reinforcement learning frequently converge to suboptimal local optima, a challenge exacerbated in large or complex environments. This work investigates Policy Gradient with Tree Search (PGTS), an approach that integrates an $m$-step lookahead mechanism to enhance policy optimization. We provide theoretical analysis demonstrating that increasing the tree se… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

  2. arXiv:2505.21478  [pdf, ps, other

    cs.CV cs.AI

    Policy Optimized Text-to-Image Pipeline Design

    Authors: Uri Gadot, Rinon Gal, Yftah Ziser, Gal Chechik, Shie Mannor

    Abstract: Text-to-image generation has evolved beyond single monolithic models to complex multi-component pipelines. These combine fine-tuned generators, adapters, upscaling blocks and even editing steps, leading to significant improvements in image quality. However, their effective design requires substantial expertise. Recent approaches have shown promise in automating this process through large language… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  3. arXiv:2502.11537  [pdf, other

    cs.LG cs.AI

    Uncovering Untapped Potential in Sample-Efficient World Model Agents

    Authors: Lior Cohen, Kaixin Wang, Bingyi Kang, Uri Gadot, Shie Mannor

    Abstract: World model (WM) agents enable sample-efficient reinforcement learning by learning policies entirely from simulated experience. However, existing token-based world models (TBWMs) are limited to visual inputs and discrete actions, restricting their adoption and applicability. Moreover, although both intrinsic motivation and prioritized WM replay have shown promise in improving WM performance and ge… ▽ More

    Submitted 20 May, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

  4. arXiv:2501.12216  [pdf, other

    cs.LG cs.CV eess.IV

    RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression

    Authors: Uri Gadot, Assaf Shocher, Shie Mannor, Gal Chechik, Assaf Hallak

    Abstract: Video encoders optimize compression for human perception by minimizing reconstruction error under bit-rate constraints. In many modern applications such as autonomous driving, an overwhelming majority of videos serve as input for AI systems performing tasks like object recognition or segmentation, rather than being watched by humans. It is therefore useful to optimize the encoder for a downstream… ▽ More

    Submitted 25 March, 2025; v1 submitted 21 January, 2025; originally announced January 2025.

  5. arXiv:2309.01107  [pdf, other

    cs.LG

    Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

    Authors: Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor

    Abstract: In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set. By targeting maximal return under the most adversarial model from that set, RMDPs address performance sensitivity to misspecified environments. Yet, to preserve computational tractability, the uncertainty set is traditionally independently structured for each state… ▽ More

    Submitted 12 February, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

    Comments: accepted in AAAI2024

  6. arXiv:2306.05859  [pdf, other

    cs.LG

    Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel

    Authors: Kaixin Wang, Uri Gadot, Navdeep Kumar, Kfir Levy, Shie Mannor

    Abstract: Robust Markov Decision Processes (RMDPs) provide a framework for sequential decision-making that is robust to perturbations on the transition kernel. However, current RMDP methods are often limited to small-scale problems, hindering their use in high-dimensional domains. To bridge this gap, we present EWoK, a novel online approach to solve RMDP that Estimates the Worst transition Kernel to learn r… ▽ More

    Submitted 12 February, 2024; v1 submitted 9 June, 2023; originally announced June 2023.