Skip to main content

Showing 1–24 of 24 results for author: Lelis, L H S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.14162  [pdf, ps, other

    cs.LG

    Common Benchmarks Undervalue the Generalization Power of Programmatic Policies

    Authors: Amirhossein Rajabpour, Kiarash Aghakasiri, Sandra Zilles, Levi H. S. Lelis

    Abstract: Algorithms for learning programmatic representations for sequential decision-making problems are often evaluated on out-of-distribution (OOD) problems, with the common conclusion that programmatic policies generalize better than neural policies on OOD problems. In this position paper, we argue that commonly used benchmarks undervalue the generalization capabilities of programmatic representations.… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 17 pages, 5 figures

  2. arXiv:2506.07255  [pdf, ps, other

    cs.AI

    Subgoal-Guided Policy Heuristic Search with Learned Subgoals

    Authors: Jake Tuero, Michael Buro, Levi H. S. Lelis

    Abstract: Policy tree search is a family of tree search algorithms that use a policy to guide the search. These algorithms provide guarantees on the number of expansions required to solve a given problem that are based on the quality of the policy. While these algorithms have shown promising results, the process in which they are trained requires complete solution trajectories to train the policy. Search tr… ▽ More

    Submitted 8 June, 2025; originally announced June 2025.

    Comments: Accepted to ICML-25

  3. arXiv:2505.12508  [pdf, ps, other

    cs.LG

    InnateCoder: Learning Programmatic Options with Foundation Models

    Authors: Rubens O. Moraes, Quazi Asif Sadmine, Hendrik Baier, Levi H. S. Lelis

    Abstract: Outside of transfer learning settings, reinforcement learning agents start their learning process from a clean slate. As a result, such agents have to go through a slow process to learn even the most obvious skills required to solve a problem. In this paper, we present InnateCoder, a system that leverages human knowledge encoded in foundation models to provide programmatic policies that encode "in… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

    Comments: Accepted at IJCAI 2025

  4. arXiv:2412.05196  [pdf, other

    cs.AI

    Exponential Speedups by Rerooting Levin Tree Search

    Authors: Laurent Orseau, Marcus Hutter, Levi H. S. Lelis

    Abstract: Levin Tree Search (LTS) (Orseau et al., 2018) is a search algorithm for deterministic environments that uses a user-specified policy to guide the search. It comes with a formal guarantee on the number of search steps (node visits) for finding a solution node that depends on the quality of the policy. In this paper, we introduce a new algorithm, called $\sqrt{\text{LTS}}$ (pronounce root-LTS), whic… ▽ More

    Submitted 11 March, 2025; v1 submitted 6 December, 2024; originally announced December 2024.

  5. arXiv:2410.12166  [pdf, other

    cs.LG cs.AI

    Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces

    Authors: Tales H. Carvalho, Kenneth Tjhia, Levi H. S. Lelis

    Abstract: Recent works have introduced LEAPS and HPRL, systems that learn latent spaces of domain-specific languages, which are used to define programmatic policies for partially observable Markov decision processes (POMDPs). These systems induce a latent space while optimizing losses such as the behavior loss, which aim to achieve locality in program behavior, meaning that vectors close in the latent space… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: Published as a conference paper at ICLR 2024

  6. arXiv:2410.11262  [pdf, other

    cs.LG cs.AI

    Unveiling Options with Neural Decomposition

    Authors: Mahdi Alikhasi, Levi H. S. Lelis

    Abstract: In reinforcement learning, agents often learn policies for specific tasks without the ability to generalize this knowledge to related tasks. This paper introduces an algorithm that attempts to address this limitation by decomposing neural networks encoding policies for Markov Decision Processes into reusable sub-policies, which are used to synthesize temporally extended actions, or options. We con… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: Published as a conference paper at ICLR 2024

  7. arXiv:2405.05431  [pdf, other

    cs.LG cs.AI cs.PL

    Searching for Programmatic Policies in Semantic Spaces

    Authors: Rubens O. Moraes, Levi H. S. Lelis

    Abstract: Syntax-guided synthesis is commonly used to generate programs encoding policies. In this approach, the set of programs, that can be written in a domain-specific language defines the search space, and an algorithm searches within this space for programs that encode strong policies. In this paper, we propose an alternative method for synthesizing programmatic policies, where we search within an appr… ▽ More

    Submitted 12 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: Available code: https://github.com/rubensolv/Library-Induced-Semantic-Spaces

  8. arXiv:2311.06979  [pdf, ps, other

    cs.AI cs.PL cs.SE

    Assessing the Interpretability of Programmatic Policies with Large Language Models

    Authors: Zahra Bashir, Michael Bowling, Levi H. S. Lelis

    Abstract: Although the synthesis of programs encoding policies often carries the promise of interpretability, systematic evaluations were never performed to assess the interpretability of these policies, likely because of the complexity of such an evaluation. In this paper, we introduce a novel metric that uses large-language models (LLM) to assess the interpretability of programmatic policies. For our metr… ▽ More

    Submitted 20 January, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: This paper is under-review for IJCAI. The main file is arxiv.tex and I have a supplementary_materials.tex file as well

  9. Program Synthesis with Best-First Bottom-Up Search

    Authors: Saqib Ameen, Levi H. S. Lelis

    Abstract: Cost-guided bottom-up search (BUS) algorithms use a cost function to guide the search to solve program synthesis tasks. In this paper, we show that current state-of-the-art cost-guided BUS algorithms suffer from a common problem: they can lose useful information given by the model and fail to perform the search in a best-first order according to a cost function. We introduce a novel best-first bot… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: Published at the Journal of Artificial Intelligence Research (JAIR)

  10. arXiv:2308.02729  [pdf, ps, other

    cs.LG

    Synthesizing Programmatic Policies with Actor-Critic Algorithms and ReLU Networks

    Authors: Spyros Orfanos, Levi H. S. Lelis

    Abstract: Programmatically Interpretable Reinforcement Learning (PIRL) encodes policies in human-readable computer programs. Novel algorithms were recently introduced with the goal of handling the lack of gradient signal to guide the search in the space of programmatic policies. Most of such PIRL algorithms first train a neural policy that is used as an oracle to guide the search in the programmatic space.… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  11. arXiv:2307.05603  [pdf, other

    cs.SE cs.LG cs.PL

    Can You Improve My Code? Optimizing Programs with Local Search

    Authors: Fatemeh Abdollahi, Saqib Ameen, Matthew E. Taylor, Levi H. S. Lelis

    Abstract: This paper introduces a local search method for improving an existing program with respect to a measurable objective. Program Optimization with Locally Improving Search (POLIS) exploits the structure of a program, defined by its lines. POLIS improves a single line of the program while keeping the remaining lines fixed, using existing brute-force synthesis algorithms, and continues iterating until… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI) 2023

  12. arXiv:2307.04893  [pdf, other

    cs.LG cs.AI

    Choosing Well Your Opponents: How to Guide the Synthesis of Programmatic Strategies

    Authors: Rubens O. Moraes, David S. Aleixo, Lucas N. Ferreira, Levi H. S. Lelis

    Abstract: This paper introduces Local Learner (2L), an algorithm for providing a set of reference strategies to guide the search for programmatic strategies in two-player zero-sum games. Previous learning algorithms, such as Iterated Best Response (IBR), Fictitious Play (FP), and Double-Oracle (DO), can be computationally expensive or miss important information for guiding search algorithms. 2L actively sel… ▽ More

    Submitted 23 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI) 2023

  13. arXiv:2305.16945  [pdf, other

    cs.LG cs.AI

    Levin Tree Search with Context Models

    Authors: Laurent Orseau, Marcus Hutter, Levi H. S. Lelis

    Abstract: Levin Tree Search (LTS) is a search algorithm that makes use of a policy (a probability distribution over actions) and comes with a theoretical guarantee on the number of expansions before reaching a goal node, depending on the quality of the policy. This guarantee can be used as a loss function, which we call the LTS loss, to optimize neural networks representing the policy (LTS+NN). In this work… ▽ More

    Submitted 12 November, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

  14. arXiv:2208.05162  [pdf, ps, other

    cs.SD cs.LG cs.MM eess.AS

    Controlling Perceived Emotion in Symbolic Music Generation with Monte Carlo Tree Search

    Authors: Lucas N. Ferreira, Lili Mou, Jim Whitehead, Levi H. S. Lelis

    Abstract: This paper presents a new approach for controlling emotion in symbolic music generation with Monte Carlo Tree Search. We use Monte Carlo Tree Search as a decoding mechanism to steer the probability distribution learned by a language model towards a given emotion. At every step of the decoding process, we use Predictor Upper Confidence for Trees (PUCT) to search for sequences that maximize the aver… ▽ More

    Submitted 1 September, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted for publication at the 18th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-22)

  15. arXiv:2203.11912  [pdf, other

    cs.AI

    What can we Learn Even From the Weakest? Learning Sketches for Programmatic Strategies

    Authors: Leandro C. Medeiros, David S. Aleixo, Levi H. S. Lelis

    Abstract: In this paper we show that behavioral cloning can be used to learn effective sketches of programmatic strategies. We show that even the sketches learned by cloning the behavior of weak players can help the synthesis of programmatic strategies. This is because even weak players can provide helpful information, e.g., that a player must choose an action in their turn of the game. If behavioral clonin… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Published at AAAI'22

  16. arXiv:2103.11505  [pdf, other

    cs.AI cs.LG

    Policy-Guided Heuristic Search with Guarantees

    Authors: Laurent Orseau, Levi H. S. Lelis

    Abstract: The use of a policy and a heuristic function for guiding search can be quite effective in adversarial problems, as demonstrated by AlphaGo and its successors, which are based on the PUCT search algorithm. While PUCT can also be used to solve single-agent deterministic problems, it lacks guarantees on its search effort and it can be computationally inefficient in practice. Combining the A* algorith… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

  17. arXiv:2008.07009  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Computer-Generated Music for Tabletop Role-Playing Games

    Authors: Lucas N. Ferreira, Levi H. S. Lelis, Jim Whitehead

    Abstract: In this paper we present Bardo Composer, a system to generate background music for tabletop role-playing games. Bardo Composer uses a speech recognition system to translate player speech into text, which is classified according to a model of emotion. Bardo Composer then uses Stochastic Bi-Objective Beam Search, a variant of Stochastic Beam Search that we introduce in this paper, with a neural mode… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: To be published in the 16th AAAI Conference ON Artificial Intelligence and Interactive Digital Entertainment

  18. arXiv:2006.06054  [pdf, other

    cs.AI

    Marginal Utility for Planning in Continuous or Large Discrete Action Spaces

    Authors: Zaheen Farraz Ahmad, Levi H. S. Lelis, Michael Bowling

    Abstract: Sample-based planning is a powerful family of algorithms for generating intelligent behavior from a model of the environment. Generating good candidate actions is critical to the success of sample-based planners, particularly in continuous or large action spaces. Typically, candidate action generation exhausts the action space, uses domain knowledge, or more recently, involves learning a stochasti… ▽ More

    Submitted 17 June, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

  19. arXiv:2004.02289  [pdf, other

    cs.LG cs.AI cs.HC

    Personalization in Human-AI Teams: Improving the Compatibility-Accuracy Tradeoff

    Authors: Jonathan Martinez, Kobi Gal, Ece Kamar, Levi H. S. Lelis

    Abstract: AI systems that model and interact with users can update their models over time to reflect new information and changes in the environment. Although these updates may improve the overall performance of the AI system, they may actually hurt the performance with respect to individual users. Prior work has studied the trade-off between improving the system's accuracy following an update and the compat… ▽ More

    Submitted 19 August, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

    Comments: 6 pages, 7 figures

    ACM Class: I.2.1

  20. arXiv:1907.13062  [pdf, ps, other

    cs.DS cs.AI

    Iterative Budgeted Exponential Search

    Authors: Malte Helmert, Tor Lattimore, Levi H. S. Lelis, Laurent Orseau, Nathan R. Sturtevant

    Abstract: We tackle two long-standing problems related to re-expansions in heuristic search algorithms. For graph search, A* can require $Ω(2^{n})$ expansions, where $n$ is the number of states within the final $f$ bound. Existing algorithms that address this problem like B and B' improve this bound to $Ω(n^2)$. For tree search, IDA* can also require $Ω(n^2)$ expansions. We describe a new algorithmic framew… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

  21. arXiv:1907.02548  [pdf, other

    cs.AI

    Procedural Generation of Initial States of Sokoban

    Authors: Dâmaris S. Bento, André G. Pereira, Levi H. S. Lelis

    Abstract: Procedural generation of initial states of state-space search problems have applications in human and machine learning as well as in the evaluation of planning systems. In this paper we deal with the task of generating hard and solvable initial states of Sokoban puzzles. We propose hardness metrics based on pattern database heuristics and the use of novelty to improve the exploration of search met… ▽ More

    Submitted 4 July, 2019; originally announced July 2019.

    Comments: Accepted for publication at IJCAI'19

  22. arXiv:1906.03242  [pdf, other

    cs.AI cs.DS

    Zooming Cautiously: Linear-Memory Heuristic Search With Node Expansion Guarantees

    Authors: Laurent Orseau, Levi H. S. Lelis, Tor Lattimore

    Abstract: We introduce and analyze two parameter-free linear-memory tree search algorithms. Under mild assumptions we prove our algorithms are guaranteed to perform only a logarithmic factor more node expansions than A* when the search space is a tree. Previously, the best guarantee for a linear-memory algorithm under similar assumptions was achieved by IDA*, which in the worst case expands quadratically mo… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

    Comments: This paper and another independent IJCAI 2019 submission have been merged into a single paper that subsumes both of them (Helmert et. al., 2019). This paper is placed here only for historical context. Please only cite the subsuming paper

  23. arXiv:1811.10928  [pdf, other

    cs.AI

    Single-Agent Policy Tree Search With Guarantees

    Authors: Laurent Orseau, Levi H. S. Lelis, Tor Lattimore, Théophane Weber

    Abstract: We introduce two novel tree search algorithms that use a policy to guide search. The first algorithm is a best-first enumeration that uses a cost function that allows us to prove an upper bound on the number of nodes to be expanded before reaching a goal state. We show that this best-first algorithm is particularly well suited for `needle-in-a-haystack' problems. The second algorithm is based on s… ▽ More

    Submitted 28 November, 2018; v1 submitted 27 November, 2018; originally announced November 2018.

    Journal ref: 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada

  24. arXiv:1711.08101  [pdf, ps, other

    cs.AI

    Asymmetric Action Abstractions for Multi-Unit Control in Adversarial Real-Time Games

    Authors: Rubens O. Moraes, Levi H. S. Lelis

    Abstract: Action abstractions restrict the number of legal actions available during search in multi-unit real-time adversarial games, thus allowing algorithms to focus their search on a set of promising actions. Optimal strategies derived from un-abstracted spaces are guaranteed to be no worse than optimal strategies derived from action-abstracted spaces. In practice, however, due to real-time constraints a… ▽ More

    Submitted 21 November, 2017; originally announced November 2017.

    Comments: AAAI'18