Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces

Carvalho, Tales H.; Tjhia, Kenneth; Lelis, Levi H. S.

Computer Science > Machine Learning

arXiv:2410.12166 (cs)

[Submitted on 16 Oct 2024]

Title:Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces

Authors:Tales H. Carvalho, Kenneth Tjhia, Levi H. S. Lelis

View PDF HTML (experimental)

Abstract:Recent works have introduced LEAPS and HPRL, systems that learn latent spaces of domain-specific languages, which are used to define programmatic policies for partially observable Markov decision processes (POMDPs). These systems induce a latent space while optimizing losses such as the behavior loss, which aim to achieve locality in program behavior, meaning that vectors close in the latent space should correspond to similarly behaving programs. In this paper, we show that the programmatic space, induced by the domain-specific language and requiring no training, presents values for the behavior loss similar to those observed in latent spaces presented in previous work. Moreover, algorithms searching in the programmatic space significantly outperform those in LEAPS and HPRL. To explain our results, we measured the "friendliness" of the two spaces to local search algorithms. We discovered that algorithms are more likely to stop at local maxima when searching in the latent space than when searching in the programmatic space. This implies that the optimization topology of the programmatic space, induced by the reward function in conjunction with the neighborhood function, is more conducive to search than that of the latent space. This result provides an explanation for the superior performance in the programmatic space.

Comments:	Published as a conference paper at ICLR 2024
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2410.12166 [cs.LG]
	(or arXiv:2410.12166v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.12166

Submission history

From: Levi Lelis [view email]
[v1] Wed, 16 Oct 2024 02:10:04 UTC (2,079 KB)

Computer Science > Machine Learning

Title:Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reclaiming the Source of Programmatic Policies: Programmatic versus Latent Spaces

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators