Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

Meli, Daniele; Castellini, Alberto; Farinelli, Alessandro

doi:10.1613/jair.1.15826

Computer Science > Artificial Intelligence

arXiv:2402.19265 (cs)

[Submitted on 29 Feb 2024]

Title:Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

Authors:Daniele Meli, Alberto Castellini, Alessandro Farinelli

View PDF HTML (experimental)

Abstract:Partially Observable Markov Decision Processes (POMDPs) are a powerful framework for planning under uncertainty. They allow to model state uncertainty as a belief probability distribution. Approximate solvers based on Monte Carlo sampling show great success to relax the computational demand and perform online planning. However, scaling to complex realistic domains with many actions and long planning horizons is still a major challenge, and a key point to achieve good performance is guiding the action-selection process with domain-dependent policy heuristics which are tailored for the specific application domain. We propose to learn high-quality heuristics from POMDP traces of executions generated by any solver. We convert the belief-action pairs to a logical semantics, and exploit data- and time-efficient Inductive Logic Programming (ILP) to generate interpretable belief-based policy specifications, which are then used as online heuristics. We evaluate thoroughly our methodology on two notoriously challenging POMDP problems, involving large action spaces and long planning horizons, namely, rocksample and pocman. Considering different state-of-the-art online POMDP solvers, including POMCP, DESPOT and AdaOPS, we show that learned heuristics expressed in Answer Set Programming (ASP) yield performance superior to neural networks and similar to optimal handcrafted task-specific heuristics within lower computational time. Moreover, they well generalize to more challenging scenarios not experienced in the training phase (e.g., increasing rocks and grid size in rocksample, incrementing the size of the map and the aggressivity of ghosts in pocman).

Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Logic in Computer Science (cs.LO)
Cite as:	arXiv:2402.19265 [cs.AI]
	(or arXiv:2402.19265v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2402.19265
Journal reference:	Journal of Artificial Intelligence Research, volume 79 (2024), pp. 725-776
Related DOI:	https://doi.org/10.1613/jair.1.15826

Submission history

From: Daniele Meli [view email]
[v1] Thu, 29 Feb 2024 15:36:01 UTC (6,437 KB)

Computer Science > Artificial Intelligence

Title:Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Learning Logic Specifications for Policy Guidance in POMDPs: an Inductive Logic Programming Approach

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators