Pessimistic Iterative Planning for Robust POMDPs

Galesloot, Maris F. L.; Suilen, Marnix; Simão, Thiago D.; Carr, Steven; Spaan, Matthijs T. J.; Topcu, Ufuk; Jansen, Nils

Computer Science > Artificial Intelligence

arXiv:2408.08770 (cs)

[Submitted on 16 Aug 2024 (v1), last revised 12 Nov 2024 (this version, v3)]

Title:Pessimistic Iterative Planning for Robust POMDPs

Authors:Maris F. L. Galesloot, Marnix Suilen, Thiago D. Simão, Steven Carr, Matthijs T. J. Spaan, Ufuk Topcu, Nils Jansen

View PDF HTML (experimental)

Abstract:Robust POMDPs extend classical POMDPs to handle model uncertainty. Specifically, robust POMDPs exhibit so-called uncertainty sets on the transition and observation models, effectively defining ranges of probabilities. Policies for robust POMDPs must be (1) memory-based to account for partial observability and (2) robust against model uncertainty to account for the worst-case instances from the uncertainty sets. To compute such robust memory-based policies, we propose the pessimistic iterative planning (PIP) framework, which alternates between two main steps: (1) selecting a pessimistic (non-robust) POMDP via worst-case probability instances from the uncertainty sets; and (2) computing a finite-state controller (FSC) for this pessimistic POMDP. We evaluate the performance of this FSC on the original robust POMDP and use this evaluation in step (1) to select the next pessimistic POMDP. Within PIP, we propose the rFSCNet algorithm. In each iteration, rFSCNet finds an FSC through a recurrent neural network by using supervision policies optimized for the pessimistic POMDP. The empirical evaluation in four benchmark environments showcases improved robustness against several baseline methods and competitive performance compared to a state-of-the-art robust POMDP solver.

Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2408.08770 [cs.AI]
	(or arXiv:2408.08770v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2408.08770

Submission history

From: Maris Galesloot [view email]
[v1] Fri, 16 Aug 2024 14:25:20 UTC (187 KB)
[v2] Mon, 30 Sep 2024 15:30:10 UTC (1,980 KB)
[v3] Tue, 12 Nov 2024 13:50:05 UTC (2,247 KB)

Computer Science > Artificial Intelligence

Title:Pessimistic Iterative Planning for Robust POMDPs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Pessimistic Iterative Planning for Robust POMDPs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators