Offline Reinforcement Learning via Inverse Optimization

Dimanidis, Ioannis; Ok, Tolga; Esfahani, Peyman Mohajerin

Computer Science > Machine Learning

arXiv:2502.20030 (cs)

[Submitted on 27 Feb 2025]

Title:Offline Reinforcement Learning via Inverse Optimization

Authors:Ioannis Dimanidis, Tolga Ok, Peyman Mohajerin Esfahani

View PDF

Abstract:Inspired by the recent successes of Inverse Optimization (IO) across various application domains, we propose a novel offline Reinforcement Learning (ORL) algorithm for continuous state and action spaces, leveraging the convex loss function called ``sub-optimality loss" from the IO literature. To mitigate the distribution shift commonly observed in ORL problems, we further employ a robust and non-causal Model Predictive Control (MPC) expert steering a nominal model of the dynamics using in-hindsight information stemming from the model mismatch. Unlike the existing literature, our robust MPC expert enjoys an exact and tractable convex reformulation. In the second part of this study, we show that the IO hypothesis class, trained by the proposed convex loss function, enjoys ample expressiveness and achieves competitive performance comparing with the state-of-the-art (SOTA) methods in the low-data regime of the MuJoCo benchmark while utilizing three orders of magnitude fewer parameters, thereby requiring significantly fewer computational resources. To facilitate the reproducibility of our results, we provide an open-source package implementing the proposed algorithms and the experiments.

Comments:	preprint
Subjects:	Machine Learning (cs.LG); Systems and Control (eess.SY); Optimization and Control (math.OC)
Cite as:	arXiv:2502.20030 [cs.LG]
	(or arXiv:2502.20030v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.20030

Submission history

From: Tolga Ok [view email]
[v1] Thu, 27 Feb 2025 12:11:44 UTC (454 KB)

Computer Science > Machine Learning

Title:Offline Reinforcement Learning via Inverse Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Offline Reinforcement Learning via Inverse Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators