Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Brero, Gianluca; Eden, Alon; Chakrabarti, Darshan; Gerstgrasser, Matthias; Greenwald, Amy; Li, Vincent; Parkes, David C.

Computer Science > Computer Science and Game Theory

arXiv:2210.03852 (cs)

[Submitted on 7 Oct 2022 (v1), last revised 19 Jul 2024 (this version, v4)]

Title:Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Authors:Gianluca Brero, Alon Eden, Darshan Chakrabarti, Matthias Gerstgrasser, Amy Greenwald, Vincent Li, David C. Parkes

View PDF HTML (experimental)

Abstract:We introduce a reinforcement learning framework for economic design where the interaction between the environment designer and the participants is modeled as a Stackelberg game. In this game, the designer (leader) sets up the rules of the economic system, while the participants (followers) respond strategically. We integrate algorithms for determining followers' response strategies into the leader's learning environment, providing a formulation of the leader's learning problem as a POMDP that we call the Stackelberg POMDP. We prove that the optimal leader's strategy in the Stackelberg game is the optimal policy in our Stackelberg POMDP under a limited set of possible policies, establishing a connection between solving POMDPs and Stackelberg games. We solve our POMDP under a limited set of policy options via the centralized training with decentralized execution framework. For the specific case of followers that are modeled as no-regret learners, we solve an array of increasingly complex settings, including problems of indirect mechanism design where there is turn-taking and limited communication by agents. We demonstrate the effectiveness of our training framework through ablation studies. We also give convergence results for no-regret learners to a Bayesian version of a coarse-correlated equilibrium, extending known results to correlated types.

Subjects:	Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA)
Cite as:	arXiv:2210.03852 [cs.GT]
	(or arXiv:2210.03852v4 [cs.GT] for this version)
	https://doi.org/10.48550/arXiv.2210.03852

Submission history

From: Gianluca Brero [view email]
[v1] Fri, 7 Oct 2022 23:55:51 UTC (2,857 KB)
[v2] Wed, 15 Feb 2023 17:18:32 UTC (3,326 KB)
[v3] Fri, 10 Nov 2023 00:36:59 UTC (4,073 KB)
[v4] Fri, 19 Jul 2024 11:35:43 UTC (375 KB)

Computer Science > Computer Science and Game Theory

Title:Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Science and Game Theory

Title:Stackelberg POMDP: A Reinforcement Learning Approach for Economic Design

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators