Don't lie to your friends: Learning what you know from collaborative self-play

Eisenstein, Jacob; Aghajani, Reza; Fisch, Adam; Dua, Dheeru; Huot, Fantine; Lapata, Mirella; Zayats, Vicky; Berant, Jonathan

Computer Science > Machine Learning

arXiv:2503.14481 (cs)

[Submitted on 18 Mar 2025 (v1), last revised 31 Mar 2025 (this version, v2)]

Title:Don't lie to your friends: Learning what you know from collaborative self-play

Authors:Jacob Eisenstein, Reza Aghajani, Adam Fisch, Dheeru Dua, Fantine Huot, Mirella Lapata, Vicky Zayats, Jonathan Berant

View PDF

Abstract:To be helpful assistants, AI agents must be aware of their own capabilities and limitations. This includes knowing when to answer from parametric knowledge versus using tools, when to trust tool outputs, and when to abstain or hedge. Such capabilities are hard to teach through supervised fine-tuning because they require constructing examples that reflect the agent's specific capabilities. We therefore propose a radically new approach to teaching agents what they know: \emph{collaborative self-play}. We construct multi-agent collaborations in which the group is rewarded for collectively arriving at correct answers. The desired meta-knowledge emerges from the incentives built into the structure of the interaction. We focus on small societies of agents that have access to heterogeneous tools (corpus-specific retrieval), and therefore must collaborate to maximize their success while minimizing their effort. Experiments show that group-level rewards for multi-agent communities can induce policies that \emph{transfer} to improve tool use and selective prediction in settings where individual agents are deployed in isolation.

Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2503.14481 [cs.LG]
	(or arXiv:2503.14481v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2503.14481

Submission history

From: Jacob Eisenstein [view email]
[v1] Tue, 18 Mar 2025 17:53:20 UTC (171 KB)
[v2] Mon, 31 Mar 2025 21:28:02 UTC (163 KB)

Computer Science > Machine Learning

Title:Don't lie to your friends: Learning what you know from collaborative self-play

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Don't lie to your friends: Learning what you know from collaborative self-play

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators