Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness

Parthasarathy, Ambreesh; Subramanian, Chandrasekar; Senrayan, Ganesh; Adappanavar, Shreyash; Taneja, Aparna; Ravindran, Balaraman; Tambe, Milind

Computer Science > Computation and Language

arXiv:2501.13120 (cs)

[Submitted on 20 Jan 2025]

Title:Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness

Authors:Ambreesh Parthasarathy, Chandrasekar Subramanian, Ganesh Senrayan, Shreyash Adappanavar, Aparna Taneja, Balaraman Ravindran, Milind Tambe

View PDF HTML (experimental)

Abstract:Restless Multi-Armed Bandits (RMABs) have been successfully applied to resource allocation problems in a variety of settings, including public health. With the rapid development of powerful large language models (LLMs), they are increasingly used to design reward functions to better match human preferences. Recent work has shown that LLMs can be used to tailor automated allocation decisions to community needs using language prompts. However, this has been studied primarily for English prompts and with a focus on task performance only. This can be an issue since grassroots workers, especially in developing countries like India, prefer to work in local languages, some of which are low-resource. Further, given the nature of the problem, biases along population groups unintended by the user are also undesirable. In this work, we study the effects on both task performance and fairness when the DLM algorithm, a recent work on using LLMs to design reward functions for RMABs, is prompted with non-English language commands. Specifically, we run the model on a synthetic environment for various prompts translated into multiple languages. The prompts themselves vary in complexity. Our results show that the LLM-proposed reward functions are significantly better when prompted in English compared to other languages. We also find that the exact phrasing of the prompt impacts task performance. Further, as prompt complexity increases, performance worsens for all languages; however, it is more robust with English prompts than with lower-resource languages. On the fairness side, we find that low-resource languages and more complex prompts are both highly likely to create unfairness along unintended dimensions.

Comments:	Accepted at the AAAI-2025 Deployable AI Workshop
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
Cite as:	arXiv:2501.13120 [cs.CL]
	(or arXiv:2501.13120v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2501.13120

Submission history

From: Chandrasekar Subramanian [view email]
[v1] Mon, 20 Jan 2025 18:14:37 UTC (12,407 KB)

Computer Science > Computation and Language

Title:Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multilinguality in LLM-Designed Reward Functions for Restless Bandits: Effects on Task Performance and Fairness

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators