Reinforcement Learning from Human Feedback: Whose Culture, Whose Values, Whose Perspectives?

Barman, Kristian González; Lohse, Simon; de Regt, Henk

doi:10.1007/s13347-025-00861-0

Computer Science > Computers and Society

arXiv:2407.17482 (cs)

[Submitted on 2 Jul 2024 (v1), last revised 17 Jan 2025 (this version, v2)]

Title:Reinforcement Learning from Human Feedback: Whose Culture, Whose Values, Whose Perspectives?

Authors:Kristian González Barman, Simon Lohse, Henk de Regt

View PDF

Abstract:We argue for the epistemic and ethical advantages of pluralism in Reinforcement Learning from Human Feedback (RLHF) in the context of Large Language Models (LLM). Drawing on social epistemology and pluralist philosophy of science, we suggest ways in which RHLF can be made more responsive to human needs and how we can address challenges along the way. The paper concludes with an agenda for change, i.e. concrete, actionable steps to improve LLM development.

Subjects:	Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2407.17482 [cs.CY]
	(or arXiv:2407.17482v2 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2407.17482
Journal reference:	González Barman, K., Lohse, S. & de Regt, H.W. Reinforcement Learning from Human Feedback in LLMs: Whose Culture, Whose Values, Whose Perspectives?. Philos. Technol. 38, 35 (2025)
Related DOI:	https://doi.org/10.1007/s13347-025-00861-0

Submission history

From: Kristian Gonzalez Barman [view email]
[v1] Tue, 2 Jul 2024 08:07:27 UTC (417 KB)
[v2] Fri, 17 Jan 2025 09:17:30 UTC (547 KB)

Computer Science > Computers and Society

Title:Reinforcement Learning from Human Feedback: Whose Culture, Whose Values, Whose Perspectives?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computers and Society

Title:Reinforcement Learning from Human Feedback: Whose Culture, Whose Values, Whose Perspectives?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators