Skip to main content

Showing 1–20 of 20 results for author: Kleiman-Weiner, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.19202  [pdf, ps, other

    cs.RO cs.HC

    Preserving Sense of Agency: User Preferences for Robot Autonomy and User Control across Household Tasks

    Authors: Claire Yang, Heer Patel, Max Kleiman-Weiner, Maya Cakmak

    Abstract: Roboticists often design with the assumption that assistive robots should be fully autonomous. However, it remains unclear whether users prefer highly autonomous robots, as prior work in assistive robotics suggests otherwise. High robot autonomy can reduce the user's sense of agency, which represents feeling in control of one's environment. How much control do users, in fact, want over the actions… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: Accepted by the 2025 34th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)

  2. arXiv:2506.06166  [pdf, other

    cs.LG cs.AI cs.CL cs.CY cs.HC

    The Lock-in Hypothesis: Stagnation by Algorithm

    Authors: Tianyi Alex Qiu, Zhonghao He, Tejasveer Chugh, Max Kleiman-Weiner

    Abstract: The training and deployment of large language models (LLMs) create a feedback loop with human users: models learn human beliefs from data, reinforce these beliefs with generated content, reabsorb the reinforced beliefs, and feed them back to users again and again. This dynamic resembles an echo chamber. We hypothesize that this feedback loop entrenches the existing values and beliefs of users, lea… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: ICML 2025, 46 pages

  3. arXiv:2505.21479  [pdf, ps, other

    cs.CL

    Are Language Models Consequentialist or Deontological Moral Reasoners?

    Authors: Keenan Samway, Max Kleiman-Weiner, David Guzman Piedrahita, Rada Mihalcea, Bernhard Schölkopf, Zhijing Jin

    Abstract: As AI systems increasingly navigate applications in healthcare, law, and governance, understanding how they handle ethically complex scenarios becomes critical. Previous work has mainly examined the moral judgments in large language models (LLMs), rather than their underlying moral reasoning process. In contrast, we focus on a large-scale analysis of the moral reasoning traces provided by LLMs. Fu… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  4. arXiv:2504.12714  [pdf, other

    cs.MA cs.AI cs.LG

    Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

    Authors: Kunal Jha, Wilka Carvalho, Yancheng Liang, Simon S. Du, Max Kleiman-Weiner, Natasha Jaques

    Abstract: Zero-shot coordination (ZSC), the ability to adapt to a new partner in a cooperative task, is a critical component of human-compatible AI. While prior work has focused on training agents to cooperate on a single task, these specialized models do not generalize to new tasks, even if they are highly similar. Here, we study how reinforcement learning on a distribution of environments with a single pa… ▽ More

    Submitted 20 April, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

    Comments: Accepted to CogSci 2025, In-review for ICML 2025

  5. arXiv:2410.16665  [pdf, other

    cs.CL cs.CY

    SafetyAnalyst: Interpretable, Transparent, and Steerable Safety Moderation for AI Behavior

    Authors: Jing-Jing Li, Valentina Pyatkin, Max Kleiman-Weiner, Liwei Jiang, Nouha Dziri, Anne G. E. Collins, Jana Schaich Borg, Maarten Sap, Yejin Choi, Sydney Levine

    Abstract: The ideal AI safety moderation system would be both structurally interpretable (so its decisions can be reliably explained) and steerable (to align to safety standards and reflect a community's values), which current systems fall short on. To address this gap, we present SafetyAnalyst, a novel AI safety moderation framework. Given an AI behavior, SafetyAnalyst uses chain-of-thought reasoning to an… ▽ More

    Submitted 27 May, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: Accepted to ICML 2025

  6. arXiv:2407.14681  [pdf, other

    cs.LG cs.AI cs.MA

    Value Internalization: Learning and Generalizing from Social Reward

    Authors: Frieda Rong, Max Kleiman-Weiner

    Abstract: Social rewards shape human behavior. During development, a caregiver guides a learner's behavior towards culturally aligned goals and values. How do these behaviors persist and generalize when the caregiver is no longer present, and the learner must continue autonomously? Here, we propose a model of value internalization where social feedback trains an internal social reward (ISR) model that gener… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: Reinforcement Learning Conference (RLC) 2024 & Cognitive Science Conference Oral

  7. arXiv:2407.02273  [pdf, other

    cs.CL

    Language Model Alignment in Multilingual Trolley Problems

    Authors: Zhijing Jin, Max Kleiman-Weiner, Giorgio Piatti, Sydney Levine, Jiarui Liu, Fernando Gonzalez, Francesco Ortu, András Strausz, Mrinmaya Sachan, Rada Mihalcea, Yejin Choi, Bernhard Schölkopf

    Abstract: We evaluate the moral alignment of LLMs with human preferences in multilingual trolley problems. Building on the Moral Machine experiment, which captures over 40 million human judgments across 200+ countries, we develop a cross-lingual corpus of moral dilemma vignettes in over 100 languages called MultiTP. This dataset enables the assessment of LLMs' decision-making processes in diverse linguistic… ▽ More

    Submitted 27 May, 2025; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: ICLR 2025 Spotlight, Best Paper @ NeurIPS 2024 Workshop on Pluralistic Alignment

  8. arXiv:2404.16698  [pdf, other

    cs.CL

    Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents

    Authors: Giorgio Piatti, Zhijing Jin, Max Kleiman-Weiner, Bernhard Schölkopf, Mrinmaya Sachan, Rada Mihalcea

    Abstract: As AI systems pervade human life, ensuring that large language models (LLMs) make safe decisions remains a significant challenge. We introduce the Governance of the Commons Simulation (GovSim), a generative simulation platform designed to study strategic interactions and cooperative decision-making in LLMs. In GovSim, a society of AI agents must collectively balance exploiting a common resource wi… ▽ More

    Submitted 8 December, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: NeurIPS 2024

  9. arXiv:2312.04350  [pdf, other

    cs.CL cs.AI cs.LG

    CLadder: Assessing Causal Reasoning in Language Models

    Authors: Zhijing Jin, Yuen Chen, Felix Leeb, Luigi Gresele, Ojasv Kamal, Zhiheng Lyu, Kevin Blin, Fernando Gonzalez Adauto, Max Kleiman-Weiner, Mrinmaya Sachan, Bernhard Schölkopf

    Abstract: The ability to perform causal reasoning is widely considered a core feature of intelligence. In this work, we investigate whether large language models (LLMs) can coherently reason about causality. Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordan… ▽ More

    Submitted 17 January, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023; updated with CLadder dataset v1.5

  10. arXiv:2201.12658  [pdf, other

    cs.LG cs.AI cs.MA

    Learning Intuitive Policies Using Action Features

    Authors: Mingwei Ma, Jizhou Liu, Samuel Sokota, Max Kleiman-Weiner, Jakob Foerster

    Abstract: An unaddressed challenge in multi-agent coordination is to enable AI agents to exploit the semantic relationships between the features of actions and the features of observations. Humans take advantage of these relationships in highly intuitive ways. For instance, in the absence of a shared language, we might point to the object we desire or hold up our fingers to indicate how many objects we want… ▽ More

    Submitted 5 June, 2023; v1 submitted 29 January, 2022; originally announced January 2022.

    Comments: ICML 2023

  11. When Is It Acceptable to Break the Rules? Knowledge Representation of Moral Judgement Based on Empirical Data

    Authors: Edmond Awad, Sydney Levine, Andrea Loreggia, Nicholas Mattei, Iyad Rahwan, Francesca Rossi, Kartik Talamadupula, Joshua Tenenbaum, Max Kleiman-Weiner

    Abstract: One of the most remarkable things about the human moral mind is its flexibility. We can make moral judgments about cases we have never seen before. We can decide that pre-established rules should be broken. We can invent novel rules on the fly. Capturing this flexibility is one of the central challenges in developing AI systems that can interpret and produce human-like moral judgment. This paper d… ▽ More

    Submitted 19 January, 2022; originally announced January 2022.

    Journal ref: Journal of Autonomous Agents and Multi-Agent Systems 38, 35 (2024)

  12. arXiv:2106.02164  [pdf, other

    cs.AI

    Modeling Communication to Coordinate Perspectives in Cooperation

    Authors: Stephanie Stacy, Chenfei Li, Minglu Zhao, Yiling Yun, Qingyi Zhao, Max Kleiman-Weiner, Tao Gao

    Abstract: Communication is highly overloaded. Despite this, even young children are good at leveraging context to understand ambiguous signals. We propose a computational account of overloaded signaling from a shared agency perspective which we call the Imagined We for Communication. Under this framework, communication helps cooperators coordinate their perspectives, allowing them to act together to achieve… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

  13. arXiv:2007.10089  [pdf

    cs.HC

    Antarjami: Exploring psychometric evaluation through a computer-based game

    Authors: Anirban Lahiri, Utanko Mitra, Sunreeta Sen, Mrinal Chakraborty, Max Kleiman-Weiner, Rajlakshmi Guha, Pabitra Mitra, Anupam Basu, Partha Pratim Chakraborty

    Abstract: A number of questionnaire based psychometric testing frameworks are globally for example OCEAN (Five factor) indicator, MBTI (Myers Brigg Type Indicator) etc. However, questionnaire based psychometric tests have some known shortcomings. This work explores whether these shortcomings can be mitigated through computer-based gaming platforms for evaluating psychometric parameters. A computer based psy… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: Submitted to CogSci 2020

  14. arXiv:2003.11778  [pdf, other

    cs.AI cs.LG cs.MA

    Too many cooks: Bayesian inference for coordinating multi-agent collaboration

    Authors: Rose E. Wang, Sarah A. Wu, James A. Evans, Joshua B. Tenenbaum, David C. Parkes, Max Kleiman-Weiner

    Abstract: Collaboration requires agents to coordinate their behavior on the fly, sometimes cooperating to solve a single task together and other times dividing it up into sub-tasks to work on in parallel. Underlying the human ability to collaborate is theory-of-mind, the ability to infer the hidden mental states that drive others to act. Here, we develop Bayesian Delegation, a decentralized multi-agent lear… ▽ More

    Submitted 5 July, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

    Comments: Rose E. Wang and Sarah A. Wu contributed equally

  15. arXiv:1906.02330  [pdf, other

    cs.LG cs.MA stat.ML

    Finding Friend and Foe in Multi-Agent Games

    Authors: Jack Serrino, Max Kleiman-Weiner, David C. Parkes, Joshua B. Tenenbaum

    Abstract: Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on T… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Jack Serrino and Max Kleiman-Weiner contributed equally

  16. arXiv:1901.06085  [pdf, other

    cs.AI cs.MA

    Theory of Minds: Understanding Behavior in Groups Through Inverse Planning

    Authors: Michael Shum, Max Kleiman-Weiner, Michael L. Littman, Joshua B. Tenenbaum

    Abstract: Human social behavior is structured by relationships. We form teams, groups, tribes, and alliances at all scales of human life. These structures guide multi-agent cooperation and competition, but when we observe others these underlying relationships are typically unobservable and hence must be inferred. Humans make these inferences intuitively and flexibly, often making rapid generalizations about… ▽ More

    Submitted 17 January, 2019; originally announced January 2019.

    Comments: published in AAAI 2019; Michael Shum and Max Kleiman-Weiner contributed equally

  17. arXiv:1810.05903  [pdf, ps, other

    cs.AI cs.CY

    Towards Formal Definitions of Blameworthiness, Intention, and Moral Responsibility

    Authors: Joseph Y. Halpern, Max Kleiman-Weiner

    Abstract: We provide formal definitions of degree of blameworthiness and intention relative to an epistemic state (a probability over causal models and a utility function on outcomes). These, together with a definition of actual causality, provide the key ingredients for moral responsibility judgments. We show that these definitions give insight into commonsense intuitions in a variety of puzzling cases fro… ▽ More

    Submitted 13 October, 2018; originally announced October 2018.

    Comments: Appears in AAAI-18

  18. arXiv:1808.02093  [pdf, other

    cs.AI cs.IT cs.LG cs.MA stat.ML

    Learning to Share and Hide Intentions using Information Regularization

    Authors: DJ Strouse, Max Kleiman-Weiner, Josh Tenenbaum, Matt Botvinick, David Schwab

    Abstract: Learning to cooperate with friends and compete with foes is a key component of multi-agent reinforcement learning. Typically to do so, one requires access to either a model of or interaction with the other agent(s). Here we show how to learn effective strategies for cooperation and competition in an asymmetric information game with no such model or interaction. Our approach is to encourage an agen… ▽ More

    Submitted 1 January, 2019; v1 submitted 6 August, 2018; originally announced August 2018.

    Comments: Presented at the 32nd Conference on Neural Information Processing Systems (NIPS 2018)

  19. arXiv:1803.07170  [pdf, other

    cs.AI cs.CY

    Blaming humans in autonomous vehicle accidents: Shared responsibility across levels of automation

    Authors: Edmond Awad, Sydney Levine, Max Kleiman-Weiner, Sohan Dsouza, Joshua B. Tenenbaum, Azim Shariff, Jean-François Bonnefon, Iyad Rahwan

    Abstract: When a semi-autonomous car crashes and harms someone, how are blame and causal responsibility distributed across the human and machine drivers? In this article, we consider cases in which a pedestrian was hit and killed by a car being operated under shared control of a primary and a secondary driver. We find that when only one driver makes an error, that driver receives the blame and is considered… ▽ More

    Submitted 21 March, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

  20. arXiv:1801.04346  [pdf, other

    cs.AI

    A Computational Model of Commonsense Moral Decision Making

    Authors: Richard Kim, Max Kleiman-Weiner, Andres Abeliuk, Edmond Awad, Sohan Dsouza, Josh Tenenbaum, Iyad Rahwan

    Abstract: We introduce a new computational model of moral decision making, drawing on a recent theory of commonsense moral learning via social dynamics. Our model describes moral dilemmas as a utility function that computes trade-offs in values over abstract moral dimensions, which provide interpretable parameter values when implemented in machine-led ethical decision-making. Moreover, characterizing the so… ▽ More

    Submitted 12 January, 2018; originally announced January 2018.