Skip to main content

Showing 1–34 of 34 results for author: McKee, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.24477  [pdf, ps, other

    cs.CY cs.AI cs.LG

    Evaluating Gemini in an arena for learning

    Authors: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Ankit Anand, Avishkar Bhoopchand, Brett Wiltshire, Daniel Gillick, Daniel Kasenberg, Eleni Sgouritsa, Gal Elidan, Hengrui Liu, Holger Winnemoeller, Irina Jurenka, James Cohan, Jennifer She, Julia Wilkowski, Kaiz Alarakyia, Kevin R. McKee, Komal Singh, Lisa Wang, Markus Kunesch, Miruna Pîslar, Niv Efron , et al. (12 additional authors not shown)

    Abstract: Artificial intelligence (AI) is poised to transform education, but the research community lacks a robust, general benchmark to evaluate AI models for learning. To assess state-of-the-art support for educational use cases, we ran an "arena for learning" where educators and pedagogy experts conduct blind, head-to-head, multi-turn comparisons of leading AI models. In particular, $N = 189$ educators d… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  2. arXiv:2503.19815  [pdf, other

    cs.AI cs.NE

    Thinking agents for zero-shot generalization to qualitatively novel tasks

    Authors: Thomas Miconi, Kevin McKee, Yicong Zheng, Jed McCaleb

    Abstract: Intelligent organisms can solve truly novel problems which they have never encountered before, either in their lifetime or their evolution. An important component of this capacity is the ability to ``think'', that is, to mentally manipulate objects, concepts and behaviors in order to plan and evaluate possible solutions to novel problems, even without environment interaction. To generate problems… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  3. arXiv:2503.02831  [pdf, other

    cs.LG

    Meta-Learning to Explore via Memory Density Feedback

    Authors: Kevin L. McKee

    Abstract: Exploration algorithms for reinforcement learning typically replace or augment the reward function with an additional ``intrinsic'' reward that trains the agent to seek previously unseen states of the environment. Here, we consider an exploration algorithm that exploits meta-learning, or learning to learn, such that the agent learns to maximize its exploration progress within a single episode, eve… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

    Comments: 15 pages, 6 figures

  4. arXiv:2503.02303  [pdf, other

    cs.NE cs.AI cs.LG

    Flexible Prefrontal Control over Hippocampal Episodic Memory for Goal-Directed Generalization

    Authors: Yicong Zheng, Nora Wolf, Charan Ranganath, Randall C. O'Reilly, Kevin L. McKee

    Abstract: Many tasks require flexibly modifying perception and behavior based on current goals. Humans can retrieve episodic memories from days to years ago, using them to contextualize and generalize behaviors across novel but structurally related situations. The brain's ability to control episodic memories based on task demands is often attributed to interactions between the prefrontal cortex (PFC) and hi… ▽ More

    Submitted 18 May, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

  5. arXiv:2502.21229  [pdf, other

    cs.LG

    A Method of Selective Attention for Reservoir Based Agents

    Authors: Kevin McKee

    Abstract: Training of deep reinforcement learning agents is slowed considerably by the presence of input dimensions that do not usefully condition the reward function. Existing modules such as layer normalization can be trained with weight decay to act as a form of selective attention, i.e. an input mask, that shrinks the scale of unnecessary inputs, which in turn accelerates training of the policy. However… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

    Comments: 6 pages, 2 figures

  6. arXiv:2502.07077  [pdf, other

    cs.CL cs.CY cs.HC

    Multi-turn Evaluation of Anthropomorphic Behaviours in Large Language Models

    Authors: Lujain Ibrahim, Canfer Akbulut, Rasmi Elasmar, Charvi Rastogi, Minsuk Kahng, Meredith Ringel Morris, Kevin R. McKee, Verena Rieser, Murray Shanahan, Laura Weidinger

    Abstract: The tendency of users to anthropomorphise large language models (LLMs) is of growing interest to AI developers, researchers, and policy-makers. Here, we present a novel method for empirically evaluating anthropomorphic LLM behaviours in realistic and varied settings. Going beyond single-turn static benchmarks, we contribute three methodological advances in state-of-the-art (SOTA) LLM evaluation. F… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  7. arXiv:2412.16429  [pdf, other

    cs.CY cs.AI cs.LG

    LearnLM: Improving Gemini for Learning

    Authors: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Brett Wiltshire, Brian Veprek, Daniel Gillick, Daniel Kasenberg, Derek Ahmed, Irina Jurenka, James Cohan, Jennifer She, Julia Wilkowski, Kaiz Alarakyia, Kevin R. McKee, Lisa Wang, Markus Kunesch, Mike Schaekermann, Miruna Pîslar, Nikhil Joshi, Parsa Mahmoudieh, Paul Jhun, Sara Wiltberger, Shakir Mohamed , et al. (21 additional authors not shown)

    Abstract: Today's generative AI systems are tuned to present information by default rather than engage users in service of learning as a human tutor would. To address the wide range of potential education use cases for these systems, we reframe the challenge of injecting pedagogical behavior as one of \textit{pedagogical instruction following}, where training and evaluation examples include system-level ins… ▽ More

    Submitted 25 December, 2024; v1 submitted 20 December, 2024; originally announced December 2024.

  8. arXiv:2412.13093  [pdf, other

    cs.LG

    Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks

    Authors: Kevin McKee

    Abstract: Tasks in which rewards depend upon past information not available in the current observation set can only be solved by agents that are equipped with short-term memory. Usual choices for memory modules include trainable recurrent hidden layers, often with gated memory. Reservoir computing presents an alternative, in which a recurrent layer is not trained, but rather has a set of fixed, sparse recur… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

    Comments: 9 pages, 6 figures

  9. arXiv:2407.16895  [pdf, other

    cs.CY cs.AI

    (Unfair) Norms in Fairness Research: A Meta-Analysis

    Authors: Jennifer Chien, A. Stevie Bergman, Kevin R. McKee, Nenad Tomasev, Vinodkumar Prabhakaran, Rida Qadri, Nahema Marchal, William Isaac

    Abstract: Algorithmic fairness has emerged as a critical concern in artificial intelligence (AI) research. However, the development of fair AI systems is not an objective process. Fairness is an inherently subjective concept, shaped by the values, experiences, and identities of those involved in research and development. To better understand the norms and values embedded in current fairness research, we con… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  10. arXiv:2407.12687  [pdf, other

    cs.CY cs.AI cs.LG

    Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

    Authors: Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister , et al. (49 additional authors not shown)

    Abstract: A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily… ▽ More

    Submitted 19 July, 2024; v1 submitted 21 May, 2024; originally announced July 2024.

  11. A social path to human-like artificial intelligence

    Authors: Edgar A. Duéñez-Guzmán, Suzanne Sadedin, Jane X. Wang, Kevin R. McKee, Joel Z. Leibo

    Abstract: Traditionally, cognitive and computer scientists have viewed intelligence solipsistically, as a property of unitary agents devoid of social context. Given the success of contemporary learning algorithms, we argue that the bottleneck in artificial intelligence (AI) progress is shifting from data assimilation to novel data generation. We bring together evidence showing that natural intelligence emer… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 17 pages, 2 figures, 1 box

    MSC Class: 68T05 ACM Class: I.2.6

  12. arXiv:2403.14467  [pdf, other

    cs.HC cs.CL cs.CY

    Recourse for reclamation: Chatting with generative language models

    Authors: Jennifer Chien, Kevin R. McKee, Jackie Kay, William Isaac

    Abstract: Researchers and developers increasingly rely on toxicity scoring to moderate generative language model outputs, in settings such as customer service, information retrieval, and content generation. However, toxicity scoring may render pertinent information inaccessible, rigidify or "value-lock" cultural norms, and prevent language reclamation processes, particularly for marginalized people. In this… ▽ More

    Submitted 21 April, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA 2024)

  13. The illusion of artificial inclusion

    Authors: William Agnew, A. Stevie Bergman, Jennifer Chien, Mark Díaz, Seliem El-Sayed, Jaylen Pittman, Shakir Mohamed, Kevin R. McKee

    Abstract: Human participants play a central role in the development of modern artificial intelligence (AI) technology, in psychological science, and in user research. Recent advances in generative AI have attracted growing interest to the possibility of replacing human participants in these domains with AI surrogates. We survey several such "substitution proposals" to better understand the arguments for and… ▽ More

    Submitted 5 February, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI 2024)

  14. Human participants in AI research: Ethics and transparency in practice

    Authors: Kevin R. McKee

    Abstract: In recent years, research involving human participants has been critical to advances in artificial intelligence (AI) and machine learning (ML), particularly in the areas of conversational, human-compatible, and cooperative AI. For example, roughly 9% of publications at recent AAAI and NeurIPS conferences indicate the collection of original human data. Yet AI and ML researchers lack guidelines for… ▽ More

    Submitted 26 September, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted at IEEE Transactions on Technology and Society

  15. arXiv:2310.03051  [pdf, other

    cs.CL cs.AI

    How FaR Are Large Language Models From Agents with Theory-of-Mind?

    Authors: Pei Zhou, Aman Madaan, Srividya Pranavi Potharaju, Aditya Gupta, Kevin R. McKee, Ari Holtzman, Jay Pujara, Xiang Ren, Swaroop Mishra, Aida Nematzadeh, Shyam Upadhyay, Manaal Faruqui

    Abstract: "Thinking is for Doing." Humans can infer other people's mental states from observations--an ability called Theory-of-Mind (ToM)--and subsequently act pragmatically on those inferences. Existing question answering benchmarks such as ToMi ask models questions to make inferences about beliefs of characters in a story, but do not test whether models can then use these inferences to guide their action… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: Preprint, 18 pages, 6 figures, 6 tables

  16. arXiv:2305.00768  [pdf, other

    cs.MA stat.ML

    Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas

    Authors: Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán

    Abstract: In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others. In reinforcement learning, SVO has been instantiated as an intrinsic motivation that remaps an agent's rewards based on particular target distributions of group reward. Prior studies show that groups of agents endowed with heterogeneous SVO learn diverse poli… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  17. arXiv:2302.00797  [pdf, ps, other

    cs.AI cs.GT cs.LG cs.MA

    Combining Deep Reinforcement Learning and Search with Generative Models for Game-Theoretic Opponent Modeling

    Authors: Zun Li, Marc Lanctot, Kevin R. McKee, Luke Marris, Ian Gemp, Daniel Hennes, Paul Muller, Kate Larson, Yoram Bachrach, Michael P. Wellman

    Abstract: Opponent modeling methods typically involve two crucial steps: building a belief distribution over opponents' strategies, and exploiting this opponent model by playing a best response. However, existing approaches typically require domain-specific heurstics to come up with such a model, and algorithms for approximating best responses are hard to scale in large, imperfect information domains. In… ▽ More

    Submitted 13 June, 2025; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: Accepted by IJCAI'25 main track

  18. arXiv:2209.10958  [pdf, ps, other

    cs.MA cs.AI

    Developing, Evaluating and Scaling Learning Agents in Multi-Agent Environments

    Authors: Ian Gemp, Thomas Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome Connor, Vibhavari Dasagi, Bart De Vylder, Edgar Duenez-Guzman, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, Siqi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Perolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov , et al. (2 additional authors not shown)

    Abstract: The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks. A signature aim of our group is to use the resources and expertise made available to us at DeepMind in d… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: Published in AI Communications 2022

  19. arXiv:2208.09513  [pdf, other

    cs.DC cs.AI

    Globus Automation Services: Research process automation across the space-time continuum

    Authors: Ryan Chard, Jim Pruyne, Kurt McKee, Josh Bryan, Brigitte Raumann, Rachana Ananthakrishnan, Kyle Chard, Ian Foster

    Abstract: Research process automation -- the reliable, efficient, and reproducible execution of linked sets of actions on scientific instruments, computers, data stores, and other resources -- has emerged as an essential element of modern science. We report here on new services within the Globus research data management platform that enable the specification of diverse research processes as reusable sets of… ▽ More

    Submitted 6 December, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

  20. arXiv:2205.13740  [pdf, other

    cs.LG cs.AI cs.CY

    Subverting machines, fluctuating identities: Re-learning human categorization

    Authors: Christina Lu, Jackie Kay, Kevin R. McKee

    Abstract: Most machine learning systems that interact with humans construct some notion of a person's "identity," yet the default paradigm in AI research envisions identity with essential attributes that are discrete and static. In stark contrast, strands of thought within critical theory present a conception of identity as malleable and constructed entirely through interaction; a doing rather than a being.… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT '22), June 21-24, 2022, Seoul, Republic of Korea. First two authors contributed equally to this work

  21. Warmth and competence in human-agent cooperation

    Authors: Kevin R. McKee, Xuechunzi Bai, Susan T. Fiske

    Abstract: Interaction and cooperation with humans are overarching aspirations of artificial intelligence (AI) research. Recent studies demonstrate that AI agents trained with deep reinforcement learning are capable of collaborating with humans. These studies primarily evaluate human compatibility through "objective" metrics such as task performance, obscuring potential variation in the levels of trust and s… ▽ More

    Submitted 8 May, 2024; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: Accepted at Autonomous Agents and Multi-Agent Systems

  22. arXiv:2201.01816  [pdf, other

    cs.AI cs.LG cs.MA

    Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria

    Authors: Kavya Kopparapu, Edgar A. Duéñez-Guzmán, Jayd Matyas, Alexander Sasha Vezhnevets, John P. Agapiou, Kevin R. McKee, Richard Everett, Janusz Marecki, Joel Z. Leibo, Thore Graepel

    Abstract: A key challenge in the study of multiagent cooperation is the need for individual agents not only to cooperate effectively, but to decide with whom to cooperate. This is particularly critical in situations when other agents have hidden, possibly misaligned motivations and goals. Social deduction games offer an avenue to study how individuals might learn to synthesize potentially unreliable informa… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

  23. arXiv:2110.11404  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    Statistical discrimination in learning agents

    Authors: Edgar A. Duéñez-Guzmán, Kevin R. McKee, Yiran Mao, Ben Coppin, Silvia Chiappa, Alexander Sasha Vezhnevets, Michiel A. Bakker, Yoram Bachrach, Suzanne Sadedin, William Isaac, Karl Tuyls, Joel Z. Leibo

    Abstract: Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics. One primary example is \textit{statistical discrimination} -- selecting social partners based not on their underlying attributes, but on readily perceptible characteristics that covary with their suitability for the task at ha… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: 29 pages, 10 figures

    MSC Class: 68T07 (Primary) 91A26; 91-10; 93A16 (Secondary) ACM Class: I.2.11; I.2.0

  24. arXiv:2110.08176  [pdf, other

    cs.LG cs.HC cs.MA

    Collaborating with Humans without Human Data

    Authors: DJ Strouse, Kevin R. McKee, Matt Botvinick, Edward Hughes, Richard Everett

    Abstract: Collaborating with humans requires rapidly adapting to their individual strengths, weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement learning techniques, such as self-play (SP) or population play (PP), produce agents that overfit to their training partners and do not generalize well to humans. Alternatively, researchers can collect human data, train a human model… ▽ More

    Submitted 7 January, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021 (spotlight)

  25. arXiv:2103.04982  [pdf, other

    cs.MA cs.AI cs.GT

    A multi-agent reinforcement learning model of reputation and cooperation in human groups

    Authors: Kevin R. McKee, Edward Hughes, Tina O. Zhu, Martin J. Chadwick, Raphael Koster, Antonio Garcia Castaneda, Charlie Beattie, Thore Graepel, Matt Botvinick, Joel Z. Leibo

    Abstract: Collective action demands that individuals efficiently coordinate how much, where, and when to cooperate. Laboratory experiments have extensively explored the first part of this process, demonstrating that a variety of social-cognitive mechanisms influence how much individuals choose to invest in group efforts. However, experimental research has been unable to shed light on how social cognitive me… ▽ More

    Submitted 22 February, 2023; v1 submitted 8 March, 2021; originally announced March 2021.

  26. Quantifying the effects of environment and population diversity in multi-agent reinforcement learning

    Authors: Kevin R. McKee, Joel Z. Leibo, Charlie Beattie, Richard Everett

    Abstract: Generalization is a major challenge for multi-agent reinforcement learning. How well does an agent perform when placed in novel environments and in interactions with new co-players? In this paper, we investigate and quantify the relationship between generalization and diversity in the multi-agent domain. Across the range of multi-agent environments considered here, procedurally generating training… ▽ More

    Submitted 4 March, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Accepted at Autonomous Agents and Multi-Agent Systems

  27. arXiv:2102.04257  [pdf, other

    cs.CY cs.AI cs.LG

    Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities

    Authors: Nenad Tomasev, Kevin R. McKee, Jackie Kay, Shakir Mohamed

    Abstract: Advances in algorithmic fairness have largely omitted sexual orientation and gender identity. We explore queer concerns in privacy, censorship, language, online safety, health, and employment to study the positive and negative effects of artificial intelligence on queer communities. These issues underscore the need for new directions in fairness research that take into account a multiplicity of co… ▽ More

    Submitted 28 April, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

    Comments: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES 2021)

  28. arXiv:2102.02274  [pdf, other

    cs.LG cs.AI cs.MA

    Neural Recursive Belief States in Multi-Agent Reinforcement Learning

    Authors: Pol Moreno, Edward Hughes, Kevin R. McKee, Bernardo Avila Pires, Théophane Weber

    Abstract: In multi-agent reinforcement learning, the problem of learning to act is particularly difficult because the policies of co-players may be heavily conditioned on information only observed by them. On the other hand, humans readily form beliefs about the knowledge possessed by their peers and leverage beliefs to inform decision-making. Such abilities underlie individual success in a wide range of Ma… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  29. arXiv:2012.08630  [pdf, other

    cs.AI cs.MA

    Open Problems in Cooperative AI

    Authors: Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, Thore Graepel

    Abstract: Problems of cooperation--in which agents seek ways to jointly improve their welfare--are ubiquitous and important. They can be found at scales ranging from our daily routines--such as driving on highways, scheduling meetings, and working collaboratively--to our global challenges--such as peace, commerce, and pandemic preparedness. Arguably, the success of the human species is rooted in our ability… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

  30. arXiv:2010.09054  [pdf, other

    cs.MA

    Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences

    Authors: Raphael Köster, Kevin R. McKee, Richard Everett, Laura Weidinger, William S. Isaac, Edward Hughes, Edgar A. Duéñez-Guzmán, Thore Graepel, Matthew Botvinick, Joel Z. Leibo

    Abstract: Game theoretic views of convention generally rest on notions of common knowledge and hyper-rational models of individual behavior. However, decades of work in behavioral economics have questioned the validity of both foundations. Meanwhile, computational neuroscience has contributed a modernized 'dual process' account of decision-making where model-free (MF) reinforcement learning trades off with… ▽ More

    Submitted 14 December, 2020; v1 submitted 18 October, 2020; originally announced October 2020.

  31. arXiv:2010.00575  [pdf, other

    cs.MA cs.GT

    D3C: Reducing the Price of Anarchy in Multi-Agent Learning

    Authors: Ian Gemp, Kevin R. McKee, Richard Everett, Edgar A. Duéñez-Guzmán, Yoram Bachrach, David Balduzzi, Andrea Tacchetti

    Abstract: In multiagent systems, the complex interaction of fixed incentives can lead agents to outcomes that are poor (inefficient) not only for the group, but also for each individual. Price of anarchy is a technical, game-theoretic definition that quantifies the inefficiency arising in these scenarios -- it compares the welfare that can be achieved through perfect coordination against that achieved by se… ▽ More

    Submitted 20 February, 2022; v1 submitted 1 October, 2020; originally announced October 2020.

    Comments: Published in AAMAS 2022

  32. arXiv:2002.02325  [pdf, other

    cs.MA cs.AI

    Social diversity and social preferences in mixed-motive reinforcement learning

    Authors: Kevin R. McKee, Ian Gemp, Brian McWilliams, Edgar A. Duéñez-Guzmán, Edward Hughes, Joel Z. Leibo

    Abstract: Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity. In contrast, studies of reinforcement learning in mixed-motive games have primarily leveraged homogeneous approaches. Given the defining characteristic of mixed-motive games--the imperfect correlation of incentives between group members--we study the… ▽ More

    Submitted 12 February, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020)

  33. arXiv:1806.01203  [pdf, other

    cs.LG cs.AI stat.ML

    Relational inductive bias for physical construction in humans and machines

    Authors: Jessica B. Hamrick, Kelsey R. Allen, Victor Bapst, Tina Zhu, Kevin R. McKee, Joshua B. Tenenbaum, Peter W. Battaglia

    Abstract: While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks. We hypothesize that what these systems lack is a "relational inductive bias": a capacity for reasoning about inter-object relations and making choices over a structured description of a scene. To test this hypot… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

    Comments: In Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci 2018)

  34. arXiv:1803.08884  [pdf, other

    cs.NE cs.AI cs.GT cs.MA q-bio.PE

    Inequity aversion improves cooperation in intertemporal social dilemmas

    Authors: Edward Hughes, Joel Z. Leibo, Matthew G. Phillips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, Thore Graepel

    Abstract: Groups of humans are often able to find ways to cooperate with one another in complex, temporally extended social dilemmas. Models based on behavioral economics are only able to explain this phenomenon for unrealistic stateless matrix games. Recently, multi-agent reinforcement learning has been applied to generalize social dilemma problems to temporally and spatially extended Markov games. However… ▽ More

    Submitted 27 September, 2018; v1 submitted 23 March, 2018; originally announced March 2018.

    Comments: 15 pages, 8 figures