Skip to main content

Showing 1–50 of 157 results for author: Griffiths, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.18134  [pdf, ps, other

    cs.AI cs.CL cs.CV

    VideoGameBench: Can Vision-Language Models complete popular video games?

    Authors: Alex L. Zhang, Thomas L. Griffiths, Karthik R. Narasimhan, Ofir Press

    Abstract: Vision-language models (VLMs) have achieved strong results on coding and math benchmarks that are challenging for humans, yet their ability to perform tasks that come naturally to humans--such as perception, spatial navigation, and memory management--remains understudied. Real video games are crafted to be intuitive for humans to learn and master by leveraging innate inductive biases, making them… ▽ More

    Submitted 30 May, 2025; v1 submitted 23 May, 2025; originally announced May 2025.

    Comments: 9 pages, 33 pages including supplementary

  2. arXiv:2505.17968  [pdf, other

    cs.LG cs.AI cs.CL

    Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems

    Authors: Jiayi Geng, Howard Chen, Dilip Arumugam, Thomas L. Griffiths

    Abstract: Using AI to create autonomous researchers has the potential to accelerate scientific discovery. A prerequisite for this vision is understanding how well an AI model can identify the underlying structure of a black-box system from its behavior. In this paper, we explore how well a large language model (LLM) learns to identify a black-box function from passively observed versus actively collected da… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

    Comments: 30 pages

  3. arXiv:2505.17323  [pdf, ps, other

    cs.AI cs.LG

    Partner Modelling Emerges in Recurrent Agents (But Only When It Matters)

    Authors: Ruaridh Mon-Williams, Max Taylor-Davies, Elizabeth Mieczkowski, Natalia Velez, Neil R. Bramley, Yanwei Wang, Thomas L. Griffiths, Christopher G. Lucas

    Abstract: Humans are remarkably adept at collaboration, able to infer the strengths and weaknesses of new partners in order to work successfully towards shared goals. To build AI systems with this capability, we must first understand its building blocks: does such flexibility require explicit, dedicated mechanisms for modelling others -- or can it emerge spontaneously from the pressures of open-ended cooper… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  4. arXiv:2505.13742  [pdf, ps, other

    cs.LG cs.AI

    Understanding Task Representations in Neural Networks via Bayesian Ablation

    Authors: Andrew Nam, Declan Campbell, Thomas Griffiths, Jonathan Cohen, Sarah-Jane Leslie

    Abstract: Neural networks are powerful tools for cognitive modeling due to their flexibility and emergent properties. However, interpreting their learned representations remains challenging due to their sub-symbolic semantics. In this work, we introduce a novel probabilistic framework for interpreting latent task representations in neural networks. Inspired by Bayesian inference, our approach defines a dist… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  5. arXiv:2505.13737  [pdf, ps, other

    cs.AI

    Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers

    Authors: Andrew Nam, Henry Conklin, Yukang Yang, Thomas Griffiths, Jonathan Cohen, Sarah-Jane Leslie

    Abstract: We present causal head gating (CHG), a scalable method for interpreting the functional roles of attention heads in transformer models. CHG learns soft gates over heads and assigns them a causal taxonomy - facilitating, interfering, or irrelevant - based on their impact on task performance. Unlike prior approaches in mechanistic interpretability, which are hypothesis-driven and require prompt templ… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 10 pages, 5 figures, 2 tables

  6. arXiv:2505.11615  [pdf, ps, other

    cs.CL cs.AI

    Steering Risk Preferences in Large Language Models by Aligning Behavioral and Neural Representations

    Authors: Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths

    Abstract: Changing the behavior of large language models (LLMs) can be as straightforward as editing the Transformer's residual streams using appropriately constructed "steering vectors." These modifications to internal neural activations, a form of representation engineering, offer an effective and targeted means of influencing model behavior without retraining or fine-tuning the model. But how can such st… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  7. arXiv:2505.11614  [pdf, ps, other

    cs.AI cs.CL

    Using Reinforcement Learning to Train Large Language Models to Explain Human Decisions

    Authors: Jian-Qiao Zhu, Hanbo Xie, Dilip Arumugam, Robert C. Wilson, Thomas L. Griffiths

    Abstract: A central goal of cognitive modeling is to develop models that not only predict human behavior but also provide insight into the underlying cognitive mechanisms. While neural network models trained on large-scale behavioral data often achieve strong predictive performance, they typically fall short in offering interpretable explanations of the cognitive processes they capture. In this work, we exp… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  8. arXiv:2505.09855  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Predictability Shapes Adaptation: An Evolutionary Perspective on Modes of Learning in Transformers

    Authors: Alexander Y. Ku, Thomas L. Griffiths, Stephanie C. Y. Chan

    Abstract: Transformer models learn in two distinct modes: in-weights learning (IWL), encoding knowledge into model weights, and in-context learning (ICL), adapting flexibly to context without weight modification. To better understand the interplay between these learning modes, we draw inspiration from evolutionary biology's analogous adaptive strategies: genetic encoding (akin to IWL, adapting over generati… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  9. arXiv:2505.07883  [pdf, ps, other

    cs.CL cs.AI

    Recovering Event Probabilities from Large Language Model Embeddings via Axiomatic Constraints

    Authors: Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths

    Abstract: Rational decision-making under uncertainty requires coherent degrees of belief in events. However, event probabilities generated by Large Language Models (LLMs) have been shown to exhibit incoherence, violating the axioms of probability theory. This raises the question of whether coherent event probabilities can be recovered from the embeddings used by the models. If so, those derived probabilitie… ▽ More

    Submitted 10 May, 2025; originally announced May 2025.

  10. arXiv:2504.20997  [pdf, other

    cs.LG cs.AI

    Toward Efficient Exploration by Large Language Model Agents

    Authors: Dilip Arumugam, Thomas L. Griffiths

    Abstract: A burgeoning area within reinforcement learning (RL) is the design of sequential decision-making agents centered around large language models (LLMs). While autonomous decision-making agents powered by modern LLMs could facilitate numerous real-world applications, such successes demand agents that are capable of data-efficient RL. One key obstacle to achieving data efficiency in RL is exploration,… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  11. arXiv:2504.12585  [pdf, other

    cs.CL cs.AI cs.LG

    Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models

    Authors: Liyi Zhang, Veniamin Veselovsky, R. Thomas McCoy, Thomas L. Griffiths

    Abstract: Large language models (LLMs) sometimes fail to respond appropriately to deterministic tasks -- such as counting or forming acronyms -- because the implicit prior distribution they have learned over sequences of tokens influences their responses. In this work, we show that, in at least some cases, LLMs actually compute the information needed to perform these tasks correctly, and we identify some in… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 16 pages, 5 figures

    ACM Class: I.2.7

  12. arXiv:2504.10191  [pdf, other

    cs.CL cs.AI

    Localized Cultural Knowledge is Conserved and Controllable in Large Language Models

    Authors: Veniamin Veselovsky, Berke Argin, Benedikt Stroebl, Chris Wendler, Robert West, James Evans, Thomas L. Griffiths, Arvind Narayanan

    Abstract: Just as humans display language patterns influenced by their native tongue when speaking new languages, LLMs often default to English-centric responses even when generating in other languages. Nevertheless, we observe that local cultural information persists within the models and can be readily activated for cultural customization. We first demonstrate that explicitly providing cultural context in… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  13. arXiv:2503.23212  [pdf, other

    cs.CV cs.LG

    Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation

    Authors: Max Gupta, Sunayana Rane, R. Thomas McCoy, Thomas L. Griffiths

    Abstract: While convolutional neural networks (CNNs) have come to match and exceed human performance in many settings, the tasks these models optimize for are largely constrained to the level of individual objects, such as classification and captioning. Humans remain vastly superior to CNNs in visual tasks involving relations, including the ability to identify two objects as `same' or `different'. A number… ▽ More

    Submitted 31 March, 2025; v1 submitted 29 March, 2025; originally announced March 2025.

  14. arXiv:2503.15703  [pdf, other

    cs.MA cs.AI

    Predicting Multi-Agent Specialization via Task Parallelizability

    Authors: Elizabeth Mieczkowski, Ruaridh Mon-Williams, Neil Bramley, Christopher G. Lucas, Natalia Velez, Thomas L. Griffiths

    Abstract: Multi-agent systems often rely on specialized agents with distinct roles rather than general-purpose agents that perform the entire task independently. However, the conditions that govern the optimal degree of specialization remain poorly understood. In this work, we propose that specialist teams outperform generalist ones when environmental constraints limit task parallelizability -- the potentia… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  15. arXiv:2503.13401  [pdf, other

    cs.CL cs.AI

    Using the Tools of Cognitive Science to Understand Large Language Models at Different Levels of Analysis

    Authors: Alexander Ku, Declan Campbell, Xuechunzi Bai, Jiayi Geng, Ryan Liu, Raja Marjieh, R. Thomas McCoy, Andrew Nam, Ilia Sucholutsky, Veniamin Veselovsky, Liyi Zhang, Jian-Qiao Zhu, Thomas L. Griffiths

    Abstract: Modern artificial intelligence systems, such as large language models, are increasingly powerful but also increasingly hard to understand. Recognizing this problem as analogous to the historical difficulties in understanding the human mind, we argue that methods developed in cognitive science can be useful for understanding large language models. We propose a framework for applying these methods b… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

  16. arXiv:2502.20502  [pdf, other

    cs.AI

    On Benchmarking Human-Like Intelligence in Machines

    Authors: Lance Ying, Katherine M. Collins, Lionel Wong, Ilia Sucholutsky, Ryan Liu, Adrian Weller, Tianmin Shu, Thomas L. Griffiths, Joshua B. Tenenbaum

    Abstract: Recent benchmark studies have claimed that AI has approached or even surpassed human-level performances on various cognitive tasks. However, this position paper argues that current AI evaluation paradigms are insufficient for assessing human-like cognitive capabilities. We identify a set of key shortcomings: a lack of human-validated labels, inadequate representation of human response variability… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 18 pages, 5 figures

  17. arXiv:2502.20237  [pdf, other

    cs.LG cs.AI

    Teasing Apart Architecture and Initial Weights as Sources of Inductive Bias in Neural Networks

    Authors: Gianluca Bencomo, Max Gupta, Ioana Marinescu, R. Thomas McCoy, Thomas L. Griffiths

    Abstract: Artificial neural networks can acquire many aspects of human knowledge from data, making them promising as models of human learning. But what those networks can learn depends upon their inductive biases -- the factors other than the data that influence the solutions they discover -- and the inductive biases of neural networks remain poorly understood, limiting our ability to draw conclusions about… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

    Comments: 11 pages, 6 figures, 6 tables

  18. arXiv:2502.13228  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Conformal Prediction as Bayesian Quadrature

    Authors: Jake C. Snell, Thomas L. Griffiths

    Abstract: As machine learning-based prediction systems are increasingly used in high-stakes situations, it is important to understand how such predictive models will perform upon deployment. Distribution-free uncertainty quantification techniques such as conformal prediction provide guarantees about the loss black-box models will incur even when the details of the models are hidden. However, such methods ar… ▽ More

    Submitted 11 June, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: ICML 2025 camera-ready version (accepted as an oral presentation). 16 pages, 4 figures. Code available at https://github.com/jakesnell/conformal-as-bayes-quad

  19. arXiv:2502.12847  [pdf, other

    cs.SI q-bio.NC q-bio.PE

    Characterizing the Interaction of Cultural Evolution Mechanisms in Experimental Social Networks

    Authors: Raja Marjieh, Manuel Anglada-Tort, Thomas L. Griffiths, Nori Jacoby

    Abstract: Understanding how cognitive and social mechanisms shape the evolution of complex artifacts such as songs is central to cultural evolution research. Social network topology (what artifacts are available?), selection (which are chosen?), and reproduction (how are they copied?) have all been proposed as key influencing factors. However, prior research has rarely studied them together due to methodolo… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  20. arXiv:2502.01540  [pdf, other

    cs.CL cs.AI

    What is a Number, That a Large Language Model May Know It?

    Authors: Raja Marjieh, Veniamin Veselovsky, Thomas L. Griffiths, Ilia Sucholutsky

    Abstract: Numbers are a basic part of how humans represent and describe the world around them. As a consequence, learning effective representations of numbers is critical for the success of large language models as they become more integrated into everyday decisions. However, these models face a challenge: depending on context, the same sequence of digit tokens, e.g., 911, can be treated as a number or as a… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Comments: 16 pages, 8 figures

  21. arXiv:2501.08617  [pdf, ps, other

    cs.LG cs.AI cs.CL

    RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation

    Authors: Kaiqu Liang, Haimin Hu, Ryan Liu, Thomas L. Griffiths, Jaime Fernández Fisac

    Abstract: While Reinforcement Learning from Human Feedback (RLHF) has shown promise in aligning generative AI, we present empirical evidence that it can also cause severe, systematic misalignment. We hypothesize that this stems from evaluator feedback depending on downstream outcome predictions (foresight) that can be influenced by the AI's output, inducing Goodhart's law dynamics. We present a theoretical… ▽ More

    Submitted 9 June, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: 27 pages, 18 figures

  22. arXiv:2411.00238  [pdf, other

    cs.AI cs.CV cs.LG q-bio.NC

    Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

    Authors: Declan Campbell, Sunayana Rane, Tyler Giallanza, Nicolò De Sabbata, Kia Ghods, Amogh Joshi, Alexander Ku, Steven M. Frankland, Thomas L. Griffiths, Jonathan D. Cohen, Taylor W. Webb

    Abstract: Recent work has documented striking heterogeneity in the performance of state-of-the-art vision language models (VLMs), including both multimodal language models and text-to-image models. These models are able to describe and generate a diverse array of complex, naturalistic images, yet they exhibit surprising failures on basic multi-object reasoning tasks -- such as counting, localization, and si… ▽ More

    Submitted 16 April, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

  23. arXiv:2410.21333  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.CY

    Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse

    Authors: Ryan Liu, Jiayi Geng, Addison J. Wu, Ilia Sucholutsky, Tania Lombrozo, Thomas L. Griffiths

    Abstract: Chain-of-thought (CoT) prompting has become a widely used strategy for improving large language and multimodal model performance. However, it is still an open question under which settings CoT systematically reduces performance. In this paper, we seek to identify the characteristics of tasks where CoT reduces performance by drawing inspiration from cognitive psychology, focusing on six representat… ▽ More

    Submitted 13 June, 2025; v1 submitted 27 October, 2024; originally announced October 2024.

  24. arXiv:2410.20268  [pdf, other

    cs.LG

    Centaur: a foundation model of human cognition

    Authors: Marcel Binz, Elif Akata, Matthias Bethge, Franziska Brändle, Fred Callaway, Julian Coda-Forno, Peter Dayan, Can Demircan, Maria K. Eckstein, Noémi Éltető, Thomas L. Griffiths, Susanne Haridi, Akshay K. Jagadish, Li Ji-An, Alexander Kipnis, Sreejan Kumar, Tobias Ludwig, Marvin Mathony, Marcelo Mattar, Alireza Modirshanechi, Surabhi S. Nath, Joshua C. Peterson, Milena Rmus, Evan M. Russek, Tankred Saanum , et al. (15 additional authors not shown)

    Abstract: Establishing a unified theory of cognition has been a major goal of psychology. While there have been previous attempts to instantiate such theories by building computational models, we currently do not have one model that captures the human mind in its entirety. A first step in this direction is to create a model that can predict human behavior in a wide range of settings. Here we introduce Centa… ▽ More

    Submitted 28 April, 2025; v1 submitted 26 October, 2024; originally announced October 2024.

  25. arXiv:2410.10799  [pdf, other

    cs.CV

    Towards Foundation Models for 3D Vision: How Close Are We?

    Authors: Yiming Zuo, Karhan Kayan, Maggie Wang, Kevin Jeon, Jia Deng, Thomas L. Griffiths

    Abstract: Building a foundation model for 3D vision is a complex challenge that remains unsolved. Towards that goal, it is important to understand the 3D reasoning capabilities of current models as well as identify the gaps between these models and humans. Therefore, we construct a new 3D visual understanding benchmark named UniQA-3D. UniQA-3D covers fundamental 3D vision tasks in the Visual Question Answer… ▽ More

    Submitted 9 December, 2024; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: Accepted to 3DV 2025. Update 12/09/24: Change the benchmark name to UniQA-3D, add link to code

  26. arXiv:2410.05563  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Rational Metareasoning for Large Language Models

    Authors: C. Nicolò De Sabbata, Theodore R. Sumers, Badr AlKhamissi, Antoine Bosselut, Thomas L. Griffiths

    Abstract: Being prompted to engage in reasoning has emerged as a core technique for using large language models (LLMs), deploying additional inference-time compute to improve task performance. However, as LLMs increase in both size and adoption, inference costs are correspondingly becoming increasingly burdensome. How, then, might we optimize reasoning's cost-performance tradeoff? This work introduces a nov… ▽ More

    Submitted 23 June, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

  27. arXiv:2410.01792  [pdf, other

    cs.CL cs.AI

    When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1

    Authors: R. Thomas McCoy, Shunyu Yao, Dan Friedman, Mathew D. Hardy, Thomas L. Griffiths

    Abstract: In "Embers of Autoregression" (McCoy et al., 2023), we showed that several large language models (LLMs) have some important limitations that are attributable to their origins in next-word prediction. Here we investigate whether these issues persist with o1, a new system from OpenAI that differs from previous LLMs in that it is optimized for reasoning. We find that o1 substantially outperforms prev… ▽ More

    Submitted 3 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

    Comments: 6 pages; updated to fix typo in Fig 4 caption

  28. arXiv:2409.05890  [pdf, other

    cs.CY physics.soc-ph

    Automating the Practice of Science -- Opportunities, Challenges, and Implications

    Authors: Sebastian Musslick, Laura K. Bartlett, Suyog H. Chandramouli, Marina Dubova, Fernand Gobet, Thomas L. Griffiths, Jessica Hullman, Ross D. King, J. Nathan Kutz, Christopher G. Lucas, Suhas Mahesh, Franco Pestilli, Sabina J. Sloman, William R. Holmes

    Abstract: Automation transformed various aspects of our human civilization, revolutionizing industries and streamlining processes. In the domain of scientific inquiry, automated approaches emerged as powerful tools, holding promise for accelerating discovery, enhancing reproducibility, and overcoming the traditional impediments to scientific progress. This article evaluates the scope of automation within sc… ▽ More

    Submitted 27 August, 2024; originally announced September 2024.

  29. arXiv:2408.07865  [pdf, other

    econ.GN cs.GT cs.LG

    Capturing the Complexity of Human Strategic Decision-Making with Machine Learning

    Authors: Jian-Qiao Zhu, Joshua C. Peterson, Benjamin Enke, Thomas L. Griffiths

    Abstract: Understanding how people behave in strategic settings--where they make decisions based on their expectations about the behavior of others--is a long-standing problem in the behavioral sciences. We conduct the largest study to date of strategic decision-making in the context of initial play in two-player matrix games, analyzing over 90,000 human decisions across more than 2,400 procedurally generat… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  30. arXiv:2408.03943  [pdf, other

    cs.HC cs.AI cs.LG

    Building Machines that Learn and Think with People

    Authors: Katherine M. Collins, Ilia Sucholutsky, Umang Bhatt, Kartik Chandra, Lionel Wong, Mina Lee, Cedegao E. Zhang, Tan Zhi-Xuan, Mark Ho, Vikash Mansinghka, Adrian Weller, Joshua B. Tenenbaum, Thomas L. Griffiths

    Abstract: What do we want from machine intelligence? We envision machines that are not just tools for thought, but partners in thought: reasonable, insightful, knowledgeable, reliable, and trustworthy systems that think with us. Current artificial intelligence (AI) systems satisfy some of these criteria, some of the time. In this Perspective, we show how the science of collaborative cognition can be put to… ▽ More

    Submitted 21 July, 2024; originally announced August 2024.

  31. arXiv:2407.01687  [pdf, other

    cs.CL cs.AI

    Deciphering the Factors Influencing the Efficacy of Chain-of-Thought: Probability, Memorization, and Noisy Reasoning

    Authors: Akshara Prabhakar, Thomas L. Griffiths, R. Thomas McCoy

    Abstract: Chain-of-Thought (CoT) prompting has been shown to enhance the multi-step reasoning capabilities of Large Language Models (LLMs). However, debates persist about whether LLMs exhibit abstract generalization or rely on shallow heuristics when given CoT prompts. To understand the factors influencing CoT reasoning we provide a detailed case study of the symbolic reasoning task of decoding shift cipher… ▽ More

    Submitted 3 October, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: EMNLP 2024 Findings; 9 pages plus references and appendices

  32. arXiv:2406.17055  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Large Language Models Assume People are More Rational than We Really are

    Authors: Ryan Liu, Jiayi Geng, Joshua C. Peterson, Ilia Sucholutsky, Thomas L. Griffiths

    Abstract: In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human… ▽ More

    Submitted 10 March, 2025; v1 submitted 24 June, 2024; originally announced June 2024.

  33. arXiv:2406.04302  [pdf, other

    cs.LG

    Representational Alignment Supports Effective Machine Teaching

    Authors: Ilia Sucholutsky, Katherine M. Collins, Maya Malaviya, Nori Jacoby, Weiyang Liu, Theodore R. Sumers, Michalis Korakakis, Umang Bhatt, Mark Ho, Joshua B. Tenenbaum, Brad Love, Zachary A. Pardos, Adrian Weller, Thomas L. Griffiths

    Abstract: A good teacher should not only be knowledgeable, but should also be able to communicate in a way that the student understands -- to share the student's representation of the world. In this work, we introduce a new controlled experimental setting, GRADE, to study pedagogy and representational alignment. We use GRADE through a series of machine-machine and machine-human teaching experiments to chara… ▽ More

    Submitted 4 February, 2025; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Preprint

  34. arXiv:2406.03707  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions

    Authors: Liyi Zhang, Michael Y. Li, Thomas L. Griffiths

    Abstract: Autoregressive language models have demonstrated a remarkable ability to extract latent structure from text. The embeddings from large language models have been shown to capture aspects of the syntax and semantics of language. But what {\em should} embeddings represent? We connect the autoregressive prediction objective to the idea of constructing predictive sufficient statistics to summarize the… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 15 pages, 8 figures

    ACM Class: I.2; I.5

  35. arXiv:2406.02268  [pdf, other

    cs.LG

    Analyzing the Benefits of Prototypes for Semi-Supervised Category Learning

    Authors: Liyi Zhang, Logan Nelson, Thomas L. Griffiths

    Abstract: Categories can be represented at different levels of abstraction, from prototypes focused on the most typical members to remembering all observed exemplars of the category. These representations have been explored in the context of supervised learning, where stimuli are presented with known category labels. We examine the benefits of prototype-based representations in a less-studied domain: semi-s… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 7 pages, 3 figures

    ACM Class: I.2; I.5

  36. arXiv:2406.01860  [pdf, other

    cs.CL

    Eliciting the Priors of Large Language Models using Iterated In-Context Learning

    Authors: Jian-Qiao Zhu, Thomas L. Griffiths

    Abstract: As Large Language Models (LLMs) are increasingly deployed in real-world settings, understanding the knowledge they implicitly use when making decisions is critical. One way to capture this knowledge is in the form of Bayesian prior distributions. We develop a prompt-based workflow for eliciting prior distributions from LLMs. Our approach is based on iterated learning, a Markov chain Monte Carlo me… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  37. arXiv:2405.19420  [pdf, other

    cs.LG cs.AI q-bio.NC

    Learning Human-Aligned Representations with Contrastive Learning and Generative Similarity

    Authors: Raja Marjieh, Sreejan Kumar, Declan Campbell, Liyi Zhang, Gianluca Bencomo, Jake Snell, Thomas L. Griffiths

    Abstract: Humans rely on effective representations to learn from few examples and abstract useful information from sensory data. Inducing such representations in machine learning models has been shown to improve their performance on various benchmarks such as few-shot learning and robustness. However, finding effective training procedures to achieve that goal can be challenging as psychologically rich train… ▽ More

    Submitted 31 January, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

  38. arXiv:2405.19313  [pdf, other

    cs.AI cs.CL econ.GN

    Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

    Authors: Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths

    Abstract: The observed similarities in the behavior of humans and Large Language Models (LLMs) have prompted researchers to consider the potential of using LLMs as models of human cognition. However, several significant challenges must be addressed before LLMs can be legitimately regarded as cognitive models. For instance, LLMs are trained on far more data than humans typically encounter, and may have been… ▽ More

    Submitted 5 May, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

    Journal ref: ICLR 2025

  39. arXiv:2403.19669  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Analyzing the Roles of Language and Vision in Learning from Limited Data

    Authors: Allison Chen, Ilia Sucholutsky, Olga Russakovsky, Thomas L. Griffiths

    Abstract: Does language help make sense of the visual world? How important is it to actually see the world rather than having it described with words? These basic questions about the nature of intelligence have been difficult to answer because we only had one example of an intelligent system -- humans -- and limited access to cases that isolated language or vision. However, the development of sophisticated… ▽ More

    Submitted 10 May, 2024; v1 submitted 15 February, 2024; originally announced March 2024.

    Comments: 8 pages, 4 figures

  40. arXiv:2403.12482  [pdf, other

    cs.AI cs.CL cs.CY cs.MA

    Embodied LLM Agents Learn to Cooperate in Organized Teams

    Authors: Xudong Guo, Kaixuan Huang, Jiale Liu, Wenhui Fan, Natalia Vélez, Qingyun Wu, Huazheng Wang, Thomas L. Griffiths, Mengdi Wang

    Abstract: Large Language Models (LLMs) have emerged as integral tools for reasoning, planning, and decision-making, drawing upon their extensive world knowledge and proficiency in language-related tasks. LLMs thus hold tremendous potential for natural language interaction within multi-agent systems to foster cooperation. However, LLM agents tend to over-report and comply with any instruction, which may resu… ▽ More

    Submitted 23 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  41. arXiv:2402.18759  [pdf, other

    cs.RO cs.AI cs.LG

    Learning with Language-Guided State Abstractions

    Authors: Andi Peng, Ilia Sucholutsky, Belinda Z. Li, Theodore R. Sumers, Thomas L. Griffiths, Jacob Andreas, Julie A. Shah

    Abstract: We describe a framework for using natural language to design state abstractions for imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed state representations, which can surface important features of an environment and hide irrelevant ones. These state representations are typically manually specified, or derived from other labor-i… ▽ More

    Submitted 6 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: ICLR 2024

  42. arXiv:2402.16668  [pdf, other

    cs.LG cs.AI

    Program-Based Strategy Induction for Reinforcement Learning

    Authors: Carlos G. Correa, Thomas L. Griffiths, Nathaniel D. Daw

    Abstract: Typical models of learning assume incremental estimation of continuously-varying decision variables like expected rewards. However, this class of models fails to capture more idiosyncratic, discrete heuristics and strategies that people and animals appear to exhibit. Despite recent advances in strategy discovery using tools like recurrent networks that generalize the classic models, the resulting… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  43. arXiv:2402.07282  [pdf, other

    cs.CL cs.AI cs.LG

    How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?

    Authors: Ryan Liu, Theodore R. Sumers, Ishita Dasgupta, Thomas L. Griffiths

    Abstract: In day-to-day communication, people often approximate the truth - for example, rounding the time or omitting details - in order to be maximally helpful to the listener. How do large language models (LLMs) handle such nuanced trade-offs? To address this question, we use psychological models and experiments designed to characterize human behavior to analyze LLMs. We test a range of LLMs and explore… ▽ More

    Submitted 13 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  44. arXiv:2402.07035  [pdf, other

    cs.LG cs.AI

    Distilling Symbolic Priors for Concept Learning into Neural Networks

    Authors: Ioana Marinescu, R. Thomas McCoy, Thomas L. Griffiths

    Abstract: Humans can learn new concepts from a small number of examples by drawing on their inductive biases. These inductive biases have previously been captured by using Bayesian models defined over symbolic hypothesis spaces. Is it possible to create a neural network that displays the same inductive biases? We show that inductive biases that enable rapid concept learning can be instantiated in artificial… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 8 pages, 6 figures, 4 tables

  45. arXiv:2402.06992  [pdf, other

    q-bio.NC cs.AI cs.CL stat.AP

    A Rational Analysis of the Speech-to-Song Illusion

    Authors: Raja Marjieh, Pol van Rijn, Ilia Sucholutsky, Harin Lee, Thomas L. Griffiths, Nori Jacoby

    Abstract: The speech-to-song illusion is a robust psychological phenomenon whereby a spoken sentence sounds increasingly more musical as it is repeated. Despite decades of research, a complete formal account of this transformation is still lacking, and some of its nuanced characteristics, namely, that certain phrases appear to transform while others do not, is not well understood. Here we provide a formal a… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 7 pages, 5 figures

  46. arXiv:2402.04203  [pdf, other

    cs.AI q-bio.NC

    Human-Like Geometric Abstraction in Large Pre-trained Neural Networks

    Authors: Declan Campbell, Sreejan Kumar, Tyler Giallanza, Thomas L. Griffiths, Jonathan D. Cohen

    Abstract: Humans possess a remarkable capacity to recognize and manipulate abstract structure, which is especially apparent in the domain of geometry. Recent research in cognitive science suggests neural networks do not share this capacity, concluding that human geometric abilities come from discrete symbolic structure in human mental representations. However, progress in artificial intelligence (AI) sugges… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  47. arXiv:2402.04105  [pdf, other

    cs.CY cs.CL

    Measuring Implicit Bias in Explicitly Unbiased Large Language Models

    Authors: Xuechunzi Bai, Angelina Wang, Ilia Sucholutsky, Thomas L. Griffiths

    Abstract: Large language models (LLMs) can pass explicit social bias tests but still harbor implicit biases, similar to humans who endorse egalitarian beliefs yet exhibit subtle biases. Measuring such implicit biases can be a challenge: as LLMs become increasingly proprietary, it may not be possible to access their embeddings and apply existing bias measures; furthermore, implicit biases are primarily a con… ▽ More

    Submitted 23 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  48. arXiv:2402.03618  [pdf, other

    cs.AI cs.CL q-bio.NC

    Comparing Abstraction in Humans and Large Language Models Using Multimodal Serial Reproduction

    Authors: Sreejan Kumar, Raja Marjieh, Byron Zhang, Declan Campbell, Michael Y. Hu, Umang Bhatt, Brenden Lake, Thomas L. Griffiths

    Abstract: Humans extract useful abstractions of the world from noisy sensory data. Serial reproduction allows us to study how people construe the world through a paradigm similar to the game of telephone, where one person observes a stimulus and reproduces it for the next to form a chain of reproductions. Past serial reproduction experiments typically employ a single sensory modality, but humans often commu… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  49. arXiv:2402.03081  [pdf, other

    cs.RO cs.AI cs.LG

    Preference-Conditioned Language-Guided Abstraction

    Authors: Andi Peng, Andreea Bobu, Belinda Z. Li, Theodore R. Sumers, Ilia Sucholutsky, Nishanth Kumar, Thomas L. Griffiths, Julie A. Shah

    Abstract: Learning from demonstrations is a common way for users to teach robots, but it is prone to spurious feature correlations. Recent work constructs state abstractions, i.e. visual representations containing task-relevant features, from language as a way to perform more generalizable learning. However, these abstractions also depend on a user's preference for what matters in a task, which may be hard… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: HRI 2024

  50. arXiv:2401.16657  [pdf, other

    cs.AI cs.CL

    Recovering Mental Representations from Large Language Models with Markov Chain Monte Carlo

    Authors: Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths

    Abstract: Simulating sampling algorithms with people has proven a useful method for efficiently probing and understanding their mental representations. We propose that the same methods can be used to study the representations of Large Language Models (LLMs). While one can always directly prompt either humans or LLMs to disclose their mental representations introspectively, we show that increased efficiency… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.