Skip to main content

Showing 1–50 of 148 results for author: Goodman, N

.
  1. arXiv:2505.23931  [pdf, ps, other

    cs.CL cs.AI

    Scaling up the think-aloud method

    Authors: Daniel Wurgaft, Ben Prystawski, Kanishk Gandhi, Cedegao E. Zhang, Joshua B. Tenenbaum, Noah D. Goodman

    Abstract: The think-aloud method, where participants voice their thoughts as they solve a task, is a valuable source of rich data about human reasoning processes. Yet, it has declined in popularity in contemporary cognitive science, largely because labor-intensive transcription and annotation preclude large sample sizes. Here, we develop methods to automate the transcription and annotation of verbal reports… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 8 pages, 4 figures. Daniel Wurgaft and Ben Prystawski contributed equally

  2. arXiv:2504.01849  [pdf, other

    cs.AI cs.CY cs.LG

    An Approach to Technical AGI Safety and Security

    Authors: Rohin Shah, Alex Irpan, Alexander Matt Turner, Anna Wang, Arthur Conmy, David Lindner, Jonah Brown-Cohen, Lewis Ho, Neel Nanda, Raluca Ada Popa, Rishub Jain, Rory Greig, Samuel Albanie, Scott Emmons, Sebastian Farquhar, Sébastien Krier, Senthooran Rajamanoharan, Sophie Bridgers, Tobi Ijitoye, Tom Everitt, Victoria Krakovna, Vikrant Varma, Vladimir Mikulik, Zachary Kenton, Dave Orr , et al. (5 additional authors not shown)

    Abstract: Artificial General Intelligence (AGI) promises transformative benefits but also presents significant risks. We develop an approach to address the risk of harms consequential enough to significantly harm humanity. We identify four areas of risk: misuse, misalignment, mistakes, and structural risks. Of these, we focus on technical approaches to misuse and misalignment. For misuse, our strategy aims… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  3. arXiv:2503.24036  [pdf, other

    cs.PL

    Automated Discovery of Tactic Libraries for Interactive Theorem Proving

    Authors: Yutong Xin, Jimmy Xin, Gabriel Poesia, Noah Goodman, Qiaochu Chen, Isil Dillig

    Abstract: Enabling more concise and modular proofs is essential for advancing formal reasoning using interactive theorem provers (ITPs). Since many ITPs, such as Rocq and Lean, use tactic-style proofs, learning higher-level custom tactics is crucial for proof modularity and automation. This paper presents a novel approach to tactic discovery, which leverages Tactic Dependence Graphs (TDGs) to identify reusa… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  4. arXiv:2503.15484  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Value Profiles for Encoding Human Variation

    Authors: Taylor Sorensen, Pushkar Mishra, Roma Patel, Michael Henry Tessler, Michiel Bakker, Georgina Evans, Iason Gabriel, Noah Goodman, Verena Rieser

    Abstract: Modelling human variation in rating tasks is crucial for enabling AI systems for personalization, pluralistic model alignment, and computational social science. We propose representing individuals using value profiles -- natural language descriptions of underlying values compressed from in-context demonstrations -- along with a steerable decoder model to estimate ratings conditioned on a value pro… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  5. arXiv:2503.01307  [pdf, other

    cs.CL cs.LG

    Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

    Authors: Kanishk Gandhi, Ayush Chakravarthy, Anikait Singh, Nathan Lile, Noah D. Goodman

    Abstract: Test-time inference has emerged as a powerful paradigm for enabling language models to ``think'' longer and more carefully about complex challenges, much like skilled human experts. While reinforcement learning (RL) can drive self-improvement in language models on verifiable tasks, some models exhibit substantial gains while others quickly plateau. For instance, we find that Qwen-2.5-3B far exceed… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  6. arXiv:2502.06204  [pdf, ps, other

    cs.CL

    Non-literal Understanding of Number Words by Language Models

    Authors: Polina Tsvilodub, Kanishk Gandhi, Haoran Zhao, Jan-Philipp Fränken, Michael Franke, Noah D. Goodman

    Abstract: Humans naturally interpret numbers non-literally, effortlessly combining context, world knowledge, and speaker intent. We investigate whether large language models (LLMs) interpret numbers similarly, focusing on hyperbole and pragmatic halo effects. Through systematic comparison with human data and computational models of pragmatic reasoning, we find that LLMs diverge from human interpretation in… ▽ More

    Submitted 2 June, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

    Comments: 12 pages, 10 figures. To appear in the Proceedings of CogSci 2025

  7. arXiv:2501.06141  [pdf, other

    cs.LG cs.AI

    Emergent Symbol-like Number Variables in Artificial Neural Networks

    Authors: Satchel Grant, Noah D. Goodman, James L. McClelland

    Abstract: What types of numeric representations emerge in neural systems? What would a satisfying answer to this question look like? In this work, we interpret Neural Network (NN) solutions to sequence based counting tasks through a variety of lenses. We seek to understand how well we can understand NNs through the lens of interpretable Symbolic Algorithms (SAs), where SAs are defined by precise, abstract,… ▽ More

    Submitted 23 April, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

  8. arXiv:2501.01540  [pdf, other

    cs.LG cs.AI

    BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery

    Authors: Kanishk Gandhi, Michael Y. Li, Lyle Goodyear, Louise Li, Aditi Bhaskar, Mohammed Zaman, Noah D. Goodman

    Abstract: Understanding the world and explaining it with scientific theories is a central aspiration of artificial intelligence research. Proposing theories, designing experiments to test them, and then revising them based on data are fundamental to scientific discovery. Despite the significant promise of LLM-based scientific agents, no benchmarks systematically test LLM's ability to propose scientific mode… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

    Comments: KG and MYL contributed equally

  9. arXiv:2411.06590  [pdf, other

    cs.LG cs.AI cs.CL

    CriticAL: Critic Automation with Language Models

    Authors: Michael Y. Li, Vivek Vajipey, Noah D. Goodman, Emily B. Fox

    Abstract: Understanding the world through models is a fundamental goal of scientific research. While large language model (LLM) based approaches show promise in automating scientific discovery, they often overlook the importance of criticizing scientific models. Criticizing models deepens scientific understanding and drives the development of more accurate models. Automating model criticism is difficult bec… ▽ More

    Submitted 10 November, 2024; originally announced November 2024.

  10. arXiv:2410.16531  [pdf, other

    cs.CL cs.AI cs.FL cs.LG

    Bayesian scaling laws for in-context learning

    Authors: Aryaman Arora, Dan Jurafsky, Christopher Potts, Noah D. Goodman

    Abstract: In-context learning (ICL) is a powerful technique for getting language models to perform complex tasks with no training updates. Prior work has established strong correlations between the number of in-context examples provided and the accuracy of the model's predictions. In this paper, we seek to explain this correlation by showing that ICL approximates a Bayesian learner. This perspective gives r… ▽ More

    Submitted 2 November, 2024; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: 10 pages main text, 26 pages total

    ACM Class: I.2.7

  11. arXiv:2409.11733  [pdf, other

    cs.CL

    Human-like Affective Cognition in Foundation Models

    Authors: Kanishk Gandhi, Zoe Lynch, Jan-Philipp Fränken, Kayla Patterson, Sharon Wambu, Tobias Gerstenberg, Desmond C. Ong, Noah D. Goodman

    Abstract: Understanding emotions is fundamental to human interaction and experience. Humans easily infer emotions from situations or facial expressions, situations from emotions, and do a variety of other affective cognition. How adept is modern AI at these inferences? We introduce an evaluation framework for testing affective cognition in foundation models. Starting from psychological theory, we generate 1… ▽ More

    Submitted 18 September, 2024; v1 submitted 18 September, 2024; originally announced September 2024.

  12. arXiv:2409.08202  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    What Makes a Maze Look Like a Maze?

    Authors: Joy Hsu, Jiayuan Mao, Joshua B. Tenenbaum, Noah D. Goodman, Jiajun Wu

    Abstract: A unique aspect of human visual understanding is the ability to flexibly interpret abstract concepts: acquiring lifted rules explaining what they symbolize, grounding them across familiar and unfamiliar contexts, and making predictions or reasoning about them. While off-the-shelf vision-language models excel at making literal interpretations of images (e.g., recognizing object categories such as t… ▽ More

    Submitted 17 February, 2025; v1 submitted 12 September, 2024; originally announced September 2024.

    Comments: ICLR 2025

  13. arXiv:2408.03617  [pdf, other

    cs.CL cs.AI cs.LG

    Is Child-Directed Speech Effective Training Data for Language Models?

    Authors: Steven Y. Feng, Noah D. Goodman, Michael C. Frank

    Abstract: While high-performing language models are typically trained on hundreds of billions of words, human children become fluent language users with a much smaller amount of data. What are the features of the data they receive, and how do these features support language modeling objectives? To investigate this question, we train GPT-2 and RoBERTa models on 29M words of English child-directed speech and… ▽ More

    Submitted 8 October, 2024; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: EMNLP 2024. Code and data at https://github.com/styfeng/TinyDialogues

  14. arXiv:2407.15645  [pdf, other

    cs.CL cs.AI

    Psychometric Alignment: Capturing Human Knowledge Distributions via Language Models

    Authors: Joy He-Yueya, Wanjing Anya Ma, Kanishk Gandhi, Benjamin W. Domingue, Emma Brunskill, Noah D. Goodman

    Abstract: Language models (LMs) are increasingly used to simulate human-like responses in scenarios where accurately mimicking a population's behavior can guide decision-making, such as in developing educational materials and designing public policies. The objective of these simulations is for LMs to capture the variations in human responses, rather than merely providing the expected correct answers. Prior… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Code and data: https://github.com/joyheyueya/psychometric-alignment

  15. arXiv:2407.04622  [pdf, other

    cs.LG

    On scalable oversight with weak LLMs judging strong LLMs

    Authors: Zachary Kenton, Noah Y. Siegel, János Kramár, Jonah Brown-Cohen, Samuel Albanie, Jannis Bulian, Rishabh Agarwal, David Lindner, Yunhao Tang, Noah D. Goodman, Rohin Shah

    Abstract: Scalable oversight protocols aim to enable humans to accurately supervise superhuman AI. In this paper we study debate, where two AI's compete to convince a judge; consultancy, where a single AI tries to convince a judge that asks questions; and compare to a baseline of direct question-answering, where the judge just answers outright without the AI. We use large language models (LLMs) as both AI a… ▽ More

    Submitted 12 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: 15 pages (53 including appendices). V2: minor correction to Figure 3; add Figure A.9 comparing open vs assigned consultancy; add a reference

  16. arXiv:2407.00900  [pdf, other

    cs.AI cs.CL

    MathCAMPS: Fine-grained Synthesis of Mathematical Problems From Human Curricula

    Authors: Shubhra Mishra, Gabriel Poesia, Belinda Mo, Noah D. Goodman

    Abstract: Mathematical problem solving is an important skill for Large Language Models (LLMs), both as an important capability and a proxy for a range of reasoning abilities. Existing benchmarks probe a diverse set of skills, but they yield aggregate accuracy metrics, obscuring specific abilities or weaknesses. Furthermore, they are difficult to extend with new problems, risking data contamination over time… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Dataset and code: https://github.com/gpoesia/mathcamps/

  17. arXiv:2407.00695  [pdf, other

    cs.AI cs.LO

    Learning Formal Mathematics From Intrinsic Motivation

    Authors: Gabriel Poesia, David Broman, Nick Haber, Noah D. Goodman

    Abstract: How did humanity coax mathematics from the aether? We explore the Platonic view that mathematics can be discovered from its axioms - a game of conjecture and proof. We describe Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type… ▽ More

    Submitted 4 November, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: NeurIPS 2024 Oral

  18. arXiv:2404.14313  [pdf, other

    cs.CL

    Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels

    Authors: Jan-Philipp Fränken, Eric Zelikman, Rafael Rafailov, Kanishk Gandhi, Tobias Gerstenberg, Noah D. Goodman

    Abstract: When prompting a language model (LM), users often expect the model to adhere to a set of behavioral principles across diverse tasks, such as producing insightful content while avoiding harmful or biased language. Instilling such principles (i.e., a constitution) into a model is resource-intensive, technically challenging, and generally requires human preference labels or examples. We introduce SAM… ▽ More

    Submitted 21 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  19. arXiv:2404.10975  [pdf, other

    cs.CL

    Procedural Dilemma Generation for Evaluating Moral Reasoning in Humans and Language Models

    Authors: Jan-Philipp Fränken, Kanishk Gandhi, Tori Qiu, Ayesha Khawaja, Noah D. Goodman, Tobias Gerstenberg

    Abstract: As AI systems like language models are increasingly integrated into decision-making processes affecting people's lives, it's critical to ensure that these systems have sound moral reasoning. To test whether they do, we need to develop systematic evaluations. We provide a framework that uses a language model to translate causal graphs that capture key aspects of moral dilemmas into prompt templates… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: CogSci 2024

  20. arXiv:2404.03683  [pdf, other

    cs.LG cs.AI cs.CL

    Stream of Search (SoS): Learning to Search in Language

    Authors: Kanishk Gandhi, Denise Lee, Gabriel Grand, Muxin Liu, Winson Cheng, Archit Sharma, Noah D. Goodman

    Abstract: Language models are rarely shown fruitful mistakes while training. They then struggle to look beyond the next token, suffering from a snowballing of errors and struggling to predict the consequence of their actions several steps ahead. In this paper, we show how language models can be taught to search by representing the process of search in language, as a flattened string -- a stream of search (S… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  21. arXiv:2403.19154  [pdf, other

    cs.CL cs.AI

    STaR-GATE: Teaching Language Models to Ask Clarifying Questions

    Authors: Chinmaya Andukuri, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman

    Abstract: When prompting language models to complete a task, users often leave important aspects unsaid. While asking questions could resolve this ambiguity (GATE; Li et al., 2023), models often struggle to ask good questions. We explore a language model's ability to self-improve (STaR; Zelikman et al., 2022) by rewarding the model for generating useful questions-a simple method we dub STaR-GATE. We generat… ▽ More

    Submitted 7 August, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  22. arXiv:2403.09629  [pdf, other

    cs.CL cs.AI cs.LG

    Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

    Authors: Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman

    Abstract: When writing and talking, people sometimes pause to think. Although reasoning-focused works have often framed reasoning as a method of answering questions or completing agentic tasks, reasoning is implicit in almost all written text. For example, this applies to the steps not stated between the lines of a proof or to the theory of mind underlying a conversation. In the Self-Taught Reasoner (STaR,… ▽ More

    Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  23. arXiv:2403.07809  [pdf, other

    cs.LG cs.CL

    pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

    Authors: Zhengxuan Wu, Atticus Geiger, Aryaman Arora, Jing Huang, Zheng Wang, Noah D. Goodman, Christopher D. Manning, Christopher Potts

    Abstract: Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability. To facilitate such research, we introduce $\textbf{pyvene}$, an open-source Python library that supports customizable interventions on a range of different PyTorch modules. $\textbf{pyvene}$ supports complex intervention schemes with an intuiti… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 8 pages, 3 figures

  24. arXiv:2403.05534  [pdf, other

    cs.CL

    Bayesian Preference Elicitation with Language Models

    Authors: Kunal Handa, Yarin Gal, Ellie Pavlick, Noah Goodman, Jacob Andreas, Alex Tamkin, Belinda Z. Li

    Abstract: Aligning AI systems to users' interests requires understanding and incorporating humans' complex values and preferences. Recently, language models (LMs) have been used to gather information about the preferences of human users. This preference data can be used to fine-tune or guide other LMs and/or AI systems. However, LMs have been shown to struggle with crucial aspects of preference learning: qu… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  25. arXiv:2403.03956  [pdf, other

    cs.IR cs.CL

    Backtracing: Retrieving the Cause of the Query

    Authors: Rose E. Wang, Pawan Wirawarn, Omar Khattab, Noah Goodman, Dorottya Demszky

    Abstract: Many online content portals allow users to ask questions to supplement their understanding (e.g., of lectures). While information retrieval (IR) systems may provide answers for such user queries, they do not directly assist content creators -- such as lecturers who want to improve their content -- identify segments that _caused_ a user to ask those questions. We introduce the task of backtracing,… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Code: https://github.com/rosewang2008/backtracing; EACL 2024 Findings, Long Paper

  26. arXiv:2403.02795  [pdf, other

    cs.AI cs.CL

    Evaluating and Optimizing Educational Content with Large Language Model Judgments

    Authors: Joy He-Yueya, Noah D. Goodman, Emma Brunskill

    Abstract: Creating effective educational materials generally requires expensive and time-consuming studies of student learning outcomes. To overcome this barrier, one idea is to build computational models of student learning and use them to optimize instructional materials. However, it is difficult to model the cognitive processes of learning dynamics. We propose an alternative approach that uses Language M… ▽ More

    Submitted 6 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 11 pages

  27. arXiv:2402.17879  [pdf, other

    cs.LG cs.CL

    Automated Statistical Model Discovery with Language Models

    Authors: Michael Y. Li, Emily B. Fox, Noah D. Goodman

    Abstract: Statistical model discovery is a challenging search over a vast space of models subject to domain-specific constraints. Efficiently searching over this space requires expertise in modeling and the problem domain. Motivated by the domain knowledge and programming capabilities of large language models (LMs), we introduce a method for language model driven automated statistical model discovery. We ca… ▽ More

    Submitted 22 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  28. arXiv:2401.12631  [pdf, other

    cs.LG cs.AI cs.CL

    A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

    Authors: Zhengxuan Wu, Atticus Geiger, Jing Huang, Aryaman Arora, Thomas Icard, Christopher Potts, Noah D. Goodman

    Abstract: We respond to the recent paper by Makelov et al. (2023), which reviews subspace interchange intervention methods like distributed alignment search (DAS; Geiger et al. 2023) and claims that these methods potentially cause "interpretability illusions". We first review Makelov et al. (2023)'s technical notion of what an "interpretability illusion" is, and then we show that even intuitive and desirabl… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 20 pages, 14 figures

  29. arXiv:2310.17769  [pdf, other

    cs.CL cs.AI

    Social Contract AI: Aligning AI Assistants with Implicit Group Norms

    Authors: Jan-Philipp Fränken, Sam Kwok, Peixuan Ye, Kanishk Gandhi, Dilip Arumugam, Jared Moore, Alex Tamkin, Tobias Gerstenberg, Noah D. Goodman

    Abstract: We explore the idea of aligning an AI assistant by inverting a model of users' (unknown) preferences from observed interactions. To validate our proposal, we run proof-of-concept simulations in the economic ultimatum game, formalizing user preferences as policies that guide the actions of simulated players. We find that the AI assistant accurately aligns its behavior to match standard policies fro… ▽ More

    Submitted 3 December, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: SoLaR NeurIPS 2023 Workshop (https://solar-neurips.github.io/)

  30. arXiv:2310.17230  [pdf, other

    cs.LG cs.CL

    Codebook Features: Sparse and Discrete Interpretability for Neural Networks

    Authors: Alex Tamkin, Mohammad Taufeeque, Noah D. Goodman

    Abstract: Understanding neural networks is challenging in part because of the dense, continuous nature of their hidden states. We explore whether we can train neural networks to have hidden states that are sparse, discrete, and more interpretable by quantizing their continuous features into what we call codebook features. Codebook features are produced by finetuning neural networks with vector quantization… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  31. arXiv:2310.11589  [pdf, other

    cs.CL cs.AI cs.LG

    Eliciting Human Preferences with Language Models

    Authors: Belinda Z. Li, Alex Tamkin, Noah Goodman, Jacob Andreas

    Abstract: Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts. But selecting examples or writing prompts for can be challenging--especially in tasks that involve unusual edge cases, demand precise articulation of nebulous preferences, or require an accurate mental model of LM behavior. We propose to use *LMs themselves* to guide the task specif… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 26 pages, 15 figures

  32. arXiv:2310.03635  [pdf, other

    cs.AI cs.CL cs.CV cs.LG stat.ML

    CLEVRER-Humans: Describing Physical and Causal Events the Human Way

    Authors: Jiayuan Mao, Xuelin Yang, Xikun Zhang, Noah D. Goodman, Jiajun Wu

    Abstract: Building machines that can reason about physical events and their causal relationships is crucial for flexible interaction with the physical world. However, most existing physical and causal reasoning benchmarks are exclusively based on synthetically generated events and synthetic natural language descriptions of causal relationships. This design brings up two issues. First, there is a lack of div… ▽ More

    Submitted 26 May, 2025; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: Version 3. NeurIPS 2022 (Dataset and Benchmark Track). First two authors contributed equally. Project page: https://sites.google.com/stanford.edu/clevrer-humans/home

  33. arXiv:2309.05660  [pdf, other

    cs.LG cs.AI cs.CL

    Hypothesis Search: Inductive Reasoning with Language Models

    Authors: Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman

    Abstract: Inductive reasoning is a core problem-solving capacity: humans can identify underlying principles from a few examples, which robustly generalize to novel scenarios. Recent work evaluates large language models (LLMs) on inductive reasoning tasks by directly prompting them yielding "in context learning." This works well for straightforward inductive tasks but performs poorly on complex tasks such as… ▽ More

    Submitted 30 May, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: ICLR 2024. The first two authors contributed equally. Code: https://github.com/Relento/hypothesis_search

  34. arXiv:2306.15448  [pdf, other

    cs.CL cs.AI cs.HC

    Understanding Social Reasoning in Language Models with Language Models

    Authors: Kanishk Gandhi, Jan-Philipp Fränken, Tobias Gerstenberg, Noah D. Goodman

    Abstract: As Large Language Models (LLMs) become increasingly integrated into our everyday lives, understanding their ability to comprehend human mental states becomes critical for ensuring effective interactions. However, despite the recent attempts to assess the Theory-of-Mind (ToM) reasoning capabilities of LLMs, the degree to which these models can align with human ToM remains a nuanced topic of explora… ▽ More

    Submitted 4 December, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  35. arXiv:2306.12672  [pdf, other

    cs.CL cs.AI cs.SC

    From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought

    Authors: Lionel Wong, Gabriel Grand, Alexander K. Lew, Noah D. Goodman, Vikash K. Mansinghka, Jacob Andreas, Joshua B. Tenenbaum

    Abstract: How does language inform our downstream thinking? In particular, how do humans make meaning from language--and how can we leverage a theory of linguistic meaning to build machines that think in more human-like ways? In this paper, we propose rational meaning construction, a computational framework for language-informed thinking that combines neural language models with probabilistic models for rat… ▽ More

    Submitted 23 June, 2023; v1 submitted 22 June, 2023; originally announced June 2023.

  36. arXiv:2306.10015  [pdf, other

    cs.LG cs.CL cs.DC

    Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness

    Authors: Eric Zelikman, Qian Huang, Percy Liang, Nick Haber, Noah D. Goodman

    Abstract: Language model training in distributed settings is limited by the communication cost of gradient exchanges. In this short note, we extend recent work from Malladi et al. (2023), using shared randomness to perform distributed fine-tuning with low bandwidth. The method is a natural decentralized extension of memory-efficient Simultaneous Perturbation Stochastic Approximation (SPSA). Each iteration,… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  37. arXiv:2306.09343  [pdf, other

    cs.CL cs.AI

    SIGHT: A Large Annotated Dataset on Student Insights Gathered from Higher Education Transcripts

    Authors: Rose E. Wang, Pawan Wirawarn, Noah Goodman, Dorottya Demszky

    Abstract: Lectures are a learning experience for both students and teachers. Students learn from teachers about the subject material, while teachers learn from students about how to refine their instruction. However, online student feedback is unstructured and abundant, making it challenging for teachers to learn and improve. We take a step towards tackling this challenge. First, we contribute a dataset for… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: First two authors contributed equally. In the Proceedings of Innovative Use of NLP for Building Educational Applications 2023. The code and data are open-sourced here: https://github.com/rosewang2008/sight

  38. arXiv:2306.07012  [pdf, other

    cs.AI cs.CL cs.HC cs.RO

    Generating Language Corrections for Teaching Physical Control Tasks

    Authors: Megha Srivastava, Noah Goodman, Dorsa Sadigh

    Abstract: AI assistance continues to help advance applications in education, from language learning to intelligent tutoring systems, yet current methods for providing students feedback are still quite limited. Most automatic feedback systems either provide binary correctness feedback, which may not help a student understand how to improve, or require hand-coding feedback templates, which may not generalize… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: International Conference on Machine Learning (ICML) 2023, 9 pages

  39. arXiv:2306.04031  [pdf, other

    cs.AI

    Certified Deductive Reasoning with Language Models

    Authors: Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman

    Abstract: Language models often achieve higher accuracy when reasoning step-by-step in complex tasks. However, even when arriving at a correct final answer, their rationales are often logically unsound or inconsistent. This is a major issue when reliable reasoning traces are needed, such when fine-tuning on model-generated reasoning for self-improvement. To tackle these issues, we introduce a class of tools… ▽ More

    Submitted 7 November, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  40. arXiv:2305.19165  [pdf, other

    cs.AI cs.CL cs.GT cs.HC

    Strategic Reasoning with Language Models

    Authors: Kanishk Gandhi, Dorsa Sadigh, Noah D. Goodman

    Abstract: Strategic reasoning enables agents to cooperate, communicate, and compete with other agents in diverse situations. Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining. Large Language Models (LLMs), with their ability to comprehend and generate complex, context-rich language, could prove p… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  41. arXiv:2305.11374  [pdf, other

    cs.CL

    Characterizing tradeoffs between teaching via language and demonstrations in multi-agent systems

    Authors: Dhara Yu, Noah D. Goodman, Jesse Mu

    Abstract: Humans teach others about the world through language and demonstration. When might one of these modalities be more effective than the other? In this work, we study the factors that modulate the effectiveness of language vs. demonstration using multi-agent systems to model human communication. Specifically, we train neural network agents to teach via language or demonstration in a grounded communic… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 7 pages, 6 figures, to appear in Proceedings of the 45th Annual Conference of the Cognitive Science Society

  42. arXiv:2305.08809  [pdf, other

    cs.CL

    Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

    Authors: Zhengxuan Wu, Atticus Geiger, Thomas Icard, Christopher Potts, Noah D. Goodman

    Abstract: Obtaining human-interpretable explanations of large, general-purpose language models is an urgent goal for AI safety. However, it is just as important that our interpretability methods are faithful to the causal dynamics underlying model behavior and able to robustly generalize to unseen inputs. Distributed Alignment Search (DAS) is a powerful gradient descent method grounded in a theory of causal… ▽ More

    Submitted 6 February, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 with Author Corrections

  43. arXiv:2305.07151  [pdf, other

    cs.CL

    Overinformative Question Answering by Humans and Machines

    Authors: Polina Tsvilodub, Michael Franke, Robert D. Hawkins, Noah D. Goodman

    Abstract: When faced with a polar question, speakers often provide overinformative answers going beyond a simple "yes" or "no". But what principles guide the selection of additional information? In this paper, we provide experimental evidence from two studies suggesting that overinformativeness in human answering is driven by considerations of relevance to the questioner's goals which they flexibly adjust g… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: 7 pages, 2 figures, to appear in the Proceedings of the 45th Annual Conference of the Cognitive Science Society (2023)

  44. arXiv:2305.03263  [pdf, other

    cs.LG cs.AI

    Bayesian Reinforcement Learning with Limited Cognitive Load

    Authors: Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

    Abstract: All biological and artificial agents must learn and make decisions given limits on their ability to process information. As such, a general theory of adaptive behavior should be able to account for the complex interactions between an agent's learning history, decisions, and capacity constraints. Recent work in computer science has begun to clarify the principles that shape these dynamics by bridgi… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  45. arXiv:2304.09102  [pdf, other

    cs.CL cs.AI

    Solving Math Word Problems by Combining Language Models With Symbolic Solvers

    Authors: Joy He-Yueya, Gabriel Poesia, Rose E. Wang, Noah D. Goodman

    Abstract: Automatically generating high-quality step-by-step solutions to math word problems has many applications in education. Recently, combining large language models (LLMs) with external tools to perform complex reasoning and calculation has emerged as a promising direction for solving math word problems, but prior approaches such as Program-Aided Language model (PAL) are biased towards simple procedur… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  46. arXiv:2304.08467  [pdf, other

    cs.CL

    Learning to Compress Prompts with Gist Tokens

    Authors: Jesse Mu, Xiang Lisa Li, Noah Goodman

    Abstract: Prompting is the primary way to utilize the multitask capabilities of language models (LMs), but prompts occupy valuable space in the input context window, and repeatedly encoding the same prompt is computationally inefficient. Finetuning and distillation methods allow for specialization of LMs without prompting, but require retraining the model for each task. To avoid this trade-off entirely, we… ▽ More

    Submitted 12 February, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023, 26 pages. Version 3 updates preprint to camera-ready version and clarifies some writing in places

  47. arXiv:2304.03843  [pdf, other

    cs.AI cs.CL cs.LG

    Why think step by step? Reasoning emerges from the locality of experience

    Authors: Ben Prystawski, Michael Y. Li, Noah D. Goodman

    Abstract: Humans have a powerful and mysterious capacity to reason. Working through a set of mental steps enables us to make inferences we would not be capable of making directly even though we get no additional data from the world. Similarly, when large language models generate intermediate steps (a chain of thought) before answering a question, they often produce better answers than they would directly. W… ▽ More

    Submitted 2 November, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: 22 pages, 6 figures

  48. arXiv:2303.04448  [pdf, other

    quant-ph nlin.PS physics.comp-ph

    The Quantum and Stochastic Toolbox: xSPDE4.2

    Authors: Peter D. Drummond, Run Yan Teh, Manushan Thenabadu, Channa Hatharasinghe, Chris McGuigan, Alex Dellios, Ned Goodman, Margaret D. Reid

    Abstract: This is the fourth major release of the xSPDE toolbox, which solves stochastic partial and ordinary differential equations, with applications in biology, chemistry, engineering, medicine, physics and quantum technologies. It computes statistical averages, including time-step and sampling error estimation. xSPDE can provide higher order convergence, Fourier spectra and probability densities. The to… ▽ More

    Submitted 26 December, 2024; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: Fourth major release of the user manual for xSPDE software on Github, at https://github.com/peterddrummond/xspde_matlab

  49. arXiv:2303.02536  [pdf, other

    cs.AI

    Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

    Authors: Atticus Geiger, Zhengxuan Wu, Christopher Potts, Thomas Icard, Noah D. Goodman

    Abstract: Causal abstraction is a promising theoretical framework for explainable artificial intelligence that defines when an interpretable high-level causal model is a faithful simplification of a low-level deep learning system. However, existing causal abstraction methods have two major limitations: they require a brute-force search over alignments between the high-level model and the low-level one, and… ▽ More

    Submitted 21 February, 2024; v1 submitted 4 March, 2023; originally announced March 2023.

  50. arXiv:2302.05757  [pdf, other

    cs.CV cs.AI

    Multispectral Contrastive Learning with Viewmaker Networks

    Authors: Jasmine Bayrooti, Noah Goodman, Alex Tamkin

    Abstract: Contrastive learning methods have been applied to a range of domains and modalities by training models to identify similar "views" of data points. However, specialized scientific modalities pose a challenge for this paradigm, as identifying good views for each scientific instrument is complex and time-intensive. In this paper, we focus on applying contrastive learning approaches to a variety of re… ▽ More

    Submitted 3 June, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Appearing in CVPR-PBVS 2023