Skip to main content

Showing 1–48 of 48 results for author: Haber, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.00769  [pdf, ps, other

    cs.CL cs.AI

    LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing

    Authors: Daniel Fein, Sebastian Russo, Violet Xiang, Kabir Jolly, Rafael Rafailov, Nick Haber

    Abstract: Evaluating creative writing generated by large language models (LLMs) remains challenging because open-ended narratives lack ground truths. Without performant automated evaluation methods, off-the-shelf (OTS) language models are employed as zero-shot judges, yet their reliability is unclear in this context. In pursuit of robust evaluation for creative writing, we introduce LitBench, the first stan… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  2. arXiv:2506.20600  [pdf, ps, other

    cs.AI

    CogGen: A Learner-Centered Generative AI Architecture for Intelligent Tutoring with Programming Video

    Authors: Wengxi Li, Roy Pea, Nick Haber, Hari Subramonyam

    Abstract: We introduce CogGen, a learner-centered AI architecture that transforms programming videos into interactive, adaptive learning experiences by integrating student modeling with generative AI tutoring based on the Cognitive Apprenticeship framework. The architecture consists of three components: (1) video segmentation by learning goals, (2) a conversational tutoring engine applying Cognitive Apprent… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  3. arXiv:2506.11343  [pdf, ps, other

    cs.CL

    From Replication to Redesign: Exploring Pairwise Comparisons for LLM-Based Peer Review

    Authors: Yaohui Zhang, Haijing Zhang, Wenlong Ji, Tianyu Hua, Nick Haber, Hancheng Cao, Weixin Liang

    Abstract: The advent of large language models (LLMs) offers unprecedented opportunities to reimagine peer review beyond the constraints of traditional workflows. Despite these opportunities, prior efforts have largely focused on replicating traditional review workflows with LLMs serving as direct substitutes for human reviewers, while limited attention has been given to exploring new paradigms that fundamen… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  4. arXiv:2506.05579  [pdf, ps, other

    cs.AI cs.CL cs.HC

    When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

    Authors: Quan Shi, Carlos E. Jimenez, Shunyu Yao, Nick Haber, Diyi Yang, Karthik Narasimhan

    Abstract: Recent advancements in AI reasoning have driven substantial improvements across diverse tasks. A critical open question is whether these improvements also yields better knowledge transfer: the ability of models to communicate reasoning in ways humans can understand, apply, and learn from. To investigate this, we introduce Knowledge Integration and Transfer Evaluation (KITE), a conceptual and exper… ▽ More

    Submitted 9 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

    Comments: For code, data, visualizer, visit: https://kite-live.vercel.app

  5. arXiv:2506.05256  [pdf, ps, other

    cs.AI cs.LG

    Just Enough Thinking: Efficient Reasoning with Adaptive Length Penalties Reinforcement Learning

    Authors: Violet Xiang, Chase Blagden, Rafael Rafailov, Nathan Lile, Sang Truong, Chelsea Finn, Nick Haber

    Abstract: Large reasoning models (LRMs) achieve higher performance on challenging reasoning tasks by generating more tokens at inference time, but this verbosity often wastes computation on easy problems. Existing solutions, including supervised finetuning on shorter traces, user-controlled budgets, or RL with uniform penalties, either require data curation, manual configuration, or treat all problems alike… ▽ More

    Submitted 5 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

  6. arXiv:2506.02314  [pdf, ps, other

    cs.AI cs.CL

    ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code

    Authors: Tianyu Hua, Harper Hua, Violet Xiang, Benjamin Klieger, Sang T. Truong, Weixin Liang, Fan-Yun Sun, Nick Haber

    Abstract: Large language models (LLMs) have shown promise in transforming machine learning research, yet their capability to faithfully implement novel ideas from recent research papers-ideas unseen during pretraining-remains unclear. We introduce ResearchCodeBench, a benchmark of 212 coding challenges that evaluates LLMs' ability to translate cutting-edge ML contributions from top 2024-2025 research papers… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  7. Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providers

    Authors: Jared Moore, Declan Grabb, William Agnew, Kevin Klyman, Stevie Chancellor, Desmond C. Ong, Nick Haber

    Abstract: Should a large language model (LLM) be used as a therapist? In this paper, we investigate the use of LLMs to *replace* mental health providers, a use case promoted in the tech startup and research space. We conduct a mapping review of therapy guides used by major medical institutions to identify crucial aspects of therapeutic relationships, such as the importance of a therapeutic alliance between… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

  8. Media Content Atlas: A Pipeline to Explore and Investigate Multidimensional Media Space using Multimodal LLMs

    Authors: Merve Cerit, Eric Zelikman, Mu-Jung Cho, Thomas N. Robinson, Byron Reeves, Nilam Ram, Nick Haber

    Abstract: As digital media use continues to evolve and influence various aspects of life, developing flexible and scalable tools to study complex media experiences is essential. This study introduces the Media Content Atlas (MCA), a novel pipeline designed to help researchers investigate large-scale screen data beyond traditional screen-use metrics. Leveraging multimodal large language models (MLLMs), MCA e… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

    Comments: Accepted to CHI 2025, in press. See the project page at mediacontentatlas.github.io

    ACM Class: H.5.0; H.5.1; I.2; J.4

  9. arXiv:2502.20663  [pdf, other

    cs.CL

    Prediction of Item Difficulty for Reading Comprehension Items by Creation of Annotated Item Repository

    Authors: Radhika Kapoor, Sang T. Truong, Nick Haber, Maria Araceli Ruiz-Primo, Benjamin W. Domingue

    Abstract: Prediction of item difficulty based on its text content is of substantial interest. In this paper, we focus on the related problem of recovering IRT-based difficulty when the data originally reported item p-value (percent correct responses). We model this item difficulty using a repository of reading passages and student data from US standardized tests from New York and Texas for grades 3-8 spanni… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

  10. arXiv:2502.17387  [pdf, other

    cs.LG cs.AI cs.CL

    Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models

    Authors: Alon Albalak, Duy Phung, Nathan Lile, Rafael Rafailov, Kanishk Gandhi, Louis Castricato, Anikait Singh, Chase Blagden, Violet Xiang, Dakota Mahan, Nick Haber

    Abstract: Increasing interest in reasoning models has led math to become a prominent testing ground for algorithmic and methodological improvements. However, existing open math datasets either contain a small collection of high-quality, human-written problems or a large corpus of machine-generated problems of uncertain quality, forcing researchers to choose between quality and quantity. In this work, we pre… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  11. arXiv:2502.13928  [pdf, ps, other

    cs.CV cs.AI cs.CL cs.LG

    Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images

    Authors: Shengguang Wu, Fan-Yun Sun, Kaiyue Wen, Nick Haber

    Abstract: Recent studies have shown that Large Vision-Language Models (VLMs) tend to neglect image content and over-rely on language-model priors, resulting in errors in visually grounded tasks and hallucinations. We hypothesize that this issue arises because existing VLMs are not explicitly trained to generate texts that are accurately grounded in fine-grained image details. To enhance visual feedback duri… ▽ More

    Submitted 2 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: Accepted to ACL 2025 Main. Project Website: https://s-vco.github.io/

  12. arXiv:2501.04682  [pdf, other

    cs.AI cs.CL

    Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought

    Authors: Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael Rafailov, Nathan Lile, Dakota Mahan, Louis Castricato, Jan-Philipp Franken, Nick Haber, Chelsea Finn

    Abstract: We propose a novel framework, Meta Chain-of-Thought (Meta-CoT), which extends traditional Chain-of-Thought (CoT) by explicitly modeling the underlying reasoning required to arrive at a particular CoT. We present empirical evidence from state-of-the-art models exhibiting behaviors consistent with in-context search, and explore methods for producing Meta-CoT via process supervision, synthetic data g… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

  13. arXiv:2412.05766  [pdf, other

    cs.LG cs.AI

    Policy-shaped prediction: avoiding distractions in model-based reinforcement learning

    Authors: Miles Hutson, Isaac Kauvar, Nick Haber

    Abstract: Model-based reinforcement learning (MBRL) is a promising route to sample-efficient policy optimization. However, a known vulnerability of reconstruction-based MBRL consists of scenarios in which detailed aspects of the world are highly predictable, but irrelevant to learning a good policy. Such scenarios can lead the model to exhaust its capacity on meaningless content, at the cost of neglecting i… ▽ More

    Submitted 7 December, 2024; originally announced December 2024.

    Comments: Accepted at NeurIPS 2024

  14. arXiv:2412.02653  [pdf, other

    physics.ed-ph cs.AI

    Scaffold or Crutch? Examining College Students' Use and Views of Generative AI Tools for STEM Education

    Authors: Karen D. Wang, Zhangyang Wu, L'Nard Tufts II, Carl Wieman, Shima Salehi, Nick Haber

    Abstract: Developing problem-solving competency is central to Science, Technology, Engineering, and Mathematics (STEM) education, yet translating this priority into effective approaches to problem-solving instruction and assessment remain a significant challenge. The recent proliferation of generative artificial intelligence (genAI) tools like ChatGPT in higher education introduces new considerations about… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  15. arXiv:2412.02193  [pdf, other

    cs.CV cs.AI

    LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models

    Authors: Fan-Yun Sun, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu

    Abstract: Spatial reasoning is a fundamental aspect of human cognition, enabling intuitive understanding and manipulation of objects in three-dimensional space. While foundation models demonstrate remarkable performance on some benchmarks, they still struggle with 3D reasoning tasks like arranging objects in space according to open-ended language instructions, particularly in dense and physically constraine… ▽ More

    Submitted 11 March, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

    Comments: CVPR 2025, project website: https://ai.stanford.edu/~sunfanyun/layoutvlm/

  16. arXiv:2412.01992  [pdf, other

    cs.HC cs.AI

    ChatCollab: Exploring Collaboration Between Humans and AI Agents in Software Teams

    Authors: Benjamin Klieger, Charis Charitsis, Miroslav Suzara, Sierra Wang, Nick Haber, John C. Mitchell

    Abstract: We explore the potential for productive team-based collaboration between humans and Artificial Intelligence (AI) by presenting and conducting initial tests with a general framework that enables multiple human and AI agents to work together as peers. ChatCollab's novel architecture allows agents - human or AI - to join collaborations in any role, autonomously engage in tasks and communication withi… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: Preprint, 25 pages, 7 figures

  17. arXiv:2411.19146  [pdf, ps, other

    cs.LG

    Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

    Authors: Akhiad Bercovich, Tomer Ronen, Talor Abramovich, Nir Ailon, Nave Assaf, Mohammad Dabbah, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Netanel Haber, Ehud Karpas, Roi Koren, Itay Levy, Pavlo Molchanov, Shahar Mor, Zach Moshe, Najeeb Nabwani, Omri Puny, Ran Rubin, Itamar Schen, Ido Shahaf, Oren Tropp, Omer Ullman Argov, Ran Zilberstein , et al. (1 additional authors not shown)

    Abstract: Large language models (LLMs) offer remarkable capabilities, yet their high inference costs restrict wider adoption. While increasing parameter counts improves accuracy, it also broadens the gap between state-of-the-art capabilities and practical deployability. We present Puzzle, a hardware-aware framework that accelerates the inference of LLMs while preserving their capabilities. Using neural arch… ▽ More

    Submitted 3 June, 2025; v1 submitted 28 November, 2024; originally announced November 2024.

  18. arXiv:2409.17652  [pdf, other

    cs.AI cs.RO

    FactorSim: Generative Simulation via Factorized Representation

    Authors: Fan-Yun Sun, S. I. Harini, Angela Yi, Yihan Zhou, Alex Zook, Jonathan Tremblay, Logan Cross, Jiajun Wu, Nick Haber

    Abstract: Generating simulations to train intelligent agents in game-playing and robotics from natural language input, from user input or task documentation, remains an open-ended challenge. Existing approaches focus on parts of this challenge, such as generating reward functions or task hyperparameters. Unlike previous work, we introduce FACTORSIM that generates full simulations in code from language input… ▽ More

    Submitted 11 November, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: neurips 2024, project website: https://cs.stanford.edu/~sunfanyun/factorsim/

  19. arXiv:2408.01452  [pdf, other

    cs.CY cs.AI cs.LG

    Building a Domain-specific Guardrail Model in Production

    Authors: Mohammad Niknazar, Paul V Haley, Latha Ramanan, Sang T. Truong, Yedendra Shrinivasan, Ayan Kumar Bhowmick, Prasenjit Dey, Ashish Jagmohan, Hema Maheshwari, Shom Ponoth, Robert Smith, Aditya Vempaty, Nick Haber, Sanmi Koyejo, Sharad Sundararajan

    Abstract: Generative AI holds the promise of enabling a range of sought-after capabilities and revolutionizing workflows in various consumer and enterprise verticals. However, putting a model in production involves much more than just generating an output. It involves ensuring the model is reliable, safe, performant and also adheres to the policy of operation in a particular domain. Guardrails as a necessit… ▽ More

    Submitted 24 July, 2024; originally announced August 2024.

  20. arXiv:2407.07086  [pdf, other

    cs.AI

    Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models

    Authors: Logan Cross, Violet Xiang, Agam Bhatia, Daniel LK Yamins, Nick Haber

    Abstract: Multi-agent reinforcement learning (MARL) methods struggle with the non-stationarity of multi-agent systems and fail to adaptively learn online when tested with novel agents. Here, we leverage large language models (LLMs) to create an autonomous agent that can handle these challenges. Our agent, Hypothetical Minds, consists of a cognitively-inspired architecture, featuring modular components for p… ▽ More

    Submitted 11 December, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  21. arXiv:2407.00695  [pdf, other

    cs.AI cs.LO

    Learning Formal Mathematics From Intrinsic Motivation

    Authors: Gabriel Poesia, David Broman, Nick Haber, Noah D. Goodman

    Abstract: How did humanity coax mathematics from the aether? We explore the Platonic view that mathematics can be discovered from its axioms - a game of conjecture and proof. We describe Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type… ▽ More

    Submitted 4 November, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: NeurIPS 2024 Oral

  22. arXiv:2405.12946  [pdf, other

    cs.HC cs.LG

    Tutorly: Turning Programming Videos Into Apprenticeship Learning Environments with LLMs

    Authors: Wengxi Li, Roy Pea, Nick Haber, Hari Subramonyam

    Abstract: Online programming videos, including tutorials and streamcasts, are widely popular and contain a wealth of expert knowledge. However, effectively utilizing these resources to achieve targeted learning goals can be challenging. Unlike direct tutoring, video content lacks tailored guidance based on individual learning paces, personalized feedback, and interactive engagement necessary for support and… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  23. arXiv:2403.09629  [pdf, other

    cs.CL cs.AI cs.LG

    Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

    Authors: Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman

    Abstract: When writing and talking, people sometimes pause to think. Although reasoning-focused works have often framed reasoning as a method of answering questions or completing agentic tasks, reasoning is implicit in almost all written text. For example, this applies to the steps not stated between the lines of a proof or to the theory of mind underlying a conversation. In the Self-Taught Reasoner (STaR,… ▽ More

    Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  24. arXiv:2312.09067  [pdf, other

    cs.CV cs.AI cs.CL cs.RO

    Holodeck: Language Guided Generation of 3D Embodied AI Environments

    Authors: Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark

    Abstract: 3D simulated environments play a critical role in Embodied AI, but their creation requires expertise and extensive manual effort, restricting their diversity and scope. To mitigate this limitation, we present Holodeck, a system that generates 3D environments to match a user-supplied prompt fully automatedly. Holodeck can generate diverse scenes, e.g., arcades, spas, and museums, adjust the designs… ▽ More

    Submitted 22 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Published in CVPR 2024, 21 pages, 27 figures, 2 tables

  25. arXiv:2312.08662  [pdf, other

    cs.MA

    From Centralized to Self-Supervised: Pursuing Realistic Multi-Agent Reinforcement Learning

    Authors: Violet Xiang, Logan Cross, Jan-Philipp Fränken, Nick Haber

    Abstract: In real-world environments, autonomous agents rely on their egocentric observations. They must learn adaptive strategies to interact with others who possess mixed motivations, discernible only through visible cues. Several Multi-Agent Reinforcement Learning (MARL) methods adopt centralized approaches that involve either centralized training or reward-sharing, often violating the realistic ways in… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  26. arXiv:2310.08773  [pdf, other

    cs.AI cs.CE

    Examining the Potential and Pitfalls of ChatGPT in Science and Engineering Problem-Solving

    Authors: Karen D. Wang, Eric Burkholder, Carl Wieman, Shima Salehi, Nick Haber

    Abstract: The study explores the capabilities of OpenAI's ChatGPT in solving different types of physics problems. ChatGPT (with GPT-4) was queried to solve a total of 40 problems from a college-level engineering physics course. These problems ranged from well-specified problems, where all data required for solving the problem was provided, to under-specified, real-world problems where not all necessary data… ▽ More

    Submitted 27 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: 12 pages, 2 figures

  27. arXiv:2310.06837  [pdf, other

    cs.CL cs.LG

    Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency

    Authors: Eric Zelikman, Wanjing Anya Ma, Jasmine E. Tran, Diyi Yang, Jason D. Yeatman, Nick Haber

    Abstract: Developing an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses. Moreover, many tests require multiple distinct sets of questions administered throughout the school year to closely monitor students' progress, known as parallel tests. In this study, we focus on tests of silent sentence reading… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Main)

  28. arXiv:2309.11710  [pdf, other

    cs.CL cs.CV

    ContextRef: Evaluating Referenceless Metrics For Image Description Generation

    Authors: Elisa Kreiss, Eric Zelikman, Christopher Potts, Nick Haber

    Abstract: Referenceless metrics (e.g., CLIPScore) use pretrained vision--language models to assess image descriptions directly without costly ground-truth reference texts. Such methods can facilitate rapid progress, but only if they truly align with human preference judgments. In this paper, we introduce ContextRef, a benchmark for assessing referenceless metrics for such alignment. ContextRef has two compo… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  29. arXiv:2309.05660  [pdf, other

    cs.LG cs.AI cs.CL

    Hypothesis Search: Inductive Reasoning with Language Models

    Authors: Ruocheng Wang, Eric Zelikman, Gabriel Poesia, Yewen Pu, Nick Haber, Noah D. Goodman

    Abstract: Inductive reasoning is a core problem-solving capacity: humans can identify underlying principles from a few examples, which robustly generalize to novel scenarios. Recent work evaluates large language models (LLMs) on inductive reasoning tasks by directly prompting them yielding "in context learning." This works well for straightforward inductive tasks but performs poorly on complex tasks such as… ▽ More

    Submitted 30 May, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: ICLR 2024. The first two authors contributed equally. Code: https://github.com/Relento/hypothesis_search

  30. arXiv:2306.15934  [pdf, other

    cs.LG cs.AI stat.ML

    Curious Replay for Model-based Adaptation

    Authors: Isaac Kauvar, Chris Doyle, Linqi Zhou, Nick Haber

    Abstract: Agents must be able to adapt quickly as an environment changes. We find that existing model-based reinforcement learning agents are unable to do this well, in part because of how they use past experiences to train their world model. Here, we present Curious Replay -- a form of prioritized experience replay tailored to model-based agents through use of a curiosity-based priority signal. Agents usin… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: Accepted at ICML 2023. Website at https://sites.google.com/view/curious-replay

  31. arXiv:2306.10015  [pdf, other

    cs.LG cs.CL cs.DC

    Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness

    Authors: Eric Zelikman, Qian Huang, Percy Liang, Nick Haber, Noah D. Goodman

    Abstract: Language model training in distributed settings is limited by the communication cost of gradient exchanges. In this short note, we extend recent work from Malladi et al. (2023), using shared randomness to perform distributed fine-tuning with low bandwidth. The method is a natural decentralized extension of memory-efficient Simultaneous Perturbation Stochastic Approximation (SPSA). Each iteration,… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  32. arXiv:2305.13452  [pdf, other

    cs.AI cs.LG

    Measuring and Modeling Physical Intrinsic Motivation

    Authors: Julio Martinez, Felix Binder, Haoliang Wang, Nick Haber, Judith Fan, Daniel L. K. Yamins

    Abstract: Humans are interactive agents driven to seek out situations with interesting physical dynamics. Here we formalize the functional form of physical intrinsic motivation. We first collect ratings of how interesting humans find a variety of physics scenarios. We then model human interestingness responses by implementing various hypotheses of intrinsic motivation including models that rely on simple sc… ▽ More

    Submitted 7 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 6 pages, 5 figures, accepted to CogSci 2023 with full paper publication in the proceedings

  33. arXiv:2305.13396  [pdf, other

    cs.LG cs.AI

    Developmental Curiosity and Social Interaction in Virtual Agents

    Authors: Chris Doyle, Sarah Shader, Michelle Lau, Megumi Sano, Daniel L. K. Yamins, Nick Haber

    Abstract: Infants explore their complex physical and social environment in an organized way. To gain insight into what intrinsic motivations may help structure this exploration, we create a virtual infant agent and place it in a developmentally-inspired 3D environment with no external rewards. The environment has a virtual caregiver agent with the capability to interact contingently with the infant agent in… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: 6 pages, 5 figures, 2 tables; accepted to CogSci 2023 with full paper publication in the proceedings

  34. arXiv:2304.00673  [pdf, other

    cs.CV

    Partial-View Object View Synthesis via Filtered Inversion

    Authors: Fan-Yun Sun, Jonathan Tremblay, Valts Blukis, Kevin Lin, Danfei Xu, Boris Ivanovic, Peter Karkus, Stan Birchfield, Dieter Fox, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Marco Pavone, Nick Haber

    Abstract: We propose Filtering Inversion (FINV), a learning framework and optimization process that predicts a renderable 3D object representation from one or few partial views. FINV addresses the challenge of synthesizing novel views of objects from partial observations, spanning cases where the object is not entirely in view, is partially occluded, or is only observed from similar views. To achieve this,… ▽ More

    Submitted 17 August, 2024; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: project website: http://cs.stanford.edu/~sunfanyun/finv

  35. arXiv:2212.10561  [pdf, other

    cs.CL cs.AI cs.LG

    Parsel: Algorithmic Reasoning with Language Models by Composing Decompositions

    Authors: Eric Zelikman, Qian Huang, Gabriel Poesia, Noah D. Goodman, Nick Haber

    Abstract: Despite recent success in large language model (LLM) reasoning, LLMs struggle with hierarchical multi-step reasoning tasks like generating complex programs. For these tasks, humans often start with a high-level algorithmic design and implement each part gradually. We introduce Parsel, a framework enabling automatic implementation and validation of complex algorithms with code LLMs. With Parsel, we… ▽ More

    Submitted 28 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: humaneval results, clarity

  36. arXiv:2208.10660  [pdf, other

    cs.LG cs.AI

    Interaction Modeling with Multiplex Attention

    Authors: Fan-Yun Sun, Isaac Kauvar, Ruohan Zhang, Jiachen Li, Mykel Kochenderfer, Jiajun Wu, Nick Haber

    Abstract: Modeling multi-agent systems requires understanding how agents interact. Such systems are often difficult to model because they can involve a variety of types of interactions that layer together to drive rich social behavioral dynamics. Here we introduce a method for accurately modeling multi-agent systems. We present Interaction Modeling with Multiplex Attention (IMMA), a forward prediction model… ▽ More

    Submitted 25 January, 2023; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: NeurIPS 2022, project website: https://cs.stanford.edu/~sunfanyun/imma/

  37. arXiv:2208.03569  [pdf, other

    eess.IV cs.CV cs.LG

    Constrained self-supervised method with temporal ensembling for fiber bundle detection on anatomic tracing data

    Authors: Vaanathi Sundaresan, Julia F. Lehman, Sean Fitzgibbon, Saad Jbabdi, Suzanne N. Haber, Anastasia Yendiki

    Abstract: Anatomic tracing data provides detailed information on brain circuitry essential for addressing some of the common errors in diffusion MRI tractography. However, automated detection of fiber bundles on tracing data is challenging due to sectioning distortions, presence of noise and artifacts and intensity/contrast variations. In this work, we propose a deep learning method with a self-supervised l… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: Accepted in 1st International Workshop on Medical Optical Imaging and Virtual Microscopy Image Analysis (MOVI 2022)

  38. arXiv:2201.11197  [pdf

    cs.CV cs.LG

    Challenges and Opportunities for Machine Learning Classification of Behavior and Mental State from Images

    Authors: Peter Washington, Cezmi Onur Mutlu, Aaron Kline, Kelley Paskov, Nate Tyler Stockham, Brianna Chrisman, Nick Deveau, Mourya Surhabi, Nick Haber, Dennis P. Wall

    Abstract: Computer Vision (CV) classifiers which distinguish and detect nonverbal social human behavior and mental state can aid digital diagnostics and therapeutics for psychiatry and the behavioral sciences. While CV classifiers for traditional and structured classification tasks can be developed with standard machine learning pipelines for supervised learning consisting of data labeling, preprocessing, a… ▽ More

    Submitted 26 January, 2022; originally announced January 2022.

    Comments: 30 pages, 1 figure, 1 table

  39. arXiv:2101.03477  [pdf

    cs.CV cs.HC

    Training Affective Computer Vision Models by Crowdsourcing Soft-Target Labels

    Authors: Peter Washington, Onur Cezmi Mutlu, Emilie Leblanc, Aaron Kline, Cathy Hou, Brianna Chrisman, Nate Stockham, Kelley Paskov, Catalin Voss, Nick Haber, Dennis Wall

    Abstract: Emotion classifiers traditionally predict discrete emotions. However, emotion expressions are often subjective, thus requiring a method to handle subjective labels. We explore the use of crowdsourcing to acquire reliable soft-target labels and evaluate an emotion detection classifier trained with these labels. We center our study on the Child Affective Facial Expression (CAFE) dataset, a gold stan… ▽ More

    Submitted 22 September, 2021; v1 submitted 10 January, 2021; originally announced January 2021.

  40. arXiv:2012.08678  [pdf

    cs.CV cs.CY cs.HC

    Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study

    Authors: Peter Washington, Haik Kalantarian, John Kent, Arman Husic, Aaron Kline, Emilie Leblanc, Cathy Hou, Onur Cezmi Mutlu, Kaitlyn Dunlap, Yordan Penev, Maya Varma, Nate Tyler Stockham, Brianna Chrisman, Kelley Paskov, Min Woo Sun, Jae-Yoon Jung, Catalin Voss, Nick Haber, Dennis Paul Wall

    Abstract: Background: Automated emotion classification could aid those who struggle to recognize emotions, including children with developmental behavioral conditions such as autism. However, most computer vision emotion recognition models are trained on adult emotion and therefore underperform when applied to child faces. Objective: We designed a strategy to gamify the collection and labeling of child emot… ▽ More

    Submitted 3 June, 2024; v1 submitted 15 December, 2020; originally announced December 2020.

    Journal ref: JMIR pediatrics and parenting 5.2 (2022): e26760

  41. arXiv:2007.07853  [pdf, other

    cs.LG cs.AI stat.ML

    Active World Model Learning with Progress Curiosity

    Authors: Kuno Kim, Megumi Sano, Julian De Freitas, Nick Haber, Daniel Yamins

    Abstract: World models are self-supervised predictive models of how the world evolves. Humans learn world models by curiously exploring their environment, in the process acquiring compact abstractions of high bandwidth sensory inputs, the ability to plan across long temporal horizons, and an understanding of the behavioral patterns of other agents. In this work, we study how to design such a curiosity-drive… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: ICML 2020. Video of results at https://bit.ly/31vg7v1

  42. arXiv:2007.04954  [pdf, other

    cs.CV cs.GR cs.LG cs.RO

    ThreeDWorld: A Platform for Interactive Multi-Modal Physical Simulation

    Authors: Chuang Gan, Jeremy Schwartz, Seth Alter, Damian Mrowca, Martin Schrimpf, James Traer, Julian De Freitas, Jonas Kubilius, Abhishek Bhandwaldar, Nick Haber, Megumi Sano, Kuno Kim, Elias Wang, Michael Lingelbach, Aidan Curtis, Kevin Feigelis, Daniel M. Bear, Dan Gutfreund, David Cox, Antonio Torralba, James J. DiCarlo, Joshua B. Tenenbaum, Josh H. McDermott, Daniel L. K. Yamins

    Abstract: We introduce ThreeDWorld (TDW), a platform for interactive multi-modal physical simulation. TDW enables simulation of high-fidelity sensory data and physical interactions between mobile agents and objects in rich 3D environments. Unique properties include: real-time near-photo-realistic image rendering; a library of objects and environments, and routines for their customization; generative procedu… ▽ More

    Submitted 28 December, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Oral Presentation at NeurIPS 21 Datasets and Benchmarks Track. Project page: http://www.threedworld.org

  43. arXiv:2004.14281  [pdf

    cs.HC cs.CV cs.LG

    A Wearable Social Interaction Aid for Children with Autism

    Authors: Nick Haber, Catalin Voss, Jena Daniels, Peter Washington, Azar Fazel, Aaron Kline, Titas De, Terry Winograd, Carl Feinstein, Dennis P. Wall

    Abstract: With most recent estimates giving an incidence rate of 1 in 68 children in the United States, the autism spectrum disorder (ASD) is a growing public health crisis. Many of these children struggle to make eye contact, recognize facial expressions, and engage in social interactions. Today the standard for treatment of the core autism-related deficits focuses on a form of behavior training known as A… ▽ More

    Submitted 19 April, 2020; originally announced April 2020.

  44. arXiv:2002.06581  [pdf

    cs.HC

    Superpower Glass: Delivering Unobtrusive Real-time Social Cues in Wearable Systems

    Authors: Catalin Voss, Peter Washington, Nick Haber, Aaron Kline, Jena Daniels, Azar Fazel, Titas De, Beth McCarthy, Carl Feinstein, Terry Winograd, Dennis Wall

    Abstract: We have developed a system for automatic facial expression recognition, which runs on Google Glass and delivers real-time social cues to the wearer. We evaluate the system as a behavioral aid for children with Autism Spectrum Disorder (ASD), who can greatly benefit from real-time non-invasive emotional cues and are more sensitive to sensory input than neurotypically developing children. In additio… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

    Comments: UbiComp ISWC 2016

  45. arXiv:2002.04263  [pdf

    cs.HC

    Designing a Holistic At-Home Learning Aid for Autism

    Authors: Catalin Voss, Nick Haber, Peter Washington, Aaron Kline, Beth McCarthy, Jena Daniels, Azar Fazel, Titas De, Carl Feinstein, Terry Winograd, Dennis Wall

    Abstract: In recent years, much focus has been put on employing technology to make novel behavioural aids for those with autism. Most of these are digital adaptations of tools used in standard behavioural therapy to enforce normative skills. These digital counterparts are often used outside of both the larger therapeutic context and the real world, in which the learned skills might apply. To address this, w… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

    Comments: Conference Workshop

    Journal ref: CHI 2016 - Autism Technology Workshop

  46. arXiv:1806.08047  [pdf, other

    cs.AI cs.CV cs.LG cs.NE

    Flexible Neural Representation for Physics Prediction

    Authors: Damian Mrowca, Chengxu Zhuang, Elias Wang, Nick Haber, Li Fei-Fei, Joshua B. Tenenbaum, Daniel L. K. Yamins

    Abstract: Humans have a remarkable capacity to understand the physical dynamics of objects in their environment, flexibly capturing complex structures and interactions at multiple levels of detail. Inspired by this ability, we propose a hierarchical particle-based object representation that covers a wide variety of types of three-dimensional objects, including both arbitrary rigid geometrical shapes and def… ▽ More

    Submitted 27 October, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

    Comments: 23 pages, 20 figures

  47. arXiv:1802.07461  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Emergence of Structured Behaviors from Curiosity-Based Intrinsic Motivation

    Authors: Nick Haber, Damian Mrowca, Li Fei-Fei, Daniel L. K. Yamins

    Abstract: Infants are experts at playing, with an amazing ability to generate novel structured behaviors in unstructured environments that lack clear extrinsic reward signals. We seek to replicate some of these abilities with a neural network that implements curiosity-driven intrinsic motivation. Using a simple but ecologically naturalistic simulated environment in which the agent can move and interact with… ▽ More

    Submitted 21 February, 2018; originally announced February 2018.

    Comments: 6 pages, 5 figures

    MSC Class: 68

  48. arXiv:1802.07442  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Learning to Play with Intrinsically-Motivated Self-Aware Agents

    Authors: Nick Haber, Damian Mrowca, Li Fei-Fei, Daniel L. K. Yamins

    Abstract: Infants are experts at playing, with an amazing ability to generate novel structured behaviors in unstructured environments that lack clear extrinsic reward signals. We seek to mathematically formalize these abilities using a neural network that implements curiosity-driven intrinsic motivation. Using a simple but ecologically naturalistic simulated environment in which an agent can move and intera… ▽ More

    Submitted 30 October, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: In NIPS 2018. 10 pages, 5 figures

    MSC Class: 68