Skip to main content

Showing 1–10 of 10 results for author: Davidson, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.21828  [pdf, ps, other

    cs.AI

    SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts

    Authors: Chen Yueh-Han, Guy Davidson, Brenden M. Lake

    Abstract: Do LLMs robustly generalize critical safety facts to novel situations? Lacking this ability is dangerous when users ask naive questions. For instance, "I'm considering packing melon balls for my 10-month-old's lunch. What other foods would be good to include?" Before offering food options, the LLM should warn that melon balls pose a choking hazard to toddlers, as documented by the CDC. Failing to… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  2. arXiv:2505.12075  [pdf, ps, other

    cs.CL cs.LG

    Do different prompting methods yield a common task representation in language models?

    Authors: Guy Davidson, Todd M. Gureckis, Brenden M. Lake, Adina Williams

    Abstract: Demonstrations and instructions are two primary approaches for prompting language models to perform in-context learning (ICL) tasks. Do identical tasks elicited in different ways result in similar representations of the task? An improved understanding of task representation mechanisms would offer interpretability insights and may aid in steering models. We study this through \textit{function vecto… ▽ More

    Submitted 21 May, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

    Comments: 9 pages, 4 figures; under review

  3. arXiv:2503.24228  [pdf, other

    cs.AI cs.CL cs.MA

    PAARS: Persona Aligned Agentic Retail Shoppers

    Authors: Saab Mansour, Leonardo Perelli, Lorenzo Mainetti, George Davidson, Stefano D'Amato

    Abstract: In e-commerce, behavioral data is collected for decision making which can be costly and slow. Simulation with LLM powered agents is emerging as a promising alternative for representing human population behavior. However, LLMs are known to exhibit certain biases, such as brand bias, review rating bias and limited representation of certain groups in the population, hence they need to be carefully be… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  4. Goals as Reward-Producing Programs

    Authors: Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake

    Abstract: People are remarkably capable of generating their own goals, beginning with child's play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behavior, models are still far from capturing the richness of everyday human goals. Here, we bridge this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-… ▽ More

    Submitted 10 September, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: Project website and goal program viewer: https://exps.gureckislab.org/guydav/goal_programs_viewer/main/

  5. arXiv:2402.03575  [pdf, other

    cs.AI cs.HC

    Toward Human-AI Alignment in Large-Scale Multi-Player Games

    Authors: Sugandha Sharma, Guy Davidson, Khimya Khetarpal, Anssi Kanervisto, Udit Arora, Katja Hofmann, Ida Momennejad

    Abstract: Achieving human-AI alignment in complex multi-agent games is crucial for creating trustworthy AI agents that enhance gameplay. We propose a method to evaluate this alignment using an interpretable task-sets framework, focusing on high-level behavioral tasks instead of low-level policies. Our approach has three components. First, we analyze extensive human gameplay data from Xbox's Bleeding Edge (1… ▽ More

    Submitted 18 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  6. arXiv:2212.04583  [pdf

    eess.AS cs.SD

    High Quality Audio Coding with MDCTNet

    Authors: Grant Davidson, Mark Vinton, Per Ekstrand, Cong Zhou, Lars Villemoes, Lie Lu

    Abstract: We propose a neural audio generative model, MDCTNet, operating in the perceptually weighted domain of an adaptive modified discrete cosine transform (MDCT). The architecture of the model captures correlations in both time and frequency directions with recurrent layers (RNNs). An audio coding system is obtained by training MDCTNet on a diverse set of fullband monophonic audio signals at 48 kHz samp… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: Five pages, five figures

  7. arXiv:2002.06703  [pdf, other

    cs.LG cs.AI stat.ML

    Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning

    Authors: Guy Davidson, Brenden M. Lake

    Abstract: We explore the benefits of augmenting state-of-the-art model-free deep reinforcement algorithms with simple object representations. Following the Frostbite challenge posited by Lake et al. (2017), we identify object representations as a critical cognitive capacity lacking from current reinforcement learning agents. We discover that providing the Rainbow model (Hessel et al.,2018) with simple, feat… ▽ More

    Submitted 28 May, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

  8. arXiv:1905.10837  [pdf, other

    cs.LG stat.ML

    Sequential mastery of multiple visual tasks: Networks naturally learn to learn and forget to forget

    Authors: Guy Davidson, Michael C. Mozer

    Abstract: We explore the behavior of a standard convolutional neural net in a continual-learning setting that introduces visual classification tasks sequentially and requires the net to master new tasks while preserving mastery of previously learned tasks. This setting corresponds to that which human learners face as they acquire domain expertise serially, for example, as an individual studies a textbook. T… ▽ More

    Submitted 30 March, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

  9. arXiv:1711.04471  [pdf, other

    cs.MS cs.DC math.NA

    Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation

    Authors: Wim Vanderbauwhede, Gavin Davidson

    Abstract: Massively parallel accelerators such as GPGPUs, manycores and FPGAs represent a powerful and affordable tool for scientists who look to speed up simulations of complex systems. However, porting code to such devices requires a detailed understanding of heterogeneous programming tools and effective strategies for parallelization. In this paper we present a source to source compilation approach with… ▽ More

    Submitted 13 November, 2017; originally announced November 2017.

    Comments: 12 pages, 5 figures, submitted to "Computers and Fluids" as full paper from ParCFD conference entry

  10. arXiv:1702.02111  [pdf, ps, other

    cs.CE math.NA physics.comp-ph

    Rayleigh Quotient Iteration with a Multigrid in Energy Preconditioner for Massively Parallel Neutron Transport

    Authors: R. N. Slaybaugh, T. M. Evans, G. G. Davidson, P. P. H. Wilson

    Abstract: Three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy preconditioner. The multigroup Krylov solver converges more quickly than Gauss Seid… ▽ More

    Submitted 7 February, 2017; originally announced February 2017.

    Comments: arXiv admin note: text overlap with arXiv:1612.00907

    Journal ref: ANS MC2015 Joint International Conference on Mathematics and Computation, Supercomputing in Nuclear Applications and the Monte Carlo Method, Nashville, Tennessee, April 19-23, 2015