Skip to main content

Showing 1–50 of 3,215 results for author: Chris

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.02639  [pdf, ps, other

    cs.LG

    On Efficient Bayesian Exploration in Model-Based Reinforcement Learning

    Authors: Alberto Caron, Chris Hicks, Vasilios Mavroudis

    Abstract: In this work, we address the challenge of data-efficient exploration in reinforcement learning by examining existing principled, information-theoretic approaches to intrinsic motivation. Specifically, we focus on a class of exploration bonuses that targets epistemic uncertainty rather than the aleatoric noise inherent in the environment. We prove that these bonuses naturally signal epistemic infor… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

  2. arXiv:2507.02083  [pdf, ps, other

    cs.AI

    Measuring Scientific Capabilities of Language Models with a Systems Biology Dry Lab

    Authors: Haonan Duan, Stephen Zhewen Lu, Caitlin Fiona Harrigan, Nishkrit Desai, Jiarui Lu, Michał Koziarski, Leonardo Cotta, Chris J. Maddison

    Abstract: Designing experiments and result interpretations are core scientific competencies, particularly in biology, where researchers perturb complex systems to uncover the underlying systems. Recent efforts to evaluate the scientific capabilities of large language models (LLMs) fail to test these competencies because wet-lab experimentation is prohibitively expensive: in expertise, time and equipment. We… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  3. arXiv:2507.02068  [pdf, ps, other

    cs.SE

    How do Software Engineering Candidates Prepare for Technical Interviews?

    Authors: Brian Bell, Teresa Thomas, Sang Won Lee, Chris Brown

    Abstract: To obtain employment, aspiring software engineers must complete technical interviews -- a hiring process which involves candidates writing code while communicating to an audience. However, the complexities of tech interviews are difficult to prepare for and seldom faced in computing curricula. To this end, we seek to understand how candidates prepare for technical interviews, investigating the eff… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  4. Optimising task allocation to balance business goals and worker well-being for financial service workforces

    Authors: Chris Duckworth, Zlatko Zlatev, James Sciberras, Peter Hallett, Enrico Gerding

    Abstract: Purpose: Financial service companies manage huge volumes of data which requires timely error identification and resolution. The associated tasks to resolve these errors frequently put financial analyst workforces under significant pressure leading to resourcing challenges and increased business risk. To address this challenge, we introduce a formal task allocation model which considers both busine… ▽ More

    Submitted 18 June, 2025; originally announced July 2025.

    Comments: Accepted in Journal of Modelling in Management

    Journal ref: ISSN 1746-5664 (2025)

  5. arXiv:2507.01352  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

    Authors: Chris Yuhao Liu, Liang Zeng, Yuzhen Xiao, Jujie He, Jiacai Liu, Chaojie Wang, Rui Yan, Wei Shen, Fuxiang Zhang, Jiacheng Xu, Yang Liu, Yahui Zhou

    Abstract: Despite the critical role of reward models (RMs) in reinforcement learning from human feedback (RLHF), current state-of-the-art open RMs perform poorly on most existing evaluation benchmarks, failing to capture the spectrum of nuanced and sophisticated human preferences. Even approaches that incorporate advanced training techniques have not yielded meaningful performance improvements. We hypothesi… ▽ More

    Submitted 3 July, 2025; v1 submitted 2 July, 2025; originally announced July 2025.

  6. arXiv:2507.01134  [pdf, ps, other

    cs.HC

    Animated Visual Encoding and Layer Blending for Identification of Educational Game Strategies

    Authors: Braden Roper, William Thompson, Chris Weaver

    Abstract: Game-Based Learning has proven to be an effective method for enhancing engagement with educational material. However, gaining a deeper understanding of player strategies remains challenging. Sequential game-state and action-based tracking tools often gather extensive data that can be difficult to interpret as long-term strategy. This data presents unique problems to visualization, as it can be fai… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

    Comments: To be published in IEEE Visualization and Visual Analytics (VIS), 2025

  7. arXiv:2507.00909  [pdf, ps, other

    cs.DC cs.AI cs.PF eess.SY

    Turning AI Data Centers into Grid-Interactive Assets: Results from a Field Demonstration in Phoenix, Arizona

    Authors: Philip Colangelo, Ayse K. Coskun, Jack Megrue, Ciaran Roberts, Shayan Sengupta, Varun Sivaram, Ethan Tiao, Aroon Vijaykar, Chris Williams, Daniel C. Wilson, Zack MacFarland, Daniel Dreiling, Nathan Morey, Anuja Ratnayake, Baskar Vairamohan

    Abstract: Artificial intelligence (AI) is fueling exponential electricity demand growth, threatening grid reliability, raising prices for communities paying for new energy infrastructure, and stunting AI innovation as data centers wait for interconnection to constrained grids. This paper presents the first field demonstration, in collaboration with major corporate partners, of a software-only approach--Emer… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

    Comments: 10 pages, 6 figures, 1 table

  8. arXiv:2507.00866  [pdf, ps, other

    astro-ph.IM cs.LG

    Template-Fitting Meets Deep Learning: Redshift Estimation Using Physics-Guided Neural Networks

    Authors: Jonas Chris Ferrao, Dickson Dias, Pranav Naik, Glory D'Cruz, Anish Naik, Siya Khandeparkar, Manisha Gokuldas Fal Dessai

    Abstract: Accurate photometric redshift estimation is critical for observational cosmology, especially in large-scale surveys where spectroscopic measurements are impractical. Traditional approaches include template fitting and machine learning, each with distinct strengths and limitations. We present a hybrid method that integrates template fitting with deep learning using physics-guided neural networks. B… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  9. Out of the Day Job: Perspectives of Industry Practitioners in Co-Design and Delivery of Software Engineering Courses

    Authors: Gillian Daniel, Chris Hall, Per Hammer, Alec-Angus Macdonald, Hollie Marwick-Best, Emma McKenzie, George Popa, Derek Somerville, Tim Storer

    Abstract: Over more than two decades, The University of Glasgow has co-designed and delivered numerous software engineering focused courses with industry partners, covering both technical and discipline specific professional skills. Such collaborations are not unique and many of the benefits are well recognised in the literature. These include enhancing the real-world relevance of curricula, developing stud… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  10. arXiv:2507.00671  [pdf, ps, other

    stat.CO cs.LG stat.ML

    Harnessing the Power of Reinforcement Learning for Adaptive MCMC

    Authors: Congye Wang, Matthew A. Fisher, Heishiro Kanagawa, Wilson Chen, Chris. J. Oates

    Abstract: Sampling algorithms drive probabilistic machine learning, and recent years have seen an explosion in the diversity of tools for this task. However, the increasing sophistication of sampling algorithms is correlated with an increase in the tuning burden. There is now a greater need than ever to treat the tuning of samplers as a learning task in its own right. In a conceptual breakthrough, Wang et a… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  11. arXiv:2507.00669  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.RO

    Audio-3DVG: Unified Audio - Point Cloud Fusion for 3D Visual Grounding

    Authors: Duc Cao-Dinh, Khai Le-Duc, Anh Dao, Bach Phan Tat, Chris Ngo, Duy M. H. Nguyen, Nguyen X. Khanh, Thanh Nguyen-Tang

    Abstract: 3D Visual Grounding (3DVG) involves localizing target objects in 3D point clouds based on natural language. While prior work has made strides using textual descriptions, leveraging spoken language-known as Audio-based 3D Visual Grounding-remains underexplored and challenging. Motivated by advances in automatic speech recognition (ASR) and speech representation learning, we propose Audio-3DVG, a si… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

    Comments: Work in progress, 42 pages

  12. arXiv:2506.23024  [pdf, other

    cs.LG cs.AI math.NA

    BWLer: Barycentric Weight Layer Elucidates a Precision-Conditioning Tradeoff for PINNs

    Authors: Jerry Liu, Yasa Baig, Denise Hui Jean Lee, Rajat Vadiraj Dwaraknath, Atri Rudra, Chris Ré

    Abstract: Physics-informed neural networks (PINNs) offer a flexible way to solve partial differential equations (PDEs) with machine learning, yet they still fall well short of the machine-precision accuracy many scientific tasks demand. In this work, we investigate whether the precision ceiling comes from the ill-conditioning of the PDEs or from the typical multi-layer perceptron (MLP) architecture. We intr… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

    Comments: Workshop for the Theory of AI for Scientific Computing @ COLT 2025 (Best Paper). 39 pages, 24 figures

  13. arXiv:2506.22694  [pdf, ps, other

    cs.CL

    VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs

    Authors: Raghavv Goel, Sudhanshu Agrawal, Mukul Gagrani, Junyoung Park, Yifan Zao, He Zhang, Tian Liu, Yiping Yang, Xin Yuan, Jiuyan Lu, Chris Lott, Mingu Lee

    Abstract: In this paper, we introduce a simple training-free technique to improve the performance of drafter-based speculative decoding (SpD) methods that incorporates language modeling head (LM head) during drafting process. A drafter-based speculative decoding leverages one or more smaller language models, a.k.a. drafters or draft models, to sample a draft sequence or tree consisting of multiple tokens, f… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: 7 pages, 4 figures, 5 tables, accepted at ICML 2025 workshop on Efficient Systems for Foundational Models

  14. arXiv:2506.22062  [pdf, ps, other

    cs.CL

    MDC-R: The Minecraft Dialogue Corpus with Reference

    Authors: Chris Madge, Maris Camilleri, Paloma Carretero Garcia, Mladen Karan, Juexi Shao, Prashant Jayannavar, Julian Hough, Benjamin Roth, Massimo Poesio

    Abstract: We introduce the Minecraft Dialogue Corpus with Reference (MDC-R). MDC-R is a new language resource that supplements the original Minecraft Dialogue Corpus (MDC) with expert annotations of anaphoric and deictic reference. MDC's task-orientated, multi-turn, situated dialogue in a dynamic environment has motivated multiple annotation efforts, owing to the interesting linguistic phenomena that this s… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  15. arXiv:2506.21648  [pdf, ps, other

    astro-ph.IM astro-ph.EP cs.RO eess.SY

    Advanced System Engineering Approaches to Emerging Challenges in Planetary and Deep-Space Exploration

    Authors: J. de Curtò, Cristina LiCalzi, Julien Tubiana Warin, Jack Gehlert, Brian Langbein, Alexandre Gamboa, Chris Sixbey, William Maguire, Santiago Fernández, Álvaro Maestroarena, Alex Brenchley, Logan Maroclo, Philemon Mercado, Joshua DeJohn, Cesar Velez, Ethan Dahmus, Taylor Steinys, David Fritz, I. de Zarzà

    Abstract: This paper presents innovative solutions to critical challenges in planetary and deep-space exploration electronics. We synthesize findings across diverse mission profiles, highlighting advances in: (1) MARTIAN positioning systems with dual-frequency transmission to achieve $\pm$1m horizontal accuracy; (2) artificial reef platforms for Titan's hydrocarbon seas utilizing specialized sensor arrays a… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  16. arXiv:2506.21538  [pdf, ps, other

    cs.CV cs.IR cs.LG

    Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval

    Authors: Hani Alomari, Anushka Sivakumar, Andrew Zhang, Chris Thomas

    Abstract: Cross-modal image-text retrieval is challenging because of the diverse possible associations between content from different modalities. Traditional methods learn a single-vector embedding to represent semantics of each sample, but struggle to capture nuanced and diverse relationships that can exist across modalities. Set-based approaches, which represent each sample with multiple embeddings, offer… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Accepted at the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025 Main)

  17. arXiv:2506.19863  [pdf, ps, other

    physics.comp-ph cs.AI

    Exploring the Capabilities of the Frontier Large Language Models for Nuclear Energy Research

    Authors: Ahmed Almeldein, Mohammed Alnaggar, Rick Archibald, Tom Beck, Arpan Biswas, Rike Bostelmann, Wes Brewer, Chris Bryan, Christopher Calle, Cihangir Celik, Rajni Chahal, Jong Youl Choi, Arindam Chowdhury, Mark Cianciosa, Franklin Curtis, Gregory Davidson, Sebastian De Pascuale, Lisa Fassino, Ana Gainaru, Yashika Ghai, Luke Gibson, Qian Gong, Christopher Greulich, Scott Greenwood, Cory Hauck , et al. (25 additional authors not shown)

    Abstract: The AI for Nuclear Energy workshop at Oak Ridge National Laboratory evaluated the potential of Large Language Models (LLMs) to accelerate fusion and fission research. Fourteen interdisciplinary teams explored diverse nuclear science challenges using ChatGPT, Gemini, Claude, and other AI models over a single day. Applications ranged from developing foundation models for fusion reactor control to au… ▽ More

    Submitted 26 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

  18. arXiv:2506.19290  [pdf, ps, other

    cs.AI cs.CL

    Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

    Authors: Liang Zeng, Yongcong Li, Yuzhen Xiao, Changshi Li, Chris Yuhao Liu, Rui Yan, Tianwen Wei, Jujie He, Xuchen Song, Yang Liu, Yahui Zhou

    Abstract: Software engineering (SWE) has recently emerged as a crucial testbed for next-generation LLM agents, demanding inherent capabilities in two critical dimensions: sustained iterative problem-solving (e.g., >50 interaction rounds) and long-context dependency resolution (e.g., >32k tokens). However, the data curation process in SWE remains notoriously time-consuming, as it heavily relies on manual ann… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  19. arXiv:2506.18206  [pdf, ps, other

    cs.CE math.NA physics.comp-ph

    Conservative data-driven finite element formulation

    Authors: Adriana Kuliková, Andrei G. Shvarts, Łukasz Kaczmarczyk, Chris J. Pearce

    Abstract: This paper presents a new data-driven finite element framework derived with mixed finite element formulation. The standard approach to diffusion problems requires the solution of the mathematical equations that describe both the conservation law and the constitutive relations, where the latter is traditionally obtained after fitting experimental data to simplified material models. To exploit all a… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  20. arXiv:2506.17035  [pdf

    cs.LG

    Critical Appraisal of Fairness Metrics in Clinical Predictive AI

    Authors: João Matos, Ben Van Calster, Leo Anthony Celi, Paula Dhiman, Judy Wawira Gichoya, Richard D. Riley, Chris Russell, Sara Khalid, Gary S. Collins

    Abstract: Predictive artificial intelligence (AI) offers an opportunity to improve clinical practice and patient outcomes, but risks perpetuating biases if fairness is inadequately addressed. However, the definition of "fairness" remains unclear. We conducted a scoping review to identify and critically appraise fairness metrics for clinical predictive AI. We defined a "fairness metric" as a measure quantify… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 32 pages, 1 figure, 2 tables, 5 boxes, 4 linked supplementary materials

  21. arXiv:2506.16891  [pdf, ps, other

    cs.CR

    Tracker Installations Are Not Created Equal: Understanding Tracker Configuration of Form Data Collection

    Authors: Julia B. Kieserman, Athanasios Andreou, Chris Geeng, Tobias Lauinger, Damon McCoy

    Abstract: Targeted advertising is fueled by the comprehensive tracking of users' online activity. As a result, advertising companies, such as Google and Meta, encourage website administrators to not only install tracking scripts on their websites but configure them to automatically collect users' Personally Identifying Information (PII). In this study, we aim to characterize how Google and Meta's trackers c… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  22. arXiv:2506.16698  [pdf, ps, other

    cs.LG

    SIDE: Semantic ID Embedding for effective learning from sequences

    Authors: Dinesh Ramasamy, Shakti Kumar, Chris Cadonic, Jiaxin Yang, Sohini Roychowdhury, Esam Abdel Rhman, Srihari Reddy

    Abstract: Sequence-based recommendations models are driving the state-of-the-art for industrial ad-recommendation systems. Such systems typically deal with user histories or sequence lengths ranging in the order of O(10^3) to O(10^4) events. While adding embeddings at this scale is manageable in pre-trained models, incorporating them into real-time prediction models is challenging due to both storage and in… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: 7 pages, 4 images, 6 tables

    Journal ref: KDD workshop, 2025

  23. arXiv:2506.16013  [pdf, ps, other

    cs.CE math.ST

    A Fast Iterative Robust Principal Component Analysis Method

    Authors: Timbwaoga Aime Judicael Ouermi, Jixian Li, Chris R. Johnson

    Abstract: Principal Component Analysis (PCA) is widely used for dimensionality reduction and data analysis. However, PCA results are adversely affected by outliers often observed in real-world data. Existing robust PCA methods are often computationally expensive or exhibit limited robustness. In this work, we introduce a Fast Iterative Robust (FIR) PCA method by efficiently estimating the inliers center loc… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  24. arXiv:2506.15975  [pdf, ps, other

    cs.CR cs.CL

    Multi-use LLM Watermarking and the False Detection Problem

    Authors: Zihao Fu, Chris Russell

    Abstract: Digital watermarking is a promising solution for mitigating some of the risks arising from the misuse of automatically generated text. These approaches either embed non-specific watermarks to allow for the detection of any text generated by a particular sampler, or embed specific keys that allow the identification of the LLM user. However, simultaneously using the same embedding for both detection… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  25. arXiv:2506.15889  [pdf, ps, other

    cs.CL

    Entropy-Driven Pre-Tokenization for Byte-Pair Encoding

    Authors: Yifan Hu, Frank Liang, Dachuan Zhao, Jonathan Geuter, Varshini Reddy, Craig W. Schmidt, Chris Tanner

    Abstract: Byte-Pair Encoding (BPE) has become a widely adopted subword tokenization method in modern language models due to its simplicity and strong empirical performance across downstream tasks. However, applying BPE to unsegmented languages such as Chinese presents significant challenges, as its frequency-driven merge operation is agnostic to linguistic boundaries. To address this, we propose two entropy… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  26. arXiv:2506.15020  [pdf, ps, other

    math.AT cs.LG math.CO math.ST

    Data analysis using discrete cubical homology

    Authors: Chris Kapulkin, Nathan Kershaw

    Abstract: We present a new tool for data analysis: persistence discrete homology, which is well-suited to analyze filtrations of graphs. In particular, we provide a novel way of representing high-dimensional data as a filtration of graphs using pairwise correlations. We discuss several applications of these tools, e.g., in weather and financial data, comparing them to the standard methods used in the respec… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 17 pages; comments welcome

    MSC Class: 62R40; 68T09; 05C90; 55U05

  27. arXiv:2506.14723  [pdf, ps, other

    cs.SD cs.AI

    Adaptive Accompaniment with ReaLchords

    Authors: Yusong Wu, Tim Cooijmans, Kyle Kastner, Adam Roberts, Ian Simon, Alexander Scarlatos, Chris Donahue, Cassie Tarakajian, Shayegan Omidshafiei, Aaron Courville, Pablo Samuel Castro, Natasha Jaques, Cheng-Zhi Anna Huang

    Abstract: Jamming requires coordination, anticipation, and collaborative creativity between musicians. Current generative models of music produce expressive output but are not able to generate in an \emph{online} manner, meaning simultaneously with other musicians (human or otherwise). We propose ReaLchords, an online generative model for improvising chord accompaniment to user melody. We start with an onli… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted by ICML 2024

  28. arXiv:2506.13579  [pdf, ps, other

    cs.LG cs.AI cs.CL cs.CV

    Flexible-length Text Infilling for Discrete Diffusion Models

    Authors: Andrew Zhang, Anushka Sivakumar, Chiawei Tang, Chris Thomas

    Abstract: Discrete diffusion models are a new class of text generators that offer advantages such as bidirectional context use, parallelizable generation, and flexible prompting compared to autoregressive models. However, a critical limitation of discrete diffusion models is their inability to perform flexible-length or flexible-position text infilling without access to ground-truth positional data. We intr… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  29. arXiv:2506.13434  [pdf, ps, other

    cs.CR

    From Promise to Peril: Rethinking Cybersecurity Red and Blue Teaming in the Age of LLMs

    Authors: Alsharif Abuadbba, Chris Hicks, Kristen Moore, Vasilios Mavroudis, Burak Hasircioglu, Diksha Goel, Piers Jennings

    Abstract: Large Language Models (LLMs) are set to reshape cybersecurity by augmenting red and blue team operations. Red teams can exploit LLMs to plan attacks, craft phishing content, simulate adversaries, and generate exploit code. Conversely, blue teams may deploy them for threat intelligence synthesis, root cause analysis, and streamlined documentation. This dual capability introduces both transformative… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 10 pages

  30. arXiv:2506.13313  [pdf, ps, other

    cs.CL cs.AI econ.GN

    Large Language Models as 'Hidden Persuaders': Fake Product Reviews are Indistinguishable to Humans and Machines

    Authors: Weiyao Meng, John Harvey, James Goulding, Chris James Carter, Evgeniya Lukinova, Andrew Smith, Paul Frobisher, Mina Forrest, Georgiana Nica-Avram

    Abstract: Reading and evaluating product reviews is central to how most people decide what to buy and consume online. However, the recent emergence of Large Language Models and Generative Artificial Intelligence now means writing fraudulent or fake reviews is potentially easier than ever. Through three studies we demonstrate that (1) humans are no longer able to distinguish between real and fake product rev… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    ACM Class: J.4; I.2.7

  31. arXiv:2506.12949  [pdf, ps, other

    hep-ex cs.AI

    eLog analysis for accelerators: status and future outlook

    Authors: Antonin Sulc, Thorsten Hellert, Aaron Reed, Adam Carpenter, Alex Bien, Chris Tennant, Claudio Bisegni, Daniel Lersch, Daniel Ratner, David Lawrence, Diana McSpadden, Hayden Hoschouer, Jason St. John, Thomas Britton

    Abstract: This work demonstrates electronic logbook (eLog) systems leveraging modern AI-driven information retrieval capabilities at the accelerator facilities of Fermilab, Jefferson Lab, Lawrence Berkeley National Laboratory (LBNL), SLAC National Accelerator Laboratory. We evaluate contemporary tools and methodologies for information retrieval with Retrieval Augmented Generation (RAGs), focusing on operati… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 4 pages, 2 figures, 16th International Particle Accelerator Conference (IPAC'25)

    Report number: THPS048

  32. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  33. arXiv:2506.12071  [pdf, ps, other

    cs.IR

    T$^2$-RAGBench: Text-and-Table Benchmark for Evaluating Retrieval-Augmented Generation

    Authors: Jan Strich, Enes Kutay Isgorur, Maximilian Trescher, Chris Biemann, Martin Semmann

    Abstract: While most financial documents contain a combination of textual and tabular information, robust Retrieval-Augmented Generation (RAG) systems are essential for effectively accessing and reasoning over such content to perform complex numerical tasks. This paper introduces T$^2$-RAGBench, a benchmark comprising 32,908 question-context-answer triples, designed to evaluate RAG methods on real-world fin… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  34. arXiv:2506.11994  [pdf, ps, other

    stat.ML cs.LG math.NA

    Spectral Estimation with Free Decompression

    Authors: Siavash Ameli, Chris van der Heide, Liam Hodgkinson, Michael W. Mahoney

    Abstract: Computing eigenvalues of very large matrices is a critical task in many machine learning applications, including the evaluation of log-determinants, the trace of matrix functions, and other important metrics. As datasets continue to grow in scale, the corresponding covariance and kernel matrices become increasingly large, often reaching magnitudes that make their direct formation impractical or im… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  35. arXiv:2506.11970  [pdf, ps, other

    cs.CR

    CnC-PRAC: Coalesce, not Cache, Per Row Activation Counts for an Efficient in-DRAM Rowhammer Mitigation

    Authors: Chris S. Lin, Jeonghyun Woo, Prashant J. Nair, Gururaj Saileshwar

    Abstract: JEDEC has introduced the Per Row Activation Counting (PRAC) framework for DDR5 and future DRAMs to enable precise counting of DRAM row activations using per-row activation counts. While recent PRAC implementations enable holistic mitigation of Rowhammer attacks, they impose slowdowns of up to 10% due to the increased DRAM timings for performing a read-modify-write of the counter. Alternatively, re… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: 8 pages, including appendices. The paper is presented at DRAMSec 2025. (see https://dramsec.ethz.ch/)

  36. arXiv:2506.11860  [pdf

    eess.IV cs.AI cs.CV cs.NE

    MindGrab for BrainChop: Fast and Accurate Skull Stripping for Command Line and Browser

    Authors: Armina Fani, Mike Doan, Isabelle Le, Alex Fedorov, Malte Hoffmann, Chris Rorden, Sergey Plis

    Abstract: We developed MindGrab, a parameter- and memory-efficient deep fully-convolutional model for volumetric skull-stripping in head images of any modality. Its architecture, informed by a spectral interpretation of dilated convolutions, was trained exclusively on modality-agnostic synthetic data. MindGrab was evaluated on a retrospective dataset of 606 multimodal adult-brain scans (T1, T2, DWI, MRA, PD… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: 12 pages, 1 table, 4 figures. 2 supplementary tables, 1 supplementary figure. Brainchop-cli: https://pypi.org/project/brainchop/ . Brainchop web: https://brainchop.org/

  37. arXiv:2506.11820  [pdf, ps, other

    cs.CV cs.CL

    Rethinking Multilingual Vision-Language Translation: Dataset, Evaluation, and Adaptation

    Authors: Xintong Wang, Jingheng Pan, Yixiao Liu, Xiaohu Zhao, Chenyang Lyu, Minghao Wu, Chris Biemann, Longyue Wang, Linlong Xu, Weihua Luo, Kaifu Zhang

    Abstract: Vision-Language Translation (VLT) is a challenging task that requires accurately recognizing multilingual text embedded in images and translating it into the target language with the support of visual context. While recent Large Vision-Language Models (LVLMs) have demonstrated strong multilingual and visual understanding capabilities, there is a lack of systematic evaluation and understanding of t… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  38. arXiv:2506.11684  [pdf, ps, other

    cs.CV cs.AI

    MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space

    Authors: Anshul Singh, Chris Biemann, Jan Strich

    Abstract: Vision-Language Models (VLMs) have demonstrated remarkable capabilities in interpreting visual layouts and text. However, a significant challenge remains in their ability to interpret robustly and reason over multi-tabular data presented as images, a common occurrence in real-world scenarios like web pages and digital documents. Existing benchmarks typically address single tables or non-visual dat… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  39. arXiv:2506.10910  [pdf, ps, other

    cs.CL

    Magistral

    Authors: Mistral-AI, :, Abhinav Rastogi, Albert Q. Jiang, Andy Lo, Gabrielle Berrada, Guillaume Lample, Jason Rute, Joep Barmentlo, Karmesh Yadav, Kartik Khandelwal, Khyathi Raghavi Chandu, Léonard Blier, Lucile Saulnier, Matthieu Dinot, Maxime Darrin, Neha Gupta, Roman Soletskyi, Sagar Vaze, Teven Le Scao, Yihan Wang, Adam Yang, Alexander H. Liu, Alexandre Sablayrolles, Amélie Héliou , et al. (76 additional authors not shown)

    Abstract: We introduce Magistral, Mistral's first reasoning model and our own scalable reinforcement learning (RL) pipeline. Instead of relying on existing implementations and RL traces distilled from prior models, we follow a ground up approach, relying solely on our own models and infrastructure. Notably, we demonstrate a stack that enabled us to explore the limits of pure RL training of LLMs, present a s… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  40. arXiv:2506.10789  [pdf, ps, other

    cs.CY cs.CL

    FASCIST-O-METER: Classifier for Neo-fascist Discourse Online

    Authors: Rudy Alexandro Garrido Veliz, Martin Semmann, Chris Biemann, Seid Muhie Yimam

    Abstract: Neo-fascism is a political and societal ideology that has been having remarkable growth in the last decade in the United States of America (USA), as well as in other Western societies. It poses a grave danger to democracy and the minorities it targets, and it requires active actions against it to avoid escalation. This work presents the first-of-its-kind neo-fascist coding scheme for digital disco… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  41. arXiv:2506.10383  [pdf, ps, other

    cs.RO eess.SY

    RICE: Reactive Interaction Controller for Cluttered Canopy Environment

    Authors: Nidhi Homey Parayil, Thierry Peynot, Chris Lehnert

    Abstract: Robotic navigation in dense, cluttered environments such as agricultural canopies presents significant challenges due to physical and visual occlusion caused by leaves and branches. Traditional vision-based or model-dependent approaches often fail in these settings, where physical interaction without damaging foliage and branches is necessary to reach a target. We present a novel reactive controll… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: This work has been submitted to the IEEE RAL for possible publication

  42. arXiv:2506.09707  [pdf, ps, other

    eess.AS cs.CL cs.HC

    Fine-Tuning Large Audio-Language Models with LoRA for Precise Temporal Localization of Prolonged Exposure Therapy Elements

    Authors: Suhas BN, Andrew M. Sherrill, Jyoti Alaparthi, Dominik Mattioli, Rosa I. Arriaga, Chris W. Wiese, Saeed Abdullah

    Abstract: Prolonged Exposure (PE) therapy is an effective treatment for post-traumatic stress disorder (PTSD), but evaluating therapist fidelity remains labor-intensive due to the need for manual review of session recordings. We present a method for the automatic temporal localization of key PE fidelity elements -- identifying their start and stop times -- directly from session audio and transcripts. Our ap… ▽ More

    Submitted 19 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

    Comments: 5 pages, 2 figures

    MSC Class: 68T07 ACM Class: I.2.7; I.5.4; H.5.2

  43. arXiv:2506.09450  [pdf, ps, other

    cs.CL cs.AI

    UniToMBench: Integrating Perspective-Taking to Improve Theory of Mind in LLMs

    Authors: Prameshwar Thiyagarajan, Vaishnavi Parimi, Shamant Sai, Soumil Garg, Zhangir Meirbek, Nitin Yarlagadda, Kevin Zhu, Chris Kim

    Abstract: Theory of Mind (ToM), the ability to understand the mental states of oneself and others, remains a challenging area for large language models (LLMs), which often fail to predict human mental states accurately. In this paper, we introduce UniToMBench, a unified benchmark that integrates the strengths of SimToM and TOMBENCH to systematically improve and assess ToM capabilities in LLMs by integrating… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: Accepted at Conference of the North American Chapter of the Association for Computational Linguistics, Student Research Workshop 2025 (NAACL SRW 2025)

  44. arXiv:2506.09381  [pdf

    cs.CL

    Binary classification for perceived quality of headlines and links on worldwide news websites, 2018-2024

    Authors: Austin McCutcheon, Thiago E. A. de Oliveira, Aleksandr Zheleznov, Chris Brogly

    Abstract: The proliferation of online news enables potential widespread publication of perceived low-quality news headlines/links. As a result, we investigated whether it was possible to automatically distinguish perceived lower-quality news headlines/links from perceived higher-quality headlines/links. We evaluated twelve machine learning models on a binary, balanced dataset of 57,544,214 worldwide news we… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  45. arXiv:2506.09102  [pdf, ps, other

    cs.CY cs.AI

    Revolutionizing Clinical Trials: A Manifesto for AI-Driven Transformation

    Authors: Mihaela van der Schaar, Richard Peck, Eoin McKinney, Jim Weatherall, Stuart Bailey, Justine Rochon, Chris Anagnostopoulos, Pierre Marquet, Anthony Wood, Nicky Best, Harry Amad, Julianna Piskorz, Krzysztof Kacprzyk, Rafik Salama, Christina Gunther, Francesca Frau, Antoine Pugeat, Ramon Hernandez

    Abstract: This manifesto represents a collaborative vision forged by leaders in pharmaceuticals, consulting firms, clinical research, and AI. It outlines a roadmap for two AI technologies - causal inference and digital twins - to transform clinical trials, delivering faster, safer, and more personalized outcomes for patients. By focusing on actionable integration within existing regulatory frameworks, we pr… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  46. arXiv:2506.08514  [pdf, ps, other

    cs.LG

    DiffGradCAM: A Universal Class Activation Map Resistant to Adversarial Training

    Authors: Jacob Piland, Chris Sweet, Adam Czakja

    Abstract: Class Activation Mapping (CAM) and its gradient-based variants (e.g., GradCAM) have become standard tools for explaining Convolutional Neural Network (CNN) predictions. However, these approaches typically focus on individual logits, while for neural networks using softmax, the class membership probability estimates depend \textit{only} on the \textit{differences} between logits, not on their absol… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  47. arXiv:2506.08437  [pdf, ps, other

    cs.LO

    Forward and Backward Simulations for Partially Observable Probability

    Authors: Chris Chen, Annabelle McIver, Carroll Morgan

    Abstract: Data refinement is the standard extension of a refinement relation from programs to datatypes (i.e. a behavioural subtyping relation). Forward/backward simulations provide a tractable method for establishing data refinement, and have been thoroughly studied for nondeterministic programs. However, for standard models of mixed probability and nondeterminism, ordinary assignment statements may not co… ▽ More

    Submitted 29 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

  48. arXiv:2506.08173  [pdf, ps, other

    cs.SE cs.AI

    Repeton: Structured Bug Repair with ReAct-Guided Patch-and-Test Cycles

    Authors: Nguyen Phu Vinh, Anh Chung Hoang, Chris Ngo, Truong-Son Hy

    Abstract: Large Language Models (LLMs) have shown strong capabilities in code generation and comprehension, yet their application to complex software engineering tasks often suffers from low precision and limited interpretability. We present Repeton, a fully open-source framework that leverages LLMs for precise and automated code manipulation in real-world Git repositories. Rather than generating holistic f… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  49. arXiv:2506.07940  [pdf, other

    cs.AI cs.LG

    Gradients: When Markets Meet Fine-tuning -- A Distributed Approach to Model Optimisation

    Authors: Christopher Subia-Waud

    Abstract: Foundation model fine-tuning faces a fundamental challenge: existing AutoML platforms rely on single optimisation strategies that explore only a fraction of viable hyperparameter configurations. In this white paper, We introduce Gradients, a decentralised AutoML platform that transforms hyperparameter optimisation into a competitive marketplace where independent miners compete to discover optimal… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  50. arXiv:2506.07639  [pdf, ps, other

    cs.RO

    Fast ECoT: Efficient Embodied Chain-of-Thought via Thoughts Reuse

    Authors: Zhekai Duan, Yuan Zhang, Shikai Geng, Gaowen Liu, Joschka Boedecker, Chris Xiaoxuan Lu

    Abstract: Embodied Chain-of-Thought (ECoT) reasoning enhances vision-language-action (VLA) models by improving performance and interpretability through intermediate reasoning steps. However, its sequential autoregressive token generation introduces significant inference latency, limiting real-time deployment. We propose Fast ECoT, an inference-time acceleration method that exploits the structured and repeti… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.