Skip to main content

Showing 1–50 of 21,312 results for author: Michael

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.15507  [pdf, ps, other

    cs.LG cs.AI

    Over-squashing in Spatiotemporal Graph Neural Networks

    Authors: Ivan Marisca, Jacob Bamberger, Cesare Alippi, Michael M. Bronstein

    Abstract: Graph Neural Networks (GNNs) have achieved remarkable success across various domains. However, recent theoretical advances have identified fundamental limitations in their information propagation capabilities, such as over-squashing, where distant nodes fail to effectively exchange information. While extensively studied in static contexts, this issue remains unexplored in Spatiotemporal GNNs (STGN… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  2. PSM: Policy Synchronised Deterministic Memory

    Authors: Michael Mendler, Marc Pouzet

    Abstract: Concurrency and determinacy do not go well with each other when resources must be shared. Haskell provides parallel programming abstractions such as IVar and LVar in the Par monad and concurrent abstractions such as MVar and TVar in the in IO and STM monads, respectively. The former are determinate but have no destructive updates and the latter have destructive updates but do not guarantee determi… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: This report summarises work on coding the theory of policy-synchronised memory (see https://rdcu.be/erBwl) in Haskell. This was developed for a graduate level course on Functional Reactive Programming taught at Bamberg University by the first author during 2020-2023. An early version of the PSM library had been presented at the SYNCHRON Workshop (Aussois, France), November 2019

  3. arXiv:2506.15405  [pdf, ps, other

    cs.CE physics.comp-ph

    Simulation of parametrized cardiac electrophysiology in three dimensions using physics-informed neural networks

    Authors: Roshan Antony Gomez, Julien Stöcker, Barış Cansız, Michael Kaliske

    Abstract: Physics-informed neural networks (PINNs) are extensively used to represent various physical systems across multiple scientific domains. The same can be said for cardiac electrophysiology, wherein fully-connected neural networks (FCNNs) have been employed to predict the evolution of an action potential in a 2D space following the two-parameter phenomenological Aliev-Panfilov (AP) model. In this pap… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  4. arXiv:2506.15383  [pdf, ps, other

    cs.LG q-bio.GN

    Global Ground Metric Learning with Applications to scRNA data

    Authors: Damin Kühn, Michael T. Schaub

    Abstract: Optimal transport provides a robust framework for comparing probability distributions. Its effectiveness is significantly influenced by the choice of the underlying ground metric. Traditionally, the ground metric has either been (i) predefined, e.g., as the Euclidean distance, or (ii) learned in a supervised way, by utilizing labeled data to learn a suitable ground metric for enhanced task-specifi… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: This method is provided as a Python package on PyPI, see https://github.com/DaminK/ggml-ot

    Journal ref: Proceedings of The 28th International Conference on Artificial Intelligence and Statistics (AISTATS), 2025, PMLR 258:3295-3303

  5. arXiv:2506.15316  [pdf

    cs.AR cs.AI

    J3DAI: A tiny DNN-Based Edge AI Accelerator for 3D-Stacked CMOS Image Sensor

    Authors: Benoit Tain, Raphael Millet, Romain Lemaire, Michal Szczepanski, Laurent Alacoque, Emmanuel Pluchart, Sylvain Choisnet, Rohit Prasad, Jerome Chossat, Pascal Pierunek, Pascal Vivet, Sebastien Thuries

    Abstract: This paper presents J3DAI, a tiny deep neural network-based hardware accelerator for a 3-layer 3D-stacked CMOS image sensor featuring an artificial intelligence (AI) chip integrating a Deep Neural Network (DNN)-based accelerator. The DNN accelerator is designed to efficiently perform neural network tasks such as image classification and segmentation. This paper focuses on the digital system of J3D… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: Preprint from ISLPED 2025. 979-8-3315-2710-5/25/$31.00 \c{opyright}2025 IEEE

  6. arXiv:2506.15041  [pdf, ps, other

    econ.GN cs.CL

    Identifying economic narratives in large text corpora -- An integrated approach using Large Language Models

    Authors: Tobias Schmidt, Kai-Robin Lange, Matthias Reccius, Henrik Müller, Michael Roos, Carsten Jentsch

    Abstract: As interest in economic narratives has grown in recent years, so has the number of pipelines dedicated to extracting such narratives from texts. Pipelines often employ a mix of state-of-the-art natural language processing techniques, such as BERT, to tackle this task. While effective on foundational linguistic operations essential for narrative extraction, such models lack the deeper semantic unde… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 53 pages, 5 figures

  7. arXiv:2506.14970  [pdf, ps, other

    eess.IV cs.CV cs.LG

    NeuroMoE: A Transformer-Based Mixture-of-Experts Framework for Multi-Modal Neurological Disorder Classification

    Authors: Wajih Hassan Raza, Aamir Bader Shah, Yu Wen, Yidan Shen, Juan Diego Martinez Lemus, Mya Caryn Schiess, Timothy Michael Ellmore, Renjie Hu, Xin Fu

    Abstract: The integration of multi-modal Magnetic Resonance Imaging (MRI) and clinical data holds great promise for enhancing the diagnosis of neurological disorders (NDs) in real-world clinical settings. Deep Learning (DL) has recently emerged as a powerful tool for extracting meaningful patterns from medical data to aid in diagnosis. However, existing DL approaches struggle to effectively leverage multi-m… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted at the 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society

  8. arXiv:2506.14923  [pdf, ps, other

    physics.geo-ph cs.AI cs.LG

    Forecasting the spatiotemporal evolution of fluid-induced microearthquakes with deep learning

    Authors: Jaehong Chung, Michael Manga, Timothy Kneafsey, Tapan Mukerji, Mengsu Hu

    Abstract: Microearthquakes (MEQs) generated by subsurface fluid injection record the evolving stress state and permeability of reservoirs. Forecasting their full spatiotemporal evolution is therefore critical for applications such as enhanced geothermal systems (EGS), CO$_2$ sequestration and other geo-engineering applications. We present a transformer-based deep learning model that ingests hydraulic stimul… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  9. arXiv:2506.14922  [pdf, ps, other

    cs.CY cs.LG

    FORTRESS: Frontier Risk Evaluation for National Security and Public Safety

    Authors: Christina Q. Knight, Kaustubh Deshpande, Ved Sirdeshmukh, Meher Mankikar, Scale Red Team, SEAL Research Team, Julian Michael

    Abstract: The rapid advancement of large language models (LLMs) introduces dual-use capabilities that could both threaten and bolster national security and public safety (NSPS). Models implement safeguards to protect against potential misuse relevant to NSPS and allow for benign users to receive helpful information. However, current benchmarks often fail to test safeguard robustness to potential NSPS risks… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 12 pages, 7 figures, submitted to NeurIPS

  10. arXiv:2506.14861  [pdf, ps, other

    q-bio.GN cs.AI q-bio.QM

    BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models

    Authors: Bharath Dandala, Michael M. Danziger, Ella Barkan, Tanwi Biswas, Viatcheslav Gurev, Jianying Hu, Matthew Madgwick, Akira Koseki, Tal Kozlovski, Michal Rosen-Zvi, Yishai Shimoni, Ching-Huei Tsou

    Abstract: Transcriptomic foundation models (TFMs) have recently emerged as powerful tools for analyzing gene expression in cells and tissues, supporting key tasks such as cell-type annotation, batch correction, and perturbation prediction. However, the diversity of model implementations and training strategies across recent TFMs, though promising, makes it challenging to isolate the contribution of individu… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  11. arXiv:2506.14855  [pdf, other

    cs.RO cs.AI

    Feedback-MPPI: Fast Sampling-Based MPC via Rollout Differentiation -- Adios low-level controllers

    Authors: Tommaso Belvedere, Michael Ziegltrum, Giulio Turrisi, Valerio Modugno

    Abstract: Model Predictive Path Integral control is a powerful sampling-based approach suitable for complex robotic tasks due to its flexibility in handling nonlinear dynamics and non-convex costs. However, its applicability in real-time, highfrequency robotic control scenarios is limited by computational demands. This paper introduces Feedback-MPPI (F-MPPI), a novel framework that augments standard MPPI by… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  12. arXiv:2506.14852  [pdf, ps, other

    cs.DC cs.AI cs.CL cs.LG cs.PF

    Cost-Efficient Serving of LLM Agents via Test-Time Plan Caching

    Authors: Qizheng Zhang, Michael Wornow, Kunle Olukotun

    Abstract: LLM-based agentic applications have shown increasingly remarkable capabilities in complex workflows but incur substantial costs due to extensive planning and reasoning requirements. Existing LLM caching techniques (like context caching and semantic caching), primarily designed for serving chatbots, are insufficient for agentic applications where outputs depend on external data or environmental con… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 23 pages

  13. arXiv:2506.14754  [pdf, ps, other

    cs.RO

    Tactile Beyond Pixels: Multisensory Touch Representations for Robot Manipulation

    Authors: Carolina Higuera, Akash Sharma, Taosha Fan, Chaithanya Krishna Bodduluri, Byron Boots, Michael Kaess, Mike Lambeta, Tingfan Wu, Zixi Liu, Francois Robert Hogan, Mustafa Mukadam

    Abstract: We present Sparsh-X, the first multisensory touch representations across four tactile modalities: image, audio, motion, and pressure. Trained on ~1M contact-rich interactions collected with the Digit 360 sensor, Sparsh-X captures complementary touch signals at diverse temporal and spatial scales. By leveraging self-supervised learning, Sparsh-X fuses these modalities into a unified representation… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  14. arXiv:2506.14652  [pdf, ps, other

    cs.CY cs.AI cs.LG

    Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor

    Authors: Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn, Angelina Wang, Fernando Diaz, Flavio du Pin Calmon, Margaret Mitchell, Michael Ekstrand, Reuben Binns, Solon Barocas

    Abstract: In AI research and practice, rigor remains largely understood in terms of methodological rigor -- such as whether mathematical, statistical, or computational methods are correctly applied. We argue that this narrow conception of rigor has contributed to the concerns raised by the responsible AI community, including overblown claims about AI capabilities. Our position is that a broader conception o… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 20 pages, 1 figure, 1 table

  15. arXiv:2506.14605  [pdf, ps, other

    cs.CV cs.LG eess.IV

    Unsupervised Imaging Inverse Problems with Diffusion Distribution Matching

    Authors: Giacomo Meanti, Thomas Ryckeboer, Michael Arbel, Julien Mairal

    Abstract: This work addresses image restoration tasks through the lens of inverse problems using unpaired datasets. In contrast to traditional approaches -- which typically assume full knowledge of the forward model or access to paired degraded and ground-truth images -- the proposed method operates under minimal assumptions and relies only on small, unpaired datasets. This makes it particularly well-suited… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Code available at https://github.com/inria-thoth/ddm4ip

  16. arXiv:2506.14544  [pdf, ps, other

    cs.GT

    Infinite lexicographic products of positional objectives

    Authors: Antonio Casares, Pierre Ohlmann, Michał Skrzypczak, Igor Walukiewicz

    Abstract: This paper contributes to the study of positional determinacy of infinite duration games played on potentially infinite graphs. Recently, [Ohlmann, TheoretiCS 2023] established that positionality of prefix-independent objectives is preserved by finite lexicographic products. We propose two different notions of infinite lexicographic products indexed by arbitrary ordinals, and extend Ohlmann's resu… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  17. arXiv:2506.14467  [pdf, ps, other

    cs.RO

    Automatic Cannulation of Femoral Vessels in a Porcine Shock Model

    Authors: Nico Zevallos, Cecilia G. Morales, Andrew Orekhov, Tejas Rane, Hernando Gomez, Francis X. Guyette, Michael R. Pinsky, John Galeotti, Artur Dubrawski, Howie Choset

    Abstract: Rapid and reliable vascular access is critical in trauma and critical care. Central vascular catheterization enables high-volume resuscitation, hemodynamic monitoring, and advanced interventions like ECMO and REBOA. While peripheral access is common, central access is often necessary but requires specialized ultrasound-guided skills, posing challenges in prehospital settings. The complexity arises… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: 2 pages, 2 figures, conference

    Journal ref: Hamlyn Symposium on Medical Robotics 2025

  18. arXiv:2506.14432  [pdf, ps, other

    eess.IV cs.CV

    A large-scale heterogeneous 3D magnetic resonance brain imaging dataset for self-supervised learning

    Authors: Asbjørn Munk, Stefano Cerri, Jakob Ambsdorf, Julia Machnio, Sebastian Nørgaard Llambias, Vardan Nersesjan, Christian Hedeager Krag, Peirong Liu, Pablo Rocamora García, Mostafa Mehdipour Ghazi, Mikael Boesen, Michael Eriksen Benros, Juan Eugenio Iglesias, Mads Nielsen

    Abstract: We present FOMO60K, a large-scale, heterogeneous dataset of 60,529 brain Magnetic Resonance Imaging (MRI) scans from 13,900 sessions and 11,187 subjects, aggregated from 16 publicly available sources. The dataset includes both clinical- and research-grade images, multiple MRI sequences, and a wide range of anatomical and pathological variability, including scans with large brain anomalies. Minimal… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  19. arXiv:2506.14400  [pdf, ps, other

    cs.LG cs.CY

    One Size Fits None: Rethinking Fairness in Medical AI

    Authors: Roland Roller, Michael Hahn, Ajay Madhavan Ravichandran, Bilgin Osmanodja, Florian Oetke, Zeineb Sassi, Aljoscha Burchardt, Klaus Netter, Klemens Budde, Anne Herrmann, Tobias Strapatsas, Peter Dabrock, Sebastian Möller

    Abstract: Machine learning (ML) models are increasingly used to support clinical decision-making. However, real-world medical datasets are often noisy, incomplete, and imbalanced, leading to performance disparities across patient subgroups. These differences raise fairness concerns, particularly when they reinforce existing disadvantages for marginalized groups. In this work, we analyze several medical pred… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted at the 6th Workshop on Gender Bias in Natural Language Processing at ACL 2025

  20. arXiv:2506.14386  [pdf, ps, other

    cs.LG cs.AI

    ResNets Are Deeper Than You Think

    Authors: Christian H. X. Ali Mehmeti-Göpel, Michael Wand

    Abstract: Residual connections remain ubiquitous in modern neural network architectures nearly a decade after their introduction. Their widespread adoption is often credited to their dramatically improved trainability: residual networks train faster, more stably, and achieve higher accuracy than their feedforward counterparts. While numerous techniques, ranging from improved initialization to advanced learn… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: NeurIPS 2025 Submission

  21. arXiv:2506.14374  [pdf, ps, other

    cs.CR cs.LG

    Excessive Reasoning Attack on Reasoning LLMs

    Authors: Wai Man Si, Mingjie Li, Michael Backes, Yang Zhang

    Abstract: Recent reasoning large language models (LLMs), such as OpenAI o1 and DeepSeek-R1, exhibit strong performance on complex tasks through test-time inference scaling. However, prior studies have shown that these models often incur significant computational costs due to excessive reasoning, such as frequent switching between reasoning trajectories (e.g., underthinking) or redundant reasoning on simple… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  22. arXiv:2506.14291  [pdf, ps, other

    cs.LG cs.SI stat.ML

    Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models

    Authors: Ben Finkelshtein, İsmail İlkan Ceylan, Michael Bronstein, Ron Levie

    Abstract: Graph machine learning architectures are typically tailored to specific tasks on specific datasets, which hinders their broader applicability. This has led to a new quest in graph machine learning: how to build graph foundation models capable of generalizing across arbitrary graphs and features? In this work, we present a recipe for designing graph foundation models for node-level tasks from first… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  23. arXiv:2506.14223  [pdf, ps, other

    cs.SD cs.CL cs.MM eess.AS

    Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription

    Authors: Anna Hamberger, Sebastian Murgul, Jochen Schmidt, Michael Heizmann

    Abstract: Music transcription plays a pivotal role in Music Information Retrieval (MIR), particularly for stringed instruments like the guitar, where symbolic music notations such as MIDI lack crucial playability information. This contribution introduces the Fretting-Transformer, an encoderdecoder model that utilizes a T5 transformer architecture to automate the transcription of MIDI sequences into guitar t… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

    Comments: Accepted to the 50th International Computer Music Conference (ICMC), 2025

  24. arXiv:2506.14111  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Essential-Web v1.0: 24T tokens of organized web data

    Authors: Essential AI, :, Andrew Hojel, Michael Pust, Tim Romanski, Yash Vanjani, Ritvik Kapila, Mohit Parmar, Adarsh Chaluvaraju, Alok Tripathy, Anil Thomas, Ashish Tanwer, Darsh J Shah, Ishaan Shah, Karl Stratos, Khoi Nguyen, Kurt Smith, Michael Callahan, Peter Rushton, Philip Monk, Platon Mazarakis, Saad Jamal, Saurabh Srivastava, Somanshu Singla, Ashish Vaswani

    Abstract: Data plays the most prominent role in how language models acquire skills and knowledge. The lack of massive, well-organized pre-training datasets results in costly and inaccessible data pipelines. We present Essential-Web v1.0, a 24-trillion-token dataset in which every document is annotated with a twelve-category taxonomy covering topic, format, content complexity, and quality. Taxonomy labels ar… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  25. arXiv:2506.13998  [pdf, ps, other

    cs.DC

    DAGs for the Masses

    Authors: Michael Anoprenko, Andrei Tonkikh, Alexander Spiegelman, Petr Kuznetsov

    Abstract: A recent approach to building consensus protocols on top of Directed Acyclic Graphs (DAGs) shows much promise due to its simplicity and stable throughput. However, as each node in the DAG typically includes a linear number of references to the nodes in the previous round, prior DAG protocols only scale up to a certain point when the overhead of maintaining the graph becomes the bottleneck. To en… ▽ More

    Submitted 18 June, 2025; v1 submitted 16 June, 2025; originally announced June 2025.

  26. arXiv:2506.13996  [pdf, ps, other

    cs.LG

    Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences

    Authors: Stas Bekman, Samyam Rajbhandari, Michael Wyatt, Jeff Rasley, Tunji Ruwase, Zhewei Yao, Aurick Qiao, Yuxiong He

    Abstract: Long sequences are critical for applications like RAG, long document summarization, multi-modality, etc., and modern LLMs, like Llama 4 Scout, support max sequence length of up to 10 million tokens. However, outside of enterprise labs, long sequence training is challenging for the AI community with limited system support in the open-source space. Out-of-box, even on a modern NVIDIA H100 80GB GPU… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 19 pages, 13 figures

  27. arXiv:2506.13974  [pdf, ps, other

    cs.LG

    Constant Stepsize Local GD for Logistic Regression: Acceleration by Instability

    Authors: Michael Crawshaw, Blake Woodworth, Mingrui Liu

    Abstract: Existing analysis of Local (Stochastic) Gradient Descent for heterogeneous objectives requires stepsizes $η\leq 1/K$ where $K$ is the communication interval, which ensures monotonic decrease of the objective. In contrast, we analyze Local Gradient Descent for logistic regression with separable, heterogeneous data using any stepsize $η> 0$. With $R$ communication rounds and $M$ clients, we show con… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: ICML 2025

  28. arXiv:2506.13905  [pdf, ps, other

    cs.AR

    Spec2RTL-Agent: Automated Hardware Code Generation from Complex Specifications Using LLM Agent Systems

    Authors: Zhongzhi Yu, Mingjie Liu, Michael Zimmer, Yingyan Celine Lin, Yong Liu, Haoxing Ren

    Abstract: Despite recent progress in generating hardware RTL code with LLMs, existing solutions still suffer from a substantial gap between practical application scenarios and the requirements of real-world RTL code development. Prior approaches either focus on overly simplified hardware descriptions or depend on extensive human guidance to process complex specifications, limiting their scalability and auto… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  29. arXiv:2506.13820  [pdf, ps, other

    cs.SE cs.AI

    Structured Program Synthesis using LLMs: Results and Insights from the IPARC Challenge

    Authors: Shraddha Surana, Ashwin Srinivasan, Michael Bain

    Abstract: The IPARC Challenge, inspired by ARC, provides controlled program synthesis tasks over synthetic images to evaluate automatic program construction, focusing on sequence, selection, and iteration. This set of 600 tasks has resisted automated solutions. This paper presents a structured inductive programming approach with LLMs that successfully solves tasks across all IPARC categories. The controlled… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  30. arXiv:2506.13776  [pdf, ps, other

    cs.AI cs.CY cs.HC

    Recommendations and Reporting Checklist for Rigorous & Transparent Human Baselines in Model Evaluations

    Authors: Kevin L. Wei, Patricia Paskov, Sunishchal Dev, Michael J. Byun, Anka Reuel, Xavier Roberts-Gaal, Rachel Calcott, Evie Coxon, Chinmay Deshpande

    Abstract: In this position paper, we argue that human baselines in foundation model evaluations must be more rigorous and more transparent to enable meaningful comparisons of human vs. AI performance, and we provide recommendations and a reporting checklist towards this end. Human performance baselines are vital for the machine learning community, downstream users, and policymakers to interpret AI evaluatio… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: A version of this paper has been accepted to ICML 2025 as a position paper (spotlight), with the title: "Position: Human Baselines in Model Evaluations Need Rigor and Transparency (With Recommendations & Reporting Checklist)."

  31. arXiv:2506.13650  [pdf, ps, other

    eess.SY cs.GT cs.MA

    Deceptive Path Planning: A Bayesian Game Approach

    Authors: Violetta Rostobaya, James Berneburg, Yue Guan, Michael Dorothy, Daigo Shishika

    Abstract: This paper investigates how an autonomous agent can transmit information through its motion in an adversarial setting. We consider scenarios where an agent must reach its goal while deceiving an intelligent observer about its destination. We model this interaction as a dynamic Bayesian game between a mobile Attacker with a privately known goal and a Defender who infers the Attacker's intent to all… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 8 pages, 9 figures. This work has been submitted to the IEEE for possible publication

  32. arXiv:2506.13494  [pdf, ps, other

    cs.CR

    Watermarking LLM-Generated Datasets in Downstream Tasks

    Authors: Yugeng Liu, Tianshuo Cong, Michael Backes, Zheng Li, Yang Zhang

    Abstract: Large Language Models (LLMs) have experienced rapid advancements, with applications spanning a wide range of fields, including sentiment classification, review generation, and question answering. Due to their efficiency and versatility, researchers and companies increasingly employ LLM-generated data to train their models. However, the inability to track content produced by LLMs poses a significan… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  33. arXiv:2506.13484  [pdf, ps, other

    cs.CV eess.IV

    Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis

    Authors: Martina Pastorino, Michael Alibani, Nicola Acito, Gabriele Moser

    Abstract: This paper presents a novel methodology for generating realistic abundance maps from hyperspectral imagery using an unsupervised, deep-learning-driven approach. Our framework integrates blind linear hyperspectral unmixing with state-of-the-art diffusion models to enhance the realism and diversity of synthetic abundance maps. First, we apply blind unmixing to extract endmembers and abundance maps d… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: CVPRw2025

  34. arXiv:2506.13479  [pdf, ps, other

    cs.CL cs.AI

    Position: Pause Recycling LoRAs and Prioritize Mechanisms to Uncover Limits and Effectiveness

    Authors: Mei-Yen Chen, Thi Thu Uyen Hoang, Michael Hahn, M. Saquib Sarfraz

    Abstract: Merging or routing low-rank adapters (LoRAs) has emerged as a popular solution for enhancing large language models, particularly when data access is restricted by regulatory or domain-specific constraints. This position paper argues that the research community should shift its focus from developing new merging or routing algorithms to understanding the conditions under which reusing LoRAs is truly… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  35. arXiv:2506.13189  [pdf, ps, other

    cs.HC cs.RO

    Multimodal "Puppeteer": An Exploration of Robot Teleoperation Via Virtual Counterpart with LLM-Driven Voice and Gesture Interaction in Augmented Reality

    Authors: Yuchong Zhang, Bastian Orthmann, Shichen Ji, Michael Welle, Jonne Van Haastregt, Danica Kragic

    Abstract: The integration of robotics and augmented reality (AR) holds transformative potential for advancing human-robot interaction (HRI), offering enhancements in usability, intuitiveness, accessibility, and collaborative task performance. This paper introduces and evaluates a novel multimodal AR-based robot puppeteer framework that enables intuitive teleoperation via virtual counterpart through large la… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: This work has been submitted to the IEEE TVCG for possible publication

  36. arXiv:2506.13139  [pdf, ps, other

    stat.ML cs.LG

    Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

    Authors: Zhenyu Liao, Michael W. Mahoney

    Abstract: Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data and rely on overparameterized models, where classical low-dimensional intuitions break down. In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large and comparable, gives rise to novel and sometimes counterintuitive behaviors. This p… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: 30 pages, 6 figures

  37. arXiv:2506.13134  [pdf, ps, other

    quant-ph cs.AI

    Quantum AGI: Ontological Foundations

    Authors: Elija Perrier, Michael Timothy Bennett

    Abstract: We examine the implications of quantum foundations for AGI, focusing on how seminal results such as Bell's theorems (non-locality), the Kochen-Specker theorem (contextuality) and no-cloning theorem problematise practical implementation of AGI in quantum settings. We introduce a novel information-theoretic taxonomy distinguishing between classical AGI and quantum AGI and show how quantum mechanics… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

    Comments: Accepted into AGI-25. Technical appendices available via link

  38. arXiv:2506.13059  [pdf, ps, other

    cs.CL cs.LG

    Multipole Attention for Efficient Long Context Reasoning

    Authors: Coleman Hooper, Sebastian Zhao, Luca Manolache, Sehoon Kim, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami

    Abstract: Large Reasoning Models (LRMs) have shown promising accuracy improvements on complex problem-solving tasks. While these models have attained high accuracy by leveraging additional computation at test time, they need to generate long chain-of-thought reasoning in order to think before answering, which requires generating thousands of tokens. While sparse attention methods can help reduce the KV cach… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 15 pages

  39. arXiv:2506.13040  [pdf, ps, other

    cs.CV

    MAMMA: Markerless & Automatic Multi-Person Motion Action Capture

    Authors: Hanz Cuevas-Velasquez, Anastasios Yiannakidis, Soyong Shin, Giorgio Becherini, Markus Höschle, Joachim Tesch, Taylor Obersat, Tsvetelina Alexiadis, Michael Black

    Abstract: We present MAMMA, a markerless motion-capture pipeline that accurately recovers SMPL-X parameters from multi-view video of two-person interaction sequences. Traditional motion-capture systems rely on physical markers. Although they offer high accuracy, their requirements of specialized hardware, manual marker placement, and extensive post-processing make them costly and time-consuming. Recent lear… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  40. arXiv:2506.12932  [pdf, ps, other

    cs.LG

    Complexity Scaling Laws for Neural Models using Combinatorial Optimization

    Authors: Lowell Weissman, Michael Krumdick, A. Lynn Abbott

    Abstract: Recent work on neural scaling laws demonstrates that model performance scales predictably with compute budget, model size, and dataset size. In this work, we develop scaling laws based on problem complexity. We analyze two fundamental complexity measures: solution space size and representation space size. Using the Traveling Salesman Problem (TSP) as a case study, we show that combinatorial optimi… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: 45 pages, 20 figures

  41. arXiv:2506.12900  [pdf, ps, other

    cs.DC cs.CR

    Self-Stabilizing Replicated State Machine Coping with Byzantine and Recurring Transient Faults

    Authors: Shlomi Dolev, Amit Hendin, Maurice Herlihy, Maria Potop Butucaru, Elad Michael Schiller

    Abstract: The ability to perform repeated Byzantine agreement lies at the heart of important applications such as blockchain price oracles or replicated state machines. Any such protocol requires the following properties: (1) \textit{Byzantine fault-tolerance}, because not all participants can be assumed to be honest, (2) r\textit{ecurrent transient fault-tolerance}, because even honest participants may be… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  42. arXiv:2506.12724  [pdf, ps, other

    cs.CV

    Dynamic Modality Scheduling for Multimodal Large Models via Confidence, Uncertainty, and Semantic Consistency

    Authors: Hiroshi Tanaka, Anika Rao, Hana Satou, Michael Johnson, Sofia García

    Abstract: Multimodal Large Models (MLLMs) have achieved remarkable progress in vision-language understanding and generation tasks. However, existing MLLMs typically rely on static modality fusion strategies, which treat all modalities equally regardless of their instance-level reliability or semantic contribution. This often leads to suboptimal performance, especially in scenarios with noisy, missing, or mi… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  43. arXiv:2506.12611  [pdf, ps, other

    cs.DC

    Accelerating Cloud-Based Transcriptomics: Performance Analysis and Optimization of the STAR Aligner Workflow

    Authors: Piotr Kica, Sabina Lichołai, Michał Orzechowski, Maciej Malawski

    Abstract: In this work, we explore the Transcriptomics Atlas pipeline adapted for cost-efficient and high-throughput computing in the cloud. We propose a scalable, cloud-native architecture designed for running a resource-intensive aligner -- STAR -- and processing tens or hundreds of terabytes of RNA-sequencing data. We implement multiple optimization techniques that give significant execution time and cos… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    Comments: Accepted at ICCS2025

  44. arXiv:2506.12563  [pdf, ps, other

    cs.CV

    Benchmarking Image Similarity Metrics for Novel View Synthesis Applications

    Authors: Charith Wickrema, Sara Leary, Shivangi Sarkar, Mark Giglio, Eric Bianchi, Eliza Mace, Michael Twardowski

    Abstract: Traditional image similarity metrics are ineffective at evaluating the similarity between a real image of a scene and an artificially generated version of that viewpoint [6, 9, 13, 14]. Our research evaluates the effectiveness of a new, perceptual-based similarity metric, DreamSim [2], and three popular image similarity metrics: Structural Similarity (SSIM), Peak Signal-to-Noise Ratio (PSNR), and… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  45. arXiv:2506.12362  [pdf, ps, other

    cs.LG cs.AI

    HYPER: A Foundation Model for Inductive Link Prediction with Knowledge Hypergraphs

    Authors: Xingyue Huang, Mikhail Galkin, Michael M. Bronstein, İsmail İlkan Ceylan

    Abstract: Inductive link prediction with knowledge hypergraphs is the task of predicting missing hyperedges involving completely novel entities (i.e., nodes unseen during training). Existing methods for inductive link prediction with knowledge hypergraphs assume a fixed relational vocabulary and, as a result, cannot generalize to knowledge hypergraphs with novel relation types (i.e., relations unseen during… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  46. arXiv:2506.12346  [pdf, ps, other

    cs.CL cs.AI

    Refract ICL: Rethinking Example Selection in the Era of Million-Token Models

    Authors: Arjun R. Akula, Kazuma Hashimoto, Krishna Srinivasan, Aditi Chaudhary, Karthik Raman, Michael Bendersky

    Abstract: The emergence of long-context large language models (LLMs) has enabled the use of hundreds, or even thousands, of demonstrations for in-context learning (ICL) - a previously impractical regime. This paper investigates whether traditional ICL selection strategies, which balance the similarity of ICL examples to the test input (using a text retriever) with diversity within the ICL set, remain effect… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  47. arXiv:2506.12242  [pdf

    cs.CL cs.AI cs.CY

    Large Language Models for History, Philosophy, and Sociology of Science: Interpretive Uses, Methodological Challenges, and Critical Perspectives

    Authors: Arno Simons, Michael Zichert, Adrian Wüthrich

    Abstract: This paper explores the use of large language models (LLMs) as research tools in the history, philosophy, and sociology of science (HPSS). LLMs are remarkably effective at processing unstructured text and inferring meaning from context, offering new affordances that challenge long-standing divides between computational and interpretive methods. This raises both opportunities and challenges for HPS… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: 27 pages, 2 tables

    ACM Class: A.1; I.2.1; I.2.7; J.4; J.5

  48. arXiv:2506.12128  [pdf, ps, other

    quant-ph cs.LG hep-lat hep-ph

    Improved Ground State Estimation in Quantum Field Theories via Normalising Flow-Assisted Neural Quantum States

    Authors: Vishal S. Ngairangbam, Michael Spannowsky, Timur Sypchenko

    Abstract: We propose a hybrid variational framework that enhances Neural Quantum States (NQS) with a Normalising Flow-based sampler to improve the expressivity and trainability of quantum many-body wavefunctions. Our approach decouples the sampling task from the variational ansatz by learning a continuous flow model that targets a discretised, amplitude-supported subspace of the Hilbert space. This overcome… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Report number: IPPP/25/33

  49. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  50. arXiv:2506.12100  [pdf, ps, other

    cs.CR cs.AI

    LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis

    Authors: Reza Fayyazi, Michael Zuzak, Shanchieh Jay Yang

    Abstract: Security vulnerabilities are rapidly increasing in frequency and complexity, creating a shifting threat landscape that challenges cybersecurity defenses. Large Language Models (LLMs) have been widely adopted for cybersecurity threat analysis. When querying LLMs, dealing with new, unseen vulnerabilities is particularly challenging as it lies outside LLMs' pre-trained distribution. Retrieval-Augment… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.