Skip to main content

Showing 1–50 of 146,830 results for author: D.

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.17220  [pdf, ps, other

    cs.CV

    Emergent Temporal Correspondences from Video Diffusion Transformers

    Authors: Jisu Nam, Soowon Son, Dahyun Chung, Jiyoung Kim, Siyoon Jin, Junhwa Hur, Seungryong Kim

    Abstract: Recent advancements in video diffusion models based on Diffusion Transformers (DiTs) have achieved remarkable success in generating temporally coherent videos. Yet, a fundamental question persists: how do these models internally establish and represent temporal correspondences across frames? We introduce DiffTrack, the first quantitative analysis framework designed to answer this question. DiffTra… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Project page is available at https:/cvlab-kaist.github.io/DiffTrack

  2. arXiv:2506.17218  [pdf, ps, other

    cs.CV cs.AI

    Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens

    Authors: Zeyuan Yang, Xueyang Yu, Delin Chen, Maohao Shen, Chuang Gan

    Abstract: Vision-language models (VLMs) excel at multimodal understanding, yet their text-only decoding forces them to verbalize visual reasoning, limiting performance on tasks that demand visual imagination. Recent attempts train VLMs to render explicit images, but the heavy image-generation pre-training often hinders the reasoning ability. Inspired by the way humans reason with mental imagery-the internal… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Project page: https://vlm-mirage.github.io/

  3. arXiv:2506.17204  [pdf, ps, other

    cs.LG cs.AI

    Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning

    Authors: Guozheng Ma, Lu Li, Zilin Wang, Li Shen, Pierre-Luc Bacon, Dacheng Tao

    Abstract: Effectively scaling up deep reinforcement learning models has proven notoriously difficult due to network pathologies during training, motivating various targeted interventions such as periodic reset and architectural advances such as layer normalization. Instead of pursuing more complex modifications, we show that introducing static network sparsity alone can unlock further scaling potential beyo… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Accepted to ICML 2025

  4. arXiv:2506.17196  [pdf, ps, other

    cs.HC

    Detecting LLM-Generated Short Answers and Effects on Learner Performance

    Authors: Shambhavi Bhushan, Danielle R Thomas, Conrad Borchers, Isha Raghuvanshi, Ralph Abboud, Erin Gatz, Shivang Gupta, Kenneth Koedinger

    Abstract: The increasing availability of large language models (LLMs) has raised concerns about their potential misuse in online learning. While tools for detecting LLM-generated text exist and are widely used by researchers and educators, their reliability varies. Few studies have compared the accuracy of detection methods, defined criteria to identify content generated by LLM, or evaluated the effect on l… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Accepted for publication at the 19th European Conference on Technology Enhanced Learning (ECTEL 2025). This is the author's accepted manuscript

  5. arXiv:2506.17188  [pdf, ps, other

    cs.CL cs.AI cs.IR

    Towards AI Search Paradigm

    Authors: Yuchen Li, Hengyi Cai, Rui Kong, Xinran Chen, Jiamin Chen, Jun Yang, Haojie Zhang, Jiayi Li, Jiayi Wu, Yiqun Chen, Changle Qu, Keyi Kong, Wenwen Ye, Lixin Su, Xinyu Ma, Long Xia, Daiting Shi, Jiashu Zhao, Haoyi Xiong, Shuaiqiang Wang, Dawei Yin

    Abstract: In this paper, we introduce the AI Search Paradigm, a comprehensive blueprint for next-generation search systems capable of emulating human information processing and decision-making. The paradigm employs a modular architecture of four LLM-powered agents (Master, Planner, Executor and Writer) that dynamically adapt to the full spectrum of information needs, from simple factual queries to complex m… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  6. arXiv:2506.17184  [pdf, ps, other

    cs.RO eess.SY

    Judo: A User-Friendly Open-Source Package for Sampling-Based Model Predictive Control

    Authors: Albert H. Li, Brandon Hung, Aaron D. Ames, Jiuguang Wang, Simon Le Cleac'h, Preston Culbertson

    Abstract: Recent advancements in parallel simulation and successful robotic applications are spurring a resurgence in sampling-based model predictive control. To build on this progress, however, the robotics community needs common tooling for prototyping, evaluating, and deploying sampling-based controllers. We introduce Judo, a software package designed to address this need. To facilitate rapid prototyping… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Accepted at the 2025 RSS Workshop on Fast Motion Planning and Control in the Era of Parallelism. 5 Pages

  7. arXiv:2506.17182  [pdf, ps, other

    cs.LG stat.ML

    Variational Learning of Disentangled Representations

    Authors: Yuli Slavutsky, Ozgur Beker, David Blei, Bianca Dumitrascu

    Abstract: Disentangled representations enable models to separate factors of variation that are shared across experimental conditions from those that are condition-specific. This separation is essential in domains such as biomedical data analysis, where generalization to new treatments, patients, or species depends on isolating stable biological signals from context-dependent effects. While extensions of the… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  8. arXiv:2506.17169  [pdf, ps, other

    cs.NE cs.AI

    Continual Learning with Columnar Spiking Neural Networks

    Authors: Denis Larionov, Nikolay Bazenkov, Mikhail Kiselev

    Abstract: This study investigates columnar-organized spiking neural networks (SNNs) for continual learning and catastrophic forgetting. Using CoLaNET (Columnar Layered Network), we show that microcolumns adapt most efficiently to new tasks when they lack shared structure with prior learning. We demonstrate how CoLaNET hyperparameters govern the trade-off between retaining old knowledge (stability) and acqui… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 12 pages, 3 figures

  9. arXiv:2506.17164  [pdf, ps, other

    cs.IT eess.SP

    Codeword-Segmentation Rate-Splitting Multiple Access and Evaluation under Suboptimal Decoding

    Authors: Sibo Zhang, Bruno Clerckx, David Vargas

    Abstract: Rate-Splitting Multiple Access (RSMA) has been recognized as a promising multiple access technique. We propose a novel architecture for downlink RSMA, namely Codeword-Segmentation RSMA (CS-RSMA). Different from conventional RSMA which splits users' messages into common and private parts before encoding, CS-RSMA encodes the users' messages directly, segments the codewords into common and private pa… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Submitted to IEEE for publication

  10. arXiv:2506.17155  [pdf, ps, other

    cs.LG cs.AI

    Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity

    Authors: Samin Yeasar Arnob, Scott Fujimoto, Doina Precup

    Abstract: In this paper, we investigate the use of small datasets in the context of offline reinforcement learning (RL). While many common offline RL benchmarks employ datasets with over a million data points, many offline RL applications rely on considerably smaller datasets. We show that offline RL algorithms can overfit on small datasets, resulting in poor performance. To address this challenge, we intro… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  11. arXiv:2506.17140  [pdf, ps, other

    eess.IV cs.AI cs.CV

    MeDi: Metadata-Guided Diffusion Models for Mitigating Biases in Tumor Classification

    Authors: David Jacob Drexlin, Jonas Dippel, Julius Hense, Niklas Prenißl, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller

    Abstract: Deep learning models have made significant advances in histological prediction tasks in recent years. However, for adaptation in clinical practice, their lack of robustness to varying conditions such as staining, scanner, hospital, and demographics is still a limiting factor: if trained on overrepresented subpopulations, models regularly struggle with less frequent patterns, leading to shortcut le… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  12. arXiv:2506.17137  [pdf, ps, other

    cs.CV

    On the Theory of Conditional Feature Alignment for Unsupervised Domain-Adaptive Counting

    Authors: Zhuonan Liang, Dongnan Liu, Jianan Fan, Yaxuan Song, Qiang Qu, Yu Yao, Peng Fu, Weidong Cai

    Abstract: Object counting models suffer when deployed across domains with differing density variety, since density shifts are inherently task-relevant and violate standard domain adaptation assumptions. To address this, we propose a theoretical framework of conditional feature alignment. We first formalize the notion of conditional divergence by partitioning each domain into subsets (e.g., object vs. backgr… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 18 pages, 5 figures, 8 tables

  13. arXiv:2506.17136  [pdf, ps, other

    cs.CV

    Semi-Supervised Multi-Modal Medical Image Segmentation for Complex Situations

    Authors: Dongdong Meng, Sheng Li, Hao Wu, Guoping Wang, Xueqing Yan

    Abstract: Semi-supervised learning addresses the issue of limited annotations in medical images effectively, but its performance is often inadequate for complex backgrounds and challenging tasks. Multi-modal fusion methods can significantly improve the accuracy of medical image segmentation by providing complementary information. However, they face challenges in achieving significant improvements under semi… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 10 pages, 2 figures, accepted at MICCAI 2025

  14. arXiv:2506.17121  [pdf, ps, other

    cs.CL

    Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?

    Authors: Adithya Bhaskar, Alexander Wettig, Tianyu Gao, Yihe Dong, Danqi Chen

    Abstract: Language models handle increasingly long contexts for tasks such as book summarization, but this leads to growing memory costs for the key-value (KV) cache. Many prior works have proposed ways of discarding KVs from memory, but their approaches are tailored to favorable settings, obscuring caveats like high peak memory and performance degradation, and a fair comparison between methods is difficult… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: We release our code publicly at https://github.com/princeton-pli/PruLong

  15. arXiv:2506.17114  [pdf, ps, other

    cs.AI

    Mathematical Proof as a Litmus Test: Revealing Failure Modes of Advanced Large Reasoning Models

    Authors: Dadi Guo, Jiayu Liu, Zhiyuan Fan, Zhitao He, Haoran Li, Yumeng Wang, Yi R., Fung

    Abstract: Large reasoning models (e.g., R1, o3) have demonstrated remarkable mathematical problem-solving abilities. However, the high reported accuracy of these advanced models on popular datasets, reliance on purely numerical evaluation and potential benchmark leakage, often masks their true reasoning shortcomings. To address this, we propose leveraging the inherent rigor and methodological complexity of… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  16. arXiv:2506.17095  [pdf, ps, other

    cs.SE

    Software Fairness Testing in Practice

    Authors: Ronnie de Souza Santos, Matheus de Morais Leca, Reydne Santos, Cleyton Magalhaes

    Abstract: Software testing ensures that a system functions correctly, meets specified requirements, and maintains high quality. As artificial intelligence and machine learning (ML) technologies become integral to software systems, testing has evolved to address their unique complexities. A critical advancement in this space is fairness testing, which identifies and mitigates biases in AI applications to pro… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  17. arXiv:2506.17077  [pdf, ps, other

    cs.CL

    Simultaneous Translation with Offline Speech and LLM Models in CUNI Submission to IWSLT 2025

    Authors: Dominik Macháček, Peter Polák

    Abstract: This paper describes Charles University submission to the Simultaneous Speech Translation Task of the IWSLT 2025. We cover all four language pairs with a direct or cascade approach. The backbone of our systems is the offline Whisper speech model, which we use for both translation and transcription in simultaneous mode with the state-of-the-art simultaneous policy AlignAtt. We further improve the p… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: IWSLT 2025

  18. arXiv:2506.17076  [pdf, ps, other

    cs.IT cs.LG

    Neural Polar Decoders for DNA Data Storage

    Authors: Ziv Aharoni, Henry D. Pfister

    Abstract: Synchronization errors, such as insertions and deletions, present a fundamental challenge in DNA-based data storage systems, arising from both synthesis and sequencing noise. These channels are often modeled as insertion-deletion-substitution (IDS) channels, for which designing maximum-likelihood decoders is computationally expensive. In this work, we propose a data-driven approach based on neural… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  19. arXiv:2506.17066  [pdf, ps, other

    cs.CC

    Quantum k-SAT Related Hypergraph Problems

    Authors: Simon-Luca Kremer, Dorian Rudolph, Sevag Gharibian

    Abstract: The Quantum k-SAT problem is the quantum generalization of the k-SAT problem. It is the problem whether a given local Hamiltonian is frustration-free. Frustration-free means that the ground state of the k-local Hamiltonian minimizes the energy of every local interaction term simultaneously. This is a central question in quantum physics and a canonical QMA_1-complete problem. The Quantum k-SAT prob… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  20. arXiv:2506.17064  [pdf, ps, other

    q-bio.BM cs.LG

    Generative Modeling of Full-Atom Protein Conformations using Latent Diffusion on Graph Embeddings

    Authors: Aditya Sengar, Ali Hariri, Daniel Probst, Patrick Barth, Pierre Vandergheynst

    Abstract: Generating diverse, all-atom conformational ensembles of dynamic proteins such as G-protein-coupled receptors (GPCRs) is critical for understanding their function, yet most generative models simplify atomic detail or ignore conformational diversity altogether. We present latent diffusion for full protein generation (LD-FPG), a framework that constructs complete all-atom protein structures, includi… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 10 pages (main text), 4 figures, 2 tables. Submitted to NeurIPS 2025. Code and data are publicly available

  21. arXiv:2506.17051  [pdf, ps, other

    cs.CV

    Relaxed syntax modeling in Transformers for future-proof license plate recognition

    Authors: Florent Meyer, Laurent Guichard, Denis Coquenet, Guillaume Gravier, Yann Soullard, Bertrand Coüasnon

    Abstract: Effective license plate recognition systems are required to be resilient to constant change, as new license plates are released into traffic daily. While Transformer-based networks excel in their recognition at first sight, we observe significant performance drop over time which proves them unsuitable for tense production environments. Indeed, such systems obtain state-of-the-art results on plates… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  22. arXiv:2506.17040  [pdf, ps, other

    cs.CV cs.NE

    Stretching Beyond the Obvious: A Gradient-Free Framework to Unveil the Hidden Landscape of Visual Invariance

    Authors: Lorenzo Tausani, Paolo Muratore, Morgan B. Talbot, Giacomo Amerio, Gabriel Kreiman, Davide Zoccolan

    Abstract: Uncovering which features' combinations high-level visual units encode is critical to understand how images are transformed into representations that support recognition. While existing feature visualization approaches typically infer a unit's most exciting images, this is insufficient to reveal the manifold of transformations under which responses remain invariant, which is key to generalization… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 21 pages, 9 figures

  23. arXiv:2506.17036  [pdf, ps, other

    stat.ME cs.LG stat.ML

    Bayesian Joint Model of Multi-Sensor and Failure Event Data for Multi-Mode Failure Prediction

    Authors: Sina Aghaee Dabaghan Fard, Minhee Kim, Akash Deep, Jaesung Lee

    Abstract: Modern industrial systems are often subject to multiple failure modes, and their conditions are monitored by multiple sensors, generating multiple time-series signals. Additionally, time-to-failure data are commonly available. Accurately predicting a system's remaining useful life (RUL) requires effectively leveraging multi-sensor time-series data alongside multi-mode failure event data. In most e… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  24. arXiv:2506.17035  [pdf

    cs.LG

    Critical Appraisal of Fairness Metrics in Clinical Predictive AI

    Authors: João Matos, Ben Van Calster, Leo Anthony Celi, Paula Dhiman, Judy Wawira Gichoya, Richard D. Riley, Chris Russell, Sara Khalid, Gary S. Collins

    Abstract: Predictive artificial intelligence (AI) offers an opportunity to improve clinical practice and patient outcomes, but risks perpetuating biases if fairness is inadequately addressed. However, the definition of "fairness" remains unclear. We conducted a scoping review to identify and critically appraise fairness metrics for clinical predictive AI. We defined a "fairness metric" as a measure quantify… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 32 pages, 1 figure, 2 tables, 5 boxes, 4 linked supplementary materials

  25. arXiv:2506.17018  [pdf, ps, other

    cs.AI cs.LG

    A Quantile Regression Approach for Remaining Useful Life Estimation with State Space Models

    Authors: Davide Frizzo, Francesco Borsatti, Gian Antonio Susto

    Abstract: Predictive Maintenance (PdM) is pivotal in Industry 4.0 and 5.0, proactively enhancing efficiency through accurate equipment Remaining Useful Life (RUL) prediction, thus optimizing maintenance scheduling and reducing unexpected failures and premature interventions. This paper introduces a novel RUL estimation approach leveraging State Space Models (SSM) for efficient long-term sequence modeling. T… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Submitted to IFAC Joint Conference on Computers, Cognition, and Communication (J3C) 2025

  26. arXiv:2506.17016  [pdf, ps, other

    cs.LG cs.MM

    The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image Generation

    Authors: Giulia Bertazzini, Chiara Albisani, Daniele Baracchi, Dasara Shullani, Roberto Verdecchia

    Abstract: With the growing adoption of AI image generation, in conjunction with the ever-increasing environmental resources demanded by AI, we are urged to answer a fundamental question: What is the environmental impact hidden behind each image we generate? In this research, we present a comprehensive empirical experiment designed to assess the energy consumption of AI image generation. Our experiment compa… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  27. arXiv:2506.17015  [pdf, ps, other

    cond-mat.str-el cs.LG hep-lat

    Simulating Correlated Electrons with Symmetry-Enforced Normalizing Flows

    Authors: Dominic Schuh, Janik Kreit, Evan Berkowitz, Lena Funcke, Thomas Luu, Kim A. Nicoli, Marcel Rodekamp

    Abstract: We present the first proof of principle that normalizing flows can accurately learn the Boltzmann distribution of the fermionic Hubbard model - a key framework for describing the electronic structure of graphene and related materials. State-of-the-art methods like Hybrid Monte Carlo often suffer from ergodicity issues near the time-continuum limit, leading to biased estimates. Leveraging symmetry-… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 9 pages, 7 figures

  28. arXiv:2506.17007  [pdf, ps, other

    cs.LG

    Robust Reinforcement Learning for Discrete Compositional Generation via General Soft Operators

    Authors: Marco Jiralerspong, Esther Derman, Danilo Vucetic, Nikolay Malkin, Bilun Sun, Tianyu Zhang, Pierre-Luc Bacon, Gauthier Gidel

    Abstract: A major bottleneck in scientific discovery involves narrowing a large combinatorial set of objects, such as proteins or molecules, to a small set of promising candidates. While this process largely relies on expert knowledge, recent methods leverage reinforcement learning (RL) to enhance this filtering. They achieve this by estimating proxy reward functions from available datasets and using regula… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  29. arXiv:2506.17006  [pdf, ps, other

    cs.CL cs.CY

    LLM-Generated Feedback Supports Learning If Learners Choose to Use It

    Authors: Danielle R. Thomas, Conrad Borchers, Shambhavi Bhushan, Erin Gatz, Shivang Gupta, Kenneth R. Koedinger

    Abstract: Large language models (LLMs) are increasingly used to generate feedback, yet their impact on learning remains underexplored, especially compared to existing feedback methods. This study investigates how on-demand LLM-generated explanatory feedback influences learning in seven scenario-based tutor training lessons. Analyzing over 2,600 lesson completions from 885 tutor learners, we compare posttest… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Full research paper accepted at EC-TEL '25

  30. arXiv:2506.17001  [pdf, ps, other

    cs.CL cs.IR

    PersonalAI: Towards digital twins in the graph form

    Authors: Mikhail Menschikov, Dmitry Evseev, Ruslan Kostoev, Ilya Perepechkin, Ilnaz Salimov, Victoria Dochkina, Petr Anokhin, Evgeny Burnaev, Nikita Semenov

    Abstract: The challenge of personalizing language models, specifically the ability to account for a user's history during interactions, is of significant interest. Despite recent advancements in large language models (LLMs) and Retrieval Augmented Generation that have enhanced the factual base of LLMs, the task of retaining extensive personal information and using it to generate personalized responses remai… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  31. arXiv:2506.16994  [pdf, ps, other

    cs.CV cs.LG

    Prmpt2Adpt: Prompt-Based Zero-Shot Domain Adaptation for Resource-Constrained Environments

    Authors: Yasir Ali Farrukh, Syed Wali, Irfan Khan, Nathaniel D. Bastian

    Abstract: Unsupervised Domain Adaptation (UDA) is a critical challenge in real-world vision systems, especially in resource-constrained environments like drones, where memory and computation are limited. Existing prompt-driven UDA methods typically rely on large vision-language models and require full access to source-domain data during adaptation, limiting their applicability. In this work, we propose Prmp… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  32. arXiv:2506.16982  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Language Bottleneck Models: A Framework for Interpretable Knowledge Tracing and Beyond

    Authors: Antonin Berthon, Mihaela van der Schaar

    Abstract: Accurately assessing student knowledge is critical for effective education, yet traditional Knowledge Tracing (KT) methods rely on opaque latent embeddings, limiting interpretability. Even LLM-based approaches generate direct predictions or summaries that may hallucinate without any accuracy guarantees. We recast KT as an inverse problem: learning the minimum natural-language summary that makes pa… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  33. arXiv:2506.16961  [pdf, ps, other

    cs.CV eess.IV

    Reversing Flow for Image Restoration

    Authors: Haina Qin, Wenyang Luo, Libin Wang, Dandan Zheng, Jingdong Chen, Ming Yang, Bing Li, Weiming Hu

    Abstract: Image restoration aims to recover high-quality (HQ) images from degraded low-quality (LQ) ones by reversing the effects of degradation. Existing generative models for image restoration, including diffusion and score-based models, often treat the degradation process as a stochastic transformation, which introduces inefficiency and complexity. In this work, we propose ResFlow, a novel image restorat… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: CVPR2025 Final Version; Corresponding Author: Bing Li

    MSC Class: 68U10 ACM Class: I.4.4

  34. arXiv:2506.16960  [pdf, ps, other

    cs.CV

    Visual-Instructed Degradation Diffusion for All-in-One Image Restoration

    Authors: Wenyang Luo, Haina Qin, Zewen Chen, Libin Wang, Dandan Zheng, Yuming Li, Yufan Liu, Bing Li, Weiming Hu

    Abstract: Image restoration tasks like deblurring, denoising, and dehazing usually need distinct models for each degradation type, restricting their generalization in real-world scenarios with mixed or unknown degradations. In this work, we propose \textbf{Defusion}, a novel all-in-one image restoration framework that utilizes visual instruction-guided degradation diffusion. Unlike existing methods that rel… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: CVPR2025 Final Version; Corresponding Author: Bing Li

    MSC Class: 68U10 ACM Class: I.4.4

  35. arXiv:2506.16956  [pdf, ps, other

    cs.CC cs.LO math.LO

    The Proof Analysis Problem

    Authors: Noel Arteche, Albert Atserias, Susanna F. de Rezende, Erfan Khaniki

    Abstract: Atserias and Müller (JACM, 2020) proved that for every unsatisfiable CNF formula $\varphi$, the formula $\operatorname{Ref}(\varphi)$, stating "$\varphi$ has small Resolution refutations", does not have subexponential-size Resolution refutations. Conversely, when $\varphi$ is satisfiable, Pudlák (TCS, 2003) showed how to construct a polynomial-size Resolution refutation of… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  36. arXiv:2506.16940  [pdf, ps, other

    cs.CV

    LunarLoc: Segment-Based Global Localization on the Moon

    Authors: Annika Thomas, Robaire Galliath, Aleksander Garbuz, Luke Anger, Cormac O'Neill, Trevor Johst, Dami Thomas, George Lordos, Jonathan P. How

    Abstract: Global localization is necessary for autonomous operations on the lunar surface where traditional Earth-based navigation infrastructure, such as GPS, is unavailable. As NASA advances toward sustained lunar presence under the Artemis program, autonomous operations will be an essential component of tasks such as robotic exploration and infrastructure deployment. Tasks such as excavation and transpor… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  37. arXiv:2506.16938  [pdf, ps, other

    quant-ph cs.ET cs.LG

    Enhancing Expressivity of Quantum Neural Networks Based on the SWAP test

    Authors: Sebastian Nagies, Emiliano Tolotti, Davide Pastorello, Enrico Blanzieri

    Abstract: Parameterized quantum circuits represent promising architectures for machine learning applications, yet many lack clear connections to classical models, potentially limiting their ability to translate the wide success of classical neural networks to the quantum realm. We examine a specific type of quantum neural network (QNN) built exclusively from SWAP test circuits, and discuss its mathematical… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 15 pages, 7 figures

  38. A deep learning and machine learning approach to predict neonatal death in the context of São Paulo

    Authors: Mohon Raihan, Plabon Kumar Saha, Rajan Das Gupta, A Z M Tahmidul Kabir, Afia Anjum Tamanna, Md. Harun-Ur-Rashid, Adnan Bin Abdus Salam, Md Tanvir Anjum, A Z M Ahteshamul Kabir

    Abstract: Neonatal death is still a concerning reality for underdeveloped and even some developed countries. Worldwide data indicate that 26.693 babies out of 1,000 births die, according to Macro Trades. To reduce this number, early prediction of endangered babies is crucial. Such prediction enables the opportunity to take ample care of the child and mother so that early child death can be avoided. In this… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Journal ref: journal-ref = {Int J Public Health Sci vol 13 no 1 pp 179--190 2024}

  39. arXiv:2506.16923  [pdf, ps, other

    cs.DB

    Advancing Fact Attribution for Query Answering: Aggregate Queries and Novel Algorithms

    Authors: Omer Abramovich, Daniel Deutch, Nave Frost, Ahmet Kara, Dan Olteanu

    Abstract: In this paper, we introduce a novel approach to computing the contribution of input tuples to the result of the query, quantified by the Banzhaf and Shapley values. In contrast to prior algorithmic work that focuses on Select-Project-Join-Union queries, ours is the first practical approach for queries with aggregates. It relies on two novel optimizations that are essential for its practicality and… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  40. arXiv:2506.16918  [pdf, ps, other

    physics.comp-ph cs.CE cs.LG

    A Neural Operator based Hybrid Microscale Model for Multiscale Simulation of Rate-Dependent Materials

    Authors: Dhananjeyan Jeyaraj, Hamidreza Eivazi, Jendrik-Alexander Tröger, Stefan Wittek, Stefan Hartmann, Andreas Rausch

    Abstract: The behavior of materials is influenced by a wide range of phenomena occurring across various time and length scales. To better understand the impact of microstructure on macroscopic response, multiscale modeling strategies are essential. Numerical methods, such as the $\text{FE}^2$ approach, account for micro-macro interactions to predict the global response in a concurrent manner. However, these… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  41. arXiv:2506.16912  [pdf, ps, other

    cs.CL cs.LG

    From Data to Knowledge: Evaluating How Efficiently Language Models Learn Facts

    Authors: Daniel Christoph, Max Ploner, Patrick Haller, Alan Akbik

    Abstract: Sample efficiency is a crucial property of language models with practical implications for training efficiency. In real-world text, information follows a long-tailed distribution. Yet, we expect models to learn and recall frequent and infrequent facts. Sample-efficient models are better equipped to handle this challenge of learning and retaining rare information without requiring excessive exposur… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Accepted to the First Workshop on Large Language Model Memorization (L2M2), co-located with ACL 2025 in Vienna

  42. arXiv:2506.16891  [pdf, ps, other

    cs.CR

    Tracker Installations Are Not Created Equal: Understanding Tracker Configuration of Form Data Collection

    Authors: Julia B. Kieserman, Athanasios Andreou, Chris Geeng, Tobias Lauinger, Damon McCoy

    Abstract: Targeted advertising is fueled by the comprehensive tracking of users' online activity. As a result, advertising companies, such as Google and Meta, encourage website administrators to not only install tracking scripts on their websites but configure them to automatically collect users' Personally Identifying Information (PII). In this study, we aim to characterize how Google and Meta's trackers c… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  43. arXiv:2506.16842  [pdf, ps, other

    cs.CV cs.RO

    Camera Calibration via Circular Patterns: A Comprehensive Framework with Measurement Uncertainty and Unbiased Projection Model

    Authors: Chaehyeon Song, Dongjae Lee, Jongwoo Lim, Ayoung Kim

    Abstract: Camera calibration using planar targets has been widely favored, and two types of control points have been mainly considered as measurements: the corners of the checkerboard and the centroid of circles. Since a centroid is derived from numerous pixels, the circular pattern provides more precise measurements than the checkerboard. However, the existing projection model of circle centroids is biased… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  44. arXiv:2506.16831  [pdf, ps, other

    cs.SE

    Accountability of Robust and Reliable AI-Enabled Systems: A Preliminary Study and Roadmap

    Authors: Filippo Scaramuzza, Damian A. Tamburri, Willem-Jan van den Heuvel

    Abstract: This vision paper presents initial research on assessing the robustness and reliability of AI-enabled systems, and key factors in ensuring their safety and effectiveness in practical applications, including a focus on accountability. By exploring evolving definitions of these concepts and reviewing current literature, the study highlights major challenges and approaches in the field. A case study… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: To be published in https://link.springer.com/book/9789819672370

  45. arXiv:2506.16822  [pdf, ps, other

    cs.RO cs.AI

    Learning Dexterous Object Handover

    Authors: Daniel Frau-Alfaro, Julio Castaño-Amoros, Santiago Puente, Pablo Gil, Roberto Calandra

    Abstract: Object handover is an important skill that we use daily when interacting with other humans. To deploy robots in collaborative setting, like houses, being able to receive and handing over objects safely and efficiently becomes a crucial skill. In this work, we demonstrate the use of Reinforcement Learning (RL) for dexterous object handover between two multi-finger hands. Key to this task is the use… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: Paper accepted for presentation in RoMan 2025

  46. arXiv:2506.16821  [pdf, ps, other

    cs.CV

    Self-supervised Feature Extraction for Enhanced Ball Detection on Soccer Robots

    Authors: Can Lin, Daniele Affinita, Marco E. P. Zimmatore, Daniele Nardi, Domenico D. Bloisi, Vincenzo Suriani

    Abstract: Robust and accurate ball detection is a critical component for autonomous humanoid soccer robots, particularly in dynamic and challenging environments such as RoboCup outdoor fields. However, traditional supervised approaches require extensive manual annotation, which is costly and time-intensive. To overcome this problem, we present a self-supervised learning framework for domain-adaptive feature… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  47. arXiv:2506.16812  [pdf, ps, other

    cs.CR

    Zero-Knowledge Proof-of-Location Protocols for Vehicle Subsidies and Taxation Compliance

    Authors: Dan Bogdanov, Eduardo Brito, Annika Jaakson, Peeter Laud, Raul-Martin Rebane

    Abstract: This paper introduces a new set of privacy-preserving mechanisms for verifying compliance with location-based policies for vehicle taxation, or for (electric) vehicle (EV) subsidies, using Zero-Knowledge Proofs (ZKPs). We present the design and evaluation of a Zero-Knowledge Proof-of-Location (ZK-PoL) system that ensures a vehicle's adherence to territorial driving requirements without disclosing… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: This is the extended version of the paper to appear in the Proceedings of the 5th International Workshop on Security and Privacy in Intelligent Infrastructures (SP2I 2025), held in conjunction with the 20th International Conference on Availability, Reliability and Security (ARES 2025)

  48. arXiv:2506.16802  [pdf, ps, other

    cs.CV

    Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation

    Authors: Riccardo Corvi, Davide Cozzolino, Ekta Prashnani, Shalini De Mello, Koki Nagano, Luisa Verdoliva

    Abstract: Synthetic video generation is progressing very rapidly. The latest models can produce very realistic high-resolution videos that are virtually indistinguishable from real ones. Although several video forensic detectors have been recently proposed, they often exhibit poor generalization, which limits their applicability in a real-world scenario. Our key insight to overcome this issue is to guide th… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  49. arXiv:2506.16793  [pdf, ps, other

    math.CO cs.IT

    A Generic Construction of $q$-ary Near-MDS Codes Supporting 2-Designs with Lengths Beyond $q+1$

    Authors: Hengfeng Liu, Chunming Tang, Zhengchun Zhou, Dongchun Han, Hao Chen

    Abstract: A linear code with parameters $[n, k, n - k + 1]$ is called maximum distance separable (MDS), and one with parameters $[n, k, n - k]$ is called almost MDS (AMDS). A code is near-MDS (NMDS) if both it and its dual are AMDS. NMDS codes supporting combinatorial $t$-designs have attracted growing interest, yet constructing such codes remains highly challenging. In 2020, Ding and Tang initiated the stu… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  50. arXiv:2506.16791  [pdf, ps, other

    cs.LG cs.AI

    TabArena: A Living Benchmark for Machine Learning on Tabular Data

    Authors: Nick Erickson, Lennart Purucker, Andrej Tschalzev, David Holzmüller, Prateek Mutalik Desai, and David Salinas, Frank Hutter

    Abstract: With the growing popularity of deep learning and foundation models for tabular data, the need for standardized and reliable benchmarks is higher than ever. However, current benchmarks are static. Their design is not updated even if flaws are discovered, model versions are updated, or new models are released. To address this, we introduce TabArena, the first continuously maintained living tabular b… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 51 pages. Code available at https://tabarena.ai/code; examples at https://tabarena.ai/code-examples; dataset curation at https://tabarena.ai/data-tabular-ml-iid-study and https://tabarena.ai/dataset-curation