Skip to main content

Showing 151–200 of 9,557 results for author: li, s

.
  1. arXiv:2506.12537  [pdf, ps, other

    cs.CL cs.AI eess.AS

    Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction

    Authors: Xiaoran Fan, Zhichao Sun, Yangfan Gao, Jingfei Xiong, Hang Yan, Yifei Cao, Jiajun Sun, Shuo Li, Zhihao Zhang, Zhiheng Xi, Yuhao Zhou, Senjie Jin, Changhao Jiang, Junjie Ye, Ming Zhang, Rui Zheng, Zhenhua Han, Yunke Zhang, Demei Yan, Shaokang Dong, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Speech-language models (SLMs) offer a promising path toward unifying speech and text understanding and generation. However, challenges remain in achieving effective cross-modal alignment and high-quality speech generation. In this work, we systematically investigate the impact of key components (i.e., speech tokenizers, speech heads, and speaker modeling) on the performance of LLM-centric SLMs. We… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2506.12470  [pdf, ps, other

    hep-ex

    Study on the impact of radioactive background on Dark Count Rate of 20-inch MCP-PMTs

    Authors: Zeyuan Feng, Zhaoyuan Peng, Haojie Dong, Yanfeng Li, Songyi Li, Xinzhou Guo, Wan Xie, Zhonghua Qin, Weiping Liu

    Abstract: This study systematically investigates the impact of natural radioactive background on the dark count rate (DCR) of 20-inch microchannel plate photomultiplier tubes (MCP-PMTs). Variations in PMT DCR under different radiation conditions were examined via underground tests, lead shielding experiments, and irradiation with \(^{55}\mathrm{Fe}\), \(^{60}\mathrm{Co}\), and \(^{90}\mathrm{Sr}\) sources.… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  3. arXiv:2506.12364  [pdf, ps, other

    cs.AI cs.CL cs.CV

    MM-R5: MultiModal Reasoning-Enhanced ReRanker via Reinforcement Learning for Document Retrieval

    Authors: Mingjun Xu, Jinhan Dong, Jue Hou, Zehui Wang, Sihang Li, Zhifeng Gao, Renxin Zhong, Hengxing Cai

    Abstract: Multimodal document retrieval systems enable information access across text, images, and layouts, benefiting various domains like document-based question answering, report analysis, and interactive content summarization. Rerankers improve retrieval precision by reordering retrieved candidates. However, current multimodal reranking methods remain underexplored, with significant room for improvement… ▽ More

    Submitted 22 June, 2025; v1 submitted 14 June, 2025; originally announced June 2025.

  4. arXiv:2506.12285  [pdf, ps, other

    eess.AS cs.AI cs.LG cs.SD

    CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following

    Authors: Yinghao Ma, Siyou Li, Juntao Yu, Emmanouil Benetos, Akira Maezawa

    Abstract: Recent advances in audio-text large language models (LLMs) have opened new possibilities for music understanding and generation. However, existing benchmarks are limited in scope, often relying on simplified tasks or multi-choice evaluations that fail to reflect the complexity of real-world music analysis. We reinterpret a broad range of traditional MIR annotations as instruction-following formats… ▽ More

    Submitted 27 June, 2025; v1 submitted 13 June, 2025; originally announced June 2025.

    Comments: Accepted by ISMIR 2025

  5. arXiv:2506.12202  [pdf, ps, other

    cs.PL cs.AI cs.CR cs.LG

    A Fast, Reliable, and Secure Programming Language for LLM Agents with Code Actions

    Authors: Stephen Mell, Botong Zhang, David Mell, Shuo Li, Ramya Ramalingam, Nathan Yu, Steve Zdancewic, Osbert Bastani

    Abstract: Modern large language models (LLMs) are often deployed as agents, calling external tools adaptively to solve tasks. Rather than directly calling tools, it can be more effective for LLMs to write code to perform the tool calls, enabling them to automatically generate complex control flow such as conditionals and loops. Such code actions are typically provided as Python code, since LLMs are quite pr… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  6. arXiv:2506.12107  [pdf

    q-bio.MN

    Network Pharmacology Reveals HSPA1A/BST2 as Potential Targets of Ci Bai Capsule's Active Compounds Intervening in Leukopenia

    Authors: Dingfan Zhang, Congshu Huang, Lei Zhou, Boyang Wang, Wei Zhou, Tiantian Xia, Pan Shen, Shao Li, Yue Gao

    Abstract: Background: Radiation-induced leukopenia caused by low-dose exposure is frequently associated with Traditional Chinese Medicine (TCM) syndromes like "blood deficiency" and "fatigue syndrome". Ci Bai Capsule (CB) has been reported to enhance white blood cell levels; however, its mechanisms and bioactive compounds remain unclear.Aim: This study aimed to identify the bioactive compounds group of CB a… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  7. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  8. arXiv:2506.12073  [pdf, ps, other

    eess.AS cs.AI cs.CL cs.SD

    Seamless Dysfluent Speech Text Alignment for Disordered Speech Analysis

    Authors: Zongli Ye, Jiachen Lian, Xuanru Zhou, Jinming Zhang, Haodong Li, Shuhe Li, Chenxu Guo, Anaisha Das, Peter Park, Zoe Ezzes, Jet Vonk, Brittany Morin, Rian Bogley, Lisa Wauters, Zachary Miller, Maria Gorno-Tempini, Gopala Anumanchipalli

    Abstract: Accurate alignment of dysfluent speech with intended text is crucial for automating the diagnosis of neurodegenerative speech disorders. Traditional methods often fail to model phoneme similarities effectively, limiting their performance. In this work, we propose Neural LCS, a novel approach for dysfluent text-text and speech-text alignment. Neural LCS addresses key challenges, including partial a… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted for Interspeech2025

  9. arXiv:2506.12055  [pdf

    q-bio.NC cs.AI

    Towards Unified Neural Decoding with Brain Functional Network Modeling

    Authors: Di Wu, Linghao Bu, Yifei Jia, Lu Cao, Siyuan Li, Siyu Chen, Yueqian Zhou, Sheng Fan, Wenjie Ren, Dengchang Wu, Kang Wang, Yue Zhang, Yuehui Ma, Jie Yang, Mohamad Sawan

    Abstract: Recent achievements in implantable brain-computer interfaces (iBCIs) have demonstrated the potential to decode cognitive and motor behaviors with intracranial brain recordings; however, individual physiological and electrode implantation heterogeneities have constrained current approaches to neural decoding within single individuals, rendering interindividual neural decoding elusive. Here, we pres… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  10. arXiv:2506.11629  [pdf, ps, other

    eess.SP

    FieldFormer: Self-supervised Reconstruction of Physical Fields via Tensor Attention Prior

    Authors: Panqi Chen, Siyuan Li, Lei Cheng, Xiao Fu, Yik-Chung Wu, Sergios Theodoridis

    Abstract: Reconstructing physical field tensors from \textit{in situ} observations, such as radio maps and ocean sound speed fields, is crucial for enabling environment-aware decision making in various applications, e.g., wireless communications and underwater acoustics. Field data reconstruction is often challenging, due to the limited and noisy nature of the observations, necessitating the incorporation o… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  11. arXiv:2506.11480  [pdf, ps, other

    cs.LG cs.AI

    LearnAlign: Reasoning Data Selection for Reinforcement Learning in Large Language Models Based on Improved Gradient Alignment

    Authors: Shipeng Li, Shikun Li, Zhiqin Yang, Xinghua Zhang, Gaode Chen, Xiaobo Xia, Hengyu Liu, Zhe Peng

    Abstract: Reinforcement learning (RL) has become a key technique for enhancing LLMs' reasoning abilities, yet its data inefficiency remains a major bottleneck. To address this critical yet challenging issue, we present a novel gradient-alignment-based method, named LearnAlign, which intelligently selects the learnable and representative training reasoning data for RL post-training. To overcome the issue of… ▽ More

    Submitted 4 July, 2025; v1 submitted 13 June, 2025; originally announced June 2025.

  12. arXiv:2506.11393  [pdf, ps, other

    cs.HC cs.CY

    Co-Designing a Chatbot for Culturally Competent Clinical Communication: Experience and Reflections

    Authors: Sandro Radovanović, Shuangyu Li

    Abstract: Clinical communication skills are essential for preparing healthcare professionals to provide equitable care across cultures. However, traditional training with simulated patients can be resource intensive and difficult to scale, especially in under-resourced settings. In this project, we explore the use of an AI-driven chatbot to support culturally competent communication training for medical stu… ▽ More

    Submitted 18 May, 2025; originally announced June 2025.

    Comments: 19 pages, 7 figures

  13. arXiv:2506.11293  [pdf, ps, other

    eess.SY

    Influence Functions for Data Attribution in Linear System Identification and LQR Control

    Authors: Jiachen Li, Shihao Li, Jiamin Xu, Soovadeep Bakshi, Dongmei Chen

    Abstract: Understanding the influence of individual training data points is crucial for developing reliable machine learning-based control systems. However, conventional methods like leave-one-out retraining are computationally infeasible for large datasets. This paper introduces a framework using influence functions to efficiently approximate the impact of removing specific training trajectories on both le… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  14. arXiv:2506.11279  [pdf

    eess.SY

    Smart Predict-Then-Control: Integrating identification and control via decision regret

    Authors: Jiachen Li, Shihao Li, Dongmei Chen

    Abstract: This paper presents Smart Predict-Then-Control (SPC) framework for integrating system identification and control. This novel SPC framework addresses the limitations of traditional methods, the unaligned modeling error and control cost. It leverages decision regret to prioritize control-relevant dynamics, optimizing prediction errors based on their impact on control performance. Furthermore, the ex… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  15. arXiv:2506.11264  [pdf

    cs.RO eess.SY

    Robust Optimal Task Planning to Maximize Battery Life

    Authors: Jiachen Li, Chu Jian, Feiyang Zhao, Shihao Li, Wei Li, Dongmei Chen

    Abstract: This paper proposes a control-oriented optimization platform for autonomous mobile robots (AMRs), focusing on extending battery life while ensuring task completion. The requirement of fast AMR task planning while maintaining minimum battery state of charge, thus maximizing the battery life, renders a bilinear optimization problem. McCormick envelop technique is proposed to linearize the bilinear t… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  16. arXiv:2506.11182  [pdf, ps, other

    q-bio.GN cs.AI

    Multimodal Modeling of CRISPR-Cas12 Activity Using Foundation Models and Chromatin Accessibility Data

    Authors: Azim Dehghani Amirabad, Yanfei Zhang, Artem Moskalev, Sowmya Rajesh, Tommaso Mansi, Shuwei Li, Mangal Prakash, Rui Liao

    Abstract: Predicting guide RNA (gRNA) activity is critical for effective CRISPR-Cas12 genome editing but remains challenging due to limited data, variation across protospacer adjacent motifs (PAMs-short sequence requirements for Cas binding), and reliance on large-scale training. We investigate whether pre-trained biological foundation model originally trained on transcriptomic data can improve gRNA activit… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: This manuscript has been accepted by ICML workshop 2025

  17. arXiv:2506.11078  [pdf, ps, other

    cs.CL

    RoE-FND: A Case-Based Reasoning Approach with Dual Verification for Fake News Detection via LLMs

    Authors: Yuzhou Yang, Yangming Zhou, Zhiying Zhu, Zhenxing Qian, Xinpeng Zhang, Sheng Li

    Abstract: The proliferation of deceptive content online necessitates robust Fake News Detection (FND) systems. While evidence-based approaches leverage external knowledge to verify claims, existing methods face critical limitations: noisy evidence selection, generalization bottlenecks, and unclear decision-making processes. Recent efforts to harness Large Language Models (LLMs) for FND introduce new challen… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  18. arXiv:2506.10947  [pdf, ps, other

    cs.AI cs.LG

    Spurious Rewards: Rethinking Training Signals in RLVR

    Authors: Rulin Shao, Shuyue Stella Li, Rui Xin, Scott Geng, Yiping Wang, Sewoong Oh, Simon Shaolei Du, Nathan Lambert, Sewon Min, Ranjay Krishna, Yulia Tsvetkov, Hannaneh Hajishirzi, Pang Wei Koh, Luke Zettlemoyer

    Abstract: We show that reinforcement learning with verifiable rewards (RLVR) can elicit strong mathematical reasoning in certain models even with spurious rewards that have little, no, or even negative correlation with the correct answer. For example, RLVR improves MATH-500 performance for Qwen2.5-Math-7B in absolute points by 21.4% (random reward), 13.8% (format reward), 24.1% (incorrect label), 26.0% (1-s… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  19. arXiv:2506.10878  [pdf, ps, other

    quant-ph

    Quantum secret sharing in a triangular superconducting quantum network

    Authors: Haoxiong Yan, Allen Zang, Joel Grebel, Xuntao Wu, Ming-Han Chou, Gustav Andersson, Christopher R. Conner, Yash J. Joshi, Shiheng Li, Jacob M. Miller, Rhys G. Povey, Hong Qiao, Eric Chitambar, Andrew N. Cleland

    Abstract: We present a three-node quantum communication testbed with a triangular topology, each side of the triangle formed by a 1.3-meter-long transmission line. We demonstrate state transfer and entanglement generation between any two nodes, generate genuine multipartite entangled GHZ states, and implement quantum secret sharing (QSS) of classical information. Our experiments show that the QSS protocol c… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  20. arXiv:2506.10859  [pdf, ps, other

    cs.IR cs.AI

    Precise Zero-Shot Pointwise Ranking with LLMs through Post-Aggregated Global Context Information

    Authors: Kehan Long, Shasha Li, Chen Xu, Jintao Tang, Ting Wang

    Abstract: Recent advancements have successfully harnessed the power of Large Language Models (LLMs) for zero-shot document ranking, exploring a variety of prompting strategies. Comparative approaches like pairwise and listwise achieve high effectiveness but are computationally intensive and thus less practical for larger-scale applications. Scoring-based pointwise approaches exhibit superior efficiency by i… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Accepted by SIGIR 2025

  21. arXiv:2506.10857  [pdf, ps, other

    cs.CV cs.AI cs.MM

    VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

    Authors: Jiashuo Yu, Yue Wu, Meng Chu, Zhifei Ren, Zizheng Huang, Pei Chu, Ruijie Zhang, Yinan He, Qirui Li, Songze Li, Zhenxiang Li, Zhongying Tu, Conghui He, Yu Qiao, Yali Wang, Yi Wang, Limin Wang

    Abstract: We present VRBench, the first long narrative video benchmark crafted for evaluating large models' multi-step reasoning capabilities, addressing limitations in existing evaluations that overlook temporal reasoning and procedural validity. It comprises 1,010 long videos (with an average duration of 1.6 hours), along with 9,468 human-labeled multi-step question-answering pairs and 30,292 reasoning st… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Technical Report

  22. arXiv:2506.10774  [pdf, ps, other

    cs.CV cs.AI

    Stroke-based Cyclic Amplifier: Image Super-Resolution at Arbitrary Ultra-Large Scales

    Authors: Wenhao Guo, Peng Lu, Xujun Peng, Zhaoran Zhao, Sheng Li

    Abstract: Prior Arbitrary-Scale Image Super-Resolution (ASISR) methods often experience a significant performance decline when the upsampling factor exceeds the range covered by the training data, introducing substantial blurring. To address this issue, we propose a unified model, Stroke-based Cyclic Amplifier (SbCA), for ultra-large upsampling tasks. The key of SbCA is the stroke vector amplifier, which de… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  23. arXiv:2506.10516  [pdf, ps, other

    cs.CV cs.AI

    CogStream: Context-guided Streaming Video Question Answering

    Authors: Zicheng Zhao, Kangyu Wang, Shijie Li, Rui Qian, Weiyao Lin, Huabin Liu

    Abstract: Despite advancements in Video Large Language Models (Vid-LLMs) improving multimodal understanding, challenges persist in streaming video reasoning due to its reliance on contextual information. Existing paradigms feed all available historical contextual information into Vid-LLMs, resulting in a significant computational burden for visual data processing. Furthermore, the inclusion of irrelevant co… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  24. arXiv:2506.10503  [pdf, ps, other

    cs.CV cs.AI

    Semantic Localization Guiding Segment Anything Model For Reference Remote Sensing Image Segmentation

    Authors: Shuyang Li, Shuang Wang, Zhuangzhuang Sun, Jing Xiao

    Abstract: The Reference Remote Sensing Image Segmentation (RRSIS) task generates segmentation masks for specified objects in images based on textual descriptions, which has attracted widespread attention and research interest. Current RRSIS methods rely on multi-modal fusion backbones and semantic segmentation heads but face challenges like dense annotation requirements and complex scene interpretation. To… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  25. arXiv:2506.10484  [pdf, ps, other

    cs.SE

    EXPEREPAIR: Dual-Memory Enhanced LLM-based Repository-Level Program Repair

    Authors: Fangwen Mu, Junjie Wang, Lin Shi, Song Wang, Shoubin Li, Qing Wang

    Abstract: Automatically repairing software issues remains a fundamental challenge at the intersection of software engineering and AI. Although recent advancements in Large Language Models (LLMs) have demonstrated potential for repository-level repair tasks, current methodologies exhibit two notable limitations: (1) they often address issues in isolation, neglecting to incorporate insights from previously re… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  26. arXiv:2506.10445  [pdf, ps, other

    quant-ph

    Ultrahigh threshold nonstabilizer nonlinear quantum error correcting code

    Authors: Maga Grafe, Kaixuan Zhou, Zaman Tekin, Zhiyuan Lin, Sen Li, Fengquan Zhang, Valentin Ivannikov, Tim Byrnes

    Abstract: We introduce a novel type of quantum error correcting code, called the spinor code, based on spaces defined by total spin. The code is a nonstabilizer code, and is also a nonlinear quantum error correcting code, meaning that quantum information is encoded in a parameterized family of quantum states, rather than a linear superposition of code words. Syndrome measurements are performed by projecting… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  27. arXiv:2506.10376  [pdf, ps, other

    cs.SE cs.HC

    MLLM-Based UI2Code Automation Guided by UI Layout Information

    Authors: Fan Wu, Cuiyun Gao, Shuqing Li, Xin-Cheng Wen, Qing Liao

    Abstract: Converting user interfaces into code (UI2Code) is a crucial step in website development, which is time-consuming and labor-intensive. The automation of UI2Code is essential to streamline this task, beneficial for improving the development efficiency. There exist deep learning-based methods for the task; however, they heavily rely on a large amount of labeled training data and struggle with general… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Accepted by the 34th International Symposium on Software Testing and Analysis (ISSTA 2025)

  28. arXiv:2506.10359  [pdf, ps, other

    cs.RO cs.LG

    Demonstrating Multi-Suction Item Picking at Scale via Multi-Modal Learning of Pick Success

    Authors: Che Wang, Jeroen van Baar, Chaitanya Mitash, Shuai Li, Dylan Randle, Weiyao Wang, Sumedh Sontakke, Kostas E. Bekris, Kapil Katyal

    Abstract: This work demonstrates how autonomously learning aspects of robotic operation from sparsely-labeled, real-world data of deployed, engineered solutions at industrial scale can provide with solutions that achieve improved performance. Specifically, it focuses on multi-suction robot picking and performs a comprehensive study on the application of multi-modal visual encoders for predicting the success… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: Accepted to Robotics: Science and Systems (RSS 2025), 15 pages

  29. arXiv:2506.10316  [pdf, ps, other

    hep-ex

    Search for sub-GeV invisible particles in inclusive decays of $J/ψ$ to $φ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (704 additional authors not shown)

    Abstract: A search for an invisible particle, $X$, with a mass between 0 and 0.96 $\textrm{GeV}/\textit{c}^{2}$, is performed in the process $J/ψ\rightarrowφ+ X$ using $(8774.0\pm39.4)\times10^{6}$ $J/ψ$ events collected with the BESIII detector from 2017 to 2019. The $φ$ meson is fully reconstructed and an efficient veto of photons, neutral and charged hadrons up to twice the $K_L^0$ mass is applied to the… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 10 pages, 3 figures

  30. arXiv:2506.09931  [pdf, ps, other

    cs.IT eess.SP

    Faster-than-Nyquist Signaling is Good for Single-Carrier ISAC: An Analytical Study

    Authors: Shuangyang Li, Fan Liu, Yifeng Xiong, Weijie Yuan, Baoming Bai, Christos Masouros, Giuseppe Caire

    Abstract: In this paper, we provide an analytical study of single-carrier faster-than-Nyquist (FTN) signaling for integrated sensing and communications (ISAC). Our derivations show that FTN is advantageous for ISAC, and reveal new insights that these advantages come from the fact that FTN signaling can effectively avoid the spectral aliasing due to the mismatch between the symbol rate and the bandwidth of t… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  31. arXiv:2506.09765  [pdf, ps, other

    cs.RO cs.LG

    Learning to Optimize Package Picking for Large-Scale, Real-World Robot Induction

    Authors: Shuai Li, Azarakhsh Keipour, Sicong Zhao, Srinath Rajagopalan, Charles Swan, Kostas E. Bekris

    Abstract: Warehouse automation plays a pivotal role in enhancing operational efficiency, minimizing costs, and improving resilience to workforce variability. While prior research has demonstrated the potential of machine learning (ML) models to increase picking success rates in large-scale robotic fleets by prioritizing high-probability picks and packages, these efforts primarily focused on predicting succe… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: The 19th International Symposium on Experimental Robotics (ISER 2025); 6-10 July 2025, Santa Fe, New Mexico, USA; 10 pages

  32. arXiv:2506.09712  [pdf, ps, other

    physics.ins-det

    Optimization and validation of charge transport simulation for hybrid pixel detectors incorporating the repulsion effect

    Authors: X. Xie, A. Bergamaschi, M. Brückner, M. Carulla, R. Dinapoli, S. Ebner, K. Ferjaoui, A. Francesca Mazzoleni, J. Franklin Mulvey, V. Gautam, D. Greiffenberg, S. Hasanaj, J. Heymes, V. Hinger, V. Kedych, T. King, S. Li, C. Lopez-Cuenca, D. Mezza, K. Moustakas, A. Mozzanica, M. Müller, K. A. Paton, C. Ruder, B. Schmitt , et al. (5 additional authors not shown)

    Abstract: For emerging applications of hybrid pixel detectors which require high spatial resolution, e.g., subpixel interpolation in X-ray imaging and deep learning-based electron localization, accurate modeling of charge transport processes in the sensor is highly demanded. To address this, two open-source, time-stepping Monte Carlo simulation methods have been developed, both explicitly incorporating char… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  33. arXiv:2506.09655  [pdf, ps, other

    cs.AI cs.LG

    DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy

    Authors: Kaixuan Xu, Jiajun Chai, Sicheng Li, Yuqian Fu, Yuanheng Zhu, Dongbin Zhao

    Abstract: Diplomacy is a complex multiplayer game that requires both cooperation and competition, posing significant challenges for AI systems. Traditional methods rely on equilibrium search to generate extensive game data for training, which demands substantial computational resources. Large Language Models (LLMs) offer a promising alternative, leveraging pre-trained knowledge to achieve strong performance… ▽ More

    Submitted 23 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

    Comments: Accepted to the 42nd International Conference on Machine Learning (ICML 2025)

  34. arXiv:2506.09562  [pdf, ps, other

    cs.CR cs.LG

    TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning

    Authors: Songze Li, Mingxuan Zhang, Kang Wei, Shouling Ji

    Abstract: Deep reinforcement learning (DRL) has achieved remarkable success in a wide range of sequential decision-making domains, including robotics, healthcare, smart grids, and finance. Recent research demonstrates that attackers can efficiently exploit system vulnerabilities during the training phase to execute backdoor attacks, producing malicious actions when specific trigger patterns are present in t… ▽ More

    Submitted 12 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

  35. arXiv:2506.09502  [pdf, ps, other

    cs.CR

    The Security Overview and Analysis of 3GPP 5G MAC CE

    Authors: Jin Cao, Yuanyuan Yang, Ruhui Ma, Sheng Li, Hui Li

    Abstract: To more effectively control and allocate network resources, MAC CE has been introduced into the network protocol, which is a type of control signaling located in the MAC layer. Since MAC CE lacks encryption and integrity protection mechanisms provided by PDCP, the control signaling carried by MAC CE is vulnerable to interception or tampering by attackers during resource scheduling and allocation.… ▽ More

    Submitted 11 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

  36. arXiv:2506.09443  [pdf, ps, other

    cs.CR

    LLMs Cannot Reliably Judge (Yet?): A Comprehensive Assessment on the Robustness of LLM-as-a-Judge

    Authors: Songze Li, Chuokun Xu, Jiaying Wang, Xueluan Gong, Chen Chen, Jirui Zhang, Jun Wang, Kwok-Yan Lam, Shouling Ji

    Abstract: Large Language Models (LLMs) have demonstrated remarkable intelligence across various tasks, which has inspired the development and widespread adoption of LLM-as-a-Judge systems for automated model testing, such as red teaming and benchmarking. However, these systems are susceptible to adversarial attacks that can manipulate evaluation outcomes, raising concerns about their robustness and, consequ… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  37. arXiv:2506.09386  [pdf, ps, other

    hep-ex

    Search for the charmonium weak decays $J/ψ\to D_{s}^{-}ρ^{+}+c.c.$ and $J/ψ\to D_{s}^{-}π^{+}+c.c.$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (705 additional authors not shown)

    Abstract: Based on $(10087\pm44)\times 10^6$ $J/ψ$ events recorded with the BESIII detector, we search for the rare charmonium weak decays $J/ψ\to D_{s}^{-}ρ^{+}+c.c.$ and $J/ψ\to D_{s}^{-}π^{+}+c.c.$ No signal is observed, and upper limits on the branching fractions at the $90\%$ confidence level are set as $\mathcal{B}(J/ψ\to D_{s}^{-}ρ^{+}+c.c.)<8.0\times10^{-7}$ and… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 18 pages, 3 figures

  38. arXiv:2506.09340  [pdf, ps, other

    cs.CL cs.AI cs.LG

    RePO: Replay-Enhanced Policy Optimization

    Authors: Siheng Li, Zhanhui Zhou, Wai Lam, Chao Yang, Chaochao Lu

    Abstract: Reinforcement learning (RL) is vital for optimizing large language models (LLMs). Recent Group Relative Policy Optimization (GRPO) estimates advantages using multiple on-policy outputs per prompt, leading to high computational costs and low data efficiency. To address this, we introduce Replay-Enhanced Policy Optimization (RePO), which leverages diverse replay strategies to retrieve off-policy sam… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: Project Page: https://github.com/SihengLi99/RePO

  39. arXiv:2506.09335  [pdf, ps, other

    cs.MA cs.AI

    Intelligent System of Emergent Knowledge: A Coordination Fabric for Billions of Minds

    Authors: Moshi Wei, Sparks Li

    Abstract: The Intelligent System of Emergent Knowledge (ISEK) establishes a decentralized network where human and artificial intelligence agents collaborate as peers, forming a self-organizing cognitive ecosystem. Built on Web3 infrastructure, ISEK combines three fundamental principles: (1) a decentralized multi-agent architecture resistant to censorship, (2) symbiotic AI-human collaboration with equal part… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 11 pages, 1 figures,

  40. arXiv:2506.08873  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci

    Identifying vortex lattice in type-II superconductors via the dynamic magnetostrictive effect

    Authors: Peipei Lu, Mengju Yuan, Jing Zhang, Qiang Gao, Shuang Liu, Yugang Zhang, Shipeng Shen, Long Zhang, Jun Lu, Xiaoyuan Zhou, Mingquan He, Aifeng Wang, Yang Li, Wenshan Hong, Shiliang Li, Huiqian Luo, Xingjiang Zhou, Xianhui Chen, Young Sun, Yisheng Chai

    Abstract: In type-I superconductors, zero electrical resistivity and perfect diamagnetism define two fundamental criteria for superconducting behavior. In contrast, type-II superconductors exhibit more complex mixed-state physics, where magnetic flux penetrates the material above the lower critical field Hc1 in the form of quantized vortices, each carrying a single flux quantum. These vortices form a two-di… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 27 pages, 8 figures, submitted

  41. arXiv:2506.08773  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Atomic to mesoscale hierarchical structures and magnetic states in an anisotropic layered ferromagnet FePd2Te2

    Authors: Shuo Mi, Manyu Wang, Bingxian Shi, Songyang Li, Xiaoxiao Pei, Yanyan Geng, Shumin Meng, Rui Xu, Li Huang, Wei Ji, Fei Pang, Peng Cheng, Jianfeng Guo, Zhihai Cheng

    Abstract: Two-dimensional (2D) magnetic materials have predominantly exhibited easy-axis or easy-plane anisotropy and display a high sensitivity to the underlying crystal structure and lattice symmetry. Recently, an in-plane anisotropic 2D ferromagnet of FePd2Te2 has been discovered with intriguing structure and quasi-one-dimensional spin system. Here, we report a real-space investigation of its twinning st… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  42. arXiv:2506.08624  [pdf, ps, other

    nucl-ex

    Measurement of $ψ(2S)$ to $J/ψ$ cross-section ratio as function of multiplicity in $p$Pb collisions at$\sqrt{s_{NN}} = 8.16$ TeV

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis, L. An , et al. (1137 additional authors not shown)

    Abstract: The production ratio of $ψ(2S)$ to $J/ψ$ charmonium states is presented as a function of multiplicity in proton-lead collisions at a centre-of-mass energy of $\sqrt{s_{NN}}=8.16$ TeV, for both prompt and nonprompt sources. The total luminosity recorded by the LHCb experiment corresponds to 13.6 $pb^{-1}$ for $p$Pb collisions and 20.8 $pb^{-1}$ for Pb$p$ collisions, where the first particle indicat… ▽ More

    Submitted 12 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/4177/ (LHCb public pages)

    Report number: LHCb-PAPER-2025-011, CERN-EP-2025-114

  43. arXiv:2506.08576  [pdf, ps, other

    hep-ex

    Measurement of the $η$ transition form factor through $η' \rightarrow π^+π^-η$ decay

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (680 additional authors not shown)

    Abstract: Based on a sample of $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at BESIII, the transition form factor of the $η$ meson is extracted by analyzing $J/ψ\toγη',~η'\toπ^+π^-η,~η\toγl^+l^-$ ($l$=$e$, $μ$) events. The measured slope of the transition form factor is $Λ^{-2}=1.645\pm0.093_{\rm stat.}\pm {0.024_{\rm sys.}}$ (GeV/$c^2$)$^{-2}$ for the di-electron channel and… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  44. arXiv:2506.08561  [pdf, ps, other

    cs.SE

    Detecting State Manipulation Vulnerabilities in Smart Contracts Using LLM and Static Analysis

    Authors: Hao Wu, Haijun Wang, Shangwang Li, Yin Wu, Ming Fan, Yitao Zhao, Ting Liu

    Abstract: An increasing number of DeFi protocols are gaining popularity, facilitating transactions among multiple anonymous users. State Manipulation is one of the notorious attacks in DeFi smart contracts, with price variable being the most commonly exploited state variable-attackers manipulate token prices to gain illicit profits. In this paper, we propose PriceSleuth, a novel method that leverages the La… ▽ More

    Submitted 10 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

  45. arXiv:2506.08470  [pdf, ps, other

    cs.CV

    MARMOT: Masked Autoencoder for Modeling Transient Imaging

    Authors: Siyuan Shen, Ziheng Wang, Xingyue Peng, Suan Xia, Ruiqian Li, Shiying Li, Jingyi Yu

    Abstract: Pretrained models have demonstrated impressive success in many modalities such as language and vision. Recent works facilitate the pretraining paradigm in imaging research. Transients are a novel modality, which are captured for an object as photon counts versus arrival times using a precisely time-resolved sensor. In particular for non-line-of-sight (NLOS) scenarios, transients of hidden objects… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  46. arXiv:2506.08365  [pdf, ps, other

    cs.LG q-bio.BM

    AlphaFold Database Debiasing for Robust Inverse Folding

    Authors: Cheng Tan, Zhenxiao Cao, Zhangyang Gao, Siyuan Li, Yufei Huang, Stan Z. Li

    Abstract: The AlphaFold Protein Structure Database (AFDB) offers unparalleled structural coverage at near-experimental accuracy, positioning it as a valuable resource for data-driven protein design. However, its direct use in training deep models that are sensitive to fine-grained atomic geometry, such as inverse folding, exposes a critical limitation. Comparative analysis of structural feature distribution… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Under review

  47. arXiv:2506.08298  [pdf, ps, other

    cs.LG cs.SI

    H$^2$GFM: Towards unifying Homogeneity and Heterogeneity on Text-Attributed Graphs

    Authors: Trung-Kien Nguyen, Heng Ping, Shixuan Li, Peiyu Zhang, Nikos Kanakaris, Nicholas Kotov, Paul Bogdan

    Abstract: The growing interests and applications of graph learning in diverse domains have propelled the development of a unified model generalizing well across different graphs and tasks, known as the Graph Foundation Model (GFM). Existing research has leveraged text-attributed graphs (TAGs) to tackle the heterogeneity in node features among graphs. However, they primarily focus on homogeneous TAGs (HoTAGs… ▽ More

    Submitted 14 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  48. arXiv:2506.08003  [pdf, ps, other

    cs.CV cs.AI

    Audio-Sync Video Generation with Multi-Stream Temporal Control

    Authors: Shuchen Weng, Haojie Zheng, Zheng Chang, Si Li, Boxin Shi, Xinlong Wang

    Abstract: Audio is inherently temporal and closely synchronized with the visual world, making it a naturally aligned and expressive control signal for controllable video generation (e.g., movies). Beyond control, directly translating audio into video is essential for understanding and visualizing rich audio narratives (e.g., Podcasts or historical recordings). However, existing approaches fall short in gene… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  49. arXiv:2506.07978  [pdf, ps, other

    astro-ph.CO astro-ph.GA

    Not so dark, not so dense: an alternative explanation for the lensing subhalo in SDSSJ0946+1006

    Authors: Qiuhan He, Andrew Robertson, James W. Nightingale, Aristeidis Amvrosiadis, Shaun Cole, Carlos S. Frenk, Samuel C. Lange, Shubo Li, Ran Li, Xiaoyue Cao, Leo W. H. Fung, Xianghao Ma, Richard Massey, Kaihao Wang, Maximilian von Wietersheim-Kramsta

    Abstract: Previous studies of the strong lens system SDSSJ0946+1006 have reported a dark matter subhalo with an unusually high central density, potentially challenging the standard cold dark matter (CDM) paradigm. However, these analyses assumed the subhalo to be completely dark, neglecting the possibility that it may host a faint galaxy. In this work, we revisit the lensing analysis of SDSSJ0946+1006, expl… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Comments Welcome!

  50. arXiv:2506.07966  [pdf, ps, other

    cs.CV

    SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence

    Authors: Ziyang Gong, Wenhao Li, Oliver Ma, Songyuan Li, Jiayi Ji, Xue Yang, Gen Luo, Junchi Yan, Rongrong Ji

    Abstract: Multimodal Large Language Models (MLLMs) have achieved remarkable progress in various multimodal tasks. To pursue higher intelligence in space, MLLMs require integrating multiple atomic spatial capabilities to handle complex and dynamic tasks. However, existing benchmarks struggle to comprehensively evaluate the spatial intelligence of common MLLMs from the atomic level to the compositional level.… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.