Skip to main content

Showing 51–100 of 12,295 results for author: Li, h

.
  1. arXiv:2507.01492  [pdf, ps, other

    cs.CV

    AVC-DPO: Aligned Video Captioning via Direct Preference Optimization

    Authors: Jiyang Tang, Hengyi Li, Yifan Du, Wayne Xin Zhao

    Abstract: Although video multimodal large language models (video MLLMs) have achieved substantial progress in video captioning tasks, it remains challenging to adjust the focal emphasis of video captions according to human preferences. To address this limitation, we propose Aligned Video Captioning via Direct Preference Optimization (AVC-DPO), a post-training framework designed to enhance captioning capabil… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2507.01449  [pdf, ps, other

    cs.CL

    LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation

    Authors: Tianyu Liu, Qitan Lv, Hao Li, Xing Gao, Xiao Sun

    Abstract: Speculative decoding (SD), where a small draft model is employed to propose draft tokens in advance and then the target model validates them in parallel, has emerged as a promising technique for LLM inference acceleration. Many endeavors to improve SD are to eliminate the need for a draft model and generate draft tokens in a retrieval-based manner in order to further alleviate the drafting overhea… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  3. arXiv:2507.01291  [pdf, ps, other

    eess.IV cs.CV

    PanTS: The Pancreatic Tumor Segmentation Dataset

    Authors: Wenxuan Li, Xinze Zhou, Qi Chen, Tianyu Lin, Pedro R. A. S. Bassi, Szymon Plotka, Jaroslaw B. Cwikla, Xiaoxi Chen, Chen Ye, Zheren Zhu, Kai Ding, Heng Li, Kang Wang, Yang Yang, Yucheng Tang, Daguang Xu, Alan L. Yuille, Zongwei Zhou

    Abstract: PanTS is a large-scale, multi-institutional dataset curated to advance research in pancreatic CT analysis. It contains 36,390 CT scans from 145 medical centers, with expert-validated, voxel-wise annotations of over 993,000 anatomical structures, covering pancreatic tumors, pancreas head, body, and tail, and 24 surrounding anatomical structures such as vascular/skeletal structures and abdominal/tho… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  4. arXiv:2507.01249  [pdf, ps, other

    hep-ex

    Search for an Axion-Like Particle in $B\rightarrow K^{(*)} a (\rightarrowγγ)$ Decays at Belle

    Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Ahmed, Y. Ahn, H. Aihara, N. Akopov, S. Alghamdi, M. Alhakami, A. Aloisio, N. Althubiti, K. Amos, M. Angelsmark, N. Anh Ky, C. Antonioli, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae , et al. (400 additional authors not shown)

    Abstract: We report a search for an axion-like particle $a$ in $B\rightarrow K^{(*)} a (\rightarrowγγ)$ decays using data collected with the Belle detector at the KEKB asymmetric energy electron-positron collider. The search is based on a $711 \mathrm{fb^{-1}}$ data sample collected at the $Υ4S$ resonance energy, corresponding to a sample of $772\times10^6$ $Υ4S$ events. In this study, we search for the dec… ▽ More

    Submitted 3 July, 2025; v1 submitted 1 July, 2025; originally announced July 2025.

    Comments: 26 pages, 15 Figures

    Report number: Belle II Preprint: 2025-017 KEK Preprint: 2025-16

  5. arXiv:2507.00790  [pdf, ps, other

    cs.CV cs.AI

    LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

    Authors: Huaqiu Li, Yong Wang, Tongwen Huang, Hailang Huang, Haoqian Wang, Xiangxiang Chu

    Abstract: Unified image restoration is a significantly challenging task in low-level vision. Existing methods either make tailored designs for specific tasks, limiting their generalizability across various types of degradation, or rely on training with paired datasets, thereby suffering from closed-set constraints. To address these issues, we propose a novel, dataset-free, and unified approach through recur… ▽ More

    Submitted 4 July, 2025; v1 submitted 1 July, 2025; originally announced July 2025.

  6. arXiv:2507.00748  [pdf, ps, other

    cs.CV

    Improving the Reasoning of Multi-Image Grounding in MLLMs via Reinforcement Learning

    Authors: Bob Zhang, Haoran Li, Tao Zhang, Cilin Yan, Jiayin Cai, Xiaolong Jiang, Yanbin Hao

    Abstract: Recently, Multimodal Large Language Models (MLLMs) excel at visual grounding in single-image scenarios with textual references. However, their performance degrades when handling real-world applications involving complex multi-image compositions and multimodal instructions, which reveals limitations in cross-image reasoning and generalization. To address these challenges, we adopt a Reinforcement L… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

    Comments: 11 pages

  7. arXiv:2507.00557  [pdf, ps, other

    cs.AI cs.LO cs.SC

    Advancing Local Search in SMT-NRA with MCSAT Integration

    Authors: Tianyi Ding, Haokun Li, Xinpeng Ni, Bican Xia, Tianqi Zhao

    Abstract: In this paper, we advance local search for Satisfiability Modulo the Theory of Nonlinear Real Arithmetic (SMT-NRA for short). First, we introduce a two-dimensional cell-jump move, called \emph{$2d$-cell-jump}, generalizing the key operation, cell-jump, of the local search method for SMT-NRA. Then, we propose an extended local search framework, named \emph{$2d$-LS} (following the local search frame… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  8. arXiv:2507.00356  [pdf

    cs.CV cs.AI

    CGEarthEye:A High-Resolution Remote Sensing Vision Foundation Model Based on the Jilin-1 Satellite Constellation

    Authors: Zhiwei Yi, Xin Cheng, Jingyu Ma, Ruifei Zhu, Junwei Tian, Yuanxiu Zhou, Xinge Zhao, Hongzhe Li

    Abstract: Deep learning methods have significantly advanced the development of intelligent rinterpretation in remote sensing (RS), with foundational model research based on large-scale pre-training paradigms rapidly reshaping various domains of Earth Observation (EO). However, compared to the open accessibility and high spatiotemporal coverage of medium-resolution data, the limited acquisition channels for… ▽ More

    Submitted 30 June, 2025; originally announced July 2025.

    Comments: A Remote Sensing Fundation Model for Very High Resolution Images

  9. arXiv:2507.00111  [pdf, ps, other

    hep-ph

    Sudakov evolution without unitarity

    Authors: Javira Altmann, Hai Tao Li, Ludovic Scyboz, Peter Skands

    Abstract: We present a method for sampling singular functions defined on (nested) multi-particle phase spaces, based on a generalisation of parton-shower phase-space generation techniques. At the heart of the method are three key ingredients: 1) the Sudakov sampling by which shower-style calculations sweep across phase space in an ordered manner, from hard to soft; 2) the sequential nesting of multiparticle… ▽ More

    Submitted 30 June, 2025; originally announced July 2025.

    Comments: 30 pages, 13 figures, 1 appendix

  10. arXiv:2507.00025  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Generalizing to New Dynamical Systems via Frequency Domain Adaptation

    Authors: Tiexin Qin, Hong Yan, Haoliang Li

    Abstract: Learning the underlying dynamics from data with deep neural networks has shown remarkable potential in modeling various complex physical dynamics. However, current approaches are constrained in their ability to make reliable predictions in a specific domain and struggle with generalizing to unseen systems that are governed by the same general dynamics but differ in environmental characteristics. I… ▽ More

    Submitted 17 June, 2025; originally announced July 2025.

    Comments: Accepted by TPAMI 2025

  11. arXiv:2506.24118  [pdf, ps, other

    cs.CY cs.SI

    Scaling Human Judgment in Community Notes with LLMs

    Authors: Haiwen Li, Soham De, Manon Revel, Andreas Haupt, Brad Miller, Keith Coleman, Jay Baxter, Martin Saveski, Michiel A. Bakker

    Abstract: This paper argues for a new paradigm for Community Notes in the LLM era: an open ecosystem where both humans and LLMs can write notes, and the decision of which notes are helpful enough to show remains in the hands of humans. This approach can accelerate the delivery of notes, while maintaining trust and legitimacy through Community Notes' foundational principle: A community of diverse human rater… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  12. arXiv:2506.24045  [pdf, ps, other

    cs.DC cs.LG

    Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC

    Authors: Xinming Wei, Jiahao Zhang, Haoran Li, Jiayu Chen, Rui Qu, Maoliang Li, Xiang Chen, Guojie Luo

    Abstract: The proliferation of agentic Large Language Models (LLMs) on personal devices introduces a new class of workloads characterized by a dichotomy of objectives. Reactive tasks, initiated by users, demand immediate, low-latency responses, while proactive tasks operate invisibly and prioritize throughput. Existing on-device LLM engines, designed for isolated inferences, fail to efficiently manage these… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  13. arXiv:2506.23673  [pdf, ps, other

    cs.AI

    HASD: Hierarchical Adaption for pathology Slide-level Domain-shift

    Authors: Jingsong Liu, Han Li, Chen Yang, Michael Deutges, Ario Sadafi, Xin You, Katharina Breininger, Nassir Navab, Peter J. Schüffler

    Abstract: Domain shift is a critical problem for pathology AI as pathology data is heavily influenced by center-specific conditions. Current pathology domain adaptation methods focus on image patches rather than WSI, thus failing to capture global WSI features required in typical clinical scenarios. In this work, we address the challenges of slide-level domain shift by proposing a Hierarchical Adaptation fr… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

  14. arXiv:2506.23623  [pdf, ps, other

    cs.CV

    Revisiting Audio-Visual Segmentation with Vision-Centric Transformer

    Authors: Shaofei Huang, Rui Ling, Tianrui Hui, Hongyu Li, Xu Zhou, Shifeng Zhang, Si Liu, Richang Hong, Meng Wang

    Abstract: Audio-Visual Segmentation (AVS) aims to segment sound-producing objects in video frames based on the associated audio signal. Prevailing AVS methods typically adopt an audio-centric Transformer architecture, where object queries are derived from audio features. However, audio-centric Transformers suffer from two limitations: perception ambiguity caused by the mixed nature of audio, and weakened de… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

    Comments: Accepted by CVPR 2025; Code: https://github.com/spyflying/VCT_AVS; Models: https://huggingface.co/nowherespyfly/VCT_AVS

  15. arXiv:2506.23543  [pdf, ps, other

    cs.CV

    Pyramidal Patchification Flow for Visual Generation

    Authors: Hui Li, Baoyou Chen, Liwei Zhang, Jiaye Li, Jingdong Wang, Siyu Zhu

    Abstract: Diffusion transformers (DiTs) adopt Patchify, mapping patch representations to token representations through linear projections, to adjust the number of tokens input to DiT blocks and thus the computation cost. Instead of a single patch size for all the timesteps, we introduce a Pyramidal Patchification Flow (PPFlow) approach: Large patch sizes are used for high noise timesteps and small patch siz… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

    Comments: 10 pages, 9figures

  16. arXiv:2506.23487  [pdf, ps, other

    stat.ML cs.LG

    Test of partial effects for Frechet regression on Bures-Wasserstein manifolds

    Authors: Haoshu Xu, Hongzhe Li

    Abstract: We propose a novel test for assessing partial effects in Frechet regression on Bures Wasserstein manifolds. Our approach employs a sample splitting strategy: the first subsample is used to fit the Frechet regression model, yielding estimates of the covariance matrices and their associated optimal transport maps, while the second subsample is used to construct the test statistic. We prove that this… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

  17. arXiv:2506.23201  [pdf, ps, other

    cs.LG eess.SY

    External Data-Enhanced Meta-Representation for Adaptive Probabilistic Load Forecasting

    Authors: Haoran Li, Muhao Guo, Marija Ilic, Yang Weng, Guangchun Ruan

    Abstract: Accurate residential load forecasting is critical for power system reliability with rising renewable integration and demand-side flexibility. However, most statistical and machine learning models treat external factors, such as weather, calendar effects, and pricing, as extra input, ignoring their heterogeneity, and thus limiting the extraction of useful external information. We propose a paradigm… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: 10 pages

  18. arXiv:2506.23125  [pdf, ps, other

    cs.RO

    Learning Motion Skills with Adaptive Assistive Curriculum Force in Humanoid Robots

    Authors: Zhanxiang Cao, Yang Zhang, Buqing Nie, Huangxuan Lin, Haoyang Li, Yue Gao

    Abstract: Learning policies for complex humanoid tasks remains both challenging and compelling. Inspired by how infants and athletes rely on external support--such as parental walkers or coach-applied guidance--to acquire skills like walking, dancing, and performing acrobatic flips, we propose A2CF: Adaptive Assistive Curriculum Force for humanoid motion learning. A2CF trains a dual-agent system, in which a… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: 8 pages, 8 figures

  19. arXiv:2506.23086  [pdf, ps, other

    cs.CV

    Frequency-enhanced Multi-granularity Context Network for Efficient Vertebrae Segmentation

    Authors: Jian Shi, Tianqi You, Pingping Zhang, Hongli Zhang, Rui Xu, Haojie Li

    Abstract: Automated and accurate segmentation of individual vertebra in 3D CT and MRI images is essential for various clinical applications. Due to the limitations of current imaging techniques and the complexity of spinal structures, existing methods still struggle with reducing the impact of image blurring and distinguishing similar vertebrae. To alleviate these issues, we introduce a Frequency-enhanced M… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

    Comments: Accepted by MICCAI2025. More modifications my be performed

  20. arXiv:2506.23068  [pdf, ps, other

    cs.LG cs.AI stat.AP

    Curious Causality-Seeking Agents Learn Meta Causal World

    Authors: Zhiyu Zhao, Haoxuan Li, Haifeng Zhang, Jun Wang, Francesco Faccio, Jürgen Schmidhuber, Mengyue Yang

    Abstract: When building a world model, a common assumption is that the environment has a single, unchanging underlying causal rule, like applying Newton's laws to every situation. In reality, what appears as a drifting causal mechanism is often the manifestation of a fixed underlying mechanism seen through a narrow observational window. This brings about a problem that, when building a world model, even sub… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

    Comments: 33 pages

  21. arXiv:2506.22923  [pdf, ps, other

    math.OC eess.SY

    Energy-Aware Model Predictive Control for Batch Manufacturing System Scheduling Under Different Electricity Pricing Strategies

    Authors: Hongliang Li, Herschel C. Pangborn, Ilya Kovalenko

    Abstract: Manufacturing industries are among the highest energy-consuming sectors, facing increasing pressure to reduce energy costs. This paper presents an energy-aware Model Predictive Control (MPC) framework to dynamically schedule manufacturing processes in response to time-varying electricity prices without compromising production goals or violating production constraints. A network-based manufacturing… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

  22. arXiv:2506.22824  [pdf, ps, other

    eess.SP

    Sensing Security Oriented OFDM-ISAC Against Multi-Intercept Threats

    Authors: Lingyun Xu, Bowen Wang, Huiyong Li, Ziyang Cheng

    Abstract: In recent years, security has emerged as a critical aspect of integrated sensing and communication (ISAC) systems. While significant research has focused on secure communications, particularly in ensuring physical layer security, the issue of sensing security has received comparatively less attention. This paper addresses the sensing security problem in ISAC, particularly under the threat of multi… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

  23. arXiv:2506.22815  [pdf, ps, other

    cs.HC

    Memory as a Service (MaaS): Rethinking Contextual Memory as Service-Oriented Modules for Collaborative Agents

    Authors: Haichang Li

    Abstract: This position paper aims to rethink the role and design of memory in Large Language Model (LLM)-based agent systems. We observe that while current memory practices have begun to transcend the limitations of single interactions, they remain conceptually grounded in "bound memory" in terms of design concept-where memory is treated as local state attached to specific context or entities, forming "mem… ▽ More

    Submitted 28 June, 2025; originally announced June 2025.

    Comments: Position Paper for workshop. This is an initial version for discussion purposes

    ACM Class: H.5.0

  24. arXiv:2506.22736  [pdf, ps, other

    cs.CV

    UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

    Authors: Dayong Su, Yafei Zhang, Huafeng Li, Jinxing Li, Yu Liu

    Abstract: Current multimodal medical image fusion typically assumes that source images are of high quality and perfectly aligned at the pixel level. Its effectiveness heavily relies on these conditions and often deteriorates when handling misaligned or degraded medical images. To address this, we propose UniFuse, a general fusion framework. By embedding a degradation-aware prompt learning module, UniFuse se… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: Accepted by ICCV2025

  25. arXiv:2506.22554  [pdf, ps, other

    cs.CV cs.AI

    Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset

    Authors: Vasu Agrawal, Akinniyi Akinyemi, Kathryn Alvero, Morteza Behrooz, Julia Buffalini, Fabio Maria Carlucci, Joy Chen, Junming Chen, Zhang Chen, Shiyang Cheng, Praveen Chowdary, Joe Chuang, Antony D'Avirro, Jon Daly, Ning Dong, Mark Duppenthaler, Cynthia Gao, Jeff Girard, Martin Gleize, Sahir Gomez, Hongyu Gong, Srivathsan Govindarajan, Brandon Han, Sen He, Denise Hernandez , et al. (59 additional authors not shown)

    Abstract: Human communication involves a complex interplay of verbal and nonverbal signals, essential for conveying meaning and achieving interpersonal goals. To develop socially intelligent AI technologies, it is crucial to develop models that can both comprehend and generate dyadic behavioral dynamics. To this end, we introduce the Seamless Interaction Dataset, a large-scale collection of over 4,000 hours… ▽ More

    Submitted 30 June, 2025; v1 submitted 27 June, 2025; originally announced June 2025.

  26. arXiv:2506.22465  [pdf, ps, other

    eess.SP cs.IT

    Preconditioned Conjugate Gradient for MIMO-AFDM System

    Authors: Jun Zhu, Yin Xu, Dazhi He, Haoyang Li, Yunfeng Guan, Wenjun Zhang

    Abstract: Affine frequency division multiplexing (AFDM) is a promising chirp-assisted multicarrier waveform for future high mobility communications. A significant challenge in MIMO-AFDM systems is the multi-user interference (MUI), which can be effectively addressed by employing precoding techniques. However, the complexity introduced by AFDM makes the precoding process computationally expensive and challen… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: arXiv admin note: text overlap with arXiv:2503.10525

  27. arXiv:2506.22161  [pdf, ps, other

    cs.CV

    Attention-disentangled Uniform Orthogonal Feature Space Optimization for Few-shot Object Detection

    Authors: Taijin Zhao, Heqian Qiu, Yu Dai, Lanxiao Wang, Fanman Meng, Qingbo Wu, Hongliang Li

    Abstract: Few-shot object detection (FSOD) aims to detect objects with limited samples for novel classes, while relying on abundant data for base classes. Existing FSOD approaches, predominantly built on the Faster R-CNN detector, entangle objectness recognition and foreground classification within shared feature spaces. This paradigm inherently establishes class-specific objectness criteria and suffers fro… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  28. arXiv:2506.22090  [pdf, ps, other

    hep-ex

    Updated measurement of $CP$ violation and polarisation in $B^0_s \rightarrow J/ψ\overline{K}{}^{*}\kern-1pt(892)^{0}$ decays

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, R. Aleksiejunas, F. Alessio, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1168 additional authors not shown)

    Abstract: A time-integrated angular analysis of the decay $B^0_s \rightarrow J/ψ\overline{K}{}^{*}\kern-1pt(892)^{0}$, with $J/ψ\rightarrow μ^{+} μ^{-}$ and $\overline{K}{}^{*}\kern-1pt(892)^{0} \rightarrow K^{-} π^{+}$, is presented. The analysis employs a sample of proton-proton collision data collected by the LHCb experiment during 2015-2018 at a centre-of-mass energy of $13 \text{TeV}$, corresponding to… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/4457/ (LHCb public pages)

    Report number: LHCb-PAPER-2025-020, CERN-EP-2025-131

  29. arXiv:2506.21682  [pdf, ps, other

    cs.CL

    Do We Really Need GNNs with Explicit Structural Modeling? MLPs Suffice for Language Model Representations

    Authors: Li Zhou, Hao Jiang, Junjie Li, Zefeng Zhao, Feng Jiang, Wenyu Chen, Haizhou Li

    Abstract: Explicit structural information has been proven to be encoded by Graph Neural Networks (GNNs), serving as auxiliary knowledge to enhance model capabilities and improve performance in downstream NLP tasks. However, recent studies indicate that GNNs fail to fully utilize structural information, whereas Multi-Layer Perceptrons (MLPs), despite lacking the message-passing mechanisms inherent to GNNs, e… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Graph Neural Networks, Multi-Layer Perceptrons, Explicit Structural Modeling, Probing Classifier

  30. arXiv:2506.21420  [pdf, ps, other

    cs.CV cs.RO

    EndoFlow-SLAM: Real-Time Endoscopic SLAM with Flow-Constrained Gaussian Splatting

    Authors: Taoyu Wu, Yiyi Miao, Zhuoxiao Li, Haocheng Zhao, Kang Dang, Jionglong Su, Limin Yu, Haoang Li

    Abstract: Efficient three-dimensional reconstruction and real-time visualization are critical in surgical scenarios such as endoscopy. In recent years, 3D Gaussian Splatting (3DGS) has demonstrated remarkable performance in efficient 3D reconstruction and rendering. Most 3DGS-based Simultaneous Localization and Mapping (SLAM) methods only rely on the appearance constraints for optimizing both 3DGS and camer… ▽ More

    Submitted 5 July, 2025; v1 submitted 26 June, 2025; originally announced June 2025.

    Comments: This paper has been accepted at MICCAI2025

  31. arXiv:2506.21215  [pdf, ps, other

    cs.AI cs.CL cs.LG

    Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?

    Authors: Haoang Chi, He Li, Wenjing Yang, Feng Liu, Long Lan, Xiaoguang Ren, Tongliang Liu, Bo Han

    Abstract: Causal reasoning capability is critical in advancing large language models (LLMs) toward strong artificial intelligence. While versatile LLMs appear to have demonstrated capabilities in understanding contextual causality and providing responses that obey the laws of causality, it remains unclear whether they perform genuine causal reasoning akin to humans. However, current evidence indicates the c… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 24 pages, accepted at NeurIPS 2024

    Journal ref: Advances in Neural Information Processing Systems, 2024, 37: 96640-96670

  32. arXiv:2506.21154  [pdf, ps, other

    stat.ME cs.AI cs.LG

    Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation

    Authors: He Li, Haoang Chi, Mingyu Liu, Wanrong Huang, Liyang Xu, Wenjing Yang

    Abstract: The real world naturally has dimensions of time and space. Therefore, estimating the counterfactual outcomes with spatial-temporal attributes is a crucial problem. However, previous methods are based on classical statistical models, which still have limitations in performance and generalization. This paper proposes a novel framework for estimating counterfactual outcomes with spatial-temporal attr… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 24 pages, accepted at ICML 2025

  33. arXiv:2506.21032  [pdf, ps, other

    cs.IR

    RecCoT: Enhancing Recommendation via Chain-of-Thought

    Authors: Shuo Yang, Jiangxia Cao, Haipeng Li, Yuqi Mao, Shuchao Pang

    Abstract: In real-world applications, users always interact with items in multiple aspects, such as through implicit binary feedback (e.g., clicks, dislikes, long views) and explicit feedback (e.g., comments, reviews). Modern recommendation systems (RecSys) learn user-item collaborative signals from these implicit feedback signals as a large-scale binary data-streaming, subsequently recommending other highl… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Work in progress

  34. FedSC: Federated Learning with Semantic-Aware Collaboration

    Authors: Huan Wang, Haoran Li, Huaming Chen, Jun Yan, Jiahua Shi, Jun Shen

    Abstract: Federated learning (FL) aims to train models collaboratively across clients without sharing data for privacy-preserving. However, one major challenge is the data heterogeneity issue, which refers to the biased labeling preferences at multiple clients. A number of existing FL methods attempt to tackle data heterogeneity locally (e.g., regularizing local models) or globally (e.g., fine-tuning global… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 12 pages, KDD 2025

  35. arXiv:2506.20926  [pdf, ps, other

    cs.CR

    Towards Generalized and Stealthy Watermarking for Generative Code Models

    Authors: Haoxuan Li, Jiale Zhang, Xiaobing Sun, Xiapu Luo

    Abstract: Generative code models (GCMs) significantly enhance development efficiency through automated code generation and code summarization. However, building and training these models require computational resources and time, necessitating effective digital copyright protection to prevent unauthorized leaks and misuse. Backdoor watermarking, by embedding hidden identifiers, simplifies copyright verificat… ▽ More

    Submitted 29 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

    Comments: 13 pages

  36. arXiv:2506.20788  [pdf, ps, other

    math.OC

    Robust and Flexible Microtransit Design: Chance-Constrained Dial-a-Ride Problem with Soft Time Windows

    Authors: Hongli Li, Zengxiang Lei, Xinwu Qian, Satish V. Ukkusuri

    Abstract: Microtransit offers a promising blend of rideshare flexibility and public transit efficiency. In practice, it faces unanticipated but spatially aligned requests, passengers seeking to join ongoing schedules, leading to underutilized capacity and degraded service if not properly managed. At the same time, it must accommodate diverse passenger needs, from routine errands to time-sensitive trips such… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: 27 pages, 10 figures, arXiv:2402.01265 [math.OC] (v1, Feb 2 2024); Plan to submit to Journal

  37. arXiv:2506.20756  [pdf, ps, other

    cs.CV

    StereoDiff: Stereo-Diffusion Synergy for Video Depth Estimation

    Authors: Haodong Li, Chen Wang, Jiahui Lei, Kostas Daniilidis, Lingjie Liu

    Abstract: Recent video depth estimation methods achieve great performance by following the paradigm of image depth estimation, i.e., typically fine-tuning pre-trained video diffusion models with massive data. However, we argue that video depth estimation is not a naive extension of image depth estimation. The temporal consistency requirements for dynamic and static regions in videos are fundamentally differ… ▽ More

    Submitted 30 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

    Comments: Work done in Nov 2024, during an internship at the University of Pennsylvania. Project page: https://stereodiff.github.io/

  38. arXiv:2506.20661  [pdf, ps, other

    quant-ph cond-mat.quant-gas physics.atom-ph

    Architectural mechanisms of a universal fault-tolerant quantum computer

    Authors: Dolev Bluvstein, Alexandra A. Geim, Sophie H. Li, Simon J. Evered, J. Pablo Bonilla Ataides, Gefen Baranes, Andi Gu, Tom Manovitz, Muqing Xu, Marcin Kalinowski, Shayan Majidy, Christian Kokail, Nishad Maskara, Elias C. Trapp, Luke M. Stewart, Simon Hollerith, Hengyun Zhou, Michael J. Gullans, Susanne F. Yelin, Markus Greiner, Vladan Vuletic, Madelyn Cain, Mikhail D. Lukin

    Abstract: Quantum error correction (QEC) is believed to be essential for the realization of large-scale quantum computers. However, due to the complexity of operating on the encoded `logical' qubits, understanding the physical principles for building fault-tolerant quantum devices and combining them into efficient architectures is an outstanding scientific challenge. Here we utilize reconfigurable arrays of… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: Main text + Methods. Ancillary files: 3 movies, error model, raw experimental commands

  39. arXiv:2506.20660  [pdf, ps, other

    quant-ph cond-mat.quant-gas physics.atom-ph

    Continuous operation of a coherent 3,000-qubit system

    Authors: Neng-Chun Chiu, Elias C. Trapp, Jinen Guo, Mohamed H. Abobeih, Luke M. Stewart, Simon Hollerith, Pavel Stroganov, Marcin Kalinowski, Alexandra A. Geim, Simon J. Evered, Sophie H. Li, Lisa M. Peters, Dolev Bluvstein, Tout T. Wang, Markus Greiner, Vladan Vuletić, Mikhail D. Lukin

    Abstract: Neutral atoms are a promising platform for quantum science, enabling advances in areas ranging from quantum simulations and computation to metrology, atomic clocks and quantum networking. While atom losses typically limit these systems to a pulsed mode, continuous operation could significantly enhance cycle rates, remove bottlenecks in metrology, and enable deep-circuit quantum evolution through q… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: Main text: 8 pages, 4 figures. Methods: 7 pages, 10 figures. Ancillary files: one supplementary movie and caption

  40. arXiv:2506.20644  [pdf, ps, other

    cs.LG

    Efficient Federated Learning with Encrypted Data Sharing for Data-Heterogeneous Edge Devices

    Authors: Hangyu Li, Hongyue Wu, Guodong Fan, Zhen Zhang, Shizhan Chen, Zhiyong Feng

    Abstract: As privacy protection gains increasing importance, more models are being trained on edge devices and subsequently merged into the central server through Federated Learning (FL). However, current research overlooks the impact of network topology, physical distance, and data heterogeneity on edge devices, leading to issues such as increased latency and degraded model performance. To address these is… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: Accepted by ICWS 2025

  41. arXiv:2506.20599  [pdf, ps, other

    cs.CV

    SFNet: Fusion of Spatial and Frequency-Domain Features for Remote Sensing Image Forgery Detection

    Authors: Ji Qi, Xinchang Zhang, Dingqi Ye, Yongjia Ruan, Xin Guo, Shaowen Wang, Haifeng Li

    Abstract: The rapid advancement of generative artificial intelligence is producing fake remote sensing imagery (RSI) that is increasingly difficult to detect, potentially leading to erroneous intelligence, fake news, and even conspiracy theories. Existing forgery detection methods typically rely on single visual features to capture predefined artifacts, such as spatial-domain cues to detect forged objects l… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  42. arXiv:2506.20590  [pdf, ps, other

    cs.CV

    WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration

    Authors: Chaojun Ni, Jie Li, Haoyun Li, Hengyu Liu, Xiaofeng Wang, Zheng Zhu, Guosheng Zhao, Boyuan Wang, Chenxin Li, Guan Huang, Wenjun Mei

    Abstract: Interactive 3D scene generation from a single image has gained significant attention due to its potential to create immersive virtual worlds. However, a key challenge in current 3D generation methods is the limited explorability, which cannot render high-quality images during larger maneuvers beyond the original viewpoint, particularly when attempting to move forward into unseen areas. To address… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  43. arXiv:2506.20502  [pdf

    astro-ph.SR physics.space-ph

    Probing Solar Polar Regions

    Authors: Yuanyong Deng, Hui Tian, Jie Jiang, Shuhong Yang, Hao Li, Robert Cameron, Laurent Gizon, Louise Harra, Robert F. Wimmer-Schweingruber, Frédéric Auchère, Xianyong Bai, Luis Bellot Rubio, Linjie Chen, Pengfei Chen, Lakshmi Pradeep Chitta, Jackie Davies, Fabio Favata, Li Feng, Xueshang Feng, Weiqun Gan, Don Hassler, Jiansen He, Junfeng Hou, Zhenyong Hou, Chunlan Jin , et al. (23 additional authors not shown)

    Abstract: The magnetic fields and dynamical processes in the solar polar regions play a crucial role in the solar magnetic cycle and in supplying mass and energy to the fast solar wind, ultimately being vital in controlling solar activities and driving space weather. Despite numerous efforts to explore these regions, to date no imaging observations of the Sun's poles have been achieved from vantage points o… ▽ More

    Submitted 28 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

    Comments: Accepted for publication in Chinese Journal of Space Science

  44. arXiv:2506.20443  [pdf, ps, other

    physics.plasm-ph

    MHD simulation of tilt instability during the dynamic FRC magnetic compression process

    Authors: Yiming Ma, Ping Zhu, Bo Rao, Haolong Li

    Abstract: The nonlinear evolution of the tilt instability in a field reversed configuration (FRC) during the dynamic magnetic compression process has been investigated using magnetohydrodynamic (MHD) simulations with the NIMROD code [C. R. Sovinec \textit{et al.}, J. Comput. Phys. \textbf{195}, 355 (2004)]. The tilt mode induces significant deformations in the linear growth phase and results in complete con… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  45. arXiv:2506.20231  [pdf, ps, other

    eess.SP

    Sensing-Aware Transmit Waveform/Receive Filter Design for OFDM-MBS Systems

    Authors: Xinghe Li, Kainan Cheng, Hongzhi Guo, Huiyong Li, Ziyang Cheng

    Abstract: In this letter, we study the problem of cooperative sensing design for an orthogonal frequency division multiplexing (OFDM) multiple base stations (MBS) system. We consider a practical scenario where the base stations (BSs) exploit certain subcarriers to realize a sensing function. Since the high sidelobe level (SLL) of OFDM waveforms degrades radar detection for weak targets, and the cross-correl… ▽ More

    Submitted 30 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

  46. arXiv:2506.19816  [pdf, ps, other

    cs.RO cs.CV

    CronusVLA: Transferring Latent Motion Across Time for Multi-Frame Prediction in Manipulation

    Authors: Hao Li, Shuai Yang, Yilun Chen, Yang Tian, Xiaoda Yang, Xinyi Chen, Hanqing Wang, Tai Wang, Feng Zhao, Dahua Lin, Jiangmiao Pang

    Abstract: Recent vision-language-action (VLA) models built on pretrained vision-language models (VLMs) have demonstrated strong generalization across manipulation tasks. However, they remain constrained by a single-frame observation paradigm and cannot fully benefit from the motion information offered by aggregated multi-frame historical observations, as the large vision-language backbone introduces substan… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: 36 pages, 21 figures

  47. arXiv:2506.19742  [pdf, ps, other

    eess.IV cs.AI cs.CV

    NeRF-based CBCT Reconstruction needs Normalization and Initialization

    Authors: Zhuowei Xu, Han Li, Dai Sun, Zhicheng Li, Yujia Li, Qingpeng Kong, Zhiwei Cheng, Nassir Navab, S. Kevin Zhou

    Abstract: Cone Beam Computed Tomography (CBCT) is widely used in medical imaging. However, the limited number and intensity of X-ray projections make reconstruction an ill-posed problem with severe artifacts. NeRF-based methods have achieved great success in this task. However, they suffer from a local-global training mismatch between their two key components: the hash encoder and the neural network. Specif… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  48. arXiv:2506.19545  [pdf, ps, other

    math.OC

    Fast convergence of a primal-dual dynamical system with implicit Hessian damping and Tikhonov regularization

    Authors: Hong-lu Li, Xin He, Yi-bin Xiao

    Abstract: This paper proposes two primal-dual dynamical systems for solving linear equality constrained convex optimization problems: one with implicit Hessian damping only, and the other further incorporating Tikhonov regularization. We analyze the fast convergence properties of both dynamical systems and show that they achieve the same convergence rates. Moreover, we show that the trajectory generated by… ▽ More

    Submitted 26 June, 2025; v1 submitted 24 June, 2025; originally announced June 2025.

    Comments: 25 pages, 9 figures

  49. arXiv:2506.19180  [pdf, ps, other

    hep-ex hep-ph

    Precise Measurement of the $Λ$ Electric Dipole Moment through the Entangled Strange Baryon-Antibaryon System

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (696 additional authors not shown)

    Abstract: The dominance of matter over antimatter in the universe has consistently driven the pursuit of new physics beyond the Standard Model that violates charge-parity symmetry. Unlike the well-constrained electrons and neutrons, strange baryons (hyperons) remain a largely unexplored territory, in which interactions between hyperons and particles from new physics could induce a non-trivial electric dipol… ▽ More

    Submitted 28 June, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

  50. arXiv:2506.18882  [pdf, ps, other

    cs.CV

    Light of Normals: Unified Feature Representation for Universal Photometric Stereo

    Authors: Hong Li, Houyuan Chen, Chongjie Ye, Zhaoxi Chen, Bohan Li, Shaocong Xu, Xianda Guo, Xuhui Liu, Yikai Wang, Baochang Zhang, Satoshi Ikehata, Boxin Shi, Anyi Rao, Hao Zhao

    Abstract: Universal photometric stereo (PS) aims to recover high-quality surface normals from objects under arbitrary lighting conditions without relying on specific illumination models. Despite recent advances such as SDM-UniPS and Uni MS-PS, two fundamental challenges persist: 1) the deep coupling between varying illumination and surface normal features, where ambiguity in observed intensity makes it diff… ▽ More

    Submitted 24 June, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

    Comments: Home: https://houyuanchen111.github.io/lino.github.io Github: https://github.com/houyuanchen111/LINO_UniPS HuggingFace Demo: https://huggingface.co/spaces/houyuanchen/lino