Skip to main content

Showing 1–50 of 18,777 results for author: Shen

.
  1. arXiv:2506.10959  [pdf, ps, other

    cs.LG cs.AI math.ST

    Understanding In-Context Learning on Structured Manifolds: Bridging Attention to Kernel Methods

    Authors: Zhaiming Shen, Alexander Hsu, Rongjie Lai, Wenjing Liao

    Abstract: While in-context learning (ICL) has achieved remarkable success in natural language and vision domains, its theoretical understanding--particularly in the context of structured geometric data--remains unexplored. In this work, we initiate a theoretical study of ICL for regression of Hölder functions on manifolds. By establishing a novel connection between the attention mechanism and classical kern… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2506.10712  [pdf, ps, other

    cs.CV

    Uncertainty-Masked Bernoulli Diffusion for Camouflaged Object Detection Refinement

    Authors: Yuqi Shen, Fengyang Xiao, Sujie Hu, Youwei Pang, Yifan Pu, Chengyu Fang, Xiu Li, Chunming He

    Abstract: Camouflaged Object Detection (COD) presents inherent challenges due to the subtle visual differences between targets and their backgrounds. While existing methods have made notable progress, there remains significant potential for post-processing refinement that has yet to be fully explored. To address this limitation, we propose the Uncertainty-Masked Bernoulli Diffusion (UMBD) model, the first g… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 16 pages, 7 figures

  3. arXiv:2506.10675  [pdf, ps, other

    eess.IV cs.CV

    ConStyX: Content Style Augmentation for Generalizable Medical Image Segmentation

    Authors: Xi Chen, Zhiqiang Shen, Peng Cao, Jinzhu Yang, Osmar R. Zaiane

    Abstract: Medical images are usually collected from multiple domains, leading to domain shifts that impair the performance of medical image segmentation models. Domain Generalization (DG) aims to address this issue by training a robust model with strong generalizability. Recently, numerous domain randomization-based DG methods have been proposed. However, these methods suffer from the following limitations:… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  4. arXiv:2506.10648  [pdf, ps, other

    physics.flu-dyn astro-ph.SR physics.plasm-ph

    Vortex-magnetic competition and regime transitions in antiparallel flux tubes

    Authors: Weiyu Shen, Rodolfo Ostilla-Mónico, Xiaojue Zhu

    Abstract: Vortex-magnetic interactions shape magnetohydrodynamic (MHD) turbulence, influencing energy transfer in astrophysical, geophysical, and industrial systems. On the Sun, granular-scale vortex flows couple strongly with magnetic fields, channelling energy into the corona. At high Reynolds numbers, vorticity and magnetic fields are nearly frozen into the charged fluid, and MHD flows emerge from the in… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  5. arXiv:2506.10520  [pdf, ps, other

    cs.IR cs.LG

    Macro Graph of Experts for Billion-Scale Multi-Task Recommendation

    Authors: Hongyu Yao, Zijin Hong, Hao Chen, Yuanchen Bei, Zhiqing Li, Qijie Shen, Zuobin Ying, Huan Gong, Feiran Huang

    Abstract: Graph-based multi-task learning at billion-scale presents a significant challenge, as different tasks correspond to distinct billion-scale graphs. Traditional multi-task learning methods often neglect these graph structures, relying solely on individual user and item embeddings. However, disregarding graph structures overlooks substantial potential for improving performance. In this paper, we intr… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  6. arXiv:2506.10472  [pdf, ps, other

    cond-mat.mtrl-sci cond-mat.other physics.optics

    Disentangling Electronic and Ionic Nonlinear Polarization Effects in the THz Kerr Response of LaAlO$_{3}$

    Authors: Chao Shen, Maximilian Frenzel, Sebastian F. Maehrlein, Zhanybek Alpichshev

    Abstract: Nonlinear responses to intense terahertz (THz) fields provide unique insights into complex dynamics of contemporary material systems. However, the interpretation of the obtained data, in particular, distinguishing genuine ionic oscillations from the instantaneous electronic responses in THz Kerr effect remains challenging. Here, we combine two-dimensional Terahertz Kerr effect (2D-TKE) spectroscop… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  7. arXiv:2506.10468  [pdf, ps, other

    cs.GR cs.CV

    Low-Barrier Dataset Collection with Real Human Body for Interactive Per-Garment Virtual Try-On

    Authors: Zaiqiang Wu, Yechen Li, Jingyuan Liu, Yuki Shibata, Takayuki Hori, I-Chao Shen, Takeo Igarashi

    Abstract: Existing image-based virtual try-on methods are often limited to the front view and lack real-time performance. While per-garment virtual try-on methods have tackled these issues by capturing per-garment datasets and training per-garment neural networks, they still encounter practical limitations: (1) the robotic mannequin used to capture per-garment datasets is prohibitively expensive for widespr… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  8. arXiv:2506.10465  [pdf, ps, other

    cs.CV

    MedSeg-R: Reasoning Segmentation in Medical Images with Multimodal Large Language Models

    Authors: Yu Huang, Zelin Peng, Yichen Zhao, Piao Yang, Xiaokang Yang, Wei Shen

    Abstract: Medical image segmentation is crucial for clinical diagnosis, yet existing models are limited by their reliance on explicit human instructions and lack the active reasoning capabilities to understand complex clinical questions. While recent advancements in multimodal large language models (MLLMs) have improved medical question-answering (QA) tasks, most methods struggle to generate precise segment… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: †: Equal contribution

  9. arXiv:2506.10365  [pdf

    cs.SE

    AutoGEEval++: A Multi-Level and Multi-Geospatial-Modality Automated Evaluation Framework for Large Language Models in Geospatial Code Generation on Google Earth Engine

    Authors: Shuyang Hou, Zhangxiao Shen, Huayi Wu, Haoyue Jiao, Ziqi Liu, Lutong Xie, Chang Liu, Jianyuan Liang, Yaxian Qing, Xiaopu Zhang, Dehua Peng, Zhipeng Gui, Xuefeng Guan

    Abstract: Geospatial code generation is becoming a key frontier in integrating artificial intelligence with geo-scientific analysis, yet standardised automated evaluation tools for this task remain absent. This study presents AutoGEEval++, an enhanced framework building on AutoGEEval, and the first automated assessment system for large language models (LLMs) generating geospatial code on Google Earth Engine… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  10. arXiv:2506.10337  [pdf, ps, other

    cs.CV

    GeoCAD: Local Geometry-Controllable CAD Generation

    Authors: Zhanwei Zhang, Kaiyuan Liu, Junjie Liu, Wenxiao Wang, Binbin Lin, Liang Xie, Chen Shen, Deng Cai

    Abstract: Local geometry-controllable computer-aided design (CAD) generation aims to modify local parts of CAD models automatically, enhancing design efficiency. It also ensures that the shapes of newly generated local parts follow user-specific geometric instructions (e.g., an isosceles right triangle or a rectangle with one corner cut off). However, existing methods encounter challenges in achieving this… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 18 pages, 12 figures

  11. arXiv:2506.10333  [pdf, ps, other

    cond-mat.mtrl-sci cond-mat.mes-hall

    Nonlinear Néel Spin-Orbit Torque in Centrosymmetric Antiferromagnets

    Authors: Jin Cao, Weikang Wu, Huiying Liu, Shen Lai, Cong Xiao, X. C. Xie, Shengyuan A. Yang

    Abstract: Electric control of Néel vector is a central task of antiferromagnetic (AFM) spintronics. The major scheme so far relies on the linear Néel torque, which however is restricted to AFMs with broken inversion symmetry. Here, we propose a nonlinear Néel spin-orbit torque, uniquely enabling electric control in the vast class of centrosymmetric AFMs, where the existing scheme fails. Importantly, its int… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 6 pages, 4 figures and 1 table

  12. arXiv:2506.10329  [pdf, ps, other

    cs.IR

    Context-Adaptive Graph Neural Networks for Next POI Recommendation

    Authors: Yu Lei, Limin Shen, Zhu Sun, Tiantian He, Yew-Soon Ong

    Abstract: Next Point-of-Interest (POI) recommendation is a critical task in location-based services, aiming to predict users' next visits based on their check-in histories. While many existing methods leverage Graph Neural Networks (GNNs) to incorporate collaborative information and improve recommendation accuracy, most of them model each type of context using separate graphs, treating different factors in… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 12 pages, 6 figures

  13. arXiv:2506.10316  [pdf, ps, other

    hep-ex

    Search for sub-GeV invisible particles in inclusive decays of $J/ψ$ to $φ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (704 additional authors not shown)

    Abstract: A search for an invisible particle, $X$, with a mass between 0 and 0.96 $\textrm{GeV}/\textit{c}^{2}$, is performed in the process $J/ψ\rightarrowφ+ X$ using $(8774.0\pm39.4)\times10^{6}$ $J/ψ$ events collected with the BESIII detector from 2017 to 2019. The $φ$ meson is fully reconstructed and an efficient veto of photons, neutral and charged hadrons up to twice the $K_L^0$ mass is applied to the… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 10 pages, 3 figures

  14. arXiv:2506.10282  [pdf, ps, other

    cs.LG

    Graph-MLLM: Harnessing Multimodal Large Language Models for Multimodal Graph Learning

    Authors: Jiajin Liu, Dongzhe Fan, Jiacheng Shen, Chuanhao Ji, Daochen Zha, Qiaoyu Tan

    Abstract: Multimodal Large Language Models (MLLMs) have demonstrated remarkable capabilities in representing and understanding diverse modalities. However, they typically focus on modality alignment in a pairwise manner while overlooking structural relationships across data points. Integrating multimodality with structured graph information (i.e., multimodal graphs, MMGs) is essential for real-world applica… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 16 pages, 4 figures

  15. arXiv:2506.10264  [pdf, ps, other

    cs.AI

    WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models

    Authors: Qiyue Yin, Pei Xu, Qiaozhe Li, Shengda Liu, Shengqi Shen, Tong Wang, Yihong Han, Xiaonan Zhao, Likun Yang, Shiyue Cao, Shiyu Qiu, Yuxuan Liu, Shizhao Yu, Lei Cui, Chengxin Yan, Jie Sun, Xiangquan Tang, Kaiqi Huang

    Abstract: Recent breakthroughs in Large Language Models (LLMs) have led to a qualitative leap in artificial intelligence' s performance on reasoning tasks, particularly demonstrating remarkable capabilities in mathematical, symbolic, and commonsense reasoning. However, as a critical component of advanced human cognition, strategic reasoning, i.e., the ability to assess multi-agent behaviors in dynamic envir… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 15 pages, 17 figures

  16. arXiv:2506.10235  [pdf, ps, other

    cs.LG cs.AI cs.AR

    LaMAGIC2: Advanced Circuit Formulations for Language Model-Based Analog Topology Generation

    Authors: Chen-Chia Chang, Wan-Hsuan Lin, Yikang Shen, Yiran Chen, Xin Zhang

    Abstract: Automation of analog topology design is crucial due to customized requirements of modern applications with heavily manual engineering efforts. The state-of-the-art work applies a sequence-to-sequence approach and supervised finetuning on language models to generate topologies given user specifications. However, its circuit formulation is inefficient due to O(|V |2) token length and suffers from lo… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: Accepted at 42nd International Conference on Machine Learning (ICML) 2025

  17. arXiv:2506.10220  [pdf, ps, other

    astro-ph.GA astro-ph.CO

    Kinematic Confirmation of a Remarkable Linear Trail of Galaxies in the NGC 1052 Field, Consistent with Formation in a High-Speed Bullet Dwarf Collision

    Authors: Michael A. Keim, Pieter van Dokkum, Zili Shen, Harrison Souchereau, Imad Pasha, Shany Danieli, Roberto Abraham, Aaron J. Romanowsky, Yimeng Tang

    Abstract: A unique linear trail of diffuse galaxies was recently identified in the NGC 1052 field. This trail includes the remarkable, ultra-diffuse galaxies DF2 and DF4 which lack dark matter and host unusually luminous globular clusters. It has been proposed that the trail formed via a high-speed collision between two gas-rich dwarf galaxies. This scenario predicts that the trail galaxies are kinematicall… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: Accepted for publication in ApJ

  18. Physical Layer-Based Device Fingerprinting for Wireless Security: From Theory to Practice

    Authors: Junqing Zhang, Francesco Ardizzon, Mattia Piana, Guanxiong Shen, Stefano Tomasin

    Abstract: The identification of the devices from which a message is received is part of security mechanisms to ensure authentication in wireless communications. Conventional authentication approaches are cryptography-based, which, however, are usually computationally expensive and not adequate in the Internet of Things (IoT), where devices tend to be low-cost and with limited resources. This paper provides… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  19. arXiv:2506.09665  [pdf, ps, other

    cs.GR cs.CV

    VideoMat: Extracting PBR Materials from Video Diffusion Models

    Authors: Jacob Munkberg, Zian Wang, Ruofan Liang, Tianchang Shen, Jon Hasselgren

    Abstract: We leverage finetuned video diffusion models, intrinsic decomposition of videos, and physically-based differentiable rendering to generate high quality materials for 3D models given a text prompt or a single image. We condition a video diffusion model to respect the input geometry and lighting condition. This model produces multiple views of a given 3D model with coherent material properties. Seco… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  20. arXiv:2506.09656  [pdf, ps, other

    cs.AI

    Application-Driven Value Alignment in Agentic AI Systems: Survey and Perspectives

    Authors: Wei Zeng, Hengshu Zhu, Chuan Qin, Han Wu, Yihang Cheng, Sirui Zhang, Xiaowei Jin, Yinuo Shen, Zhenxing Wang, Feimin Zhong, Hui Xiong

    Abstract: The ongoing evolution of AI paradigms has propelled AI research into the Agentic AI stage. Consequently, the focus of research has shifted from single agents and simple applications towards multi-agent autonomous decision-making and task collaboration in complex environments. As Large Language Models (LLMs) advance, their applications become more diverse and complex, leading to increasingly situat… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  21. arXiv:2506.09645  [pdf, ps, other

    cs.CL cs.IR cs.LG

    Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering

    Authors: Tianjun Yao, Haoxuan Li, Zhiqiang Shen, Pan Li, Tongliang Liu, Kun Zhang

    Abstract: Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains, but their reliability is hindered by the outdated knowledge and hallucinations. Retrieval-Augmented Generation mitigates these issues by grounding LLMs with external knowledge; however, most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning. Knowledg… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 32 pages, 28 figures

    ACM Class: I.2.6

  22. arXiv:2506.09386  [pdf, ps, other

    hep-ex

    Search for the charmonium weak decays $J/ψ\to D_{s}^{-}ρ^{+}+c.c.$ and $J/ψ\to D_{s}^{-}π^{+}+c.c.$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (705 additional authors not shown)

    Abstract: Based on $(10087\pm44)\times 10^6$ $J/ψ$ events recorded with the BESIII detector, we search for the rare charmonium weak decays $J/ψ\to D_{s}^{-}ρ^{+}+c.c.$ and $J/ψ\to D_{s}^{-}π^{+}+c.c.$ No signal is observed, and upper limits on the branching fractions at the $90\%$ confidence level are set as $\mathcal{B}(J/ψ\to D_{s}^{-}ρ^{+}+c.c.)<8.0\times10^{-7}$ and… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 18 pages, 3 figures

  23. arXiv:2506.09369  [pdf, ps, other

    cs.CV

    ScaleLSD: Scalable Deep Line Segment Detection Streamlined

    Authors: Zeran Ke, Bin Tan, Xianwei Zheng, Yujun Shen, Tianfu Wu, Nan Xue

    Abstract: This paper studies the problem of Line Segment Detection (LSD) for the characterization of line geometry in images, with the aim of learning a domain-agnostic robust LSD model that works well for any natural images. With the focus of scalable self-supervised learning of LSD, we revisit and streamline the fundamental designs of (deep and non-deep) LSD approaches to have a high-performing and effici… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: accepted to CVPR 2025; 17 pages, appendices included

  24. arXiv:2506.09351  [pdf, ps, other

    cs.CL

    DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts

    Authors: Yuchen Feng, Bowen Shen, Naibin Gu, Jiaxuan Zhao, Peng Fu, Zheng Lin, Weiping Wang

    Abstract: Large language models (LLMs) with the Mixture-of-Experts (MoE) architecture achieve high cost-efficiency by selectively activating a subset of the parameters. Despite the inference efficiency of MoE LLMs, the training of extensive experts from scratch incurs substantial overhead, whereas reconstructing a dense LLM into an MoE LLM significantly reduces the training budget. However, existing reconst… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: ACL 2025

  25. arXiv:2506.09344  [pdf, ps, other

    cs.AI cs.CL cs.CV cs.LG cs.SD eess.AS

    Ming-Omni: A Unified Multimodal Model for Perception and Generation

    Authors: Inclusion AI, Biao Gong, Cheng Zou, Chuanyang Zheng, Chunluan Zhou, Canxiang Yan, Chunxiang Jin, Chunjie Shen, Dandan Zheng, Fudong Wang, Furong Xu, GuangMing Yao, Jun Zhou, Jingdong Chen, Jianxin Sun, Jiajia Liu, Jianjiang Zhu, Jun Peng, Kaixiang Ji, Kaiyou Song, Kaimeng Ren, Libin Wang, Lixiang Ru, Lele Xie, Longhua Tan , et al. (33 additional authors not shown)

    Abstract: We propose Ming-Omni, a unified multimodal model capable of processing images, text, audio, and video, while demonstrating strong proficiency in both speech and image generation. Ming-Omni employs dedicated encoders to extract tokens from different modalities, which are then processed by Ling, an MoE architecture equipped with newly proposed modality-specific routers. This design enables a single… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 18 pages,8 figures

  26. arXiv:2506.09260  [pdf, ps, other

    cs.IR cs.CL

    ThinkQE: Query Expansion via an Evolving Thinking Process

    Authors: Yibin Lei, Tao Shen, Andrew Yates

    Abstract: Effective query expansion for web search benefits from promoting both exploration and result diversity to capture multiple interpretations and facets of a query. While recent LLM-based methods have improved retrieval performance and demonstrate strong domain generalization without additional training, they often generate narrowly focused expansions that overlook these desiderata. We propose ThinkQ… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  27. arXiv:2506.09182  [pdf, ps, other

    cs.RO cs.ET

    Towards Full-Scenario Safety Evaluation of Automated Vehicles: A Volume-Based Method

    Authors: Hang Zhou, Chengyuan Ma, Shiyu Shen, Xiaopeng Li

    Abstract: With the rapid development of automated vehicles (AVs) in recent years, commercially available AVs are increasingly demonstrating high-level automation capabilities. However, most existing AV safety evaluation methods are primarily designed for simple maneuvers such as car-following and lane-changing. While suitable for basic tests, these methods are insufficient for assessing high-level automatio… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: NA

  28. arXiv:2506.09042  [pdf, ps, other

    cs.CV

    Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models

    Authors: Xuanchi Ren, Yifan Lu, Tianshi Cao, Ruiyuan Gao, Shengyu Huang, Amirmojtaba Sabour, Tianchang Shen, Tobias Pfaff, Jay Zhangjie Wu, Runjian Chen, Seung Wook Kim, Jun Gao, Laura Leal-Taixe, Mike Chen, Sanja Fidler, Huan Ling

    Abstract: Collecting and annotating real-world data for safety-critical physical AI systems, such as Autonomous Vehicle (AV), is time-consuming and costly. It is especially challenging to capture rare edge cases, which play a critical role in training and testing of an AV system. To address this challenge, we introduce the Cosmos-Drive-Dreams - a synthetic data generation (SDG) pipeline that aims to generat… ▽ More

    Submitted 11 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: Only the core contributors are listed. The full list of contributors can be found in Appendix A of this paper

  29. arXiv:2506.08989  [pdf, ps, other

    cs.LG cs.CL

    SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

    Authors: Xiao Liang, Zhong-Zhi Li, Yeyun Gong, Yang Wang, Hengyuan Zhang, Yelong Shen, Ying Nian Wu, Weizhu Chen

    Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has proven effective for training large language models (LLMs) on complex reasoning tasks, such as mathematical problem solving. A prerequisite for the scalability of RLVR is a high-quality problem set with precise and verifiable answers. However, the scarcity of well-crafted human-labeled math problems and limited-verification answers in exist… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: Reinforcement Learning; Large Language Models; LLM Reasoning

  30. arXiv:2506.08979  [pdf, other

    cs.CV cs.RO

    Rethinking Range-View LiDAR Segmentation in Adverse Weather

    Authors: Longyu Yang, Ping Hu, Lu Zhang, Jun Liu, Yap-Peng Tan, Heng Tao Shen, Xiaofeng Zhu

    Abstract: LiDAR segmentation has emerged as an important task to enrich multimedia experiences and analysis. Range-view-based methods have gained popularity due to their high computational efficiency and compatibility with real-time deployment. However, their generalized performance under adverse weather conditions remains underexplored, limiting their reliability in real-world environments. In this work, w… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  31. arXiv:2506.08887  [pdf, ps, other

    cs.CV

    DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

    Authors: Leqi Shen, Guoqiang Gong, Tianxiang Hao, Tao He, Yifeng Zhang, Pengzhang Liu, Sicheng Zhao, Jungong Han, Guiguang Ding

    Abstract: The parameter-efficient adaptation of the image-text pretraining model CLIP for video-text retrieval is a prominent area of research. While CLIP is focused on image-level vision-language matching, video-text retrieval demands comprehensive understanding at the video level. Three key discrepancies emerge in the transfer from image-level to video-level: vision, language, and alignment. However, exis… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: CVPR 2025

  32. arXiv:2506.08873  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci

    Identifying vortex lattice in type-II superconductors via the dynamic magnetostrictive effect

    Authors: Peipei Lu, Mengju Yuan, Jing Zhang, Qiang Gao, Shuang Liu, Yugang Zhang, Shipeng Shen, Long Zhang, Jun Lu, Xiaoyuan Zhou, Mingquan He, Aifeng Wang, Yang Li, Wenshan Hong, Shiliang Li, Huiqian Luo, Xingjiang Zhou, Xianhui Chen, Young Sun, Yisheng Chai

    Abstract: In type-I superconductors, zero electrical resistivity and perfect diamagnetism define two fundamental criteria for superconducting behavior. In contrast, type-II superconductors exhibit more complex mixed-state physics, where magnetic flux penetrates the material above the lower critical field Hc1 in the form of quantized vortices, each carrying a single flux quantum. These vortices form a two-di… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 27 pages, 8 figures, submitted

  33. arXiv:2506.08863  [pdf, ps, other

    astro-ph.SR

    Responses of a Coronal Hole to a Fast Flare-Driven Coronal Wave

    Authors: Xiaofan Zhang, Huadong Chen, Guiping Zhou, Li Feng, Yang Su, Jinhan Guo, Leping Li, Wei Lin, Suli Ma, Yuandeng Shen, Ruisheng Zheng, Suo Liu, Xianyong Bai, Yuanyong Deng, Jingxiu Wang

    Abstract: Coronal waves, significant solar phenomena, act as diagnostic tools for scientists studying solar atmosphere properties. Here, we present a novel observation detailing how a coronal wave event, associated with an X5.0 class flare, influenced the properties of an adjacent coronal hole through interaction. The coronal wave was observed in both extreme ultraviolet observations from the Atmospheric Im… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 10 pages, 5 figures, Accepted in ApJL

  34. arXiv:2506.08640  [pdf, ps, other

    cs.CV

    Orientation Matters: Making 3D Generative Models Orientation-Aligned

    Authors: Yichong Lu, Yuzhuo Tian, Zijin Jiang, Yikun Zhao, Yuanbo Yang, Hao Ouyang, Haoji Hu, Huimin Yu, Yujun Shen, Yiyi Liao

    Abstract: Humans intuitively perceive object shape and orientation from a single image, guided by strong priors about canonical poses. However, existing 3D generative models often produce misaligned results due to inconsistent training data, limiting their usability in downstream tasks. To address this gap, we introduce the task of orientation-aligned 3D object generation: producing 3D objects from single i… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: Project Page: https://xdimlab.github.io/Orientation_Matters

  35. arXiv:2506.08632  [pdf, other

    cs.CV

    RoboSwap: A GAN-driven Video Diffusion Framework For Unsupervised Robot Arm Swapping

    Authors: Yang Bai, Liudi Yang, George Eskandar, Fengyi Shen, Dong Chen, Mohammad Altillawi, Ziyuan Liu, Gitta Kutyniok

    Abstract: Recent advancements in generative models have revolutionized video synthesis and editing. However, the scarcity of diverse, high-quality datasets continues to hinder video-conditioned robotic learning, limiting cross-platform generalization. In this work, we address the challenge of swapping a robotic arm in one video with another: a key step for crossembodiment learning. Unlike previous methods t… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  36. arXiv:2506.08624  [pdf, ps, other

    nucl-ex

    Measurement of $ψ(2S)$ to $J/ψ$ cross-section ratio as function of multiplicity in $p$Pb collisions at$\sqrt{s_{NN}} = 8.16$ TeV

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis, L. An , et al. (1137 additional authors not shown)

    Abstract: The production ratio of $ψ(2S)$ to $J/ψ$ charmonium states is presented as a function of multiplicity in proton-lead collisions at a centre-of-mass energy of $\sqrt{s_{NN}}=8.16$ TeV, for both prompt and nonprompt sources. The total luminosity recorded by the LHCb experiment corresponds to 13.6 $pb^{-1}$ for $p$Pb collisions and 20.8 $pb^{-1}$ for Pb$p$ collisions, where the first particle indicat… ▽ More

    Submitted 12 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

    Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/4177/ (LHCb public pages)

    Report number: LHCb-PAPER-2025-011, CERN-EP-2025-114

  37. arXiv:2506.08591  [pdf, ps, other

    cs.CV cs.LG cs.MM

    Diversity-Guided MLP Reduction for Efficient Large Vision Transformers

    Authors: Chengchao Shen, Hourun Zhu, Gongfan Fang, Jianxin Wang, Xinchao Wang

    Abstract: Transformer models achieve excellent scaling property, where the performance is improved with the increment of model capacity. However, large-scale model parameters lead to an unaffordable cost of computing and memory. We analyze popular transformer architectures and find that multilayer perceptron (MLP) modules take up the majority of model parameters. To this end, we focus on the recoverability… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  38. arXiv:2506.08576  [pdf, ps, other

    hep-ex

    Measurement of the $η$ transition form factor through $η' \rightarrow π^+π^-η$ decay

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (680 additional authors not shown)

    Abstract: Based on a sample of $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected at BESIII, the transition form factor of the $η$ meson is extracted by analyzing $J/ψ\toγη',~η'\toπ^+π^-η,~η\toγl^+l^-$ ($l$=$e$, $μ$) events. The measured slope of the transition form factor is $Λ^{-2}=1.645\pm0.093_{\rm stat.}\pm {0.024_{\rm sys.}}$ (GeV/$c^2$)$^{-2}$ for the di-electron channel and… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  39. arXiv:2506.08470  [pdf, ps, other

    cs.CV

    MARMOT: Masked Autoencoder for Modeling Transient Imaging

    Authors: Siyuan Shen, Ziheng Wang, Xingyue Peng, Suan Xia, Ruiqian Li, Shiying Li, Jingyi Yu

    Abstract: Pretrained models have demonstrated impressive success in many modalities such as language and vision. Recent works facilitate the pretraining paradigm in imaging research. Transients are a novel modality, which are captured for an object as photon counts versus arrival times using a precisely time-resolved sensor. In particular for non-line-of-sight (NLOS) scenarios, transients of hidden objects… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

  40. arXiv:2506.08421  [pdf

    physics.acc-ph

    High-precision Beam Optics Calculation of the HIAF-BRing Using Measured Fields

    Authors: Ke Wang, Li-Na Sheng, Geng Wang, Wei-Ping Chai, You-Jin Yuan, Jian-Cheng Yang, Guo-Dong Shen, Liang Lu

    Abstract: The construction of the High Intensity heavy ion Accelerator Facility (HIAF) has been completed, with current efforts focused on subsystem commissioning. Beam commissioning is scheduled for autumn 2025, marking a critical milestone in HIAF's operational readiness. This paper presents high-precision optics calculations for the Booster Ring (BRing) of HIAF, a key component for achieving stable heavy… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  41. arXiv:2506.08390  [pdf, ps, other

    cs.AI

    On Reasoning Strength Planning in Large Reasoning Models

    Authors: Leheng Sheng, An Zhang, Zijian Wu, Weixiang Zhao, Changshuo Shen, Yi Zhang, Xiang Wang, Tat-Seng Chua

    Abstract: Recent studies empirically reveal that large reasoning models (LRMs) can automatically allocate more reasoning strengths (i.e., the number of reasoning tokens) for harder problems, exhibiting difficulty-awareness for better task performance. While this automatic reasoning strength allocation phenomenon has been widely observed, its underlying mechanism remains largely unexplored. To this end, we p… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  42. arXiv:2506.08371  [pdf, ps, other

    cs.CL

    Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding

    Authors: Zikai Xiao, Ziyang Wang, Wen Ma, Yan Zhang, Wei Shen, Yan Wang, Luqi Gong, Zuozhu Liu

    Abstract: While Large Language Models (LLMs) support long contexts, they struggle with performance degradation within the context window. Current solutions incur prohibitive training costs, leaving statistical behaviors and cost-effective approaches underexplored. From the decoding perspective, we identify the Posterior Salience Attenuation (PSA) phenomenon, where the salience ratio correlates with long-tex… ▽ More

    Submitted 10 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

  43. arXiv:2506.08368  [pdf, ps, other

    astro-ph.HE astro-ph.CO astro-ph.GA

    Prospects for Time-Domain and Multi-Messenger Science with eXTP

    Authors: Shu-Xu Yi, Wen Zhao, Ren-Xin Xu, Xue-Feng Wu, Giulia Stratta, Simone Dall'Osso, Yan-Jun Xu, Andrea Santangelo, Silvia Zane, Shuang-Nan Zhang, Hua Feng, Huan Yang, Junjie Mao, Junqiang Ge, Lijing Shao, Mi-Xiang Lan, He Gao, Lin Lin, Ning Jiang, Qingwen Wu, Tong Liu, Yun-Wei Yu, Xiang-Yu Wang, Jin Zhang, Dafne Guetta , et al. (49 additional authors not shown)

    Abstract: In this new era of time-domain and multi-messenger astronomy, various new transients and new phenomena are constantly being discovered thanks to the rapid advances in observations, which provide the excellent opportunity to study the physics in the extreme environments. The enhanced X-ray Timing and Polarimetry mission (eXTP), planned to be launched in 2030, has several key advantages, including a… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Submitted to the SCIENCE CHINA Physics, Mechanics & Astronomy

  44. arXiv:2506.08352  [pdf, ps, other

    cs.IR

    Reinforcement Fine-Tuning for Reasoning towards Multi-Step Multi-Source Search in Large Language Models

    Authors: Wentao Shi, Yiqing Shen

    Abstract: Large language models (LLMs) can face factual limitations when responding to time-sensitive queries about recent events that arise after their knowledge thresholds in the training corpus. Existing search-augmented approaches fall into two categories, each with distinct limitations: multi-agent search frameworks incur substantial computational overhead by separating search planning and response syn… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  45. arXiv:2506.08326  [pdf, other

    cs.LG cs.AI

    Graph Prompting for Graph Learning Models: Recent Advances and Future Directions

    Authors: Xingbo Fu, Zehong Wang, Zihan Chen, Jiazheng Li, Yaochen Zhu, Zhenyu Lei, Cong Shen, Yanfang Ye, Chuxu Zhang, Jundong Li

    Abstract: Graph learning models have demonstrated great prowess in learning expressive representations from large-scale graph data in a wide variety of real-world scenarios. As a prevalent strategy for training powerful graph learning models, the "pre-training, adaptation" scheme first pre-trains graph learning models on unlabeled graph data in a self-supervised manner and then adapts them to specific downs… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Accepted by KDD 2025 Tutorial/Survey Track

  46. arXiv:2506.08319  [pdf, ps, other

    eess.SY cs.RO

    DEKC: Data-Enable Control for Tethered Space Robot Deployment in the Presence of Uncertainty via Koopman Operator Theory

    Authors: Ao Jin, Qinyi Wang, Sijie Wen, Ya Liu, Ganghui Shen, Panfeng Huang, Fan Zhang

    Abstract: This work focuses the deployment of tethered space robot in the presence of unknown uncertainty. A data-enable framework called DEKC which contains offline training part and online execution part is proposed to deploy tethered space robot in the presence of uncertainty. The main idea of this work is modeling the unknown uncertainty as a dynamical system, which enables high accuracy and convergence… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 12 pages

  47. arXiv:2506.08228  [pdf, ps, other

    cs.LG cs.AI cs.RO

    Scaling Laws of Motion Forecasting and Planning -- A Technical Report

    Authors: Mustafa Baniodeh, Kratarth Goel, Scott Ettinger, Carlos Fuertes, Ari Seff, Tim Shen, Cole Gulino, Chenjie Yang, Ghassen Jerfel, Dokook Choe, Rui Wang, Vinutha Kallem, Sergio Casas, Rami Al-Rfou, Benjamin Sapp, Dragomir Anguelov

    Abstract: We study the empirical scaling laws of a family of encoder-decoder autoregressive transformer models on the task of joint motion forecasting and planning in the autonomous driving domain. Using a 500 thousand hours driving dataset, we demonstrate that, similar to language modeling, model performance improves as a power-law function of the total compute budget, and we observe a strong correlation b… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  48. arXiv:2506.08123  [pdf, ps, other

    cs.CL

    QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA

    Authors: Jacob Dineen, Aswin RRV, Qin Liu, Zhikun Xu, Xiao Ye, Ming Shen, Zhaonan Li, Shijie Lu, Chitta Baral, Muhao Chen, Ben Zhou

    Abstract: Alignment of large language models with explicit principles (such as helpfulness, honesty, and harmlessness) is crucial for ensuring safe and reliable AI systems. However, standard reward-based alignment methods typically collapse diverse feedback into a single scalar reward, entangling multiple objectives into one opaque training signal, which hinders interpretability. In this work, we introduce… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  49. arXiv:2506.08105  [pdf, ps, other

    astro-ph.HE gr-qc

    Probing the Strong Gravity Region of Black Holes with eXTP

    Authors: Qingcui Bu, Cosimo Bambi, Lijun Gou, Yanjun Xu, Phil Uttley, Alessandra De Rosa, Andrea Santangelo, Silvia Zane, Hua Feng, Shuang-Nan Zhang, Chichuan Jin, Haiwu Pan, Xinwen Shu, Francesco Ursini, Yanan Wang, Jianfeng Wu, Bei You, Yefei Yuan, Wenda Zhang, Stefano Bianchi, Lixin Dai, Tiziana Di Salvo, Michal Dovciak, Yuan Feng, Hengxiao Guo , et al. (18 additional authors not shown)

    Abstract: We present the novel capabilities of the enhanced X-ray Timing and Polarimetry (eXTP) mission to study the strong gravity region around stellar-mass black holes in X-ray binary systems and supermassive black holes in active galactic nuclei. eXTP can combine X-ray spectral, timing, and polarimetric techniques to study the accretion process near black holes, measure black hole masses and spins, and… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: submitted to the SCIENCE CHINA Physics, Mechanics & Astronomy

  50. arXiv:2506.08104  [pdf, ps, other

    astro-ph.HE astro-ph.SR hep-ph nucl-th

    Dense Matter in Neutron Stars with eXTP

    Authors: Ang Li, Anna L. Watts, Guobao Zhang, Sebastien Guillot, Yanjun Xu, Andrea Santangelo, Silvia Zane, Hua Feng, Shuang-Nan Zhang, Mingyu Ge, Liqiang Qi, Tuomo Salmi, Bas Dorsman, Zhiqiang Miao, Zhonghao Tu, Yuri Cavecchi, Xia Zhou, Xiaoping Zheng, Weihua Wang, Quan Cheng, Xuezhi Liu, Yining Wei, Wei Wang, Yujing Xu, Shanshan Weng , et al. (58 additional authors not shown)

    Abstract: In this White Paper, we present the potential of the enhanced X-ray Timing and Polarimetry (eXTP) mission to constrain the equation of state of dense matter in neutron stars, exploring regimes not directly accessible to terrestrial experiments. By observing a diverse population of neutron stars - including isolated objects, X-ray bursters, and accreting systems - eXTP's unique combination of timin… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: submitted to the SCIENCE CHINA Physics, Mechanics & Astronomy