Skip to main content

Showing 1–50 of 459 results for author: Xinyang

.
  1. arXiv:2506.10808  [pdf, ps, other

    gr-qc cond-mat.stat-mech hep-th

    Analogous supercritical crossovers in black holes and water

    Authors: Shoucheng Wang, Xinyang Li, Yuliang Jin, Li Li

    Abstract: We investigate the supercritical crossovers for black hole thermodynamics in the supercritical regime beyond the critical point, where small and large black holes are indistinguishable from the conventional viewpoint. We establish a refined supercritical phase diagram that comprehensively characterizes small, large, and indistinguishable black hole phases, whose boundaries are defined by two super… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

    Comments: 8 pages, 6 figures

  2. arXiv:2506.08849  [pdf, ps, other

    cs.CV

    Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image Analysis

    Authors: Jingguo Qu, Xinyang Han, Tonghuan Xiao, Jia Ai, Juan Wu, Tong Zhao, Jing Qin, Ann Dorothy King, Winnie Chiu-Wing Chu, Jing Cai, Michael Tin-Cheung Ying

    Abstract: Medical ultrasonography is an essential imaging technique for examining superficial organs and tissues, including lymph nodes, breast, and thyroid. It employs high-frequency ultrasound waves to generate detailed images of the internal structures of the human body. However, manually contouring regions of interest in these images is a labor-intensive task that demands expertise and often results in… ▽ More

    Submitted 10 June, 2025; v1 submitted 10 June, 2025; originally announced June 2025.

  3. arXiv:2506.06254  [pdf, ps, other

    cs.AI cs.CL cs.LG

    PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time

    Authors: Weizhi Zhang, Xinyang Zhang, Chenwei Zhang, Liangwei Yang, Jingbo Shang, Zhepei Wei, Henry Peng Zou, Zijie Huang, Zhengyang Wang, Yifan Gao, Xiaoman Pan, Lian Xiong, Jingguo Liu, Philip S. Yu, Xian Li

    Abstract: Large Language Model (LLM) empowered agents have recently emerged as advanced paradigms that exhibit impressive capabilities in a wide range of domains and tasks. Despite their potential, current LLM agents often adopt a one-size-fits-all approach, lacking the flexibility to respond to users' varying needs and preferences. This limitation motivates us to develop PersonaAgent, the first personalize… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  4. arXiv:2506.05901  [pdf, other

    cs.CL cs.AI

    Route-and-Reason: Scaling Large Language Model Reasoning with Reinforced Model Router

    Authors: Chenyang Shao, Xinyang Liu, Yutang Lin, Fengli Xu, Yong Li

    Abstract: Multi-step reasoning has proven essential for enhancing the problem-solving capabilities of Large Language Models (LLMs) by decomposing complex tasks into intermediate steps, either explicitly or implicitly. Extending the reasoning chain at test time through deeper thought processes or broader exploration, can furthur improve performance, but often incurs substantial costs due to the explosion in… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  5. arXiv:2506.05828  [pdf, ps, other

    cs.CL cs.CE

    FinanceReasoning: Benchmarking Financial Numerical Reasoning More Credible, Comprehensive and Challenging

    Authors: Zichen Tang, Haihong E, Ziyan Ma, Haoyang He, Jiacheng Liu, Zhongjun Yang, Zihua Rong, Rongjin Li, Kun Ji, Qing Huang, Xinyang Hu, Yang Liu, Qianhe Zheng

    Abstract: We introduce FinanceReasoning, a novel benchmark designed to evaluate the reasoning capabilities of large reasoning models (LRMs) in financial numerical reasoning problems. Compared to existing benchmarks, our work provides three key advancements. (1) Credibility: We update 15.6% of the questions from four public datasets, annotating 908 new questions with detailed Python solutions and rigorously… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: Accepted by ACL 2025 Main Conference

  6. arXiv:2506.03391  [pdf, ps, other

    cs.IR cs.AI cs.DB cs.LG

    Universal Reusability in Recommender Systems: The Case for Dataset- and Task-Independent Frameworks

    Authors: Tri Kurniawan Wijaya, Xinyang Shao, Gonzalo Fiz Pontiveros, Edoardo D'Amico

    Abstract: Recommender systems are pivotal in delivering personalized experiences across industries, yet their adoption and scalability remain hindered by the need for extensive dataset- and task-specific configurations. Existing systems often require significant manual intervention, domain expertise, and engineering effort to adapt to new datasets or tasks, creating barriers to entry and limiting reusabilit… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  7. arXiv:2506.02774  [pdf, ps, other

    cs.GR

    Voyager: Real-Time Splatting City-Scale 3D Gaussians on Your Phone

    Authors: Zheng Liu, He Zhu, Xinyang Li, Yirun Wang, Yujiao Shi, Wei Li, Jingwen Leng, Minyi Guo, Yu Feng

    Abstract: 3D Gaussian Splatting (3DGS) is an emerging technique for photorealistic 3D scene rendering. However, rendering city-scale 3DGS scenes on mobile devices, e.g., your smartphones, remains a significant challenge due to the limited resources on mobile devices. A natural solution is to offload computation to the cloud; however, naively streaming rendered frames from the cloud to the client introduces… ▽ More

    Submitted 3 June, 2025; v1 submitted 3 June, 2025; originally announced June 2025.

  8. arXiv:2505.24125  [pdf

    q-bio.NC

    Weak but influential: Nonlinear contributions of structural connectivity to human cognitive abilities and brain functions

    Authors: Rong Wang, Zhao Chang, Xuechun Liu, Daniel Kristanto, Étienne Gérard Guy Gartner, Xinyang Liu, Mianxin Liu, Ying Wu, Ming Lui, Changsong Zhou

    Abstract: Diverse human cognitive abilities are rooted in brain structural connectivity which has weights spanning several orders of magnitude. However, due to false-positive challenges in tractography, weak connectivity has been often treated as noise and ignored - despite its prevalence across mammalian brains. Here we show that weak connectivity significantly predicts human cognitive abilities and suppor… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 26 pages, 6 figures

  9. arXiv:2505.23940  [pdf, ps, other

    physics.flu-dyn

    Diff-FlowFSI: A GPU-Optimized Differentiable CFD Platform for High-Fidelity Turbulence and FSI Simulations

    Authors: Xiantao Fan, Xinyang Liu, Meng Wang, Jian-Xun Wang

    Abstract: Turbulent flows and fluid-structure interactions (FSI) are ubiquitous in scientific and engineering applications, but their accurate and efficient simulation remains a major challenge due to strong nonlinearities, multiscale interactions, and high computational demands. Traditional CFD solvers, though effective, struggle with scalability and adaptability for tasks such as inverse modeling, optimiz… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 59 pages, 21 figures

  10. arXiv:2505.23922  [pdf, ps, other

    cs.CV cs.CL

    ScaleLong: A Multi-Timescale Benchmark for Long Video Understanding

    Authors: David Ma, Huaqing Yuan, Xingjian Wang, Qianbo Zang, Tianci Liu, Xinyang He, Yanbin Wei, Jiawei Guo, Ni Jiahui, Zhenzhu Yang, Meng Cao, Shanghaoran Quan, Yizhi Li, Wangchunshu Zhou, Jiaheng Liu, Wenhao Huang, Ge Zhang, Shiwen Ni, Xiaojie Jin

    Abstract: Although long-video understanding demands that models capture hierarchical temporal information -- from clip (seconds) and shot (tens of seconds) to event (minutes) and story (hours) -- existing benchmarks either neglect this multi-scale design or scatter scale-specific questions across different videos, preventing direct comparison of model performance across timescales on the same content. To ad… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  11. arXiv:2505.23687  [pdf

    physics.optics physics.app-ph

    Enhanced Light Extraction and Beam Focusing in GaN LEDs Using Hybrid Metasurface-Distributed Bragg Reflector Structures

    Authors: Hanbo Xu, Xinyang Liu, Lei Wang

    Abstract: This study presents an optimized hybrid design integrating a distributed Bragg reflector (DBR) and a TiO2 nanocylinder metasurface to enhance light extraction efficiency (LEE) and beam directionality(narrow divergence angle) in light-emitting diodes (LEDs) based on gallium nitride (GaN).Parametric simulations were used to identify an optimal device architecture.The resulting structure comprises a… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 25 pages,10 charts

    Report number: JNO25-2929

  12. arXiv:2505.21962  [pdf, ps, other

    cs.CV

    A2Seek: Towards Reasoning-Centric Benchmark for Aerial Anomaly Understanding

    Authors: Mengjingcheng Mo, Xinyang Tong, Jiaxu Leng, Mingpi Tan, Jiankang Zheng, Yiran Liu, Haosheng Chen, Ji Gan, Weisheng Li, Xinbo Gao

    Abstract: While unmanned aerial vehicles (UAVs) offer wide-area, high-altitude coverage for anomaly detection, they face challenges such as dynamic viewpoints, scale variations, and complex scenes. Existing datasets and methods, mainly designed for fixed ground-level views, struggle to adapt to these conditions, leading to significant performance drops in drone-view scenarios. To bridge this gap, we introdu… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  13. arXiv:2505.21396  [pdf, ps, other

    cs.CL cs.AI cs.CY cs.HC

    Improving Research Idea Generation Through Data: An Empirical Investigation in Social Science

    Authors: Xiao Liu, Xinyi Dong, Xinyang Gao, Yansong Feng, Xun Pang

    Abstract: Recent advancements in large language models (LLMs) have shown promise in generating novel research ideas. However, these ideas often face challenges related to feasibility and expected effectiveness. This paper explores how augmenting LLMs with relevant data during the idea generation process can enhance the quality of generated ideas. We introduce two ways of incorporating data: (1) providing me… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  14. arXiv:2505.16314  [pdf, ps, other

    cs.CV cs.AI

    NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

    Authors: Shuhao Han, Haotian Fan, Fangyuan Kong, Wenjie Liao, Chunle Guo, Chongyi Li, Radu Timofte, Liang Li, Tao Li, Junhui Cui, Yunqiu Wang, Yang Tai, Jingwei Sun, Jianhui Sun, Xinli Yue, Tianyi Wang, Huan Hou, Junda Lu, Xinyang Huang, Zitang Zhou, Zijian Zhang, Xuhui Zheng, Xuecheng Wu, Chong Peng, Xuezhi Cao , et al. (90 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2025 challenge on Text to Image (T2I) generation model quality assessment, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2025. The aim of this challenge is to address the fine-grained quality assessment of text-to-image generation models. This challenge evaluates text-to-image models from two aspe… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

  15. arXiv:2505.16166  [pdf, ps, other

    cs.CV

    TRAIL: Transferable Robust Adversarial Images via Latent diffusion

    Authors: Yuhao Xue, Zhifei Zhang, Xinyang Jiang, Yifei Shen, Junyao Gao, Wentao Gu, Jiale Zhao, Miaojing Shi, Cairong Zhao

    Abstract: Adversarial attacks exploiting unrestricted natural perturbations present severe security risks to deep learning systems, yet their transferability across models remains limited due to distribution mismatches between generated adversarial features and real-world data. While recent works utilize pre-trained diffusion models as adversarial priors, they still encounter challenges due to the distribut… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  16. arXiv:2505.06118  [pdf, ps, other

    eess.IV cs.AI cs.CV

    The Application of Deep Learning for Lymph Node Segmentation: A Systematic Review

    Authors: Jingguo Qu, Xinyang Han, Man-Lik Chui, Yao Pu, Simon Takadiyi Gunda, Ziman Chen, Jing Qin, Ann Dorothy King, Winnie Chiu-Wing Chu, Jing Cai, Michael Tin-Cheung Ying

    Abstract: Automatic lymph node segmentation is the cornerstone for advances in computer vision tasks for early detection and staging of cancer. Traditional segmentation methods are constrained by manual delineation and variability in operator proficiency, limiting their ability to achieve high accuracy. The introduction of deep learning technologies offers new possibilities for improving the accuracy of lym… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

  17. arXiv:2505.05064  [pdf, ps, other

    cs.LG

    WaterDrum: Watermarking for Data-centric Unlearning Metric

    Authors: Xinyang Lu, Xinyuan Niu, Gregory Kang Ruey Lau, Bui Thi Cam Nhung, Rachael Hwee Ling Sim, Fanyu Wen, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low

    Abstract: Large language model (LLM) unlearning is critical in real-world applications where it is necessary to efficiently remove the influence of private, copyrighted, or harmful data from some users. However, existing utility-centric unlearning metrics (based on model utility) may fail to accurately evaluate the extent of unlearning in realistic settings such as when (a) the forget and retain set have se… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  18. arXiv:2505.04376  [pdf, other

    eess.IV cs.CV

    Label-efficient Single Photon Images Classification via Active Learning

    Authors: Zili Zhang, Ziting Wen, Yiheng Qiang, Hongzhou Dong, Wenle Dong, Xinyang Li, Xiaofan Wang, Xiaoqiang Ren

    Abstract: Single-photon LiDAR achieves high-precision 3D imaging in extreme environments through quantum-level photon detection technology. Current research primarily focuses on reconstructing 3D scenes from sparse photon events, whereas the semantic interpretation of single-photon images remains underexplored, due to high annotation costs and inefficient labeling strategies. This paper presents the first a… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  19. arXiv:2505.03912  [pdf, other

    cs.RO cs.CV

    OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation

    Authors: Can Cui, Pengxiang Ding, Wenxuan Song, Shuanghao Bai, Xinyang Tong, Zirui Ge, Runze Suo, Wanqi Zhou, Yang Liu, Bofang Jia, Han Zhao, Siteng Huang, Donglin Wang

    Abstract: Dual-system VLA (Vision-Language-Action) architectures have become a hot topic in embodied intelligence research, but there is a lack of sufficient open-source work for further performance analysis and optimization. To address this problem, this paper will summarize and compare the structural designs of existing dual-system architectures, and conduct systematic empirical evaluations on the core de… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

  20. arXiv:2505.02056  [pdf, other

    cs.CV cs.LG

    Handling Imbalanced Pseudolabels for Vision-Language Models with Concept Alignment and Confusion-Aware Calibrated Margin

    Authors: Yuchen Wang, Xuefeng Bai, Xiucheng Li, Weili Guan, Liqiang Nie, Xinyang Chen

    Abstract: Adapting vision-language models (VLMs) to downstream tasks with pseudolabels has gained increasing attention. A major obstacle is that the pseudolabels generated by VLMs tend to be imbalanced, leading to inferior performance. While existing methods have explored various strategies to address this, the underlying causes of imbalance remain insufficiently investigated. To fill this gap, we delve int… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

    Comments: Accepted to ICML 2025

  21. arXiv:2504.20285  [pdf, other

    cs.IT eess.SP math.OC

    Computation of Capacity-Distortion-Cost Functions for Continuous Memoryless Channels

    Authors: Xinyang Li, Ziyou Tang, Vlad C. Andrei, Ullrich J. Mönich, Fan Liu, Holger Boche

    Abstract: This paper aims at computing the capacity-distortion-cost (CDC) function for continuous memoryless channels, which is defined as the supremum of the mutual information between channel input and output, constrained by an input cost and an expected distortion of estimating channel state. Solving the optimization problem is challenging because the input distribution does not lie in a finite-dimension… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: Accepted by ISIT 2025

  22. arXiv:2504.19506  [pdf, other

    cs.CV

    SynergyAmodal: Deocclude Anything with Text Control

    Authors: Xinyang Li, Chengjie Yi, Jiawei Lai, Mingbao Lin, Yansong Qu, Shengchuan Zhang, Liujuan Cao

    Abstract: Image deocclusion (or amodal completion) aims to recover the invisible regions (\ie, shape and appearance) of occluded instances in images. Despite recent advances, the scarcity of high-quality data that balances diversity, plausibility, and fidelity remains a major obstacle. To address this challenge, we identify three critical elements: leveraging in-the-wild image data for diversity, incorporat… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

    Comments: 17 pages

  23. arXiv:2504.18809  [pdf

    physics.optics quant-ph

    Metasurface-Assisted Adaptive Quantum Phase Contrast Imaging

    Authors: Xiaojing Feng, Juanzi He, Xingyu Liu, Xiaoshu Zhu, Yifan Zhou, Xinyang Feng, Shuming Wang

    Abstract: Quantum imaging employs the nonclassical correlation of photons to break through the noise limitation of classical imaging, realizing high sensitivity, high SNR imaging and multifunctional image processing. To enhance the flexibility and imaging performance of the optical systems, metasurfaces composed of subwavelength structural units provide a powerful optimization approach, enabling advanced ap… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

  24. arXiv:2504.13153  [pdf, other

    cs.CV

    Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs

    Authors: Shaohui Dai, Yansong Qu, Zheyan Li, Xinyang Li, Shengchuan Zhang, Liujuan Cao

    Abstract: Bridging natural language and 3D geometry is a crucial step toward flexible, language-driven scene understanding. While recent advances in 3D Gaussian Splatting (3DGS) have enabled fast and high-quality scene reconstruction, research has also explored incorporating open-vocabulary understanding into 3DGS. However, most existing methods require iterative optimization over per-view 2D semantic featu… ▽ More

    Submitted 17 April, 2025; originally announced April 2025.

  25. arXiv:2504.12527  [pdf

    q-bio.OT eess.IV

    Analysis of the MICCAI Brain Tumor Segmentation -- Metastases (BraTS-METS) 2025 Lighthouse Challenge: Brain Metastasis Segmentation on Pre- and Post-treatment MRI

    Authors: Nazanin Maleki, Raisa Amiruddin, Ahmed W. Moawad, Nikolay Yordanov, Athanasios Gkampenis, Pascal Fehringer, Fabian Umeh, Crystal Chukwurah, Fatima Memon, Bojan Petrovic, Justin Cramer, Mark Krycia, Elizabeth B. Shrickel, Ichiro Ikuta, Gerard Thompson, Lorenna Vidal, Vilma Kosovic, Adam E. Goldman-Yassen, Virginia Hill, Tiffany So, Sedra Mhana, Albara Alotaibi, Nathan Page, Prisha Bhatia, Yasaman Sharifi , et al. (218 additional authors not shown)

    Abstract: Despite continuous advancements in cancer treatment, brain metastatic disease remains a significant complication of primary cancer and is associated with an unfavorable prognosis. One approach for improving diagnosis, management, and outcomes is to implement algorithms based on artificial intelligence for the automated segmentation of both pre- and post-treatment MRI brain images. Such algorithms… ▽ More

    Submitted 6 May, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

    Comments: 28 pages, 4 figures, 2 tables

  26. arXiv:2504.09466  [pdf, other

    cs.CR cs.CL

    AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender

    Authors: Weixiang Zhao, Jiahe Guo, Yulin Hu, Yang Deng, An Zhang, Xingyu Sui, Xinyang Han, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu

    Abstract: Despite extensive efforts in safety alignment, large language models (LLMs) remain vulnerable to jailbreak attacks. Activation steering offers a training-free defense method but relies on fixed steering coefficients, resulting in suboptimal protection and increased false rejections of benign inputs. To address this, we propose AdaSteer, an adaptive activation steering method that dynamically adjus… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    Comments: 17 pages, 6 figures, 9 tables

  27. arXiv:2504.07881  [pdf

    q-bio.GN

    An LLM-Driven Multi-Agent Debate System for Mendelian Diseases

    Authors: Xinyang Zhou, Yongyong Ren, Qianqian Zhao, Daoyi Huang, Xinbo Wang, Tingting Zhao, Zhixing Zhu, Wenyuan He, Shuyuan Li, Yan Xu, Yu Sun, Yongguo Yu, Shengnan Wu, Jian Wang, Guangjun Yu, Dake He, Bo Ban, Hui Lu

    Abstract: Accurate diagnosis of Mendelian diseases is crucial for precision therapy and assistance in preimplantation genetic diagnosis. However, existing methods often fall short of clinical standards or depend on extensive datasets to build pretrained machine learning models. To address this, we introduce an innovative LLM-Driven multi-agent debate system (MD2GPS) with natural language explanations of the… ▽ More

    Submitted 11 April, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

    Comments: 21 pages, 5 figures, 1 table

  28. arXiv:2504.07547  [pdf, other

    eess.SY

    Strategic learning for disturbance rejection in multi-agent systems: Nash and Minmax in graphical games

    Authors: Xinyang Wang, Martin Guay, Shimin Wang, Hongwei Zhang

    Abstract: This article investigates the optimal control problem with disturbance rejection for discrete-time multi-agent systems under cooperative and non-cooperative graphical games frameworks. Given the practical challenges of obtaining accurate models, Q-function-based policy iteration methods are proposed to seek the Nash equilibrium solution for the cooperative graphical game and the distributed minmax… ▽ More

    Submitted 10 April, 2025; originally announced April 2025.

  29. arXiv:2504.05052  [pdf, other

    physics.soc-ph

    Assess Space-Based Solar Power in European-Scale Power System Decarbonization

    Authors: Xinyang Che, Lijun Liu, Wei He

    Abstract: Meeting net-zero targets remains formidable as terrestrial renewables grapple with intermittency and regional variability. Here, we integrate space-based solar power (SBSP) -- a potential near-constant, orbital solar technology -- into a high-resolution, Europe-wide capacity-expansion and dispatch model to quantify its contribution under net-zero constraints. We examine two advanced SBSP designs:… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  30. arXiv:2503.21684  [pdf, other

    math.AP

    Decorated phases in triblock copolymers: zeroth- and first-order analysis

    Authors: Stanley Alama, Lia Bronsard, Xinyang Lu, Chong Wang

    Abstract: We study a two-dimensional inhibitory ternary system characterized by a free energy functional which combines an interface short-range interaction energy promoting micro-domain growth with a Coulomb-type long-range interaction energy which prevents micro-domains from unlimited spreading. Here we consider a scenario in which two species are dominant and one species is vanishingly small. In this sce… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  31. arXiv:2503.13882  [pdf, other

    cs.LG cs.AI

    MoK-RAG: Mixture of Knowledge Paths Enhanced Retrieval-Augmented Generation for Embodied AI Environments

    Authors: Zhengsheng Guo, Linwei Zheng, Xinyang Chen, Xuefeng Bai, Kehai Chen, Min Zhang

    Abstract: While human cognition inherently retrieves information from diverse and specialized knowledge sources during decision-making processes, current Retrieval-Augmented Generation (RAG) systems typically operate through single-source knowledge retrieval, leading to a cognitive-algorithmic discrepancy. To bridge this gap, we introduce MoK-RAG, a novel multi-source RAG framework that implements a mixture… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

  32. arXiv:2503.08590  [pdf, ps, other

    math.FA math.CV

    A counterexample of the Fredholm of Toeplitz operator

    Authors: Hua Liu, Xinyang Zhang

    Abstract: In this paper we study the essential spectra of the Toeplitz operator on the Hardy space $H^1$. We give a counterexample to show that the Toeplitz operator with symbol is not Fredholm, which gives a counterexample to the conjecture by J.A. Virtanen J A in 2006.

    Submitted 11 March, 2025; originally announced March 2025.

  33. arXiv:2503.08007  [pdf, other

    cs.RO cs.AI

    MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models

    Authors: Han Zhao, Wenxuan Song, Donglin Wang, Xinyang Tong, Pengxiang Ding, Xuelian Cheng, Zongyuan Ge

    Abstract: Developing versatile quadruped robots that can smoothly perform various actions and tasks in real-world environments remains a significant challenge. This paper introduces a novel vision-language-action (VLA) model, mixture of robotic experts (MoRE), for quadruped robots that aim to introduce reinforcement learning (RL) for fine-tuning large-scale VLA models with a large amount of mixed-quality da… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: Accepted by ICRA 2025

  34. arXiv:2503.07253  [pdf, other

    cs.CV

    AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis

    Authors: Zhangyu Lai, Yilin Lu, Xinyang Li, Jianghang Lin, Yansong Qu, Liujuan Cao, Ming Li, Rongrong Ji

    Abstract: While existing anomaly synthesis methods have made remarkable progress, achieving both realism and diversity in synthesis remains a major obstacle. To address this, we propose AnomalyPainter, a zero-shot framework that breaks the diversity-realism trade-off dilemma through synergizing Vision Language Large Model (VLLM), Latent Diffusion Model (LDM), and our newly introduced texture library Tex-9K.… ▽ More

    Submitted 11 March, 2025; v1 submitted 10 March, 2025; originally announced March 2025.

    Comments: anomaly synthesis,anomaly detection

  35. arXiv:2503.06084  [pdf, other

    cs.CV

    Exploring Interpretability for Visual Prompt Tuning with Hierarchical Concepts

    Authors: Yubin Wang, Xinyang Jiang, De Cheng, Xiangqian Zhao, Zilong Wang, Dongsheng Li, Cairong Zhao

    Abstract: Visual prompt tuning offers significant advantages for adapting pre-trained visual foundation models to specific tasks. However, current research provides limited insight into the interpretability of this approach, which is essential for enhancing AI reliability and enabling AI-driven knowledge discovery. In this paper, rather than learning abstract prompt embeddings, we propose the first framewor… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

    Comments: 10 pages, 9 figures

  36. arXiv:2503.05362  [pdf, other

    cs.CL

    Chain of Strategy Optimization Makes Large Language Models Better Emotional Supporter

    Authors: Weixiang Zhao, Xingyu Sui, Xinyang Han, Yang Deng, Yulin Hu, Jiahe Guo, Libo Qin, Qianyun Du, Shijin Wang, Yanyan Zhao, Bing Qin, Ting Liu

    Abstract: The growing emotional stress in modern society has increased the demand for Emotional Support Conversations (ESC). While Large Language Models (LLMs) show promise for ESC, they face two key challenges: (1) low strategy selection accuracy, and (2) preference bias, limiting their adaptability to emotional needs of users. Existing supervised fine-tuning (SFT) struggles to address these issues, as it… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

    Comments: 19 pages, 9 figures, 15 tables

  37. arXiv:2503.03179  [pdf, other

    hep-ph

    Probing the couplings of an axion-like particle with leptons via three-lepton final state processes at future $e^{-}p$ colliders

    Authors: Chong-Xing Yue, Xin-Yang Li, Mei-Shu-Yu Wang, Yang-Yang Bu

    Abstract: The axion-like particle (ALP) is one of the best motivated particles beyond the Standard Model (SM). We explore the possibility of detecting the couplings of ALP with leptons via three-lepton final state processes $e^- p \to e^- j a~(a \to \ell^+ \ell^-)$ at the LHeC (FCC-eh). For completeness, we investigate the cases where the ALP decays not only into electron and muon pairs but also into tau pa… ▽ More

    Submitted 2 April, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

    Comments: 32 pages, 16 figures, 12 tables, accepted for publication in PRD

  38. arXiv:2503.01776  [pdf, other

    cs.LG cs.AI cs.CV cs.IR

    Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

    Authors: Tiansheng Wen, Yifei Wang, Zequn Zeng, Zhong Peng, Yudi Su, Xinyang Liu, Bo Chen, Hongwei Liu, Stefanie Jegelka, Chenyu You

    Abstract: Many large-scale systems rely on high-quality deep representations (embeddings) to facilitate tasks like retrieval, search, and generative modeling. Matryoshka Representation Learning (MRL) recently emerged as a solution for adaptive embedding lengths, but it requires full model retraining and suffers from noticeable performance degradations at short lengths. In this paper, we show that sparse cod… ▽ More

    Submitted 19 May, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: Accepted by ICML2025

  39. arXiv:2503.00881  [pdf, other

    cs.CV cs.AI

    Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization

    Authors: You Shen, Zhipeng Zhang, Xinyang Li, Yansong Qu, Yu Lin, Shengchuan Zhang, Liujuan Cao

    Abstract: Representing 3D scenes from multiview images is a core challenge in computer vision and graphics, which requires both precise rendering and accurate reconstruction. Recently, 3D Gaussian Splatting (3DGS) has garnered significant attention for its high-quality rendering and fast inference speed. Yet, due to the unstructured and irregular nature of Gaussian point clouds, ensuring accurate geometry r… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

  40. arXiv:2502.20968  [pdf, other

    cs.CL

    Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs

    Authors: Weixiang Zhao, Yulin Hu, Yang Deng, Jiahe Guo, Xingyu Sui, Xinyang Han, An Zhang, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu

    Abstract: Role-playing enables large language models (LLMs) to engage users in immersive and personalized interactions, but it also introduces significant safety risks. Existing role-play fine-tuning techniques improve role adaptability but may degrade safety performance, particularly for villainous characters. In this work, we conduct the first comprehensive assessment of role-play fine-tuning risks by tra… ▽ More

    Submitted 27 May, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

    Comments: To appear at ACL 2025 (Main)

  41. arXiv:2502.18915   

    cs.CL cs.AI

    END: Early Noise Dropping for Efficient and Effective Context Denoising

    Authors: Hongye Jin, Pei Chen, Jingfeng Yang, Zhengyang Wang, Meng Jiang, Yifan Gao, Binxuan Huang, Xinyang Zhang, Zheng Li, Tianyi Liu, Huasheng Li, Bing Yin

    Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, they are often distracted by irrelevant or noisy context in input sequences that degrades output quality. This problem affects both long- and short-context scenarios, such as retrieval-augmented generation, table question-answering, and in-context learning. We re… ▽ More

    Submitted 25 March, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

    Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

  42. arXiv:2502.18808  [pdf, other

    cs.LG stat.ML

    Optimal Stochastic Trace Estimation in Generative Modeling

    Authors: Xinyang Liu, Hengrong Du, Wei Deng, Ruqi Zhang

    Abstract: Hutchinson estimators are widely employed in training divergence-based likelihoods for diffusion models to ensure optimal transport (OT) properties. However, this estimator often suffers from high variance and scalability concerns. To address these challenges, we investigate Hutch++, an optimal stochastic trace estimator for generative models, designed to minimize training variance while maintaini… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

    Comments: Accepted by AISTATS 2025

  43. arXiv:2502.14795  [pdf, other

    cs.RO cs.CV

    Humanoid-VLA: Towards Universal Humanoid Control with Visual Integration

    Authors: Pengxiang Ding, Jianfei Ma, Xinyang Tong, Binghong Zou, Xinxin Luo, Yiguo Fan, Ting Wang, Hongchao Lu, Panzhong Mo, Jinxin Liu, Yuefan Wang, Huaicheng Zhou, Wenshuo Feng, Jiacheng Liu, Siteng Huang, Donglin Wang

    Abstract: This paper addresses the limitations of current humanoid robot control frameworks, which primarily rely on reactive mechanisms and lack autonomous interaction capabilities due to data scarcity. We propose Humanoid-VLA, a novel framework that integrates language understanding, egocentric scene perception, and motion control, enabling universal humanoid control. Humanoid-VLA begins with language-mot… ▽ More

    Submitted 21 February, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

  44. arXiv:2502.12151  [pdf, other

    cs.CV eess.SY

    VoLUT: Efficient Volumetric streaming enhanced by LUT-based super-resolution

    Authors: Chendong Wang, Anlan Zhang, Yifan Yang, Lili Qiu, Yuqing Yang, Xinyang Jiang, Feng Qian, Suman Banerjee

    Abstract: 3D volumetric video provides immersive experience and is gaining traction in digital media. Despite its rising popularity, the streaming of volumetric video content poses significant challenges due to the high data bandwidth requirement. A natural approach to mitigate the bandwidth issue is to reduce the volumetric video's data rate by downsampling the content prior to transmission. The video can… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

  45. arXiv:2502.09473  [pdf, other

    cs.LG eess.SP

    Learning to Predict Global Atrial Fibrillation Dynamics from Sparse Measurements

    Authors: Alexander Jenkins, Andrea Cini, Joseph Barker, Alexander Sharp, Arunashis Sau, Varun Valentine, Srushti Valasang, Xinyang Li, Tom Wong, Timothy Betts, Danilo Mandic, Cesare Alippi, Fu Siong Ng

    Abstract: Catheter ablation of Atrial Fibrillation (AF) consists of a one-size-fits-all treatment with limited success in persistent AF. This may be due to our inability to map the dynamics of AF with the limited resolution and coverage provided by sequential contact mapping catheters, preventing effective patient phenotyping for personalised, targeted ablation. Here we introduce FibMap, a graph recurrent n… ▽ More

    Submitted 14 February, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: Under review

  46. arXiv:2502.06195  [pdf, other

    cs.SD cs.RO

    Calibration of Multiple Asynchronous Microphone Arrays using Hybrid TDOA

    Authors: Chengjie Zhang, Wenda Pan, Xinyang Han, He Kong

    Abstract: Accurate calibration of acoustic sensing systems made of multiple asynchronous microphone arrays is essential for satisfactory performance in sound source localization and tracking. State-of-the-art calibration methods for this type of system rely on the time difference of arrival and direction of arrival measurements among the microphone arrays (denoted as TDOA-M and DOA, respectively). In this p… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: This paper was accepted and is going to be presented at ICASSP 2025

  47. arXiv:2501.19051  [pdf, other

    cs.NI

    Swift: Rethinking RDMA Control Plane for Elastic Computing

    Authors: Junxue Zhang, Han Tian, Xinyang Huang, Wenxue Li, Kaiqiang Xu, Dian Shen, Yong Wang, Kai Chen

    Abstract: Elastic computing enables dynamic scaling to meet workload demands, and Remote Direct Memory Access (RDMA) enhances this by providing high-throughput, low-latency network communication. However, integrating RDMA into elastic computing remains a challenge, particularly in control plane operations for RDMA connection setup. This paper revisits the assumptions of prior work on high-performance RDMA… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

  48. arXiv:2501.18672  [pdf, other

    cs.GR cs.CV

    Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting

    Authors: Yansong Qu, Dian Chen, Xinyang Li, Xiaofan Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

    Abstract: Recent advancements in 3D scene editing have been propelled by the rapid development of generative models. Existing methods typically utilize generative models to perform text-guided editing on 3D representations, such as 3D Gaussian Splatting (3DGS). However, these methods are often limited to texture modifications and fail when addressing geometric changes, such as editing a character's head to… ▽ More

    Submitted 25 May, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

    Comments: Visit our project page at https://quyans.github.io/Drag-Your-Gaussian

  49. arXiv:2501.15373  [pdf, other

    eess.SY cs.AI cs.LG math.OC nlin.AO

    Learning-Enhanced Safeguard Control for High-Relative-Degree Systems: Robust Optimization under Disturbances and Faults

    Authors: Xinyang Wang, Hongwei Zhang, Shimin Wang, Wei Xiao, Martin Guay

    Abstract: Merely pursuing performance may adversely affect the safety, while a conservative policy for safe exploration will degrade the performance. How to balance the safety and performance in learning-based control problems is an interesting yet challenging issue. This paper aims to enhance system performance with safety guarantee in solving the reinforcement learning (RL)-based optimal control problems… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

    Comments: 16 pages, 6 figures

  50. arXiv:2501.11687  [pdf, ps, other

    eess.SP

    SE(3)-Based Trajectory Optimization and Target Tracking in UAV-Enabled ISAC Systems

    Authors: Dongxiao Xu, Xinyang Li, Vlad C. Andrei, Moritz Wiese, Ullrich J. Moenich, Holger Boche

    Abstract: This paper presents a novel approach to enhance sensing capabilities in UAV-enabled MIMO-OFDM ISAC systems by leveraging UAV mobility as a mono-static radar. By integrating uniform planar arrays (UPAs) and modeling the UAV dynamics in $SE(3)$, we address key challenges such as 3D space sensing and trajectory design. We propose a target tracking scheme using extended Kalman filtering (EKF) in… ▽ More

    Submitted 29 April, 2025; v1 submitted 20 January, 2025; originally announced January 2025.

    Comments: Accepted by IEEE International Symposium on Information Theory 2025