Search | arXiv e-print repository

Hysteresis-Aware Neural Network Modeling and Whole-Body Reinforcement Learning Control of Soft Robots

Authors: Zongyuan Chen, Yan Xia, Jiayuan Liu, Jijia Liu, Wenhao Tang, Jiayu Chen, Feng Gao, Longfei Ma, Hongen Liao, Yu Wang, Chao Yu, Boyu Zhang, Fei Xing

Abstract: Soft robots exhibit inherent compliance and safety, which makes them particularly suitable for applications requiring direct physical interaction with humans, such as surgical procedures. However, their nonlinear and hysteretic behavior, resulting from the properties of soft materials, presents substantial challenges for accurate modeling and control. In this study, we present a soft robotic syste… ▽ More Soft robots exhibit inherent compliance and safety, which makes them particularly suitable for applications requiring direct physical interaction with humans, such as surgical procedures. However, their nonlinear and hysteretic behavior, resulting from the properties of soft materials, presents substantial challenges for accurate modeling and control. In this study, we present a soft robotic system designed for surgical applications and propose a hysteresis-aware whole-body neural network model that accurately captures and predicts the soft robot's whole-body motion, including its hysteretic behavior. Building upon the high-precision dynamic model, we construct a highly parallel simulation environment for soft robot control and apply an on-policy reinforcement learning algorithm to efficiently train whole-body motion control strategies. Based on the trained control policy, we developed a soft robotic system for surgical applications and validated it through phantom-based laser ablation experiments in a physical environment. The results demonstrate that the hysteresis-aware modeling reduces the Mean Squared Error (MSE) by 84.95 percent compared to traditional modeling methods. The deployed control algorithm achieved a trajectory tracking error ranging from 0.126 to 0.250 mm on the real soft robot, highlighting its precision in real-world conditions. The proposed method showed strong performance in phantom-based surgical experiments and demonstrates its potential for complex scenarios, including future real-world clinical applications. △ Less

Submitted 18 April, 2025; originally announced April 2025.

arXiv:2504.13539 [pdf, other]

Search for $1^{-+}$ charmonium-like hybrid via $e^{+}e^{-}\rightarrow γη^{(\prime)} η_{c}$ at center-of-mass energies between 4.258 and 4.681 GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (696 additional authors not shown)

Abstract: Using $e^{+}e^{-}$ collision data corresponding to an integrated luminosity of 10.6 fb$^{-1}$ collected at center-of-mass energies between 4.258 and 4.681 GeV with the BESIII detector at the BEPCII collider, we search for the $1^{- +}$ charmonium-like hybrid via $e^{+}e^{-}\rightarrowγηη_{c}$ and $e^{+}e^{-}\rightarrowγη^{\prime}η_{c}$ decays for the first time. No significant signal is observed a… ▽ More Using $e^{+}e^{-}$ collision data corresponding to an integrated luminosity of 10.6 fb$^{-1}$ collected at center-of-mass energies between 4.258 and 4.681 GeV with the BESIII detector at the BEPCII collider, we search for the $1^{- +}$ charmonium-like hybrid via $e^{+}e^{-}\rightarrowγηη_{c}$ and $e^{+}e^{-}\rightarrowγη^{\prime}η_{c}$ decays for the first time. No significant signal is observed and the upper limits on the Born cross sections for both processes are set at the 90% confidence level. △ Less

Submitted 18 April, 2025; originally announced April 2025.

arXiv:2504.13437 [pdf]

Chirality-induced quantum nonreciprocity

Authors: Zimo Zhang, Zhongxiao Xu, Ran Huang, Xingda Lu, Fengbo Zhang, Donghao Li, Şahin K. Özdemir, Franco Nori, Han Bao, Yanhong Xiao, Bing Chen, Hui Jing, Heng Shen

Abstract: Chirality, nonreciprocity, and quantum correlations are at the center of a wide range of intriguing effects and applications across natural sciences and emerging quantum technologies. However, the direct link combining these three essential concepts has remained unknown till now. Here, we establish a chiral non-Hermitian platform with flying atoms and demonstrate chirality-induced nonreciprocal bi… ▽ More Chirality, nonreciprocity, and quantum correlations are at the center of a wide range of intriguing effects and applications across natural sciences and emerging quantum technologies. However, the direct link combining these three essential concepts has remained unknown till now. Here, we establish a chiral non-Hermitian platform with flying atoms and demonstrate chirality-induced nonreciprocal bipartite quantum correlations between two channels: Quantum correlation emerges when two spatially separated light beams of the same polarization propagate in opposite directions in the atomic cloud, and it becomes zero when they travel in the same direction. Thus, just by flipping the propagation direction of one of the beams while keeping its polarization the same as the other beam, we can create or annihilate quantum correlations between two channels. We also show that this nonreciprocal quantum correlation can be extended to multi-color sidebands with Floquet engineering. Our findings may pave the road for realizing one-way quantum effects, such as nonreciprocal squeezing or entanglement, with a variety of chiral devices, for the emerging applications of e.g., directional quantum network or nonreciprocal quantum metrology. △ Less

Submitted 21 April, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

Comments: To be published

arXiv:2504.13267 [pdf, other]

Leveraging Functional Encryption and Deep Learning for Privacy-Preserving Traffic Forecasting

Authors: Isaac Adom, Mohammmad Iqbal Hossain, Hassan Mahmoud, Ahmad Alsharif, Mahmoud Nabil Mahmoud, Yang Xiao

Abstract: Over the past few years, traffic congestion has continuously plagued the nation's transportation system creating several negative impacts including longer travel times, increased pollution rates, and higher collision risks. To overcome these challenges, Intelligent Transportation Systems (ITS) aim to improve mobility and vehicular systems, ensuring higher levels of safety by utilizing cutting-edge… ▽ More Over the past few years, traffic congestion has continuously plagued the nation's transportation system creating several negative impacts including longer travel times, increased pollution rates, and higher collision risks. To overcome these challenges, Intelligent Transportation Systems (ITS) aim to improve mobility and vehicular systems, ensuring higher levels of safety by utilizing cutting-edge technologies, sophisticated sensing capabilities, and innovative algorithms. Drivers' participatory sensing, current/future location reporting, and machine learning algorithms have considerably improved real-time congestion monitoring and future traffic management. However, each driver's sensitive spatiotemporal location information can create serious privacy concerns. To address these challenges, we propose in this paper a secure, privacy-preserving location reporting and traffic forecasting system that guarantees privacy protection of driver data while maintaining high traffic forecasting accuracy. Our novel k-anonymity scheme utilizes functional encryption to aggregate encrypted location information submitted by drivers while ensuring the privacy of driver location data. Additionally, using the aggregated encrypted location information as input, this research proposes a deep learning model that incorporates a Convolutional-Long Short-Term Memory (Conv-LSTM) module to capture spatial and short-term temporal features and a Bidirectional Long Short-Term Memory (Bi-LSTM) module to recover long-term periodic patterns for traffic forecasting. With extensive evaluation on real datasets, we demonstrate the effectiveness of the proposed scheme with less than 10% mean absolute error for a 60-minute forecasting horizon, all while protecting driver privacy. △ Less

Submitted 17 April, 2025; originally announced April 2025.

Comments: 17 pages, 14 Figures, Journal Publication

arXiv:2504.13044 [pdf, other]

The Dissipation Theory of Aging: A Quantitative Analysis Using a Cellular Aging Map

Authors: Farhan Khodaee, Rohola Zandie, Yufan Xia, Elazer R. Edelman

Abstract: We propose a new theory for aging based on dynamical systems and provide a data-driven computational method to quantify the changes at the cellular level. We use ergodic theory to decompose the dynamics of changes during aging and show that aging is fundamentally a dissipative process within biological systems, akin to dynamical systems where dissipation occurs due to non-conservative forces. To q… ▽ More We propose a new theory for aging based on dynamical systems and provide a data-driven computational method to quantify the changes at the cellular level. We use ergodic theory to decompose the dynamics of changes during aging and show that aging is fundamentally a dissipative process within biological systems, akin to dynamical systems where dissipation occurs due to non-conservative forces. To quantify the dissipation dynamics, we employ a transformer-based machine learning algorithm to analyze gene expression data, incorporating age as a token to assess how age-related dissipation is reflected in the embedding space. By evaluating the dynamics of gene and age embeddings, we provide a cellular aging map (CAM) and identify patterns indicative of divergence in gene embedding space, nonlinear transitions, and entropy variations during aging for various tissues and cell types. Our results provide a novel perspective on aging as a dissipative process and introduce a computational framework that enables measuring age-related changes with molecular resolution. △ Less

Submitted 17 April, 2025; originally announced April 2025.

arXiv:2504.12824 [pdf, other]

Mixed Structural Choice Operator: Enhancing Technology Mapping with Heterogeneous Representations

Authors: Zhang Hu, Hongyang Pan, Yinshui Xia, Lunyao Wang, Zhufei Chu

Abstract: The independence of logic optimization and technology mapping poses a significant challenge in achieving high-quality synthesis results. Recent studies have improved optimization outcomes through collaborative optimization of multiple logic representations and have improved structural bias through structural choices. However, these methods still rely on technology-independent optimization and fail… ▽ More The independence of logic optimization and technology mapping poses a significant challenge in achieving high-quality synthesis results. Recent studies have improved optimization outcomes through collaborative optimization of multiple logic representations and have improved structural bias through structural choices. However, these methods still rely on technology-independent optimization and fail to truly resolve structural bias issues. This paper proposes a scalable and efficient framework based on Mixed Structural Choices (MCH). This is a novel heterogeneous mapping method that combines multiple logic representations with technology-aware optimization. MCH flexibly integrates different logic representations and stores candidates for various optimization strategies. By comprehensively evaluating the technology costs of these candidates, it enhances technology mapping and addresses structural bias issues in logic synthesis. Notably, the MCH-based lookup table (LUT) mapping algorithm set new records in the EPFL Best Results Challenge by combining the structural strengths of both And-Inverter Graph (AIG) and XOR-Majority Graph (XMG) logic representations. Additionally, MCH-based ASIC technology mapping achieves a 3.73% area and 8.94% delay reduction (balanced), 20.35% delay reduction (delay-oriented), and 21.02% area reduction (area-oriented), outperforming traditional structural choice methods. Furthermore, MCH-based logic optimization utilizes diverse structures to surpass local optima and achieve better results. △ Less

Submitted 17 April, 2025; originally announced April 2025.

Comments: Accepted by DAC 2025. Please note that this is not the final camera-ready version

arXiv:2504.12345 [pdf, other]

Reimagining Urban Science: Scaling Causal Inference with Large Language Models

Authors: Yutong Xia, Ao Qu, Yunhan Zheng, Yihong Tang, Dingyi Zhuang, Yuxuan Liang, Shenhao Wang, Cathy Wu, Lijun Sun, Roger Zimmermann, Jinhua Zhao

Abstract: Urban causal research is essential for understanding the complex dynamics of cities and informing evidence-based policies. However, it is challenged by the inefficiency and bias of hypothesis generation, barriers to multimodal data complexity, and the methodological fragility of causal experimentation. Recent advances in large language models (LLMs) present an opportunity to rethink how urban caus… ▽ More Urban causal research is essential for understanding the complex dynamics of cities and informing evidence-based policies. However, it is challenged by the inefficiency and bias of hypothesis generation, barriers to multimodal data complexity, and the methodological fragility of causal experimentation. Recent advances in large language models (LLMs) present an opportunity to rethink how urban causal analysis is conducted. This Perspective examines current urban causal research by analyzing taxonomies that categorize research topics, data sources, and methodological approaches to identify structural gaps. We then introduce an LLM-driven conceptual framework, AutoUrbanCI, composed of four distinct modular agents responsible for hypothesis generation, data engineering, experiment design and execution, and results interpretation with policy recommendations. We propose evaluation criteria for rigor and transparency and reflect on implications for human-AI collaboration, equity, and accountability. We call for a new research agenda that embraces AI-augmented workflows not as replacements for human expertise but as tools to broaden participation, improve reproducibility, and unlock more inclusive forms of urban causal reasoning. △ Less

Submitted 9 May, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

arXiv:2504.12341 [pdf, other]

Streamlining Biomedical Research with Specialized LLMs

Authors: Linqing Chen, Weilei Wang, Yubin Xia, Wentao Wu, Peng Xu, Zilong Bai, Jie Fang, Chaobo Xu, Ran Hu, Licong Xu, Haoran Hua, Jing Sun, Hanmeng Zhong, Jin Liu, Tian Qiu, Haowen Liu, Meng Hu, Xiuwen Li, Fei Gao, Yong Gu, Tao Shi, Chaochao Wang, Jianping Lu, Cheng Sun, Yixin Wang , et al. (8 additional authors not shown)

Abstract: In this paper, we propose a novel system that integrates state-of-the-art, domain-specific large language models with advanced information retrieval techniques to deliver comprehensive and context-aware responses. Our approach facilitates seamless interaction among diverse components, enabling cross-validation of outputs to produce accurate, high-quality responses enriched with relevant data, imag… ▽ More In this paper, we propose a novel system that integrates state-of-the-art, domain-specific large language models with advanced information retrieval techniques to deliver comprehensive and context-aware responses. Our approach facilitates seamless interaction among diverse components, enabling cross-validation of outputs to produce accurate, high-quality responses enriched with relevant data, images, tables, and other modalities. We demonstrate the system's capability to enhance response precision by leveraging a robust question-answering model, significantly improving the quality of dialogue generation. The system provides an accessible platform for real-time, high-fidelity interactions, allowing users to benefit from efficient human-computer interaction, precise retrieval, and simultaneous access to a wide range of literature and data. This dramatically improves the research efficiency of professionals in the biomedical and pharmaceutical domains and facilitates faster, more informed decision-making throughout the R\&D process. Furthermore, the system proposed in this paper is available at https://synapse-chat.patsnap.com. △ Less

Submitted 15 April, 2025; originally announced April 2025.

Journal ref: Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations,p9--19,2025

arXiv:2504.12285 [pdf, other]

BitNet b1.58 2B4T Technical Report

Authors: Shuming Ma, Hongyu Wang, Shaohan Huang, Xingxing Zhang, Ying Hu, Ting Song, Yan Xia, Furu Wei

Abstract: We introduce BitNet b1.58 2B4T, the first open-source, native 1-bit Large Language Model (LLM) at the 2-billion parameter scale. Trained on a corpus of 4 trillion tokens, the model has been rigorously evaluated across benchmarks covering language understanding, mathematical reasoning, coding proficiency, and conversational ability. Our results demonstrate that BitNet b1.58 2B4T achieves performanc… ▽ More We introduce BitNet b1.58 2B4T, the first open-source, native 1-bit Large Language Model (LLM) at the 2-billion parameter scale. Trained on a corpus of 4 trillion tokens, the model has been rigorously evaluated across benchmarks covering language understanding, mathematical reasoning, coding proficiency, and conversational ability. Our results demonstrate that BitNet b1.58 2B4T achieves performance on par with leading open-weight, full-precision LLMs of similar size, while offering significant advantages in computational efficiency, including substantially reduced memory footprint, energy consumption, and decoding latency. To facilitate further research and adoption, the model weights are released via Hugging Face along with open-source inference implementations for both GPU and CPU architectures. △ Less

Submitted 24 April, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

Comments: Work in progress

arXiv:2504.12194 [pdf, ps, other]

The Optimal Condition Number for ReLU Function

Authors: Yu Xia, Haoyu Zhou

Abstract: ReLU is a widely used activation function in deep neural networks. This paper explores the stability properties of the ReLU map. For any weight matrix $\boldsymbol{A} \in \mathbb{R}^{m \times n}$ and bias vector $\boldsymbol{b} \in \mathbb{R}^{m}$ at a given layer, we define the condition number $β_{\boldsymbol{A},\boldsymbol{b}}$ as… ▽ More ReLU is a widely used activation function in deep neural networks. This paper explores the stability properties of the ReLU map. For any weight matrix $\boldsymbol{A} \in \mathbb{R}^{m \times n}$ and bias vector $\boldsymbol{b} \in \mathbb{R}^{m}$ at a given layer, we define the condition number $β_{\boldsymbol{A},\boldsymbol{b}}$ as $β_{\boldsymbol{A},\boldsymbol{b}} = \frac{\mathcal{U}_{\boldsymbol{A},\boldsymbol{b}}}{\mathcal{L}_{\boldsymbol{A},\boldsymbol{b}}}$, where $\mathcal{U}_{\boldsymbol{A},\boldsymbol{b}}$ and $\mathcal{L}_{\boldsymbol{A},\boldsymbol{b}}$ are the upper and lower Lipschitz constants, respectively. We first demonstrate that for any given $\boldsymbol{A}$ and $\boldsymbol{b}$, the condition number satisfies $β_{\boldsymbol{A},\boldsymbol{b}} \geq \sqrt{2}$. Moreover, when the weights of the network at a given layer are initialized as random i.i.d. Gaussian variables and the bias term is set to zero, the condition number asymptotically approaches this lower bound. This theoretical finding suggests that Gaussian weight initialization is optimal for preserving distances in the context of random deep neural network weights. △ Less

Submitted 16 April, 2025; originally announced April 2025.

Comments: 29 pages

arXiv:2504.12167 [pdf, other]

RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning

Authors: Yuan Luo, Rudolf Hoffmann, Yan Xia, Olaf Wysocki, Benedikt Schwab, Thomas H. Kolbe, Daniel Cremers

Abstract: Semantic 3D city models are worldwide easy-accessible, providing accurate, object-oriented, and semantic-rich 3D priors. To date, their potential to mitigate the noise impact on radar object detection remains under-explored. In this paper, we first introduce a unique dataset, RadarCity, comprising 54K synchronized radar-image pairs and semantic 3D city models. Moreover, we propose a novel neural n… ▽ More Semantic 3D city models are worldwide easy-accessible, providing accurate, object-oriented, and semantic-rich 3D priors. To date, their potential to mitigate the noise impact on radar object detection remains under-explored. In this paper, we first introduce a unique dataset, RadarCity, comprising 54K synchronized radar-image pairs and semantic 3D city models. Moreover, we propose a novel neural network, RADLER, leveraging the effectiveness of contrastive self-supervised learning (SSL) and semantic 3D city models to enhance radar object detection of pedestrians, cyclists, and cars. Specifically, we first obtain the robust radar features via a SSL network in the radar-image pretext task. We then use a simple yet effective feature fusion strategy to incorporate semantic-depth features from semantic 3D city models. Having prior 3D information as guidance, RADLER obtains more fine-grained details to enhance radar object detection. We extensively evaluate RADLER on the collected RadarCity dataset and demonstrate average improvements of 5.46% in mean avarage precision (mAP) and 3.51% in mean avarage recall (mAR) over previous radar object detection methods. We believe this work will foster further research on semantic-guided and map-supported radar object detection. Our project page is publicly available athttps://gpp-communication.github.io/RADLER . △ Less

Submitted 16 April, 2025; originally announced April 2025.

Comments: The paper accepted for CVPRW '25 (PBVS 2025 - the Perception Beyond the Visible Spectrum)

arXiv:2504.11637 [pdf, other]

DamageCAT: A Deep Learning Transformer Framework for Typology-Based Post-Disaster Building Damage Categorization

Authors: Yiming Xiao, Ali Mostafavi

Abstract: Natural disasters increasingly threaten communities worldwide, creating an urgent need for rapid, reliable building damage assessment to guide emergency response and recovery efforts. Current methods typically classify damage in binary (damaged/undamaged) or ordinal severity terms, limiting their practical utility. In fact, the determination of damage typology is crucial for response and recovery… ▽ More Natural disasters increasingly threaten communities worldwide, creating an urgent need for rapid, reliable building damage assessment to guide emergency response and recovery efforts. Current methods typically classify damage in binary (damaged/undamaged) or ordinal severity terms, limiting their practical utility. In fact, the determination of damage typology is crucial for response and recovery efforts. To address this important gap, this paper introduces DamageCAT, a novel framework that provides typology-based categorical damage descriptions rather than simple severity ratings. Accordingly, this study presents two key contributions: (1) the BD-TypoSAT dataset containing satellite image triplets (pre-disaster, post-disaster, and damage masks) from Hurricane Ida with four damage categories (partial roof damage, total roof damage, partial structural collapse, and total structural collapse), and (2) a hierarchical U-Net-based transformer architecture that effectively processes pre-post disaster image pairs to identify and categorize building damage. Despite significant class imbalances in the training data, our model achieved robust performance with overall metrics of 0.7921 Intersection over Union (IoU) and 0.8835 F1 scores across all categories. The model's capability to recognize intricate damage typology in less common categories is especially remarkable. The DamageCAT framework advances automated damage assessment by providing actionable, typological information that better supports disaster response decision-making and resource allocation compared to traditional severity-based approaches. △ Less

Submitted 15 April, 2025; originally announced April 2025.

Comments: 23 pages, 6 figures

arXiv:2504.10867 [pdf, other]

Precise measurement of the form factors in $D^0\rightarrow K^*(892)^-μ^+ν_μ$ and test of lepton universality with $D^0\rightarrow K^*(892)^-\ell^+ν_{\ell}$ decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (696 additional authors not shown)

Abstract: We report a study of the semileptonic decay $D^0 \rightarrow \bar{K}^0π^-μ^+ν_μ$ based on a sample of $7.9~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at a center-of-mass energy of 3.773~GeV with the BESIII detector at the BEPCII collider. The branching fraction of the decay is measured for the first time to be… ▽ More We report a study of the semileptonic decay $D^0 \rightarrow \bar{K}^0π^-μ^+ν_μ$ based on a sample of $7.9~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at a center-of-mass energy of 3.773~GeV with the BESIII detector at the BEPCII collider. The branching fraction of the decay is measured for the first time to be $\mathcal{B}(D^0\rightarrow \bar{K}^0π^-μ^+ν_μ) = (1.373 \pm 0.020_{\rm stat} \pm 0.023_{\rm syst})\%$, where the first uncertainty is statistical and the second is systematic. Based on the investigation of the decay dynamics, we find that the decay is dominated by the $K^{*}(892)^-$ resonance with the branching fraction measured to be $\mathcal{B}(D^0\rightarrow K^{*}(892)^-μ^+ν_μ) = (1.948 \pm 0.033_{\rm stat} \pm 0.036_{\rm syst})\%$. We also determine the hadronic form factors for the $D^0\rightarrow K^{*}(892)^-μ^+ν_μ$ decay to be $r_{V} = V(0)/A_1(0) = 1.46 \pm 0.11_{\rm stat} \pm 0.04_{\rm syst}$, $r_{2} = A_2(0)/A_1(0) = 0.71 \pm 0.08_{\rm stat} \pm 0.03_{\rm syst}$, and $A_1(0)=0.609 \pm 0.008_{\rm stat} \pm 0.008_{\rm syst}$, where $V(0)$ is the vector form factor and $A_{1,2}(0)$ are the axial form factors evaluated at $q^2=0$. The $A_1(0)$ is measured for the first time in $D^0\rightarrow K^{*}(892)^-μ^+ν_μ$ decay. Averaging the form-factor parameters that we reported previously in $D^0\rightarrow K^*(892)^-(\rightarrow \bar{K}^0π^-)e^+ν_{e}$ and $D^0\rightarrow K^*(892)^-(\rightarrow K^-π^0)μ^+ν_μ$ decays, we obtain $r_{V}=1.456\pm0.040_{\rm stat}\pm0.016_{\rm syst}$, $r_{2}=0.715\pm0.031_{\rm stat}\pm0.014_{\rm stat}$, and $A_1(0)=0.614\pm0.005_{\rm stat}\pm0.004_{\rm syst}$. This is the most precise determination of the form-factor parameters to date measured in $D\rightarrow K^*(892)$ transition, which provide the most stringent test on various theoretical models. △ Less

Submitted 15 April, 2025; originally announced April 2025.

Comments: 9 pages, 4 figures

arXiv:2504.10806 [pdf, other]

ACSNet: A Deep Neural Network for Compound GNSS Jamming Signal Classification

Authors: Min Jiang, Ziqiang Ye, Yue Xiao, Yulan Gao, Ming Xiao, Dusit Niyato

Abstract: In the global navigation satellite system (GNSS), identifying not only single but also compound jamming signals is crucial for ensuring reliable navigation and positioning, particularly in future wireless communication scenarios such as the space-air-ground integrated network (SAGIN). However, conventional techniques often struggle with low recognition accuracy and high computational complexity, e… ▽ More In the global navigation satellite system (GNSS), identifying not only single but also compound jamming signals is crucial for ensuring reliable navigation and positioning, particularly in future wireless communication scenarios such as the space-air-ground integrated network (SAGIN). However, conventional techniques often struggle with low recognition accuracy and high computational complexity, especially under low jamming-to-noise ratio (JNR) conditions. To overcome the challenge of accurately identifying compound jamming signals embedded within GNSS signals, we propose ACSNet, a novel convolutional neural network designed specifically for this purpose. Unlike traditional methods that tend to exhibit lower accuracy and higher computational demands, particularly in low JNR environments, ACSNet addresses these issues by integrating asymmetric convolution blocks, which enhance its sensitivity to subtle signal variations. Simulations demonstrate that ACSNet significantly improves accuracy in low JNR regions and shows robust resilience to power ratio (PR) variations, confirming its effectiveness and efficiency for practical GNSS interference management applications. △ Less

Submitted 14 April, 2025; originally announced April 2025.

arXiv:2504.10218 [pdf]

A Novel Quantum Fourier Ordinary Differential Equation Solver for Solving Linear and Nonlinear Partial Differential Equations

Authors: Yang Xiao, Liming Yang, Chang Shu, Yinjie Du, Yuxin Song

Abstract: In this work, a novel quantum Fourier ordinary differential equation (ODE) solver is proposed to solve both linear and nonlinear partial differential equations (PDEs). Traditional quantum ODE solvers transform a PDE into an ODE system via spatial discretization and then integrate it, thereby converting the task of solving the PDE into computing the integral for the driving function $f(x)$. These s… ▽ More In this work, a novel quantum Fourier ordinary differential equation (ODE) solver is proposed to solve both linear and nonlinear partial differential equations (PDEs). Traditional quantum ODE solvers transform a PDE into an ODE system via spatial discretization and then integrate it, thereby converting the task of solving the PDE into computing the integral for the driving function $f(x)$. These solvers rely on the quantum amplitude estimation algorithm, which requires the driving function $f(x)$ to be within the range of [0, 1] and necessitates the construction of a quantum circuit for the oracle R that encodes $f(x)$. This construction can be highly complex, even for simple functions like $f(x) = x$. An important exception arises for the specific case of $f(x) = sin^2(mx+c)$, which can be encoded more efficiently using a set of $Ry$ rotation gates. To address these challenges, we expand the driving function $f(x)$ as a Fourier series and propose the Quantum Fourier ODE Solver. This approach not only simplifies the construction of the oracle R but also removes the restriction that $f(x)$ must lie within [0,1]. The proposed method was evaluated by solving several representative linear and nonlinear PDEs, including the Navier-Stokes (N-S) equations. The results show that the quantum Fourier ODE solver produces results that closely match both analytical and reference solutions. △ Less

Submitted 14 April, 2025; originally announced April 2025.

Comments: 37 pages, 13 figures

arXiv:2504.09964 [pdf, other]

Characterizing the Palomar 5 Stream: HDBSCAN Analysis and Galactic Halo Constraints

Authors: Yun-Ao Xiao, Hu Zou, Lu Feng, Wei-Jian Guo, Niu Li, Wen-Xiong Li, Shu-Fei Liu, Gaurav Singh, Ji-Peng Sui, Jia-Li Wang, Sui-Jian Xue

Abstract: We utilize the DESI Legacy Imaging Surveys DR10 to investigate the previously undetected faint extension of the Palomar 5 stellar stream. By applying the HDBSCAN clustering algorithm, we identify stream members and successfully extend the leading arm of the stream to approximately $\mathrm{DEC} \sim -15^\circ$. Combining the fully detected stream with a suite of mock stream simulations, we conduct… ▽ More We utilize the DESI Legacy Imaging Surveys DR10 to investigate the previously undetected faint extension of the Palomar 5 stellar stream. By applying the HDBSCAN clustering algorithm, we identify stream members and successfully extend the leading arm of the stream to approximately $\mathrm{DEC} \sim -15^\circ$. Combining the fully detected stream with a suite of mock stream simulations, we conduct a detailed comparison to constrain both the intrinsic properties of the stream and the dynamical parameters of the Milky Way (MW) halo. Our analysis yields a best-fit model characterized by eight parameters: $M_{\mathrm{halo}} = 5.67\times10^{11}\ M_{\odot}$, $r_{s,\mathrm{halo}} = 28.94\ \mathrm{kpc}$, $q_z = 0.93$, $M_{\mathrm{gc}} = 4.31\times10^{3}\ M_{\odot}$, $dM_{\mathrm{gc}}/dt = 1.81\ M_{\odot}\ \mathrm{Myr}^{-1}$, $μ_α\cosδ= -2.28\ \mathrm{mas\ yr}^{-1}$, $μ_δ = -2.26\ \mathrm{mas\ yr}^{-1}$, and $D = 23.25\ \mathrm{kpc}$. Notably, our constraints on the halo shape indicate that the MW's dark matter halo exhibits a flattened potential, with a minor-to-major axis ratio of $q_z = 0.93$. This finding aligns well with theoretical expectations and previous observational estimates. Additionally, the best-fit model accurately reproduces the observed stream morphology and dynamics, providing a more precise understanding of both the evolution of the stream and the overall structure of the Galactic halo. △ Less

Submitted 14 April, 2025; originally announced April 2025.

Comments: Submitted to Research in Astronomy and Astrophysics (RAA). Comments welcome

arXiv:2504.09178 [pdf, other]

Hybrid Beamforming for RIS-Assisted Multiuser Fluid Antenna Systems

Authors: Jiangong Chen, Yue Xiao, Zhendong Peng, Jing Zhu, Xia Lei, Christos Masouros, Kai-Kit Wong

Abstract: Recent advances in reconfigurable antennas have led to the new concept of the fluid antenna system (FAS) for shape and position flexibility, as another degree of freedom for wireless communication enhancement. This paper explores the integration of a transmit FAS array for hybrid beamforming (HBF) into a reconfigurable intelligent surface (RIS)-assisted communication architecture for multiuser com… ▽ More Recent advances in reconfigurable antennas have led to the new concept of the fluid antenna system (FAS) for shape and position flexibility, as another degree of freedom for wireless communication enhancement. This paper explores the integration of a transmit FAS array for hybrid beamforming (HBF) into a reconfigurable intelligent surface (RIS)-assisted communication architecture for multiuser communications in the downlink, corresponding to the downlink RIS-assisted multiuser multiple-input single-output (MISO) FAS model (Tx RIS-assisted-MISO-FAS). By considering Rician channel fading, we formulate a sum-rate maximization optimization problem to alternately optimize the HBF matrix, the RIS phase-shift matrix, and the FAS position. Due to the strong coupling of multiple optimization variables, the multi-fractional summation in the sum-rate expression, the modulus-1 limitation of analog phase shifters and RIS, and the antenna position variables appearing in the exponent, this problem is highly non-convex, which is addressed through the block coordinate descent (BCD) framework in conjunction with semidefinite relaxation (SDR) and majorization-minimization (MM) methods. To reduce the computational complexity, we then propose a low-complexity grating-lobe (GL)-based telescopic-FAS (TFA) with multiple delicately deployed RISs under the sub-connected HBF architecture and the line-of-sight (LoS)-dominant channel condition, to allow closed-form solutions for the HBF and TFA position. Our simulation results illustrate that the former optimization scheme significantly enhances the achievable rate of the proposed system, while the GL-based TFA scheme also provides a considerable gain over conventional fixed-position antenna (FPA) systems, requiring statistical channel state information (CSI) only and with low computational complexity. △ Less

Submitted 12 April, 2025; originally announced April 2025.

arXiv:2504.09039 [pdf, other]

Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization

Authors: Gen Li, Yang Xiao, Jie Ji, Kaiyuan Deng, Bo Hui, Linke Guo, Xiaolong Ma

Abstract: Text-to-image (T2I) diffusion models have achieved remarkable success in generating high-quality images from textual prompts. However, their ability to store vast amounts of knowledge raises concerns in scenarios where selective forgetting is necessary, such as removing copyrighted content, reducing biases, or eliminating harmful concepts. While existing unlearning methods can remove certain conce… ▽ More Text-to-image (T2I) diffusion models have achieved remarkable success in generating high-quality images from textual prompts. However, their ability to store vast amounts of knowledge raises concerns in scenarios where selective forgetting is necessary, such as removing copyrighted content, reducing biases, or eliminating harmful concepts. While existing unlearning methods can remove certain concepts, they struggle with multi-concept forgetting due to instability, residual knowledge persistence, and generation quality degradation. To address these challenges, we propose \textbf{Dynamic Mask coupled with Concept-Aware Loss}, a novel unlearning framework designed for multi-concept forgetting in diffusion models. Our \textbf{Dynamic Mask} mechanism adaptively updates gradient masks based on current optimization states, allowing selective weight modifications that prevent interference with unrelated knowledge. Additionally, our \textbf{Concept-Aware Loss} explicitly guides the unlearning process by enforcing semantic consistency through superclass alignment, while a regularization loss based on knowledge distillation ensures that previously unlearned concepts remain forgotten during sequential unlearning. We conduct extensive experiments to evaluate our approach. Results demonstrate that our method outperforms existing unlearning techniques in forgetting effectiveness, output fidelity, and semantic coherence, particularly in multi-concept scenarios. Our work provides a principled and flexible framework for stable and high-fidelity unlearning in generative models. The code will be released publicly. △ Less

Submitted 11 April, 2025; originally announced April 2025.

arXiv:2504.08096 [pdf, other]

Cellular Development Follows the Path of Minimum Action

Authors: Rohola Zandie, Farhan Khodaee, Yufan Xia, Elazer R. Edelman

Abstract: Cellular development follows a stochastic yet rule-governed trajectory, though the underlying principles remain elusive. Here, we propose that cellular development follows paths of least action, aligning with foundational physical laws that govern dynamic systems across nature. We introduce a computational framework that takes advantage of the deep connection between the principle of least action… ▽ More Cellular development follows a stochastic yet rule-governed trajectory, though the underlying principles remain elusive. Here, we propose that cellular development follows paths of least action, aligning with foundational physical laws that govern dynamic systems across nature. We introduce a computational framework that takes advantage of the deep connection between the principle of least action and maximum entropy to model developmental processes using Transformers architecture. This approach enables precise quantification of entropy production, information flow curvature, and local irreversibility for developmental asymmetry in single-cell RNA sequence data. Within this unified framework, we provide interpretable metrics: entropy to capture exploration-exploitation trade-offs, curvature to assess plasticity-elasticity dynamics, and entropy production to characterize dedifferentiation and transdifferentiation. We validate our method across both single-cell and embryonic development datasets, demonstrating its ability to reveal hidden thermodynamic and informational constraints shaping cellular fate decisions. △ Less

Submitted 10 April, 2025; originally announced April 2025.

arXiv:2504.07817 [pdf, other]

Search for the baryon and lepton number violating decay $J/ψ\to pe^-$ + c.c

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (664 additional authors not shown)

Abstract: Based on $(2712.4\pm 14.3) \times 10^{6} $ ${ψ(3686)}$ events collected by the BESIII detector operating at the BEPCII storage ring, we perform a search for the baryon- and lepton-number violating decay $J/ψ\to pe^{-}+c.c.$ via $ψ(3686) \to π^{+}π^{-}J/ψ$. No significant signal is found. An upper limit on the branching fraction of $\mathcal{B}(J/ψ\to p e^{-}+ c.c.) < 3.1 \times 10^{-8}$ at 90\% co… ▽ More Based on $(2712.4\pm 14.3) \times 10^{6} $ ${ψ(3686)}$ events collected by the BESIII detector operating at the BEPCII storage ring, we perform a search for the baryon- and lepton-number violating decay $J/ψ\to pe^{-}+c.c.$ via $ψ(3686) \to π^{+}π^{-}J/ψ$. No significant signal is found. An upper limit on the branching fraction of $\mathcal{B}(J/ψ\to p e^{-}+ c.c.) < 3.1 \times 10^{-8}$ at 90\% confidence level. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: 8 pages, 1 figure

arXiv:2504.07787 [pdf, other]

doi 10.1145/3728881

Fairness Mediator: Neutralize Stereotype Associations to Mitigate Bias in Large Language Models

Authors: Yisong Xiao, Aishan Liu, Siyuan Liang, Xianglong Liu, Dacheng Tao

Abstract: LLMs have demonstrated remarkable performance across diverse applications, yet they inadvertently absorb spurious correlations from training data, leading to stereotype associations between biased concepts and specific social groups. These associations perpetuate and even amplify harmful social biases, raising significant fairness concerns. To mitigate such biases, prior studies have attempted to… ▽ More LLMs have demonstrated remarkable performance across diverse applications, yet they inadvertently absorb spurious correlations from training data, leading to stereotype associations between biased concepts and specific social groups. These associations perpetuate and even amplify harmful social biases, raising significant fairness concerns. To mitigate such biases, prior studies have attempted to project model embeddings into unbiased spaces during inference. However, these approaches have shown limited effectiveness due to their weak alignment with downstream social biases. Inspired by the observation that concept cognition in LLMs is primarily represented through a linear associative memory mechanism, where key-value mapping occurs in the MLP layers, we posited that biased concepts and social groups are similarly encoded as entity (key) and information (value) pairs, which can be manipulated to promote fairer associations. To this end, we propose Fairness Mediator (FairMed), a bias mitigation framework that neutralizes stereotype associations. Our framework comprises two main components: a stereotype association prober and an adversarial debiasing neutralizer. The prober captures stereotype associations encoded within MLP layer activations by employing prompts centered around biased concepts to detect the emission probabilities for social groups. Subsequently, the adversarial debiasing neutralizer intervenes in MLP activations during inference to equalize the association probabilities among different social groups. Extensive experiments across nine protected attributes show that FairMed significantly outperforms SOTA methods in effectiveness. Compared to the most effective baseline, FairMed presents competitive efficiency by cutting mitigation overhead by hundreds of minutes. FairMed also maintains the LLM's language understanding capabilities without compromising overall performance. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: Accepted by ISSTA 2025.20 pages

arXiv:2504.07765 [pdf, ps, other]

Finite pattern problems related to Engel expansion

Authors: Chun-Yun Cao, Yang Xiao

Abstract: Let $\mathcal{F}$ be a countable collection of functions $f$ defined on the integers with integer values, such that for every $f\in \mathcal{F}$, $f(n)\to +\infty$ as $n\to +\infty$. This paper primarily investigates the Hausdorff dimension of the set of points whose digit sequences of the Engel expansion are strictly increasing and contain any finite pattern of $\mathcal{F}$, demonstrating applic… ▽ More Let $\mathcal{F}$ be a countable collection of functions $f$ defined on the integers with integer values, such that for every $f\in \mathcal{F}$, $f(n)\to +\infty$ as $n\to +\infty$. This paper primarily investigates the Hausdorff dimension of the set of points whose digit sequences of the Engel expansion are strictly increasing and contain any finite pattern of $\mathcal{F}$, demonstrating applications with representative examples. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: 9 pages

MSC Class: 28A80; 11K55

arXiv:2504.07753 [pdf]

Virtual-mask Informed Prior for Sparse-view Dual-Energy CT Reconstruction

Authors: Zini Chen, Yao Xiao, Junyan Zhang, Shaoyu Wang, Liu Shi, Qiegen Liu

Abstract: Sparse-view sampling in dual-energy computed tomography (DECT) significantly reduces radiation dose and increases imaging speed, yet is highly prone to artifacts. Although diffusion models have demonstrated potential in effectively handling incomplete data, most existing methods in this field focus on the image do-main and lack global constraints, which consequently leads to insufficient reconstru… ▽ More Sparse-view sampling in dual-energy computed tomography (DECT) significantly reduces radiation dose and increases imaging speed, yet is highly prone to artifacts. Although diffusion models have demonstrated potential in effectively handling incomplete data, most existing methods in this field focus on the image do-main and lack global constraints, which consequently leads to insufficient reconstruction quality. In this study, we propose a dual-domain virtual-mask in-formed diffusion model for sparse-view reconstruction by leveraging the high inter-channel correlation in DECT. Specifically, the study designs a virtual mask and applies it to the high-energy and low-energy data to perform perturbation operations, thus constructing high-dimensional tensors that serve as the prior information of the diffusion model. In addition, a dual-domain collaboration strategy is adopted to integrate the information of the randomly selected high-frequency components in the wavelet domain with the information in the projection domain, for the purpose of optimizing the global struc-tures and local details. Experimental results indicated that the present method exhibits excellent performance across multiple datasets. △ Less

Submitted 10 April, 2025; originally announced April 2025.

arXiv:2504.07733 [pdf, other]

DeepGreen: Effective LLM-Driven Green-washing Monitoring System Designed for Empirical Testing -- Evidence from China

Authors: Congluo Xu, Yu Miao, Yiling Xiao, Chengmengjia Lin

Abstract: This paper proposes DeepGreen, an Large Language Model Driven (LLM-Driven) system for detecting corporate green-washing behaviour. Utilizing dual-layer LLM analysis, DeepGreen preliminarily identifies potential green keywords in financial statements and then assesses their implementation degree via iterative semantic analysis of LLM. A core variable GreenImplement is derived from the ratio from th… ▽ More This paper proposes DeepGreen, an Large Language Model Driven (LLM-Driven) system for detecting corporate green-washing behaviour. Utilizing dual-layer LLM analysis, DeepGreen preliminarily identifies potential green keywords in financial statements and then assesses their implementation degree via iterative semantic analysis of LLM. A core variable GreenImplement is derived from the ratio from the two layers' output. We extract 204 financial statements of 68 companies from A-share market over three years, comprising 89,893 words, and analyse them through DeepGreen. Our analysis, supported by violin plots and K-means clustering, reveals insights and validates the variable against the Huazheng ESG rating. It offers a novel perspective for regulatory agencies and investors, serving as a proactive monitoring tool that complements traditional methods.Empirical tests show that green implementation can significantly boost the asset return rate of companies, but there is heterogeneity in scale. Small and medium-sized companies have limited contribution to asset return via green implementation, so there is a stronger motivation for green-washing. △ Less

Submitted 10 April, 2025; originally announced April 2025.

arXiv:2504.07321 [pdf, other]

A Unified Framework for Large-Scale Classification: Error Rate Control and Optimality

Authors: Yinrui Sun, Yin Xia

Abstract: Classification is a fundamental task in supervised learning, while achieving valid misclassification rate control remains challenging due to possibly the limited predictive capability of the classifiers or the intrinsic complexity of the classification task. In this article, we address large-scale multi-class classification problems with general error rate guarantees to enhance algorithmic trustwo… ▽ More Classification is a fundamental task in supervised learning, while achieving valid misclassification rate control remains challenging due to possibly the limited predictive capability of the classifiers or the intrinsic complexity of the classification task. In this article, we address large-scale multi-class classification problems with general error rate guarantees to enhance algorithmic trustworthiness. To this end, we first introduce a notion of group-wise classification, which unifies the common class-wise and overall classifications as special cases. We then develop a unified algorithmic framework for the general group-wise classification that consists of three steps: Pre-classification, Selective $p$-value construction, and large-scale Post-classification decisions (PSP). Theoretically, PSP is distribution-free and provides valid finite-sample guarantees for controlling general group-wise false decision rates at target levels. To show the power of PSP, we demonstrate that the step of post-classification decisions never degrades the power of pre-classification, provided that pre-classification has been sufficiently powerful to meet the target error levels. Additionally, we further establish general power optimality theories for PSP from both non-asymptotic and asymptotic perspectives. Numerical results in both simulations and real data analysis validate the performance of the proposed PSP approach. △ Less

Submitted 9 April, 2025; originally announced April 2025.

arXiv:2504.07070 [pdf, other]

A Survey on Personalized and Pluralistic Preference Alignment in Large Language Models

Authors: Zhouhang Xie, Junda Wu, Yiran Shen, Yu Xia, Xintong Li, Aaron Chang, Ryan Rossi, Sachin Kumar, Bodhisattwa Prasad Majumder, Jingbo Shang, Prithviraj Ammanabrolu, Julian McAuley

Abstract: Personalized preference alignment for large language models (LLMs), the process of tailoring LLMs to individual users' preferences, is an emerging research direction spanning the area of NLP and personalization. In this survey, we present an analysis of works on personalized alignment and modeling for LLMs. We introduce a taxonomy of preference alignment techniques, including training time, infere… ▽ More Personalized preference alignment for large language models (LLMs), the process of tailoring LLMs to individual users' preferences, is an emerging research direction spanning the area of NLP and personalization. In this survey, we present an analysis of works on personalized alignment and modeling for LLMs. We introduce a taxonomy of preference alignment techniques, including training time, inference time, and additionally, user-modeling based methods. We provide analysis and discussion on the strengths and limitations of each group of techniques and then cover evaluation, benchmarks, as well as open problems in the field. △ Less

Submitted 9 April, 2025; originally announced April 2025.

arXiv:2504.07002 [pdf, ps, other]

doi 10.1145/3728952

DeCoMa: Detecting and Purifying Code Dataset Watermarks through Dual Channel Code Abstraction

Authors: Yuan Xiao, Yuchen Chen, Shiqing Ma, Haocheng Huang, Chunrong Fang, Yanwei Chen, Weisong Sun, Yunfeng Zhu, Xiaofang Zhang, Zhenyu Chen

Abstract: Watermarking is a technique to help identify the source of data points, which can be used to help prevent the misuse of protected datasets. Existing methods on code watermarking, leveraging the idea from the backdoor research, embed stealthy triggers as watermarks.Despite their high resilience against dilution attacks and backdoor detections, the robustness has not been fully evaluated. To fill th… ▽ More Watermarking is a technique to help identify the source of data points, which can be used to help prevent the misuse of protected datasets. Existing methods on code watermarking, leveraging the idea from the backdoor research, embed stealthy triggers as watermarks.Despite their high resilience against dilution attacks and backdoor detections, the robustness has not been fully evaluated. To fill this gap, we propose DeCoMa, a dual-channel approach to Detect and purify Code dataset waterMarks.To overcome the high barrier created by the stealthy and hidden nature of code watermarks, DeCoMa leverages dual-channel constraints on code to generalize and map code samples into standardized templates. Subsequently, DeCoMa extracts hidden watermarks by identifying outlier associations between paired elements within the standardized templates. Finally, DeCoMa purifies the watermarked dataset by removing all samples containing the detected watermark, enabling the silent appropriation of protected code. We conduct extensive experiments to evaluate the effectiveness and efficiency of DeCoMa, covering 14 types of code watermarks and 3 representative intelligent code tasks (a total of 14 scenarios). Experimental results demonstrate that DeCoMa achieves a stable recall of 100% in 14 code watermark detection scenarios, significantly outperforming the baselines. Additionally, DeCoMa effectively attacks code watermarks with embedding rates as low as 0.1%, while maintaining comparable model performance after training on the purified dataset. Furthermore, as DeCoMa requires no model training for detection, it achieves substantially higher efficiency than all baselines, with a speedup ranging from 31.5 to 130.9X. The results call for more advanced watermarking techniques for code models, while DeCoMa can serve as a baseline for future evaluation. △ Less

Submitted 9 April, 2025; originally announced April 2025.

Comments: Accepted to ISSTA 2025. Code is available at https://github.com/xiaoyuanpigo/DeCoMa

arXiv:2504.06426 [pdf, other]

S'MoRE: Structural Mixture of Residual Experts for LLM Fine-tuning

Authors: Hanqing Zeng, Yinglong Xia, Zhuokai Zhao, Gilbert Jiang, Qiang Zhang, Jiayi Liu, Lizhu Zhang, Xiangjun Fan, Benyu Zhang

Abstract: Fine-tuning pre-trained large language models (LLMs) presents a dual challenge of balancing parameter efficiency and model capacity. Existing methods like low-rank adaptations (LoRA) are efficient but lack flexibility, while Mixture-of-Experts (MoE) architectures enhance model capacity at the cost of more & under-utilized parameters. To address these limitations, we propose Structural Mixture of R… ▽ More Fine-tuning pre-trained large language models (LLMs) presents a dual challenge of balancing parameter efficiency and model capacity. Existing methods like low-rank adaptations (LoRA) are efficient but lack flexibility, while Mixture-of-Experts (MoE) architectures enhance model capacity at the cost of more & under-utilized parameters. To address these limitations, we propose Structural Mixture of Residual Experts (S'MoRE), a novel framework that seamlessly integrates the efficiency of LoRA with the flexibility of MoE. Specifically, S'MoRE employs hierarchical low-rank decomposition of expert weights, yielding residuals of varying orders interconnected in a multi-layer structure. By routing input tokens through sub-trees of residuals, S'MoRE emulates the capacity of many experts by instantiating and assembling just a few low-rank matrices. We craft the inter-layer propagation of S'MoRE's residuals as a special type of Graph Neural Network (GNN), and prove that under similar parameter budget, S'MoRE improves "structural flexibility" of traditional MoE (or Mixture-of-LoRA) by exponential order. Comprehensive theoretical analysis and empirical results demonstrate that S'MoRE achieves superior fine-tuning performance, offering a transformative approach for efficient LLM adaptation. △ Less

Submitted 8 April, 2025; originally announced April 2025.

arXiv:2504.06271 [pdf, other]

ER-RAG: Enhance RAG with ER-Based Unified Modeling of Heterogeneous Data Sources

Authors: Yikuan Xia, Jiazun Chen, Yirui Zhan, Suifeng Zhao, Weipeng Jiang, Chaorui Zhang, Wei Han, Bo Bai, Jun Gao

Abstract: Large language models (LLMs) excel in question-answering (QA) tasks, and retrieval-augmented generation (RAG) enhances their precision by incorporating external evidence from diverse sources like web pages, databases, and knowledge graphs. However, current RAG methods rely on agent-specific strategies for individual data sources, posing challenges low-resource or black-box environments and complic… ▽ More Large language models (LLMs) excel in question-answering (QA) tasks, and retrieval-augmented generation (RAG) enhances their precision by incorporating external evidence from diverse sources like web pages, databases, and knowledge graphs. However, current RAG methods rely on agent-specific strategies for individual data sources, posing challenges low-resource or black-box environments and complicates operations when evidence is fragmented across sources. To address these limitations, we propose ER-RAG, a framework that unifies evidence integration across heterogeneous data sources using the Entity-Relationship (ER) model. ER-RAG standardizes entity retrieval and relationship querying through ER-based APIs with GET and JOIN operations. It employs a two-stage generation process: first, a preference optimization module selects optimal sources; second, another module constructs API chains based on source schemas. This unified approach allows efficient fine-tuning and seamless integration across diverse data sources. ER-RAG demonstrated its effectiveness by winning all three tracks of the 2024 KDDCup CRAG Challenge, achieving performance on par with commercial RAG pipelines using an 8B LLM backbone. It outperformed hybrid competitors by 3.1% in LLM score and accelerated retrieval by 5.5X. △ Less

Submitted 2 March, 2025; originally announced April 2025.

arXiv:2504.05584 [pdf, other]

Observation of Transverse Polarization and Determination of Electromagnetic Form Factor of $Λ$ Hyperon at $\sqrt{s}= 3.773$ GeV

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (697 additional authors not shown)

Abstract: Using a 20.3 fb$^{-1}$ of $e^{+}e^{-}$ collision data sample collected by the BESIII detector at the BEPCII collider, we present an observation of transverse polarization and a complete determination of the electromagnetic form factor of the $Λ$ hyperon in $e^{+}e^{-}\toΛ\barΛ$ decay with the entangled $Λ-\barΛ$ pair at $\sqrt{s}=3.773$ GeV. The relative phase between the electric and magnetic for… ▽ More Using a 20.3 fb$^{-1}$ of $e^{+}e^{-}$ collision data sample collected by the BESIII detector at the BEPCII collider, we present an observation of transverse polarization and a complete determination of the electromagnetic form factor of the $Λ$ hyperon in $e^{+}e^{-}\toΛ\barΛ$ decay with the entangled $Λ-\barΛ$ pair at $\sqrt{s}=3.773$ GeV. The relative phase between the electric and magnetic form factors is determined to be $ΔΦ=(1.53\pm0.36\pm0.03)$ rad with a significance of 5.5$σ$ taking into account systematic uncertainty. This result indicates a non-zero phase between the transition amplitudes of the $Λ\barΛ$ helicity states. Additionally, we measure the angular distribution parameter and the modulus of the ratio between the electric and the magnetic form factor is found to be $η=0.86\pm0.05\pm0.03$ and $R(s)=|G_{E}(s)/G_{M}(s)|=0.47\pm0.08\pm0.05$, where the first uncertainty is statistical and the second systematic. △ Less

Submitted 7 April, 2025; originally announced April 2025.

Comments: 9 pages, 1 table, 5 figures

arXiv:2504.05541 [pdf, other]

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting

Authors: Yunlong Tang, Jing Bi, Chao Huang, Susan Liang, Daiki Shimada, Hang Hua, Yunzhong Xiao, Yizhi Song, Pinxin Liu, Mingqian Feng, Junjia Guo, Zhuo Liu, Luchuan Song, Ali Vosoughi, Jinxi He, Liu He, Zeliang Zhang, Jiebo Luo, Chenliang Xu

Abstract: We present CAT-V (Caption AnyThing in Video), a training-free framework for fine-grained object-centric video captioning that enables detailed descriptions of user-selected objects through time. CAT-V integrates three key components: a Segmenter based on SAMURAI for precise object segmentation across frames, a Temporal Analyzer powered by TRACE-Uni for accurate event boundary detection and tempora… ▽ More We present CAT-V (Caption AnyThing in Video), a training-free framework for fine-grained object-centric video captioning that enables detailed descriptions of user-selected objects through time. CAT-V integrates three key components: a Segmenter based on SAMURAI for precise object segmentation across frames, a Temporal Analyzer powered by TRACE-Uni for accurate event boundary detection and temporal analysis, and a Captioner using InternVL-2.5 for generating detailed object-centric descriptions. Through spatiotemporal visual prompts and chain-of-thought reasoning, our framework generates detailed, temporally-aware descriptions of objects' attributes, actions, statuses, interactions, and environmental contexts without requiring additional training data. CAT-V supports flexible user interactions through various visual prompts (points, bounding boxes, and irregular regions) and maintains temporal sensitivity by tracking object states and interactions across different time segments. Our approach addresses limitations of existing video captioning methods, which either produce overly abstract descriptions or lack object-level precision, enabling fine-grained, object-specific descriptions while maintaining temporal coherence and spatial accuracy. The GitHub repository for this project is available at https://github.com/yunlong10/CAT-V △ Less

Submitted 8 April, 2025; v1 submitted 7 April, 2025; originally announced April 2025.

arXiv:2504.04524 [pdf, other]

Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning

Authors: Xuerui Su, Shufang Xie, Guoqing Liu, Yingce Xia, Renqian Luo, Peiran Jin, Zhiming Ma, Yue Wang, Zun Wang, Yuting Liu

Abstract: Recently, Large Language Models (LLMs) have rapidly evolved, approaching Artificial General Intelligence (AGI) while benefiting from large-scale reinforcement learning to enhance Human Alignment (HA) and Reasoning. Recent reward-based optimization algorithms, such as Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO) have achieved significant performance on reasoning… ▽ More Recently, Large Language Models (LLMs) have rapidly evolved, approaching Artificial General Intelligence (AGI) while benefiting from large-scale reinforcement learning to enhance Human Alignment (HA) and Reasoning. Recent reward-based optimization algorithms, such as Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO) have achieved significant performance on reasoning tasks, whereas preference-based optimization algorithms such as Direct Preference Optimization (DPO) significantly improve the performance of LLMs on human alignment. However, despite the strong performance of reward-based optimization methods in alignment tasks , they remain vulnerable to reward hacking. Furthermore, preference-based algorithms (such as Online DPO) haven't yet matched the performance of reward-based optimization algorithms (like PPO) on reasoning tasks, making their exploration in this specific area still a worthwhile pursuit. Motivated by these challenges, we propose the Trust Region Preference Approximation (TRPA) algorithm, which integrates rule-based optimization with preference-based optimization for reasoning tasks. As a preference-based algorithm, TRPA naturally eliminates the reward hacking issue. TRPA constructs preference levels using predefined rules, forms corresponding preference pairs, and leverages a novel optimization algorithm for RL training with a theoretical monotonic improvement guarantee. Experimental results demonstrate that TRPA not only achieves competitive performance on reasoning tasks but also exhibits robust stability. The code of this paper are released and updating on https://github.com/XueruiSu/Trust-Region-Preference-Approximation.git. △ Less

Submitted 6 April, 2025; originally announced April 2025.

Comments: 10pages

arXiv:2504.04420 [pdf, other]

Observation of $ψ(3686) \to Ξ^- K^0_S \barΩ^+ $+c.c

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (680 additional authors not shown)

Abstract: Using a sample of $(2.712\pm0.014) \times 10^{9}$ $ψ(3686)$ events collected with the BESIII detector at the electron positron collider BEPCII, the decay $ψ(3686) \to Ξ^- K^0_S \barΩ^+ +c.c.$ is observed for the first time, which has a significance of 5.9 standard deviations. The branching fraction of this decay is measured to be $(2.91\pm0.47\pm0.33)\times 10^{-6}$, where the first and second unc… ▽ More Using a sample of $(2.712\pm0.014) \times 10^{9}$ $ψ(3686)$ events collected with the BESIII detector at the electron positron collider BEPCII, the decay $ψ(3686) \to Ξ^- K^0_S \barΩ^+ +c.c.$ is observed for the first time, which has a significance of 5.9 standard deviations. The branching fraction of this decay is measured to be $(2.91\pm0.47\pm0.33)\times 10^{-6}$, where the first and second uncertainties are statistical and systematic, respectively. The ratio between $\mathcal{B}_{ψ(3686) \to Ξ^- K^0_S \barΩ^+ +c.c.}$ and $\mathcal{B}_{ψ(3686) \to Ω^- K^+ \barΞ^0 +c.c.}$ is determined to be $1.05\pm0.23\pm0.14 $, which deviates with the isospin symmetry conservation predicted value of 0.5 by $2.1σ$. △ Less

Submitted 6 April, 2025; originally announced April 2025.

arXiv:2504.04414 [pdf, ps, other]

Redefining Information Freshness: AoGI for Generative AI in 6G Networks

Authors: Yuquan Xiao, Qinghe Du, Wenchi Cheng, George K. Karagiannidis, Arumugam Nallanathan, Mohsen Guizani

Abstract: Generative Artificial Intelligence (GenAI) is playing an increasingly important role in enriching and facilitating human life by generating various useful information, of which real-time GenAI is a significant part and has great potential in applications such as real-time robot control, automated driving, augmented reality, etc. There are a variety of information updating processes in real-time Ge… ▽ More Generative Artificial Intelligence (GenAI) is playing an increasingly important role in enriching and facilitating human life by generating various useful information, of which real-time GenAI is a significant part and has great potential in applications such as real-time robot control, automated driving, augmented reality, etc. There are a variety of information updating processes in real-time GenAI, and the age of information (AoI) is an effective metric for evaluating information freshness. However, due to the diversity and generativity of information in real-time GenAI, it may be incompatible to directly use existing information aging metrics to assess its timeliness. In this article, we introduce a new concept called Age of Generative Information (AoGI) to evaluate the freshness of generative information, which takes into account the information delay caused not only by sampling and transmission, but also by computation. Furthermore, since real-time GenAI services are often supported by mobile-edge-cloud (MEC) collaborative computing in 6G networks and some of the generated information is privacy sensitive, it is recommended that the identities of edge and cloud should always be verified in a zero-trust manner. We introduce the concept of Age of Trust (AoT) to characterise the decay process of their trust level. We also discuss the optimisations of these evolved information aging metrics, focusing on the impact of dynamic external conditions, including wireless environments and limited computational resources. Finally, we highlight several open challenges in providing timeliness guarantees for real-time GenAI services. △ Less

Submitted 6 April, 2025; originally announced April 2025.

arXiv:2504.04096 [pdf, ps, other]

Observation of a Three-Resonance Structure in the Cross Section of $e^+e^-\toπ^+π^- h_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (697 additional authors not shown)

Abstract: Using $e^+e^-$ collision data collected with the BESIII detector operating at the Beijing Electron Positron Collider, the cross section of $e^+e^-\to π^+π^- h_c$ is measured at 59 points with center-of-mass energy $\sqrt{s}$ ranging from $4.009$ to $4.950~\mathrm{GeV}$ with a total integrated luminosity of $22.2~\mathrm{fb}^{-1}$. The cross section between $4.3$ and $4.45~\mathrm{GeV}$ exhibits a… ▽ More Using $e^+e^-$ collision data collected with the BESIII detector operating at the Beijing Electron Positron Collider, the cross section of $e^+e^-\to π^+π^- h_c$ is measured at 59 points with center-of-mass energy $\sqrt{s}$ ranging from $4.009$ to $4.950~\mathrm{GeV}$ with a total integrated luminosity of $22.2~\mathrm{fb}^{-1}$. The cross section between $4.3$ and $4.45~\mathrm{GeV}$ exhibits a plateau-like shape and drops sharply around $4.5~\mathrm{GeV}$, which cannot be described by two resonances only. Three coherent Breit-Wigner functions are used to parameterize the $\sqrt{s}$-dependent cross section line shape. The masses and widths are determined to be $M_1=(4223.6_{-3.7-2.9}^{+3.6+2.6})~\mathrm{MeV}/c^2$, $Γ_1=(58.5_{-11.4-6.5}^{+10.8+6.7})~\mathrm{MeV}$, $M_2=(4327.4_{-18.8-9.3}^{+20.1+10.7})~\mathrm{MeV}/c^2$, $Γ_2=(244.1_{-27.1-18.0}^{+34.0+23.9})~\mathrm{MeV}$, and $M_3=(4467.4_{-5.4-2.7}^{+7.2+3.2})~\mathrm{MeV}/c^2$, $Γ_3=(62.8_{-14.4-6.6}^{+19.2+9.8})~\mathrm{MeV}$. The first uncertainties are statistical and the other two are systematic. The statistical significance of the three Breit-Wigner assumption over the two Breit-Wigner assumption is greater than $5σ$. △ Less

Submitted 5 April, 2025; originally announced April 2025.

arXiv:2504.03941 [pdf]

Diagnosing Biases in Tropical Atlantic-Pacific Multi-Decadal Teleconnections Across CMIP6 and E3SM Models

Authors: Yan Xia, Yong-Fu Lin, Jin-Yi Yu, Walter Hannah, Mike Pritchard

Abstract: Decadal-scale interactions between the tropical Atlantic and Pacific Oceans play a crucial role in global climate variability through bidirectional teleconnections. Current climate models show persistent biases in representing these basin interactions, particularly in simulating the Atlantic's influence on Pacific climate. Using historical simulations from 27 CMIP6 models and two configurations of… ▽ More Decadal-scale interactions between the tropical Atlantic and Pacific Oceans play a crucial role in global climate variability through bidirectional teleconnections. Current climate models show persistent biases in representing these basin interactions, particularly in simulating the Atlantic's influence on Pacific climate. Using historical simulations from 27 CMIP6 models and two configurations of the Energy Exascale Earth System Model (E3SM) during 1950-2015, we systematically evaluate tropical Atlantic-Pacific teleconnections through both Walker circulation and extratropical wave responses. Most models exhibit Pacific-dominated teleconnections, contradicting observational evidence of Atlantic control on Pacific variability during the past 40 years. By developing a performance metric that combines tropical circulation patterns and extratropical wave propagation, we identify two distinct model behaviors: high-skill models capture the bidirectional Atlantic-Pacific teleconnections with a secondary symptom of systematic 20-degree westward shifts in convective centers, while low-skill models display amplified Pacific dominance through reversed Walker circulation responses warming in both tropical basins. Comparative analysis between standard E3SMv2 and its multi-scale modeling framework configuration demonstrates that implementing more sophisticated cloud-scale processes alone, with limited model tuning, cannot resolve these teleconnection biases. Our results identify four CMIP6 models and E3SMv2 that effectively reproduce observed teleconnection pathways, offering a comprehensive diagnostic framework for evaluating decadal Atlantic-Pacific interactions in climate models. △ Less

Submitted 4 April, 2025; originally announced April 2025.

arXiv:2504.02956 [pdf, other]

Understanding Aha Moments: from External Observations to Internal Mechanisms

Authors: Shu Yang, Junchao Wu, Xin Chen, Yunze Xiao, Xinyi Yang, Derek F. Wong, Di Wang

Abstract: Large Reasoning Models (LRMs), capable of reasoning through complex problems, have become crucial for tasks like programming, mathematics, and commonsense reasoning. However, a key challenge lies in understanding how these models acquire reasoning capabilities and exhibit "aha moments" when they reorganize their methods to allocate more thinking time to problems. In this work, we systematically st… ▽ More Large Reasoning Models (LRMs), capable of reasoning through complex problems, have become crucial for tasks like programming, mathematics, and commonsense reasoning. However, a key challenge lies in understanding how these models acquire reasoning capabilities and exhibit "aha moments" when they reorganize their methods to allocate more thinking time to problems. In this work, we systematically study "aha moments" in LRMs, from linguistic patterns, description of uncertainty, "Reasoning Collapse" to analysis in latent space. We demonstrate that the "aha moment" is externally manifested in a more frequent use of anthropomorphic tones for self-reflection and an adaptive adjustment of uncertainty based on problem difficulty. This process helps the model complete reasoning without succumbing to "Reasoning Collapse". Internally, it corresponds to a separation between anthropomorphic characteristics and pure reasoning, with an increased anthropomorphic tone for more difficult problems. Furthermore, we find that the "aha moment" helps models solve complex problems by altering their perception of problem difficulty. As the layer of the model increases, simpler problems tend to be perceived as more complex, while more difficult problems appear simpler. △ Less

Submitted 3 April, 2025; originally announced April 2025.

arXiv:2504.02605 [pdf, other]

Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving

Authors: Daoguang Zan, Zhirong Huang, Wei Liu, Hanwu Chen, Linhao Zhang, Shulin Xin, Lu Chen, Qi Liu, Xiaojian Zhong, Aoyan Li, Siyao Liu, Yongsheng Xiao, Liangqiang Chen, Yuyu Zhang, Jing Su, Tianyu Liu, Rui Long, Kai Shen, Liang Xiang

Abstract: The task of issue resolving is to modify a codebase to generate a patch that addresses a given issue. However, existing benchmarks, such as SWE-bench, focus almost exclusively on Python, making them insufficient for evaluating Large Language Models (LLMs) across diverse software ecosystems. To address this, we introduce a multilingual issue-resolving benchmark, called Multi-SWE-bench, covering Jav… ▽ More The task of issue resolving is to modify a codebase to generate a patch that addresses a given issue. However, existing benchmarks, such as SWE-bench, focus almost exclusively on Python, making them insufficient for evaluating Large Language Models (LLMs) across diverse software ecosystems. To address this, we introduce a multilingual issue-resolving benchmark, called Multi-SWE-bench, covering Java, TypeScript, JavaScript, Go, Rust, C, and C++. It includes a total of 1,632 high-quality instances, which were carefully annotated from 2,456 candidates by 68 expert annotators, ensuring that the benchmark can provide an accurate and reliable evaluation. Based on Multi-SWE-bench, we evaluate a series of state-of-the-art models using three representative methods (Agentless, SWE-agent, and OpenHands) and present a comprehensive analysis with key empirical insights. In addition, we launch a Multi-SWE-RL open-source community, aimed at building large-scale reinforcement learning (RL) training datasets for issue-resolving tasks. As an initial contribution, we release a set of 4,723 well-structured instances spanning seven programming languages, laying a solid foundation for RL research in this domain. More importantly, we open-source our entire data production pipeline, along with detailed tutorials, encouraging the open-source community to continuously contribute and expand the dataset. We envision our Multi-SWE-bench and the ever-growing Multi-SWE-RL community as catalysts for advancing RL toward its full potential, bringing us one step closer to the dawn of AGI. △ Less

Submitted 3 April, 2025; originally announced April 2025.

arXiv:2504.02202 [pdf]

Photon-number-resolving single-photon detector with a system detection efficiency of 98% and photon-number resolution of 32

Authors: Chaomeng Ding, Xingyu Zhang, Jiamin Xiong, You Xiao, Tianzhu Zhang, Jia Huang, Hongxin Xu, Xiaoyu Liu, Lixing You, Zhen Wang, Hao Li

Abstract: Efficiently distinguishing photon numbers is a crucial yet challenging technology for various quantum information and quantum metrology applications. While superconducting transition edge sensors offer good photon-number-resolving (PNR) capabilities, they are hampered by low detection speed, timing jitter, and complex cooling and readout requirements. In this work, we present a significant advance… ▽ More Efficiently distinguishing photon numbers is a crucial yet challenging technology for various quantum information and quantum metrology applications. While superconducting transition edge sensors offer good photon-number-resolving (PNR) capabilities, they are hampered by low detection speed, timing jitter, and complex cooling and readout requirements. In this work, we present a significant advancement toward achieving high-fidelity PNR single-photon detectors. The unique twin-layer configuration of superconducting nanowire atop a dielectric mirror ensures the near-unity detection efficiency. The segmented design enables spatial multiplexing, establishing a mapping relationship between pulse amplitude and registered photons. The fabricated detector exhibits impressive performance metrics, including a single-photon system detection efficiency (SDE) of ~ 98% at a dark count rate of 20 cps and photon-number resolution capability up to 32. Further characterization through detector tomography reveals high fidelities for two-, three-, and four-photon events, approximately 87%,73%, and 40% respectively. Moreover, the detector operates at a high count rate of 41 MHz at 3dB-SDE, with a low timing jitter of as low as 40 ps. With its near-unity efficiency, high photon-number resolution, low dark count rate and fast detection speed, we expect significant interest in these detectors, promising substantial benefits for weak light detection and optical quantum information applications. △ Less

Submitted 2 April, 2025; originally announced April 2025.

arXiv:2504.01911 [pdf, other]

Advancing AI-Scientist Understanding: Making LLM Think Like a Physicist with Interpretable Reasoning

Authors: Yinggan Xu, Hana Kimlee, Yijia Xiao, Di Luo

Abstract: Large Language Models (LLMs) are playing an expanding role in physics research by enhancing reasoning, symbolic manipulation, and numerical computation. However, ensuring the reliability and interpretability of their outputs remains a significant challenge. In our framework, we conceptualize the collaboration between AI and human scientists as a dynamic interplay among three modules: the reasoning… ▽ More Large Language Models (LLMs) are playing an expanding role in physics research by enhancing reasoning, symbolic manipulation, and numerical computation. However, ensuring the reliability and interpretability of their outputs remains a significant challenge. In our framework, we conceptualize the collaboration between AI and human scientists as a dynamic interplay among three modules: the reasoning module, the interpretation module, and the AI-scientist interaction module. Recognizing that effective physics reasoning demands rigorous logical consistency, quantitative precision, and deep integration with established theoretical models, we introduce the interpretation module to improve the understanding of AI-generated outputs, which is not previously explored in the literature. This module comprises multiple specialized agents, including summarizers, model builders, UI builders, and testers, which collaboratively structure LLM outputs within a physically grounded framework, by constructing a more interpretable science model. A case study demonstrates that our approach enhances transparency, facilitates validation, and strengthens AI-augmented reasoning in scientific discovery. △ Less

Submitted 2 April, 2025; originally announced April 2025.

arXiv:2504.01823 [pdf, other]

Evidence of doubly OZI-suppressed decay $η_{c} \to ωφ$ in the radiative decay $J/ψ\to γη_{c}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (680 additional authors not shown)

Abstract: Using a sample of $(10087\pm44) \times 10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, the first evidence for the doubly OZI-suppressed decay $η_{c} \to ωφ$ is reported with a significance of 4.0$σ$. The branching fraction of $η_{c} \to ωφ$ is measured to be $\mathcal{B}(η_{c} \to ωφ) = (3.86 \pm 0.92 \pm 0.62) \times 10^{-5}$, where the first uncertainty is statist… ▽ More Using a sample of $(10087\pm44) \times 10^{6}$ $J/ψ$ events collected with the BESIII detector at the BEPCII collider, the first evidence for the doubly OZI-suppressed decay $η_{c} \to ωφ$ is reported with a significance of 4.0$σ$. The branching fraction of $η_{c} \to ωφ$ is measured to be $\mathcal{B}(η_{c} \to ωφ) = (3.86 \pm 0.92 \pm 0.62) \times 10^{-5}$, where the first uncertainty is statistical and the second is systematic. This result provides valuable insights into the underlying mechanisms of charmonium decays, particularly for processes such as $η_{c} \to VV$ (where $V$ represents a vector meson). △ Less

Submitted 2 April, 2025; originally announced April 2025.

arXiv:2504.01541 [pdf, other]

Hyperbolic Diffusion Recommender Model

Authors: Meng Yuan, Yutian Xiao, Wei Chen, Chu Zhao, Deqing Wang, Fuzhen Zhuang

Abstract: Diffusion models (DMs) have emerged as the new state-of-the-art family of deep generative models. To gain deeper insights into the limitations of diffusion models in recommender systems, we investigate the fundamental structural disparities between images and items. Consequently, items often exhibit distinct anisotropic and directional structures that are less prevalent in images. However, the tra… ▽ More Diffusion models (DMs) have emerged as the new state-of-the-art family of deep generative models. To gain deeper insights into the limitations of diffusion models in recommender systems, we investigate the fundamental structural disparities between images and items. Consequently, items often exhibit distinct anisotropic and directional structures that are less prevalent in images. However, the traditional forward diffusion process continuously adds isotropic Gaussian noise, causing anisotropic signals to degrade into noise, which impairs the semantically meaningful representations in recommender systems. Inspired by the advancements in hyperbolic spaces, we propose a novel \textit{\textbf{H}yperbolic} \textit{\textbf{D}iffusion} \textit{\textbf{R}ecommender} \textit{\textbf{M}odel} (named HDRM). Unlike existing directional diffusion methods based on Euclidean space, the intrinsic non-Euclidean structure of hyperbolic space makes it particularly well-adapted for handling anisotropic diffusion processes. In particular, we begin by formulating concepts to characterize latent directed diffusion processes within a geometrically grounded hyperbolic space. Subsequently, we propose a novel hyperbolic latent diffusion process specifically tailored for users and items. Drawing upon the natural geometric attributes of hyperbolic spaces, we impose structural restrictions on the space to enhance hyperbolic diffusion propagation, thereby ensuring the preservation of the intrinsic topology of user-item graphs. Extensive experiments on three benchmark datasets demonstrate the effectiveness of HDRM. △ Less

Submitted 10 April, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

arXiv:2504.01369 [pdf, other]

LITE: LLM-Impelled efficient Taxonomy Evaluation

Authors: Lin Zhang, Zhouhong Gu, Suhang Zheng, Tao Wang, Tianyu Li, Hongwei Feng, Yanghua Xiao

Abstract: This paper presents LITE, an LLM-based evaluation method designed for efficient and flexible assessment of taxonomy quality. To address challenges in large-scale taxonomy evaluation, such as efficiency, fairness, and consistency, LITE adopts a top-down hierarchical evaluation strategy, breaking down the taxonomy into manageable substructures and ensuring result reliability through cross-validation… ▽ More This paper presents LITE, an LLM-based evaluation method designed for efficient and flexible assessment of taxonomy quality. To address challenges in large-scale taxonomy evaluation, such as efficiency, fairness, and consistency, LITE adopts a top-down hierarchical evaluation strategy, breaking down the taxonomy into manageable substructures and ensuring result reliability through cross-validation and standardized input formats. LITE also introduces a penalty mechanism to handle extreme cases and provides both quantitative performance analysis and qualitative insights by integrating evaluation metrics closely aligned with task objectives. Experimental results show that LITE demonstrates high reliability in complex evaluation tasks, effectively identifying semantic errors, logical contradictions, and structural flaws in taxonomies, while offering directions for improvement. Code is available at https://github.com/Zhang-l-i-n/TAXONOMY_DETECT . △ Less

Submitted 2 April, 2025; originally announced April 2025.

arXiv:2504.00851 [pdf, other]

Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations

Authors: Chongjie Si, Zhiyi Shi, Xuehui Wang, Yichen Xiao, Xiaokang Yang, Wei Shen

Abstract: Adapting pre-trained foundation models for diverse downstream tasks is a core practice in artificial intelligence. However, the wide range of tasks and high computational costs make full fine-tuning impractical. To overcome this, parameter-efficient fine-tuning (PEFT) methods like LoRA have emerged and are becoming a growing research focus. Despite the success of these methods, they are primarily… ▽ More Adapting pre-trained foundation models for diverse downstream tasks is a core practice in artificial intelligence. However, the wide range of tasks and high computational costs make full fine-tuning impractical. To overcome this, parameter-efficient fine-tuning (PEFT) methods like LoRA have emerged and are becoming a growing research focus. Despite the success of these methods, they are primarily designed for linear layers, focusing on two-dimensional matrices while largely ignoring higher-dimensional parameter spaces like convolutional kernels. Moreover, directly applying these methods to higher-dimensional parameter spaces often disrupts their structural relationships. Given the rapid advancements in matrix-based PEFT methods, rather than designing a specialized strategy, we propose a generalization that extends matrix-based PEFT methods to higher-dimensional parameter spaces without compromising their structural properties. Specifically, we treat parameters as elements of a Lie group, with updates modeled as perturbations in the corresponding Lie algebra. These perturbations are mapped back to the Lie group through the exponential map, ensuring smooth, consistent updates that preserve the inherent structure of the parameter space. Extensive experiments on computer vision and natural language processing validate the effectiveness and versatility of our approach, demonstrating clear improvements over existing methods. △ Less

Submitted 1 April, 2025; originally announced April 2025.

arXiv:2504.00756 [pdf, other]

RECKON: Large-scale Reference-based Efficient Knowledge Evaluation for Large Language Model

Authors: Lin Zhang, Zhouhong Gu, Xiaoran Shi, Hongwei Feng, Yanghua Xiao

Abstract: As large language models (LLMs) advance, efficient knowledge evaluation becomes crucial to verifying their capabilities. Traditional methods, relying on benchmarks, face limitations such as high resource costs and information loss. We propose the Large-scale Reference-based Efficient Knowledge Evaluation for Large Language Model (RECKON), which directly uses reference data to evaluate models. RECK… ▽ More As large language models (LLMs) advance, efficient knowledge evaluation becomes crucial to verifying their capabilities. Traditional methods, relying on benchmarks, face limitations such as high resource costs and information loss. We propose the Large-scale Reference-based Efficient Knowledge Evaluation for Large Language Model (RECKON), which directly uses reference data to evaluate models. RECKON organizes unstructured data into manageable units and generates targeted questions for each cluster, improving evaluation accuracy and efficiency. Experimental results show that RECKON reduces resource consumption by 56.5% compared to traditional methods while achieving over 97% accuracy across various domains, including world knowledge, code, legal, and biomedical datasets. Code is available at https://github.com/MikeGu721/reckon △ Less

Submitted 1 April, 2025; originally announced April 2025.

arXiv:2504.00695 [pdf, other]

ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection

Authors: Xiaoxuan Zhu, Zhouhong Gu, Baiqian Wu, Suhang Zheng, Tao Wang, Tianyu Li, Hongwei Feng, Yanghua Xiao

Abstract: Pre-training large language models (LLMs) necessitates enormous diverse textual corpora, making effective data selection a key challenge for balancing computational resources and model performance. Current methodologies primarily emphasize data quality metrics and mixing proportions, yet they fail to adequately capture the underlying semantic connections between training samples and quality dispar… ▽ More Pre-training large language models (LLMs) necessitates enormous diverse textual corpora, making effective data selection a key challenge for balancing computational resources and model performance. Current methodologies primarily emphasize data quality metrics and mixing proportions, yet they fail to adequately capture the underlying semantic connections between training samples and quality disparities within individual domains. We introduce ToReMi (Topic-based Reweighting for Model improvement), a novel two-stage framework that dynamically adjusts training sample weights according to their topical associations and observed learning patterns. Our comprehensive experiments reveal that ToReMi variants consistently achieve superior performance over conventional pre-training approaches, demonstrating accelerated perplexity reduction across multiple domains and enhanced capabilities on downstream evaluation tasks. Code is available at https://github.com/zxx000728/ToReMi. △ Less

Submitted 20 April, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

arXiv:2504.00561 [pdf, other]

Continual Cross-Modal Generalization

Authors: Yan Xia, Hai Huang, Minghui Fang, Zhou Zhao

Abstract: Cross-modal generalization aims to learn a shared discrete representation space from multimodal pairs, enabling knowledge transfer across unannotated modalities. However, achieving a unified representation for all modality pairs requires extensive paired data, which is often impractical. Inspired by the availability of abundant bimodal data (e.g., in ImageBind), we explore a continual learning app… ▽ More Cross-modal generalization aims to learn a shared discrete representation space from multimodal pairs, enabling knowledge transfer across unannotated modalities. However, achieving a unified representation for all modality pairs requires extensive paired data, which is often impractical. Inspired by the availability of abundant bimodal data (e.g., in ImageBind), we explore a continual learning approach that incrementally maps new modalities into a shared discrete codebook via a mediator modality. We propose the Continual Mixture of Experts Adapter (CMoE-Adapter) to project diverse modalities into a unified space while preserving prior knowledge. To align semantics across stages, we introduce a Pseudo-Modality Replay (PMR) mechanism with a dynamically expanding codebook, enabling the model to adaptively incorporate new modalities using learned ones as guidance. Extensive experiments on image-text, audio-text, video-text, and speech-text show that our method achieves strong performance on various cross-modal generalization tasks. Code is provided in the supplementary material. △ Less

Submitted 1 April, 2025; originally announced April 2025.

arXiv:2503.24178 [pdf, other]

Beijing Normal University 12-meter Interferometric kHz GW Detector Prototype: Design and Scientific Prospects

Authors: Mengyao Wang, Fan Zhang, Xinyao Guo, Haixing Miao, Huan Yang, Yiqiu Ma, Haoyu Wang, Teng Zhang, Mengdi Cao, Yuchao Chen, Xiaoman Huang, Junlang Li, Fangfei Liu, Jianyu Liu, Yuan Pan, Yulin Xia, Jianbo Xing, Yujie Yu, Chenjie Zhou, Zong-hong Zhu

Abstract: Gravitational wave (GW) astronomy has opened a new window into the universe, enabling the study of extreme astrophysical phenomena that are otherwise obscured in traditional electromagnetic observations. While global efforts have predominantly focused on low- and mid-frequency GW detection, the high-frequency regime, particularly in the kilohertz (kHz) range, remains underexplored despite its pote… ▽ More Gravitational wave (GW) astronomy has opened a new window into the universe, enabling the study of extreme astrophysical phenomena that are otherwise obscured in traditional electromagnetic observations. While global efforts have predominantly focused on low- and mid-frequency GW detection, the high-frequency regime, particularly in the kilohertz (kHz) range, remains underexplored despite its potential to reveal critical insights into compact binary mergers, neutron star physics, and other exotic astrophysical sources. In this context, the Beijing Normal University (BNU) prototype represents a pioneering effort to develop a dedicated kHz GW detector. Featuring a 12-meter L-shaped resonator within a two-arm vacuum system, the BNU prototype is designed to test innovative configurations and address key technical challenges for kHz GW detection. Beyond its primary focus on being a technology testbed and demonstrator for kHz detection, the prototype is also being evaluated for its own sensitivity in the megahertz (MHz) range, offering the potential to explore even higher-frequency signals from e.g., primordial black holes and geontropic fluctuations. This paper provides a comprehensive overview of the BNU prototype, detailing its design, key components, and scientific objectives. △ Less

Submitted 1 April, 2025; v1 submitted 31 March, 2025; originally announced March 2025.

Comments: 21 pages, 7 figures

arXiv:2503.24081 [pdf, other]

Cell-Free Massive MIMO Under Mobility: A Fairness-Differentiated Handover Scheme

Authors: Yunlu Xiao, Marina Petrova, Ljiljana Simić

Abstract: While cell-free massive MIMO (CF-mMIMO) offers both uniform and high network-wide throughput in static networks, its performance in a mobile network is not yet fully addressed. In this paper, we evaluate the performance of a mobile CF-mMIMO network under a comprehensive throughput model and show that it suffers from large performance degradation due to the combined effect of channel aging and hand… ▽ More While cell-free massive MIMO (CF-mMIMO) offers both uniform and high network-wide throughput in static networks, its performance in a mobile network is not yet fully addressed. In this paper, we evaluate the performance of a mobile CF-mMIMO network under a comprehensive throughput model and show that it suffers from large performance degradation due to the combined effect of channel aging and handover delay. To improve the performance of CF-mMIMO under mobility, we propose a fairness-differentiated handover scheme. Our scheme differentiates the handover policy for different users by their channel conditions compared to a threshold based on Jain's fairness index, in order to prioritize handovers for the poorly served users. We present an extensive evaluation of the mobile throughput performance of our handover scheme with realistic urban network distributions and UE mobility patterns. Our results show that our scheme significantly outperforms the existing literature benchmarks when considering both channel aging and handover delay cost. Importantly, the advantage of UE-centric over network-centric CF-mMIMO, of uniformly good performance over the network, is uniquely preserved under mobility by our handover scheme. We thus show that CF-mMIMO can be a feasible architecture for practical mobile networks. △ Less

Submitted 31 March, 2025; originally announced March 2025.

arXiv:2503.23952 [pdf, other]

HeteroPod: XPU-Accelerated Infrastructure Offloading for Commodity Cloud-Native Applications

Authors: Bicheng Yang, Jingkai He, Dong Du, Yubin Xia, Haibo Chen

Abstract: Cloud-native systems increasingly rely on infrastructure services (e.g., service meshes, monitoring agents), which compete for resources with user applications, degrading performance and scalability. We propose HeteroPod, a new abstraction that offloads these services to Data Processing Units (DPUs) to enforce strict isolation while reducing host resource contention and operational costs. To reali… ▽ More Cloud-native systems increasingly rely on infrastructure services (e.g., service meshes, monitoring agents), which compete for resources with user applications, degrading performance and scalability. We propose HeteroPod, a new abstraction that offloads these services to Data Processing Units (DPUs) to enforce strict isolation while reducing host resource contention and operational costs. To realize HeteroPod, we introduce HeteroNet, a cross-PU (XPU) network system featuring: (1) split network namespace, a unified network abstraction for processes spanning CPU and DPU, and (2) elastic and efficient XPU networking, a communication mechanism achieving shared-memory performance without pinned resource overhead and polling costs. By leveraging HeteroNet and the compositional nature of cloud-native workloads, HeteroPod can optimally offload infrastructure containers to DPUs. We implement HeteroNet based on Linux, and implement a cloud-native system called HeteroK8s based on Kubernetes. We evaluate the systems using NVIDIA Bluefield-2 DPUs and CXL-based DPUs (simulated with real CXL memory devices). The results show that HeteroK8s effectively supports complex (unmodified) commodity cloud-native applications (up to 1 million LoC) and provides up to 31.9x better latency and 64x less resource consumption (compared with kernel-bypass design), 60% better end-to-end latency, and 55% higher scalability compared with SOTA systems. △ Less

Submitted 31 March, 2025; originally announced March 2025.

Showing 51–100 of 3,638 results for author: Xiao, Y