Search | arXiv e-print repository

Some Consistent Power Constructions

Abstract: Consistent Hoare, Smyth and Plotkin power domains are introduced and discussed by Yuan and Kou. The consistent algebraic operation $+$ defined by them is a binary partial Scott continuous operation satisfying the requirement: $a+b$ exists whenever there exists a $c$ which is greater than $a$ and $b$. We extend the consistency to be a categorical concept and obtain an approach to generating consist… ▽ More Consistent Hoare, Smyth and Plotkin power domains are introduced and discussed by Yuan and Kou. The consistent algebraic operation $+$ defined by them is a binary partial Scott continuous operation satisfying the requirement: $a+b$ exists whenever there exists a $c$ which is greater than $a$ and $b$. We extend the consistency to be a categorical concept and obtain an approach to generating consistent monads from monads on dcpos whose images equipped with some algebraic operations. Then we provide two new power constructions over domains: the consistent Plotkin index power domain and the consistent probabilistic power domain. Moreover, we verify these power constructions are free. △ Less

Submitted 7 March, 2025; originally announced March 2025.

MSC Class: 06B35; 54A35; 54D05

arXiv:2503.04681 [pdf, other]

Mixed Near-field and Far-field Target Localization for Low-altitude Economy

Authors: Cong Zhou, Changsheng You, Chao Zhou, Hongqiang Cheng, Shuo Shi

Abstract: In this paper, we study efficient mixed near-field and far-field target localization methods for low-altitude economy, by capitalizing on extremely large-scale multiple-input multiple-output (XL-MIMO) communication systems. Compared with existing works, we address three new challenges in localization, arising from 1) half-wavelength antenna spacing constraint, 2) hybrid uniform planar array (UPA)… ▽ More In this paper, we study efficient mixed near-field and far-field target localization methods for low-altitude economy, by capitalizing on extremely large-scale multiple-input multiple-output (XL-MIMO) communication systems. Compared with existing works, we address three new challenges in localization, arising from 1) half-wavelength antenna spacing constraint, 2) hybrid uniform planar array (UPA) architecture, and 3) incorrect mixed-field target classification for near-field targets.To address these issues, we propose a new three-step mixed-field localization method.First, we reconstruct the signals received at UPA antennas by judiciously designing analog combining matrices over time with minimum recovery errors, thus tackling the reduced-dimensional signal-space issue in hybrid arrays.Second, based on recovered signals, we devise a modified MUSIC algorithm (catered to UPA architecture) to estimate 2D angular parameters of both far- and near-field targets. Due to half-wavelength inter-antenna spacing, there exist ambiguous angles when estimating true angles of targets.In the third step, we design an effective classification method to distinguish mixed-field targets, determine true angles of all targets, as well as estimate the ranges of near-field targets. In particular, angular ambiguity is resolved by showing an important fact that the three types of estimated angles (i.e., far-field, near-field, and ambiguous angles) exhibit significantly different patterns in the range-domain MUSIC spectrum. Furthermore, to characterize the estimation error lower-bound, we obtain a matrix closed-form Cramér-Rao bounds for mixed-field target localization. Finally, numerical results demonstrate the effectiveness of our proposed mixed-field localization method, which improves target-classification accuracy and achieves a lower root mean square error than various benchmark schemes. △ Less

Submitted 6 March, 2025; originally announced March 2025.

Comments: An effective mixed near-field and far-field target localization method by employing typical wireless communication infrastructures is proposed in this paper

arXiv:2503.04035 [pdf]

Unveiling the Oxidation Mechanisms of Octa-Penta Graphene: A Multidimensional Exploration from First-Principles to Machine Learning

Authors: Chenyi Zhou, Rubin Huo, Boyi Situ, Zihan Yan, Zhe Zhang, Yusong Tu

Abstract: Octa-penta graphene (OPG), a novel carbon allotrope characterized by its distinctive arrangement of pentagonal and octagonal rings, has garnered considerable attention due to its exceptional structure and functional properties. This study systematically investigates the oxidation mechanisms of OPG and elucidates the oxygen migration patterns on the OPG monolayer through first-principles calculatio… ▽ More Octa-penta graphene (OPG), a novel carbon allotrope characterized by its distinctive arrangement of pentagonal and octagonal rings, has garnered considerable attention due to its exceptional structure and functional properties. This study systematically investigates the oxidation mechanisms of OPG and elucidates the oxygen migration patterns on the OPG monolayer through first-principles calculations and machine-learning-based molecular dynamics (MLMD) simulations. Specifically, the oxidation processes on OPG-L and OPG-Z involve exothermic chemisorption, where oxygen molecules dissociate at the surfaces, forming stable epoxy groups. Furthermore, the integrated-crystal orbital Hamilton population (ICOHP) and Bader charge analyses provide insights into the physical mechanisms of oxygen atom adsorption. Importantly, we found that oxidation also impact the electronic properties of OPG, with OPG-L retaining its metallic characteristics post-oxygen adsorption, whereas OPG-Z undergoes a transformation from a metallic to a semiconducting state due to the introduction of oxygen. Oxygen migration on OPG monolayer involves breaking and reforming of C-O bonds, with varying stability across adsorption sites and limited migration along the basal plane. MLMD simulations corroborate these migration patterns, offering detailed migration trajectories consistent with theoretical predictions. These findings enhance the understanding of oxygen migration dynamics on OPG, facilitate its experimental validations, and highlight its potential as a novel 2D material for applications in batteries, heat-resistant materials, and oxidation-resistant coatings. △ Less

Submitted 5 March, 2025; originally announced March 2025.

arXiv:2503.03137 [pdf, other]

Learning to Reduce Search Space for Generalizable Neural Routing Solver

Authors: Changliang Zhou, Xi Lin, Zhenkun Wang, Qingfu Zhang

Abstract: Constructive neural combinatorial optimization (NCO) has attracted growing research attention due to its ability to solve complex routing problems without relying on handcrafted rules. However, existing NCO methods face significant challenges in generalizing to large-scale problems due to high computational complexity and inefficient capture of structural patterns. To address this issue, we propos… ▽ More Constructive neural combinatorial optimization (NCO) has attracted growing research attention due to its ability to solve complex routing problems without relying on handcrafted rules. However, existing NCO methods face significant challenges in generalizing to large-scale problems due to high computational complexity and inefficient capture of structural patterns. To address this issue, we propose a novel learning-based search space reduction method that adaptively selects a small set of promising candidate nodes at each step of the constructive NCO process. Unlike traditional methods that rely on fixed heuristics, our selection model dynamically prioritizes nodes based on learned patterns, significantly reducing the search space while maintaining solution quality. Experimental results demonstrate that our method, trained solely on 100-node instances from uniform distribution, generalizes remarkably well to large-scale Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP) instances with up to 1 million nodes from the uniform distribution and over 80K nodes from other distributions. △ Less

Submitted 19 May, 2025; v1 submitted 4 March, 2025; originally announced March 2025.

Comments: 37 pages, 7 figures

arXiv:2503.02241 [pdf]

Unsupervised Waste Classification By Dual-Encoder Contrastive Learning and Multi-Clustering Voting (DECMCV)

Authors: Kui Huang, Mengke Song, Shuo Ba, Ling An, Huajie Liang, Huanxi Deng, Yang Liu, Zhenyu Zhang, Chichun Zhou

Abstract: Waste classification is crucial for improving processing efficiency and reducing environmental pollution. Supervised deep learning methods are commonly used for automated waste classification, but they rely heavily on large labeled datasets, which are costly and inefficient to obtain. Real-world waste data often exhibit category and style biases, such as variations in camera angles, lighting condi… ▽ More Waste classification is crucial for improving processing efficiency and reducing environmental pollution. Supervised deep learning methods are commonly used for automated waste classification, but they rely heavily on large labeled datasets, which are costly and inefficient to obtain. Real-world waste data often exhibit category and style biases, such as variations in camera angles, lighting conditions, and types of waste, which can impact the model's performance and generalization ability. Therefore, constructing a bias-free dataset is essential. Manual labeling is not only costly but also inefficient. While self-supervised learning helps address data scarcity, it still depends on some labeled data and generally results in lower accuracy compared to supervised methods. Unsupervised methods show potential in certain cases but typically do not perform as well as supervised models, highlighting the need for an efficient and cost-effective unsupervised approach. This study presents a novel unsupervised method, Dual-Encoder Contrastive Learning with Multi-Clustering Voting (DECMCV). The approach involves using a pre-trained ConvNeXt model for image encoding, leveraging VisionTransformer to generate positive samples, and applying a multi-clustering voting mechanism to address data labeling and domain shift issues. Experimental results demonstrate that DECMCV achieves classification accuracies of 93.78% and 98.29% on the TrashNet and Huawei Cloud datasets, respectively, outperforming or matching supervised models. On a real-world dataset of 4,169 waste images, only 50 labeled samples were needed to accurately label thousands, improving classification accuracy by 29.85% compared to supervised models. This method effectively addresses style differences, enhances model generalization, and contributes to the advancement of automated waste classification. △ Less

Submitted 3 March, 2025; originally announced March 2025.

arXiv:2503.02196 [pdf, ps, other]

First Measurement of the Decay Dynamics in the Semileptonic Transition of the $D^{+(0)}$ into the Axial-vector Meson $\bar K_1(1270)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (680 additional authors not shown)

Abstract: Using $e^+e^-$ collision data taken at the center-of-mass energy of 3.773 GeV with the BESIII detector, corresponding to an integrated luminosity of 20.3 fb$^{-1}$, we report the first amplitude and angular analyses of the semileptonic decays $D^{+(0)}\to K^-π^+π^{0(-)} e^+ν_e$. From the amplitude analysis, we determine for the first time the hadronic form factors of the semileptonic $D$ decays in… ▽ More Using $e^+e^-$ collision data taken at the center-of-mass energy of 3.773 GeV with the BESIII detector, corresponding to an integrated luminosity of 20.3 fb$^{-1}$, we report the first amplitude and angular analyses of the semileptonic decays $D^{+(0)}\to K^-π^+π^{0(-)} e^+ν_e$. From the amplitude analysis, we determine for the first time the hadronic form factors of the semileptonic $D$ decays into the axial-vector meson $\bar{K}_1(1270)$ to be $r_A=(-11.2\pm1.0\pm0.9)\times10^{-2}$ and $r_V = (-4.3\pm 1.0\pm2.4)\times 10^{-2}$. The angular analysis yields an up-down asymmetry $\mathcal{A}^\prime_{ud} = 0.01\pm0.11$, which is consistent with the Standard Model prediction. △ Less

Submitted 3 March, 2025; originally announced March 2025.

Comments: 15 pages, 6 figures, submitted to PRL

arXiv:2503.01253 [pdf, other]

NM-SpMM: Accelerating Matrix Multiplication Using N:M Sparsity with GPGPU

Authors: Cong Ma, Du Wu, Zhelang Deng, Jiang Chen, Xiaowen Huang, Jintao Meng, Wenxi Zhu, Bingqiang Wang, Amelie Chi Zhou, Peng Chen, Minwen Deng, Yanjie Wei, Shengzhong Feng, Yi Pan

Abstract: Deep learning demonstrates effectiveness across a wide range of tasks. However, the dense and over-parameterized nature of these models results in significant resource consumption during deployment. In response to this issue, weight pruning, particularly through N:M sparsity matrix multiplication, offers an efficient solution by transforming dense operations into semi-sparse ones. N:M sparsity pro… ▽ More Deep learning demonstrates effectiveness across a wide range of tasks. However, the dense and over-parameterized nature of these models results in significant resource consumption during deployment. In response to this issue, weight pruning, particularly through N:M sparsity matrix multiplication, offers an efficient solution by transforming dense operations into semi-sparse ones. N:M sparsity provides an option for balancing performance and model accuracy, but introduces more complex programming and optimization challenges. To address these issues, we design a systematic top-down performance analysis model for N:M sparsity. Meanwhile, NM-SpMM is proposed as an efficient general N:M sparsity implementation. Based on our performance analysis, NM-SpMM employs a hierarchical blocking mechanism as a general optimization to enhance data locality, while memory access optimization and pipeline design are introduced as sparsity-aware optimization, allowing it to achieve close-to-theoretical peak performance across different sparsity levels. Experimental results show that NM-SpMM is 2.1x faster than nmSPARSE (the state-of-the-art for general N:M sparsity) and 1.4x to 6.3x faster than cuBLAS's dense GEMM operations, closely approaching the theoretical maximum speedup resulting from the reduction in computation due to sparsity. NM-SpMM is open source and publicly available at https://github.com/M-H482/NM-SpMM. △ Less

Submitted 4 March, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

Comments: 12 pages, 10 figures, accepted at IPDPS 2025. Code: https://github.com/M-H482/NM-SpMM

ACM Class: C.1.4; D.1.3; G.1.0

arXiv:2503.00934 [pdf, other]

A sharp-interface approach for simulating solid-state dewetting of thin films with double-bubble structure

Authors: Meng Li, Nan Wang, Ruofan Zhao, Chunjie Zhou

Abstract: We develop a sharp-interface model for solid-state dewetting of double-bubble thin films using an energy variational approach based on a newly proposed interfacial energy. This model characterizes the dynamic evolution of interfaces in double-bubble thin films, a process primarily governed by surface diffusion and junction/contact points migration, and fundamentally distinct from the behavior obse… ▽ More We develop a sharp-interface model for solid-state dewetting of double-bubble thin films using an energy variational approach based on a newly proposed interfacial energy. This model characterizes the dynamic evolution of interfaces in double-bubble thin films, a process primarily governed by surface diffusion and junction/contact points migration, and fundamentally distinct from the behavior observed in a single thin film. Subsequently, a structure-preserving parametric finite element approximation is developed for the sharp-interface model, which can preserve both area conservation and energy stability. Extensive numerical experiments are presented to demonstrate the convergence, structure-preserving properties, and superior mesh quality of the proposed method. Additionally, we investigate several specific evolution processes, including the equilibrium shapes of double-bubble thin films and the pinch-off dynamics of long islands. △ Less

Submitted 4 March, 2025; v1 submitted 2 March, 2025; originally announced March 2025.

arXiv:2502.20915 [pdf, other]

The Feasibility Study of the GeV-Energy Muon Source Based on HIAF

Authors: Yu Xu, Xueheng Zhang, Yuhong Yu, Pei Yu, Li Deng, Jiajia Zhai, Liangwen Chen, He Zhao, Lina Sheng, Guodong Shen, Ziwen Pan, Qite Li, Chen Zhou, Qiang Li, Lei Yang, Zhiyu Sun

Abstract: Generating a mono-energetic, high-energy muon beam using accelerator facilities can be very attractive for many purposes, for example, improving muon tomography currently limited by the low flux and wide energy spread of cosmic ray muons, and searching for muon related new physics beyond the Standard Model. One potential accelerator facility is the High Intensity Heavy-Ion Accelerator Facility (HI… ▽ More Generating a mono-energetic, high-energy muon beam using accelerator facilities can be very attractive for many purposes, for example, improving muon tomography currently limited by the low flux and wide energy spread of cosmic ray muons, and searching for muon related new physics beyond the Standard Model. One potential accelerator facility is the High Intensity Heavy-Ion Accelerator Facility (HIAF), which is currently under construction in Huizhou City, China. Considering the projectile energy and beamline length, a high-intensity and GeV-energy muon flux could be produced and delivered by the High Energy Fragment Separator beamline of the HIAF facility. In this paper, the flux intensity and purity of muon beam based on HIAF are discussed in detail. For the $μ^+$ beam, the highest muon yield reaches $8.2 \times 10^6 ~ μ$/s with the purity of approximately $2\%$ at a momentum of 3.5 GeV/c; meanwhile, for the $μ^-$ beam, the maximum muon yield is 4.2 $\times 10^6 ~ μ$/s with the purity of around $20\%$ at a momentum of 1.5 GeV/c. The results also indicate that, for muon beams with an energy of several GeV, by applying a suitable purification strategy, we can get a muon beam with a purity of 100\% and an intensity of the order of $10^5 ~ μ$/s. △ Less

Submitted 21 May, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

Comments: 12 pages, accepted for publication in the PHYSICAL REVIEW ACCELERATORS AND BEAMS

arXiv:2502.20821 [pdf, ps, other]

doi 10.1007/JHEP06(2025)194

Improved measurement of absolute branching fraction of the inclusive decay $Λ_{c}^{+} \to K_{S}^{0} X$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (679 additional authors not shown)

Abstract: By analyzing $4.5$ fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated with the BESIII detector at center-of-mass energies ranging from $4599.53$ MeV to $4698.82$ MeV, we report the measurement of the absolute branching fraction (BF) of the inclusive decay $Λ_{c}^{+} \to K_{S}^{0} X$ using the double-tag technique. The result is $\mathcal{B}(Λ_{c}^{+} \to K_{S}^{0} X)=(10.9\pm0.2\pm0.1)\%$, where… ▽ More By analyzing $4.5$ fb$^{-1}$ of $e^{+}e^{-}$ collision data accumulated with the BESIII detector at center-of-mass energies ranging from $4599.53$ MeV to $4698.82$ MeV, we report the measurement of the absolute branching fraction (BF) of the inclusive decay $Λ_{c}^{+} \to K_{S}^{0} X$ using the double-tag technique. The result is $\mathcal{B}(Λ_{c}^{+} \to K_{S}^{0} X)=(10.9\pm0.2\pm0.1)\%$, where the first uncertainty is statistical and the second is systematic. This result indicates that there are still undiscovered decay channels containing $K_{S}^{0}$ in the final state with a combined BF of $(3.1\pm0.4)\%$. The BF of the inclusive decay $Λ_{c}^{+} \to \overline{K}^{0} / K^{0} X$ is calculated to be $\mathcal{B}(Λ_{c}^{+} \to \overline{K}^{0} / K^{0} X)=(21.8 \pm0.4 \pm0.2 \pm1.1)\%$, where the third uncertainty accounts for a possible difference between $\mathcal{B}(Λ_{c}^{+} \to K_{S}^{0} X)$ and $\mathcal{B}(Λ_{c}^{+} \to K_{L}^{0} X)$. The result is in agreement with the prediction of the statistical isospin model. △ Less

Submitted 21 June, 2025; v1 submitted 28 February, 2025; originally announced February 2025.

Journal ref: J. High Energ. Phys. 2025, 194 (2025)

arXiv:2502.18977 [pdf]

A space-resolved visible spectrometer system using compact endoscopic optics for full vertical profile measurement of impurity line emissions in superconducting EAST tokamak

Authors: A. Hu, Y. Cheng, L. Zhang, S. Morita, J. Ma, M. Kobayashi, C. Zhou, J. Chen, Y. Cao, F. Zhang, W. Zhang, Z. Li, D. Mitnik, S. Wang, Y. Jie, G. Zuo, J. Qian, H. Liu, G. Xu, J. Hu, K. Lu, Y. Song

Abstract: In Experimental Advanced Superconducting Tokamak (EAST tokamak) with tungsten divertors and molybdenum first wall, lithiumization and boronization have been frequently carried out to improve the plasma performance, in particular, in long pulse discharges. A study on impurity behaviors of lithium, boron and tungsten atoms/ions in the edge plasma is then crucially important. For the purpose, a space… ▽ More In Experimental Advanced Superconducting Tokamak (EAST tokamak) with tungsten divertors and molybdenum first wall, lithiumization and boronization have been frequently carried out to improve the plasma performance, in particular, in long pulse discharges. A study on impurity behaviors of lithium, boron and tungsten atoms/ions in the edge plasma is then crucially important. For the purpose, a space-resolved visible spectrometer system has been newly developed to observe full vertical profiles over a length of 1.7m of impurity line emissions in wavelength range of 320-800nm. For the full vertical profile measurement compact endoscopic optics is employed with an optical fiber bundle for the system, which can be inserted into a 1.5m long extension tube called 'long nose', because the distance between the diagnostic port and plasma center is considerably long. Therefore, a quartz glass window mounted from the vacuum vessel side is designed to withstand the reverse pressure. A mechanical shutter is also designed to open at a large angle of 235 degree so that the viewing angle of nearby ports is not blocked. Two sets of the fiber bundle, 60-channel linear array and 11*10 channel planar array , with a length of 30m are attached to two sets of Czerny-Turner visible spectrometers for one-dimensional (1D) vertical profile measurement of core plasma and two-dimensional (2D) spectroscopy of divertor plasma, respectively. A complementary metal oxide semiconductor (CMOS) detector with 2048*2048 pixels is used for the visible spectrometers. A preliminary result on the full vertical profile is obtained for BII line emission at 703.19nm in the 1D system △ Less

Submitted 26 February, 2025; originally announced February 2025.

arXiv:2502.17441 [pdf, other]

Renaissance of Literate Programming in the Era of LLMs: Enhancing LLM-Based Code Generation in Large-Scale Projects

Authors: Wuyang Zhang, Yansong Li, Zeyu Dong, Yu Wu, Yingyao Zhou, Duolei Wang, Songsirou Xing, Chichun Zhou, Da Shen

Abstract: Large Language Models (LLMs) have helped programmers increase efficiency through code generation, comprehension, and repair. However, their application to large-scale projects remains challenging due to complex interdependencies and the extensive size of modern codebases. Although Knuth's concept of Literate Programming (LP) combines code and natural language to convey logic and intent, its potent… ▽ More Large Language Models (LLMs) have helped programmers increase efficiency through code generation, comprehension, and repair. However, their application to large-scale projects remains challenging due to complex interdependencies and the extensive size of modern codebases. Although Knuth's concept of Literate Programming (LP) combines code and natural language to convey logic and intent, its potential for enhancing relationships in large projects has not been fully explored. In this study, we introduce the idea of Interoperable LP (ILP), which leverages literate programming principles to enhance the development of both small-scale documents and large-scale projects with LLMs. We investigate how LLMs perform under ILP-style instructions for both document-oriented tasks and entire projects. Recognizing that many researchers rely on well-structured templates to guide LLMs, we propose a concise prompt engineering method to write LP documents so LLMs can better be involved in code generation. We also examine the capacity of various LLMs to generate Scheme and Python code on the RepoBench benchmark, illustrating the advantages of our approach. Our findings indicate that ILP with LLMs can enhance LLM-based code generation in large-scale project development. △ Less

Submitted 25 December, 2024; originally announced February 2025.

arXiv:2502.16163 [pdf, other]

Large Language Model for Lossless Image Compression with Visual Prompts

Authors: Junhao Du, Chuqin Zhou, Ning Cao, Gang Chen, Yunuo Chen, Zhengxue Cheng, Li Song, Guo Lu, Wenjun Zhang

Abstract: Recent advancements in deep learning have driven significant progress in lossless image compression. With the emergence of Large Language Models (LLMs), preliminary attempts have been made to leverage the extensive prior knowledge embedded in these pretrained models to enhance lossless image compression, particularly by improving the entropy model. However, a significant challenge remains in bridg… ▽ More Recent advancements in deep learning have driven significant progress in lossless image compression. With the emergence of Large Language Models (LLMs), preliminary attempts have been made to leverage the extensive prior knowledge embedded in these pretrained models to enhance lossless image compression, particularly by improving the entropy model. However, a significant challenge remains in bridging the gap between the textual prior knowledge within LLMs and lossless image compression. To tackle this challenge and unlock the potential of LLMs, this paper introduces a novel paradigm for lossless image compression that incorporates LLMs with visual prompts. Specifically, we first generate a lossy reconstruction of the input image as visual prompts, from which we extract features to serve as visual embeddings for the LLM. The residual between the original image and the lossy reconstruction is then fed into the LLM along with these visual embeddings, enabling the LLM to function as an entropy model to predict the probability distribution of the residual. Extensive experiments on multiple benchmark datasets demonstrate our method achieves state-of-the-art compression performance, surpassing both traditional and learning-based lossless image codecs. Furthermore, our approach can be easily extended to images from other domains, such as medical and screen content images, achieving impressive performance. These results highlight the potential of LLMs for lossless image compression and may inspire further research in related directions. △ Less

Submitted 22 February, 2025; originally announced February 2025.

arXiv:2502.16097 [pdf, other]

LitLinker: Supporting the Ideation of Interdisciplinary Contexts with Large Language Models for Teaching Literature in Elementary Schools

Authors: Haoxiang Fan, Changshuang Zhou, Hao Yu, Xueyang Wu, Jiangyu Gu, Zhenhui Peng

Abstract: Teaching literature under interdisciplinary contexts (e.g., science, art) that connect reading materials has become popular in elementary schools. However, constructing such contexts is challenging as it requires teachers to explore substantial amounts of interdisciplinary content and link it to the reading materials. In this paper, we develop LitLinker via an iterative design process involving 13… ▽ More Teaching literature under interdisciplinary contexts (e.g., science, art) that connect reading materials has become popular in elementary schools. However, constructing such contexts is challenging as it requires teachers to explore substantial amounts of interdisciplinary content and link it to the reading materials. In this paper, we develop LitLinker via an iterative design process involving 13 teachers to facilitate the ideation of interdisciplinary contexts for teaching literature. Powered by a large language model (LLM), LitLinker can recommend interdisciplinary topics and contextualize them with the literary elements (e.g., paragraphs, viewpoints) in the reading materials. A within-subjects study (N=16) shows that compared to an LLM chatbot, LitLinker can improve the integration depth of different subjects and reduce workload in this ideation task. Expert interviews (N=9) also demonstrate LitLinker's usefulness for supporting the ideation of interdisciplinary contexts for teaching literature. We conclude with concerns and design considerations for supporting interdisciplinary teaching with LLMs. △ Less

Submitted 22 February, 2025; originally announced February 2025.

arXiv:2502.14495 [pdf, other]

Nearshore Underwater Target Detection Meets UAV-borne Hyperspectral Remote Sensing: A Novel Hybrid-level Contrastive Learning Framework and Benchmark Dataset

Authors: Jiahao Qi, Chuanhong Zhou, Xingyue Liu, Chen Chen, Dehui Zhu, Kangcheng Bin, Ping Zhong

Abstract: UAV-borne hyperspectral remote sensing has emerged as a promising approach for underwater target detection (UTD). However, its effectiveness is hindered by spectral distortions in nearshore environments, which compromise the accuracy of traditional hyperspectral UTD (HUTD) methods that rely on bathymetric model. These distortions lead to significant uncertainty in target and background spectra, ch… ▽ More UAV-borne hyperspectral remote sensing has emerged as a promising approach for underwater target detection (UTD). However, its effectiveness is hindered by spectral distortions in nearshore environments, which compromise the accuracy of traditional hyperspectral UTD (HUTD) methods that rely on bathymetric model. These distortions lead to significant uncertainty in target and background spectra, challenging the detection process. To address this, we propose the Hyperspectral Underwater Contrastive Learning Network (HUCLNet), a novel framework that integrates contrastive learning with a self-paced learning paradigm for robust HUTD in nearshore regions. HUCLNet extracts discriminative features from distorted hyperspectral data through contrastive learning, while the self-paced learning strategy selectively prioritizes the most informative samples. Additionally, a reliability-guided clustering strategy enhances the robustness of learned representations.To evaluate the method effectiveness, we conduct a novel nearshore HUTD benchmark dataset, ATR2-HUTD, covering three diverse scenarios with varying water types and turbidity, and target types. Extensive experiments demonstrate that HUCLNet significantly outperforms state-of-the-art methods. The dataset and code will be publicly available at: https://github.com/qjh1996/HUTD △ Less

Submitted 20 February, 2025; originally announced February 2025.

Comments: 18pages,13figures

arXiv:2502.12603 [pdf, other]

Disentangling Long-Short Term State Under Unknown Interventions for Online Time Series Forecasting

Authors: Ruichu Cai, Haiqin Huang, Zhifang Jiang, Zijian Li, Changze Zhou, Yuequn Liu, Yuming Liu, Zhifeng Hao

Abstract: Current methods for time series forecasting struggle in the online scenario, since it is difficult to preserve long-term dependency while adapting short-term changes when data are arriving sequentially. Although some recent methods solve this problem by controlling the updates of latent states, they cannot disentangle the long/short-term states, leading to the inability to effectively adapt to non… ▽ More Current methods for time series forecasting struggle in the online scenario, since it is difficult to preserve long-term dependency while adapting short-term changes when data are arriving sequentially. Although some recent methods solve this problem by controlling the updates of latent states, they cannot disentangle the long/short-term states, leading to the inability to effectively adapt to nonstationary. To tackle this challenge, we propose a general framework to disentangle long/short-term states for online time series forecasting. Our idea is inspired by the observations where short-term changes can be led by unknown interventions like abrupt policies in the stock market. Based on this insight, we formalize a data generation process with unknown interventions on short-term states. Under mild assumptions, we further leverage the independence of short-term states led by unknown interventions to establish the identification theory to achieve the disentanglement of long/short-term states. Built on this theory, we develop a long short-term disentanglement model (LSTD) to extract the long/short-term states with long/short-term encoders, respectively. Furthermore, the LSTD model incorporates a smooth constraint to preserve the long-term dependencies and an interrupted dependency constraint to enforce the forgetting of short-term dependencies, together boosting the disentanglement of long/short-term states. Experimental results on several benchmark datasets show that our \textbf{LSTD} model outperforms existing methods for online time series forecasting, validating its efficacy in real-world applications. △ Less

Submitted 18 February, 2025; originally announced February 2025.

Journal ref: AAAI2025

arXiv:2502.11588 [pdf, other]

A Unified Modeling Framework for Automated Penetration Testing

Authors: Yunfei Wang, Shixuan Liu, Wenhao Wang, Changling Zhou, Chao Zhang, Jiandong Jin, Cheng Zhu

Abstract: The integration of artificial intelligence into automated penetration testing (AutoPT) has highlighted the necessity of simulation modeling for the training of intelligent agents, due to its cost-efficiency and swift feedback capabilities. Despite the proliferation of AutoPT research, there is a recognized gap in the availability of a unified framework for simulation modeling methods. This paper p… ▽ More The integration of artificial intelligence into automated penetration testing (AutoPT) has highlighted the necessity of simulation modeling for the training of intelligent agents, due to its cost-efficiency and swift feedback capabilities. Despite the proliferation of AutoPT research, there is a recognized gap in the availability of a unified framework for simulation modeling methods. This paper presents a systematic review and synthesis of existing techniques, introducing MDCPM to categorize studies based on literature objectives, network simulation complexity, dependency of technical and tactical operations, and scenario feedback and variation. To bridge the gap in unified method for multi-dimensional and multi-level simulation modeling, dynamic environment modeling, and the scarcity of public datasets, we introduce AutoPT-Sim, a novel modeling framework that based on policy automation and encompasses the combination of all sub dimensions. AutoPT-Sim offers a comprehensive approach to modeling network environments, attackers, and defenders, transcending the constraints of static modeling and accommodating networks of diverse scales. We publicly release a generated standard network environment dataset and the code of Network Generator. By integrating publicly available datasets flexibly, support is offered for various simulation modeling levels focused on policy automation in MDCPM and the network generator help researchers output customized target network data by adjusting parameters or fine-tuning the network generator. △ Less

Submitted 17 February, 2025; originally announced February 2025.

arXiv:2502.11516 [pdf, ps, other]

CRB-Rate Tradeoff in RSMA-enabled Near-Field Integrated Multi-Target Sensing and Multi-User Communications

Authors: Jiasi Zhou, Cong Zhou, Yanjing Sun, Chintha Tellambura

Abstract: Extremely large-scale antenna arrays enhance spectral efficiency and spatial resolution in integrated sensing and communication (ISAC) networks while expanding the Rayleigh distance, triggering a shift from conventional far-field plane waves to near-field (NF) spherical waves. However, full-digital beamforming is infeasible due to the need for dedicated radio frequency (RF) chains. To address this… ▽ More Extremely large-scale antenna arrays enhance spectral efficiency and spatial resolution in integrated sensing and communication (ISAC) networks while expanding the Rayleigh distance, triggering a shift from conventional far-field plane waves to near-field (NF) spherical waves. However, full-digital beamforming is infeasible due to the need for dedicated radio frequency (RF) chains. To address this, NF-ISAC with a rate-splitting multiple access (RSMA) scheme is developed for advanced interference management, considering fully-connected and partially-connected hybrid analog and digital (HAD) beamforming architectures. Specifically, the Cramér-Rao bound (CRB) for joint distance and angle sensing is derived, and the achievable performance region between the max-min communication rate and the multi-target CRB is defined. To fully characterize the Pareto boundary of the CRB-rate region, a sensing-centric minimization problem is formulated under communication rate constraints for two HAD beamforming architectures. A penalty dual decomposition (PDD)-based double-loop algorithm is developed to optimize fully-connected HAD beamformers. To reduce computational complexity, a two-stage design algorithm for fully connected HAD beamforming is also proposed. Additionally, the PDD-based double-loop algorithm is extended to the partially-connected HAD architecture. Simulations demonstrate the proposed schemes and algorithms: 1) achieve performance comparable to a fully digital beamformer with fewer RF chains, 2) outperform space division multiple access and far-field ISAC, and 3) yield enhanced CRB-rate trade-off performance. △ Less

Submitted 17 February, 2025; originally announced February 2025.

Comments: 13 pages, 9 figures

arXiv:2502.11506 [pdf, other]

Learning Surrogate Potential Mean Field Games via Gaussian Processes: A Data-Driven Approach to Ill-Posed Inverse Problems

Authors: Jingguo Zhang, Xianjin Yang, Chenchen Mou, Chao Zhou

Abstract: Mean field games (MFGs) describe the collective behavior of large populations of interacting agents. In this work, we tackle ill-posed inverse problems in potential MFGs, aiming to recover the agents' population, momentum, and environmental setup from limited, noisy measurements and partial observations. These problems are ill-posed because multiple MFG configurations can explain the same data, or… ▽ More Mean field games (MFGs) describe the collective behavior of large populations of interacting agents. In this work, we tackle ill-posed inverse problems in potential MFGs, aiming to recover the agents' population, momentum, and environmental setup from limited, noisy measurements and partial observations. These problems are ill-posed because multiple MFG configurations can explain the same data, or different parameters can yield nearly identical observations. Nonetheless, they remain crucial in practice for real-world scenarios where data are inherently sparse or noisy, or where the MFG structure is not fully determined. Our focus is on finding surrogate MFGs that accurately reproduce the observed data despite these challenges. We propose two Gaussian process (GP)-based frameworks: an inf-sup formulation and a bilevel approach. The choice between them depends on whether the unknown parameters introduce concavity in the objective. In the inf-sup framework, we use the linearity of GPs and their parameterization structure to maintain convex-concave properties, allowing us to apply standard convex optimization algorithms. In the bilevel framework, we employ a gradient-descent-based algorithm and introduce two methods for computing the outer gradient. The first method leverages an existing solver for the inner potential MFG and applies automatic differentiation, while the second adopts an adjoint-based strategy that computes the outer gradient independently of the inner solver. Our numerical experiments show that when sufficient prior information is available, the unknown parameters can be accurately recovered. Otherwise, if prior information is limited, the inverse problem is ill-posed, but our frameworks can still produce surrogate MFG models that closely match observed data. △ Less

Submitted 17 February, 2025; originally announced February 2025.

Comments: 36 pages

arXiv:2502.11058 [pdf, other]

DreamDDP: Accelerating Data Parallel Distributed LLM Training with Layer-wise Scheduled Partial Synchronization

Authors: Zhenheng Tang, Zichen Tang, Junlin Huang, Xinglin Pan, Rudan Yan, Yuxin Wang, Amelie Chi Zhou, Shaohuai Shi, Xiaowen Chu, Bo Li

Abstract: The growth of large language models (LLMs) increases challenges of accelerating distributed training across multiple GPUs in different data centers. Moreover, concerns about data privacy and data exhaustion have heightened interest in geo-distributed data centers. Communication in geo-distributed data parallel training (DDP) with stochastic gradient descent (S-SGD) is the main bottleneck in low-ba… ▽ More The growth of large language models (LLMs) increases challenges of accelerating distributed training across multiple GPUs in different data centers. Moreover, concerns about data privacy and data exhaustion have heightened interest in geo-distributed data centers. Communication in geo-distributed data parallel training (DDP) with stochastic gradient descent (S-SGD) is the main bottleneck in low-bandwidth environments. Local SGD mitigates communication overhead by reducing synchronization frequency, and recent studies have successfully applied it to geo-distributedly pre-train LLMs. However, we identify that its model synchronization mechanism prevents overlapping communication and computation, which makes the system lose opportunities to overlap communication and computation. To overcome this limitation, we expand the design space of local SGD by layer-wisely decoupling model synchronization. In each iteration, only some layers are synchronized instead of the entire model after a specific number of iterations. Leveraging this methodology, we introduce DreamDDP, a training framework to accelerate low-bandwidth distributed training with three key innovations: (1) partial local SGD with theoretical assurances of convergence rates comparable to S-SGD; (2) overlapping parameter synchronization with computation without extra GPU memory occupation; (3) identifying and exploiting three properties to schedule the communication and computation to reduce the training time based on fine-grained profiling of layer-wise communication and computation time. Empirical evaluations conducted on 32 GPUs using prominent deep learning models, including ResNet-18, ResNet-50, GPT-2, and Llama-2, demonstrate that DreamDDP enhances the convergence properties of Local SGD (and Adam) and achieves speedups ranging from $1.49\times$ to $3.91\times$ over leading baseline methods. △ Less

Submitted 16 February, 2025; originally announced February 2025.

arXiv:2502.09613 [pdf, other]

Latent Radiance Fields with 3D-aware 2D Representations

Authors: Chaoyi Zhou, Xi Liu, Feng Luo, Siyu Huang

Abstract: Latent 3D reconstruction has shown great promise in empowering 3D semantic understanding and 3D generation by distilling 2D features into the 3D space. However, existing approaches struggle with the domain gap between 2D feature space and 3D representations, resulting in degraded rendering performance. To address this challenge, we propose a novel framework that integrates 3D awareness into the 2D… ▽ More Latent 3D reconstruction has shown great promise in empowering 3D semantic understanding and 3D generation by distilling 2D features into the 3D space. However, existing approaches struggle with the domain gap between 2D feature space and 3D representations, resulting in degraded rendering performance. To address this challenge, we propose a novel framework that integrates 3D awareness into the 2D latent space. The framework consists of three stages: (1) a correspondence-aware autoencoding method that enhances the 3D consistency of 2D latent representations, (2) a latent radiance field (LRF) that lifts these 3D-aware 2D representations into 3D space, and (3) a VAE-Radiance Field (VAE-RF) alignment strategy that improves image decoding from the rendered 2D representations. Extensive experiments demonstrate that our method outperforms the state-of-the-art latent 3D reconstruction approaches in terms of synthesis performance and cross-dataset generalizability across diverse indoor and outdoor scenes. To our knowledge, this is the first work showing the radiance field representations constructed from 2D latent representations can yield photorealistic 3D reconstruction performance. △ Less

Submitted 13 February, 2025; originally announced February 2025.

Comments: Accepted to ICLR 2025; Project page: https://latent-radiance-field.github.io/LRF

arXiv:2502.09183 [pdf, other]

RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation

Authors: Changzhi Zhou, Xinyu Zhang, Dandan Song, Xiancai Chen, Wanli Gu, Huipeng Ma, Yuhang Tian, Mengdi Zhang, Linmei Hu

Abstract: Code generation has attracted increasing attention with the rise of Large Language Models (LLMs). Many studies have developed powerful code LLMs by synthesizing code-related instruction data and applying supervised fine-tuning. However, these methods are limited by teacher model distillation and ignore the potential of iterative refinement by self-generated code. In this paper, we propose Adaptive… ▽ More Code generation has attracted increasing attention with the rise of Large Language Models (LLMs). Many studies have developed powerful code LLMs by synthesizing code-related instruction data and applying supervised fine-tuning. However, these methods are limited by teacher model distillation and ignore the potential of iterative refinement by self-generated code. In this paper, we propose Adaptive Critique Refinement (ACR), which enables the model to refine itself by self-generated code and external critique, rather than directly imitating the code responses of the teacher model. Concretely, ACR includes a composite scoring system with LLM-as-a-Judge to evaluate the quality of code responses and a selective critique strategy with LLM-as-a-Critic to critique self-generated low-quality code responses. We develop the RefineCoder series by iteratively applying ACR, achieving continuous performance improvement on multiple code generation benchmarks. Compared to the baselines of the same size, our proposed RefineCoder series can achieve comparable or even superior performance using less data. △ Less

Submitted 13 February, 2025; originally announced February 2025.

Comments: work in process

arXiv:2502.08929 [pdf, ps, other]

Precise Measurement of the $χ_{c0}$ Resonance Parameters and Branching Fractions of $χ_{c0,c2}\toπ^+π^-/K^+K^-$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (648 additional authors not shown)

Abstract: By analyzing a $ψ(3686)$ data sample containing $(107.7\pm0.6)\times10^{6}$ events taken with the BESIII detector at the BEPCII storage ring in 2009, the $χ_{c0}$ resonance parameters are precisely measured using $χ_{c0,c2} \to π^+π^-/K^+K^-$ events. The mass of $χ_{c0}$ is determined to be $M(χ_{c0})=(3415.67\pm0.07\pm0.06\pm0.07$)~MeV/$c^2$, and its full width is… ▽ More By analyzing a $ψ(3686)$ data sample containing $(107.7\pm0.6)\times10^{6}$ events taken with the BESIII detector at the BEPCII storage ring in 2009, the $χ_{c0}$ resonance parameters are precisely measured using $χ_{c0,c2} \to π^+π^-/K^+K^-$ events. The mass of $χ_{c0}$ is determined to be $M(χ_{c0})=(3415.67\pm0.07\pm0.06\pm0.07$)~MeV/$c^2$, and its full width is $Γ(χ_{c0})=(12.44\pm0.12\pm0.12)~{\rm MeV}$, where the first uncertainty is statistical, the second systematic, and the third for mass comes from $χ_{c2}$ mass uncertainty. These measurements improve the precision of $χ_{c0}$ mass by a factor of four and width by one order of magnitude over the previous individual measurements, and significantly boost our knowledge about the charmonium spectrum. Together with additional $(345.4\pm2.6)\times10^{6}$ $ψ(3686)$ data events taken in 2012, the decay branching fractions of $χ_{c0,c2}\toπ^+π^-/K^+K^-$ are measured as well, with precision improved by a factor of three compared to previous measurements. These $χ_{c0}$ decay branching fractions provide important inputs for the study of glueballs. △ Less

Submitted 1 July, 2025; v1 submitted 12 February, 2025; originally announced February 2025.

Comments: 9 pages, 2 figure

arXiv:2502.08092 [pdf, ps, other]

GCoT: Chain-of-Thought Prompt Learning for Graphs

Authors: Xingtong Yu, Chang Zhou, Zhongwei Kuai, Xinming Zhang, Yuan Fang

Abstract: Chain-of-thought (CoT) prompting has achieved remarkable success in natural language processing (NLP). However, its vast potential remains largely unexplored for graphs. This raises an interesting question: How can we design CoT prompting for graphs to guide graph models to learn step by step? On one hand, unlike natural languages, graphs are non-linear and characterized by complex topological str… ▽ More Chain-of-thought (CoT) prompting has achieved remarkable success in natural language processing (NLP). However, its vast potential remains largely unexplored for graphs. This raises an interesting question: How can we design CoT prompting for graphs to guide graph models to learn step by step? On one hand, unlike natural languages, graphs are non-linear and characterized by complex topological structures. On the other hand, many graphs lack textual data, making it difficult to formulate language-based CoT prompting. In this work, we propose the first CoT prompt learning framework for text-free graphs, GCoT. Specifically, we decompose the adaptation process for each downstream task into a series of inference steps, with each step consisting of prompt-based inference, ``thought'' generation, and thought-conditioned prompt learning. While the steps mimic CoT prompting in NLP, the exact mechanism differs significantly. Specifically, at each step, an input graph, along with a prompt, is first fed into a pre-trained graph encoder for prompt-based inference. We then aggregate the hidden layers of the encoder to construct a ``thought'', which captures the working state of each node in the current step. Conditioned on this thought, we learn a prompt specific to each node based on the current state. These prompts are fed into the next inference step, repeating the cycle. To evaluate and analyze the effectiveness of GCoT, we conduct comprehensive experiments on eight public datasets, which demonstrate the advantage of our approach. △ Less

Submitted 2 June, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

Comments: Accepted by SIGKDD2025

arXiv:2502.07597 [pdf, other]

doi 10.1103/m9z9-1jpd

Testing spooky action between free-traveling electron-positron pairs

Authors: Leyun Gao, Alim Ruzi, Qite Li, Chen Zhou, Qiang Li

Abstract: Quantum entanglement is a cornerstone of quantum mechanics. While the entanglement of confined electron pairs has been established early on, the entanglement of free-traveling electron pairs, particularly at high energies, remains largely unexplored due to the substantial challenges involved in measuring the spins of free-traveling electrons. In this study, we investigate the entanglement and the… ▽ More Quantum entanglement is a cornerstone of quantum mechanics. While the entanglement of confined electron pairs has been established early on, the entanglement of free-traveling electron pairs, particularly at high energies, remains largely unexplored due to the substantial challenges involved in measuring the spins of free-traveling electrons. In this study, we investigate the entanglement and the Bell inequality violation of free-traveling electron-positron pairs generated in a fixed-target experiment. This experimental setup facilitates the creation of a controllable source of entangled electron-positron pairs, where entangled events are produced in specific phase spaces. Based on this source and the prior knowledge of the entangled state, we demonstrate the feasibility of measuring the polarization correlations of the entangled $e^+e^-$ pairs through their individual secondary scatterings off two separate additional targets. △ Less

Submitted 11 February, 2025; originally announced February 2025.

Comments: 6 pages, 4 figures, PKMu Quantum proposal with positron beam

Journal ref: Phys. Rev. D 111, 116018 (2025)

arXiv:2502.07466 [pdf, other]

Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models

Authors: Lin Zhu, Xinbing Wang, Chenghu Zhou, Qinying Gu, Nanyang Ye

Abstract: Given a style-reference image as the additional image condition, text-to-image diffusion models have demonstrated impressive capabilities in generating images that possess the content of text prompts while adopting the visual style of the reference image. However, current state-of-the-art methods often struggle to disentangle content and style from style-reference images, leading to issues such as… ▽ More Given a style-reference image as the additional image condition, text-to-image diffusion models have demonstrated impressive capabilities in generating images that possess the content of text prompts while adopting the visual style of the reference image. However, current state-of-the-art methods often struggle to disentangle content and style from style-reference images, leading to issues such as content leakages. To address this issue, we propose a masking-based method that efficiently decouples content from style without the need of tuning any model parameters. By simply masking specific elements in the style reference's image features, we uncover a critical yet under-explored principle: guiding with appropriately-selected fewer conditions (e.g., dropping several image feature elements) can efficiently avoid unwanted content flowing into the diffusion models, enhancing the style transfer performances of text-to-image diffusion models. In this paper, we validate this finding both theoretically and experimentally. Extensive experiments across various styles demonstrate the effectiveness of our masking-based method and support our theoretical results. △ Less

Submitted 11 February, 2025; originally announced February 2025.

arXiv:2502.07406 [pdf, other]

doi 10.1007/JHEP05(2025)144

Search for $e^+e^-\to K_S^0 K_S^0 h_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (642 additional authors not shown)

Abstract: Using $e^+e^-$ collision data at 13 center-of-mass energies ranging from 4.600 to 4.950 GeV collected with the BESIII detector, we search for the unmeasured $e^+e^-\to K_S^0 K_S^0 h_c$ process . No significant signal is observed, and the upper limits of the Born cross sections at each center-of-mass energy are presented. Using $e^+e^-$ collision data at 13 center-of-mass energies ranging from 4.600 to 4.950 GeV collected with the BESIII detector, we search for the unmeasured $e^+e^-\to K_S^0 K_S^0 h_c$ process . No significant signal is observed, and the upper limits of the Born cross sections at each center-of-mass energy are presented. △ Less

Submitted 27 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

arXiv:2502.06846 [pdf, other]

Prot2Chat: Protein LLM with Early-Fusion of Text, Sequence and Structure

Authors: Zhicong Wang, Zicheng Ma, Ziqiang Cao, Changlong Zhou, Jun Zhang, Yiqin Gao

Abstract: Motivation: Proteins are of great significance in living organisms. However, understanding their functions encounters numerous challenges, such as insufficient integration of multimodal information, a large number of training parameters, limited flexibility of classification-based methods, and the lack of systematic evaluation metrics for protein Q&A systems. To tackle these issues, we propose the… ▽ More Motivation: Proteins are of great significance in living organisms. However, understanding their functions encounters numerous challenges, such as insufficient integration of multimodal information, a large number of training parameters, limited flexibility of classification-based methods, and the lack of systematic evaluation metrics for protein Q&A systems. To tackle these issues, we propose the Prot2Chat framework. Results: We modified ProteinMPNN to encode protein sequence and structural information in a unified way. We used a large language model (LLM) to encode questions into vectors and developed a protein-text adapter to compress protein information into virtual tokens based on these vectors, achieving the early fusion of text and protein information. Finally, the same LLM reads the virtual tokens and the questions to generate answers. To optimize training efficiency, we froze the encoder and employed Low-Rank Adaptation (LoRA) techniques for the LLM. Experiments on two datasets show that both automated metrics and expert evaluations demonstrate the superior performance of our model, and zero-shot prediction results highlight its generalization ability. The models and codes are available at https://github.com/ wangzc1233/Prot2Chat. Contact: [email protected] or [email protected] Key words: Protein Q&A, Early-Fusion, LLM △ Less

Submitted 22 May, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

Comments: 8 pages, 3 figures

arXiv:2502.05424 [pdf, other]

SAMGPT: Text-free Graph Foundation Model for Multi-domain Pre-training and Cross-domain Adaptation

Authors: Xingtong Yu, Zechuan Gong, Chang Zhou, Yuan Fang, Hui Zhang

Abstract: Graphs are able to model interconnected entities in many online services, supporting a wide range of applications on the Web. This raises an important question: How can we train a graph foundational model on multiple source domains and adapt to an unseen target domain? A major obstacle is that graphs from different domains often exhibit divergent characteristics. Some studies leverage large langua… ▽ More Graphs are able to model interconnected entities in many online services, supporting a wide range of applications on the Web. This raises an important question: How can we train a graph foundational model on multiple source domains and adapt to an unseen target domain? A major obstacle is that graphs from different domains often exhibit divergent characteristics. Some studies leverage large language models to align multiple domains based on textual descriptions associated with the graphs, limiting their applicability to text-attributed graphs. For text-free graphs, a few recent works attempt to align different feature distributions across domains, while generally neglecting structural differences. In this work, we propose a novel Structure Alignment framework for text-free Multi-domain Graph Pre-Training and cross-domain adaptation (SAMGPT). It is designed to learn multi-domain knowledge from graphs originating in multiple source domains, which can then be adapted to address applications in an unseen target domain. Specifically, we introduce a set of structure tokens to harmonize structure-based aggregation across source domains during the pre-training phase. Next, for cross-domain adaptation, we design dual prompts, namely, holistic prompts and specific prompts, which adapt unified multi-domain structural knowledge and fine-grained, domain-specific information, respectively, to a target domain. Finally, we conduct comprehensive experiments on seven public datasets to evaluate and analyze the effectiveness of SAMGPT. △ Less

Submitted 12 April, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

Comments: Accepted by WWW2025 Main Track

arXiv:2502.05158 [pdf, other]

Relationship between 2D and 3D Galaxy Stellar Mass and Correlations with Halo Mass

Authors: Conghao Zhou, Alexie Leauthaud, Shuo Xu, Benedikt Diemer, Song Huang, Katya Leidig, Tesla Jeltema, Marco Gatti, Yifei Luo, Carlo Cannarozzo, Sven Heydenreich

Abstract: Recent studies suggest that the stars in the outer regions of massive galaxies trace halo mass better than the inner regions and that an annular stellar mass provides a low scatter method of selecting galaxy clusters. However, we can only observe galaxies as projected two-dimensional objects on the sky. In this paper, we use a sample of simulated galaxies to study how well galaxy stellar mass prof… ▽ More Recent studies suggest that the stars in the outer regions of massive galaxies trace halo mass better than the inner regions and that an annular stellar mass provides a low scatter method of selecting galaxy clusters. However, we can only observe galaxies as projected two-dimensional objects on the sky. In this paper, we use a sample of simulated galaxies to study how well galaxy stellar mass profiles in three dimensions correlate with halo mass, and what effects arise when observationally projecting stellar profiles into two dimensions. We compare 2D and 3D outer stellar mass selections and find that they have similar performance as halo mass proxies and that, surprisingly, a 2D selection sometimes has marginally better performance. We also investigate whether the weak lensing profiles around galaxies selected by 2D outer stellar mass suffer from projection effects. We find that the lensing profiles of samples selected by 2D and 3D definitions are nearly identical, suggesting that the 2D selection does not create a bias. These findings underscore the promise of using outer stellar mass as a tool for identifying galaxy clusters. △ Less

Submitted 7 February, 2025; originally announced February 2025.

Comments: 31 pages 11 figures. To be submitted to JCAP

arXiv:2502.03828 [pdf, ps, other]

doi 10.1103/PhysRevD.111.L071101

Observation of $D\to \bar{K}_{1}(1270)μ^+ν_μ$ and test of lepton flavor universality with $D\to \bar{K}_1(1270) \ell^{+} ν_{\ell}$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (646 additional authors not shown)

Abstract: By analyzing 7.93 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy of 3.773 GeV with the BESIII detector operated at the BEPCII collider, we report the observation of the semimuonic decays of $D^+\to \bar K_1(1270)^0μ^+ν_μ$ and $D^0\to K_1(1270)^-μ^+ν_μ$ with statistical significances of $12.5σ$ and $6.0σ$, respectively. Their decay branching fractions are determined… ▽ More By analyzing 7.93 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy of 3.773 GeV with the BESIII detector operated at the BEPCII collider, we report the observation of the semimuonic decays of $D^+\to \bar K_1(1270)^0μ^+ν_μ$ and $D^0\to K_1(1270)^-μ^+ν_μ$ with statistical significances of $12.5σ$ and $6.0σ$, respectively. Their decay branching fractions are determined to be ${\mathcal B}[D^{+}\to \bar{K}_1(1270)^0 μ^{+}ν_μ]=(2.36\pm0.20^{+0.18}_{-0.27}\pm 0.48)\times10^{-3}$ and ${\mathcal B}[D^{0}\to K_1(1270)^{-} μ^{+}ν_μ]=(0.78\pm0.11^{+0.05}_{-0.09}\pm 0.15)\times10^{-3}$, where the first and second uncertainties are statistical and systematic, respectively, and the third originates from the input branching fraction of $\bar K_{1}(1270)^0\to K^- π^+π^0$ or $K_1(1270)^-\to K^-π^+π^-$. Combining our branching fractions with the previous measurements of ${\mathcal B}[D^+\to \bar K_1(1270)^0e^+ν_{e}]$ and ${\mathcal B}[D^0\to K_1(1270)^-e^+ν_{e}]$, we determine the branching fraction ratios to be ${\mathcal B}[D^+\to \bar K_1(1270)^0μ^+ν_μ]/{\mathcal B}[D^+\to \bar K_1(1270)^0e^+ν_{e}]=1.03 \pm 0.14 \substack{+0.11\\-0.15}$ and ${\mathcal B}[D^0\to K_1(1270)^-μ^+ν_μ]/{\mathcal B}[D^0\to K_1(1270)^-e^+ν_{e}]=0.74\pm 0.13 \substack{+0.08\\-0.13}$. Using the branching fractions measured in this work and the world-average lifetimes of the $D^+$ and $D^0$ mesons, we determine the semimuonic partial decay width ratio to be $Γ[D^+\to \bar K_1(1270)^0 μ^+ν_μ]/Γ[D^0\to K_1(1270)^- μ^+ν_μ]=1.22\pm 0.10\substack{+0.06\\-0.09}$, which is consistent with unity as predicted by isospin conservation. △ Less

Submitted 18 April, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

Comments: 11 pages, 5 figures

Journal ref: Phys. Rev. D 111, L071101(2025)

arXiv:2502.01243 [pdf, other]

OphthBench: A Comprehensive Benchmark for Evaluating Large Language Models in Chinese Ophthalmology

Authors: Chengfeng Zhou, Ji Wang, Juanjuan Qin, Yining Wang, Ling Sun, Weiwei Dai

Abstract: Large language models (LLMs) have shown significant promise across various medical applications, with ophthalmology being a notable area of focus. Many ophthalmic tasks have shown substantial improvement through the integration of LLMs. However, before these models can be widely adopted in clinical practice, evaluating their capabilities and identifying their limitations is crucial. To address thi… ▽ More Large language models (LLMs) have shown significant promise across various medical applications, with ophthalmology being a notable area of focus. Many ophthalmic tasks have shown substantial improvement through the integration of LLMs. However, before these models can be widely adopted in clinical practice, evaluating their capabilities and identifying their limitations is crucial. To address this research gap and support the real-world application of LLMs, we introduce the OphthBench, a specialized benchmark designed to assess LLM performance within the context of Chinese ophthalmic practices. This benchmark systematically divides a typical ophthalmic clinical workflow into five key scenarios: Education, Triage, Diagnosis, Treatment, and Prognosis. For each scenario, we developed multiple tasks featuring diverse question types, resulting in a comprehensive benchmark comprising 9 tasks and 591 questions. This comprehensive framework allows for a thorough assessment of LLMs' capabilities and provides insights into their practical application in Chinese ophthalmology. Using this benchmark, we conducted extensive experiments and analyzed the results from 39 popular LLMs. Our evaluation highlights the current gap between LLM development and its practical utility in clinical settings, providing a clear direction for future advancements. By bridging this gap, we aim to unlock the potential of LLMs and advance their development in ophthalmology. △ Less

Submitted 3 February, 2025; originally announced February 2025.

arXiv:2502.00850 [pdf, other]

Dual Alignment Maximin Optimization for Offline Model-based RL

Authors: Chi Zhou, Wang Luo, Haoran Li, Congying Han, Tiande Guo, Zicheng Zhang

Abstract: Offline reinforcement learning agents face significant deployment challenges due to the synthetic-to-real distribution mismatch. While most prior research has focused on improving the fidelity of synthetic sampling and incorporating off-policy mechanisms, the directly integrated paradigm often fails to ensure consistent policy behavior in biased models and underlying environmental dynamics, which… ▽ More Offline reinforcement learning agents face significant deployment challenges due to the synthetic-to-real distribution mismatch. While most prior research has focused on improving the fidelity of synthetic sampling and incorporating off-policy mechanisms, the directly integrated paradigm often fails to ensure consistent policy behavior in biased models and underlying environmental dynamics, which inherently arise from discrepancies between behavior and learning policies. In this paper, we first shift the focus from model reliability to policy discrepancies while optimizing for expected returns, and then self-consistently incorporate synthetic data, deriving a novel actor-critic paradigm, Dual Alignment Maximin Optimization (DAMO). It is a unified framework to ensure both model-environment policy consistency and synthetic and offline data compatibility. The inner minimization performs dual conservative value estimation, aligning policies and trajectories to avoid out-of-distribution states and actions, while the outer maximization ensures that policy improvements remain consistent with inner value estimates. Empirical evaluations demonstrate that DAMO effectively ensures model and policy alignments, achieving competitive performance across diverse benchmark tasks. △ Less

Submitted 10 May, 2025; v1 submitted 2 February, 2025; originally announced February 2025.

arXiv:2502.00217 [pdf, other]

Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone

Authors: Negar Hassanpour, Muhammad Kamran Janjua, Kunlin Zhang, Sepehr Lavasani, Xiaowen Zhang, Chunhua Zhou, Chao Gao

Abstract: Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a const… ▽ More Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. By balancing task-specific gradients without over-constraining their direction or magnitude, ConicGrad effectively resolves inter-task gradient conflicts. Moreover, our framework ensures computational efficiency and scalability to high-dimensional parameter spaces. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks. △ Less

Submitted 31 January, 2025; originally announced February 2025.

Comments: 16 pages, 7 figures, 5 tables

arXiv:2501.19106 [pdf]

Subtle variations in stiff dimensions of brain networks account for individual differences in cognitive ability

Authors: Sida Chen, Qianyuan Tang, Taro Toyoizumi, Werner Sommer, Lianchun Yu, Changsong Zhou

Abstract: Explaining individual differences in cognitive abilities requires both identifying brain parameters that vary across individuals and understanding how brain networks are recruited for specific tasks. Typically, task performance relies on the integration and segregation of functional subnetworks, often captured by parameters like regional excitability and connectivity. Yet, the high dimensionality… ▽ More Explaining individual differences in cognitive abilities requires both identifying brain parameters that vary across individuals and understanding how brain networks are recruited for specific tasks. Typically, task performance relies on the integration and segregation of functional subnetworks, often captured by parameters like regional excitability and connectivity. Yet, the high dimensionality of these parameters hinders pinpointing their functional relevance. Here, we apply stiff-sloppy analysis to human brain data, revealing that certain subtle parameter combinations ("stiff dimensions") powerfully influence neural activity during task processing, whereas others ("sloppy dimensions") vary more extensively but exert minimal impact. Using a pairwise maximum entropy model of task fMRI, we show that even small deviations in stiff dimensions-derived through Fisher Information Matrix analysis-govern the dynamic interplay of segregation and integration between the default mode network (DMN) and a working memory network (WMN). Crucially, separating a 0-back task (vigilant attention) from a 2-back task (working memory updating) uncovers partially distinct stiff dimensions predicting performance in each condition, along with a global DMN-WMN segregation shared across both tasks. Altogether, stiff-sloppy analysis challenges the conventional focus on large parameter variability by highlighting these subtle yet functionally decisive parameter combinations. △ Less

Submitted 27 April, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

arXiv:2501.18993 [pdf, other]

Visual Autoregressive Modeling for Image Super-Resolution

Authors: Yunpeng Qu, Kun Yuan, Jinhua Hao, Kai Zhao, Qizhi Xie, Ming Sun, Chao Zhou

Abstract: Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also posed limitations on their application. Building upon the tremendous success of autoregressive models in the language domain, we propose \textbf{VARSR}, a novel… ▽ More Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also posed limitations on their application. Building upon the tremendous success of autoregressive models in the language domain, we propose \textbf{VARSR}, a novel visual autoregressive modeling for ISR framework with the form of next-scale prediction. To effectively integrate and preserve semantic information in low-resolution images, we propose using prefix tokens to incorporate the condition. Scale-aligned Rotary Positional Encodings are introduced to capture spatial structures and the diffusion refiner is utilized for modeling quantization residual loss to achieve pixel-level fidelity. Image-based Classifier-free Guidance is proposed to guide the generation of more realistic images. Furthermore, we collect large-scale data and design a training process to obtain robust generative priors. Quantitative and qualitative results show that VARSR is capable of generating high-fidelity and high-realism images with more efficiency than diffusion-based methods. Our codes will be released at https://github.com/qyp2000/VARSR. △ Less

Submitted 31 January, 2025; originally announced January 2025.

Comments: 20 pages; 17 figures

arXiv:2501.18850 [pdf, other]

Equivariant Hypergraph Diffusion for Crystal Structure Prediction

Authors: Yang Liu, Chuan Zhou, Shuai Zhang, Peng Zhang, Xixun Lin, Shirui Pan

Abstract: Crystal Structure Prediction (CSP) remains a fundamental challenge with significant implications for the development of new materials and the advancement of various scientific disciplines. Recent developments have shown that generative models, particularly diffusion models, hold great promise for CSP. However, traditional graph-based representations, where atomic bonds are modeled as pairwise grap… ▽ More Crystal Structure Prediction (CSP) remains a fundamental challenge with significant implications for the development of new materials and the advancement of various scientific disciplines. Recent developments have shown that generative models, particularly diffusion models, hold great promise for CSP. However, traditional graph-based representations, where atomic bonds are modeled as pairwise graph edges, fail to fully capture the intricate high-order interactions essential for accurately representing crystal structures. In this work, we propose a novel approach that utilizes hypergraphs to represent crystal structures, providing a more expressive abstraction for modeling multi-way atomic interactions. By adopting hypergraphs, we can effectively capture complex high-order relationships and symmetries, such as permutation and periodic translation invariance, which are crucial for characterizing crystal structures. In this work, we propose the \textbf{E}quivariant \textbf{H}ypergraph \textbf{Diff}usion Model (\textbf{EH-Diff}), a generative model designed to take advantage of the symmetry-preserving properties of hypergraphs. EH-Diff exploits these features to offer an efficient and accurate method for predicting crystal structures with a strong theoretical justification to preserve invariance properties. Empirically, we conduct extensive experiments on four benchmark datasets, and the results demonstrate that EH-Diff outperforms state-of-the-art CSP methods with only one sample. △ Less

Submitted 30 January, 2025; originally announced January 2025.

Comments: 14 pages, 4 figures

arXiv:2501.17992 [pdf, other]

Reinforcement-Learning Portfolio Allocation with Dynamic Embedding of Market Information

Authors: Jinghai He, Cheng Hua, Chunyang Zhou, Zeyu Zheng

Abstract: We develop a portfolio allocation framework that leverages deep learning techniques to address challenges arising from high-dimensional, non-stationary, and low-signal-to-noise market information. Our approach includes a dynamic embedding method that reduces the non-stationary, high-dimensional state space into a lower-dimensional representation. We design a reinforcement learning (RL) framework t… ▽ More We develop a portfolio allocation framework that leverages deep learning techniques to address challenges arising from high-dimensional, non-stationary, and low-signal-to-noise market information. Our approach includes a dynamic embedding method that reduces the non-stationary, high-dimensional state space into a lower-dimensional representation. We design a reinforcement learning (RL) framework that integrates generative autoencoders and online meta-learning to dynamically embed market information, enabling the RL agent to focus on the most impactful parts of the state space for portfolio allocation decisions. Empirical analysis based on the top 500 U.S. stocks demonstrates that our framework outperforms common portfolio benchmarks and the predict-then-optimize (PTO) approach using machine learning, particularly during periods of market stress. Traditional factor models do not fully explain this superior performance. The framework's ability to time volatility reduces its market exposure during turbulent times. Ablation studies confirm the robustness of this performance across various reinforcement learning algorithms. Additionally, the embedding and meta-learning techniques effectively manage the complexities of high-dimensional, noisy, and non-stationary financial data, enhancing both portfolio performance and risk management. △ Less

Submitted 29 January, 2025; originally announced January 2025.

arXiv:2501.17489 [pdf, other]

Neural Spelling: A Spell-Based BCI System for Language Neural Decoding

Authors: Xiaowei Jiang, Charles Zhou, Yiqun Duan, Ziyi Zhao, Thomas Do, Chin-Teng Lin

Abstract: Brain-computer interfaces (BCIs) present a promising avenue by translating neural activity directly into text, eliminating the need for physical actions. However, existing non-invasive BCI systems have not successfully covered the entire alphabet, limiting their practicality. In this paper, we propose a novel non-invasive EEG-based BCI system with Curriculum-based Neural Spelling Framework, which… ▽ More Brain-computer interfaces (BCIs) present a promising avenue by translating neural activity directly into text, eliminating the need for physical actions. However, existing non-invasive BCI systems have not successfully covered the entire alphabet, limiting their practicality. In this paper, we propose a novel non-invasive EEG-based BCI system with Curriculum-based Neural Spelling Framework, which recognizes all 26 alphabet letters by decoding neural signals associated with handwriting first, and then apply a Generative AI (GenAI) to enhance spell-based neural language decoding tasks. Our approach combines the ease of handwriting with the accessibility of EEG technology, utilizing advanced neural decoding algorithms and pre-trained large language models (LLMs) to translate EEG patterns into text with high accuracy. This system show how GenAI can improve the performance of typical spelling-based neural language decoding task, and addresses the limitations of previous methods, offering a scalable and user-friendly solution for individuals with communication impairments, thereby enhancing inclusive communication options. △ Less

Submitted 29 January, 2025; originally announced January 2025.

arXiv:2501.16720 [pdf, other]

One Head Eight Arms: Block Matrix based Low Rank Adaptation for CLIP-based Few-Shot Learning

Authors: Chunpeng Zhou, Qianqian Shen, Zhi Yu, Jiajun Bu, Haishuai Wang

Abstract: Recent advancements in fine-tuning Vision-Language Foundation Models (VLMs) have garnered significant attention for their effectiveness in downstream few-shot learning tasks.While these recent approaches exhibits some performance improvements, they often suffer from excessive training parameters and high computational costs. To address these challenges, we propose a novel Block matrix-based low-ra… ▽ More Recent advancements in fine-tuning Vision-Language Foundation Models (VLMs) have garnered significant attention for their effectiveness in downstream few-shot learning tasks.While these recent approaches exhibits some performance improvements, they often suffer from excessive training parameters and high computational costs. To address these challenges, we propose a novel Block matrix-based low-rank adaptation framework, called Block-LoRA, for fine-tuning VLMs on downstream few-shot tasks. Inspired by recent work on Low-Rank Adaptation (LoRA), Block-LoRA partitions the original low-rank decomposition matrix of LoRA into a series of sub-matrices while sharing all down-projection sub-matrices. This structure not only reduces the number of training parameters, but also transforms certain complex matrix multiplication operations into simpler matrix addition, significantly lowering the computational cost of fine-tuning. Notably, Block-LoRA enables fine-tuning CLIP on the ImageNet few-shot benchmark using a single 24GB GPU. We also show that Block-LoRA has the more tighter bound of generalization error than vanilla LoRA. Without bells and whistles, extensive experiments demonstrate that Block-LoRA achieves competitive performance compared to state-of-the-art CLIP-based few-shot methods, while maintaining a low training parameters count and reduced computational overhead. △ Less

Submitted 28 January, 2025; originally announced January 2025.

Comments: Under Review

arXiv:2501.15529 [pdf, other]

UNIDOOR: A Universal Framework for Action-Level Backdoor Attacks in Deep Reinforcement Learning

Authors: Oubo Ma, Linkang Du, Yang Dai, Chunyi Zhou, Qingming Li, Yuwen Pu, Shouling Ji

Abstract: Deep reinforcement learning (DRL) is widely applied to safety-critical decision-making scenarios. However, DRL is vulnerable to backdoor attacks, especially action-level backdoors, which pose significant threats through precise manipulation and flexible activation, risking outcomes like vehicle collisions or drone crashes. The key distinction of action-level backdoors lies in the utilization of th… ▽ More Deep reinforcement learning (DRL) is widely applied to safety-critical decision-making scenarios. However, DRL is vulnerable to backdoor attacks, especially action-level backdoors, which pose significant threats through precise manipulation and flexible activation, risking outcomes like vehicle collisions or drone crashes. The key distinction of action-level backdoors lies in the utilization of the backdoor reward function to associate triggers with target actions. Nevertheless, existing studies typically rely on backdoor reward functions with fixed values or conditional flipping, which lack universality across diverse DRL tasks and backdoor designs, resulting in fluctuations or even failure in practice. This paper proposes the first universal action-level backdoor attack framework, called UNIDOOR, which enables adaptive exploration of backdoor reward functions through performance monitoring, eliminating the reliance on expert knowledge and grid search. We highlight that action tampering serves as a crucial component of action-level backdoor attacks in continuous action scenarios, as it addresses attack failures caused by low-frequency target actions. Extensive evaluations demonstrate that UNIDOOR significantly enhances the attack performance of action-level backdoors, showcasing its universality across diverse attack scenarios, including single/multiple agents, single/multiple backdoors, discrete/continuous action spaces, and sparse/dense reward signals. Furthermore, visualization results encompassing state distribution, neuron activation, and animations demonstrate the stealthiness of UNIDOOR. The source code of UNIDOOR can be found at https://github.com/maoubo/UNIDOOR. △ Less

Submitted 26 January, 2025; originally announced January 2025.

Comments: 21 pages, 12 figures, 7 tables

arXiv:2501.15447 [pdf, ps, other]

Observation of $h_{c}$ radiative decays to multiple light hadrons and the tensor state $f_2(1270)$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (666 additional authors not shown)

Abstract: Using $ψ(3686)\rightarrow π^{0} h_{c}$ decays from a data sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider, $h_c$ radiative decays to $γπ^{+}π^{-},~γπ^{+}π^{-}η,~\gamma2(π^{+}π^{-})$, and $γp\bar{p}$ are observed for the first time, each with a significance greater than $5σ$. The corresponding branching fractions are measured. Furtherm… ▽ More Using $ψ(3686)\rightarrow π^{0} h_{c}$ decays from a data sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider, $h_c$ radiative decays to $γπ^{+}π^{-},~γπ^{+}π^{-}η,~\gamma2(π^{+}π^{-})$, and $γp\bar{p}$ are observed for the first time, each with a significance greater than $5σ$. The corresponding branching fractions are measured. Furthermore, intermediate states below 2.8 GeV/$c^{2}$ are investigated, leading to the first observation of the decay process of $h_c\rightarrowγf_{2}(1270)\rightarrowγπ^{+}π^{-}$ with a significance of $5.5\,σ$. This observation represents the first instance of $h_c$ radiative decay to a tensor state. △ Less

Submitted 26 January, 2025; originally announced January 2025.

arXiv:2501.14771

Dynamic Adaptation in Data Storage: Real-Time Machine Learning for Enhanced Prefetching

Authors: Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao

Abstract: The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1]. This study explores the application of streaming machine learning [3] to revolutionize data prefetching within multi-tiered storage systems. Unlike traditional batch-trained models, streaming machine learning [5] offers adaptability, real-time insights, and computational… ▽ More The exponential growth of data storage demands has necessitated the evolution of hierarchical storage management strategies [1]. This study explores the application of streaming machine learning [3] to revolutionize data prefetching within multi-tiered storage systems. Unlike traditional batch-trained models, streaming machine learning [5] offers adaptability, real-time insights, and computational efficiency, responding dynamically to workload variations. This work designs and validates an innovative framework that integrates streaming classification models for predicting file access patterns, specifically the next file offset. Leveraging comprehensive feature engineering and real-time evaluation over extensive production traces, the proposed methodology achieves substantial improvements in prediction accuracy, memory efficiency, and system adaptability. The results underscore the potential of streaming models in real-time storage management, setting a precedent for advanced caching and tiering strategies. △ Less

Submitted 28 January, 2025; v1 submitted 29 December, 2024; originally announced January 2025.

Comments: I uploaded the paper without obtaining consent from all the authors. One of the authors now refuses to publish this paper, as it has been demonstrated to be unreliable, contains significant flaws in prior research, and is missing proper citations in Sections 2 and 3

arXiv:2501.14770

Optimizing SSD Caches for Cloud Block Storage Systems Using Machine Learning Approaches

Authors: Chiyu Cheng, Chang Zhou, Yang Zhao, Jin Cao

Abstract: The growing demand for efficient cloud storage solutions has led to the widespread adoption of Solid-State Drives (SSDs) for caching in cloud block storage systems. The management of data writes to SSD caches plays a crucial role in improving overall system performance, reducing latency, and extending the lifespan of storage devices. A critical challenge arises from the large volume of write-only… ▽ More The growing demand for efficient cloud storage solutions has led to the widespread adoption of Solid-State Drives (SSDs) for caching in cloud block storage systems. The management of data writes to SSD caches plays a crucial role in improving overall system performance, reducing latency, and extending the lifespan of storage devices. A critical challenge arises from the large volume of write-only data, which significantly impacts the performance of SSD caches when handled inefficiently. Specifically, writes that have not been read for a certain period may introduce unnecessary write traffic to the SSD cache without offering substantial benefits for cache performance. This paper proposes a novel approach to mitigate this issue by leveraging machine learning techniques to dynamically optimize the write policy in cloud-based storage systems. The proposed method identifies write-only data and selectively filters it out in real-time, thereby minimizing the number of unnecessary write operations and improving the overall performance of the cache system. Experimental results demonstrate that the proposed machine learning-based policy significantly outperforms traditional approaches by reducing the number of harmful writes and optimizing cache utilization. This solution is particularly suitable for cloud environments with varying and unpredictable workloads, where traditional cache management strategies often fall short. △ Less

Submitted 28 January, 2025; v1 submitted 29 December, 2024; originally announced January 2025.

Comments: I uploaded the paper without obtaining consent from all the authors. One of the authors now refuses to publish this paper, as it has been demonstrated to be unreliable, contains significant flaws in prior research, and is missing citations in Sections 2

arXiv:2501.14213 [pdf]

Comprehensive Analog Signal Processing Platform Enabled with Acoustic Charge Transport in Two-dimensional Materials

Authors: Yueyi Sun, Siming Liu, Yingjie Luo, Jiwei Chen, Yihong Sun, Changjian Zhou

Abstract: Two-dimensional Acoustic Charge Transport (2D-ACT) devices, which integrate two dimensional semiconductor field-effect transistor (FET) with high-frequency surface acoustic wave (SAW) device provide a potential compact platform for the processing of analog signals in a wireless, non-contact, low-loss and real-time way. It is expected to be used in long-distance space communication and sensing. How… ▽ More Two-dimensional Acoustic Charge Transport (2D-ACT) devices, which integrate two dimensional semiconductor field-effect transistor (FET) with high-frequency surface acoustic wave (SAW) device provide a potential compact platform for the processing of analog signals in a wireless, non-contact, low-loss and real-time way. It is expected to be used in long-distance space communication and sensing. However, current investigations into 2D-ACT devices are still limited to the observation of DC acoustoelectric currents, and have yet to achieve real-time electronic signal processing capabilities. In this paper, we have designed a hybrid acoustoelectric platform composed of two-dimensional semiconductor FET and SAW device. The platform is capable of processing DC signals, exhibiting ambipolar transport behavior. The sub-wavelength channel length of the FET within the platform allows for the real-time observation of carrier distribution at a microscopic scale in conjunction with the SAW potential, and facilitating the reproduction and intensity regulation of AC signals. By adjusting the relative phase and intensity ratio of two counter-propagating SAWs, the platform also enables the addition and subtraction of AC signals. △ Less

Submitted 27 January, 2025; v1 submitted 23 January, 2025; originally announced January 2025.

Comments: 26 pages, 10 figures

arXiv:2501.12793 [pdf, other]

Revisit Self-Debugging with Self-Generated Tests for Code Generation

Authors: Xiancai Chen, Zhengwei Tao, Kechi Zhang, Changzhi Zhou, Wanli Gu, Yuanpeng He, Mengdi Zhang, Xunliang Cai, Haiyan Zhao, Zhi Jin

Abstract: Large language models (LLMs) have shown significant advancements in code generation, but still face challenges on tasks beyond their basic capabilities. Recently, the notion of self-debugging has been proposed to boost the performance of code generation by leveraging execution feedback from tests. Despite its promise, the availability of high-quality tests in real-world scenarios is limited. In th… ▽ More Large language models (LLMs) have shown significant advancements in code generation, but still face challenges on tasks beyond their basic capabilities. Recently, the notion of self-debugging has been proposed to boost the performance of code generation by leveraging execution feedback from tests. Despite its promise, the availability of high-quality tests in real-world scenarios is limited. In this context, self-debugging with self-generated tests is a promising solution but lacks a full exploration of its limitations and practical potential. Therefore, we investigate its efficacy on diverse programming problems. To deepen our understanding, we propose two distinct paradigms for the process: post-execution and in-execution self-debugging. Within the scope of self-contained Python programming tasks, we find that post-execution self-debugging struggles on basic problems but shows potential for improvement on competitive ones, due to the bias introduced by self-generated tests. On the other hand, in-execution self-debugging enables LLMs to mitigate the bias by solely leveraging intermediate states during execution, thereby enhancing code generation. △ Less

Submitted 22 January, 2025; originally announced January 2025.

Comments: Work in Progress

arXiv:2501.12399 [pdf, other]

FinSphere: A Conversational Stock Analysis Agent Equipped with Quantitative Tools based on Real-Time Database

Authors: Shijie Han, Changhai Zhou, Yiqing Shen, Tianning Sun, Yuhua Zhou, Xiaoxia Wang, Zhixiao Yang, Jingshu Zhang, Hongguang Li

Abstract: Current financial Large Language Models (LLMs) struggle with two critical limitations: a lack of depth in stock analysis, which impedes their ability to generate professional-grade insights, and the absence of objective evaluation metrics to assess the quality of stock analysis reports. To address these challenges, this paper introduces FinSphere, a conversational stock analysis agent, along with… ▽ More Current financial Large Language Models (LLMs) struggle with two critical limitations: a lack of depth in stock analysis, which impedes their ability to generate professional-grade insights, and the absence of objective evaluation metrics to assess the quality of stock analysis reports. To address these challenges, this paper introduces FinSphere, a conversational stock analysis agent, along with three major contributions: (1) Stocksis, a dataset curated by industry experts to enhance LLMs' stock analysis capabilities, (2) AnalyScore, a systematic evaluation framework for assessing stock analysis quality, and (3) FinSphere, an AI agent that can generate high-quality stock analysis reports in response to user queries. Experiments demonstrate that FinSphere achieves superior performance compared to both general and domain-specific LLMs, as well as existing agent-based systems, even when they are enhanced with real-time data access and few-shot guidance. The integrated framework, which combines real-time data feeds, quantitative tools, and an instruction-tuned LLM, yields substantial improvements in both analytical quality and practical applicability for real-world stock analysis. △ Less

Submitted 8 January, 2025; originally announced January 2025.

arXiv:2501.11478 [pdf, other]

Each Graph is a New Language: Graph Learning with LLMs

Authors: Huachi Zhou, Jiahe Du, Chuang Zhou, Chang Yang, Yilin Xiao, Yuxuan Xie, Xiao Huang

Abstract: Recent efforts leverage Large Language Models (LLMs) for modeling text-attributed graph structures in node classification tasks. These approaches describe graph structures for LLMs to understand or aggregate LLM-generated textual attribute embeddings through graph structure. However, these approaches face two main limitations in modeling graph structures with LLMs. (i) Graph descriptions become ve… ▽ More Recent efforts leverage Large Language Models (LLMs) for modeling text-attributed graph structures in node classification tasks. These approaches describe graph structures for LLMs to understand or aggregate LLM-generated textual attribute embeddings through graph structure. However, these approaches face two main limitations in modeling graph structures with LLMs. (i) Graph descriptions become verbose in describing high-order graph structure. (ii) Textual attributes alone do not contain adequate graph structure information. It is challenging to model graph structure concisely and adequately with LLMs. LLMs lack built-in mechanisms to model graph structures directly. They also struggle with complex long-range dependencies between high-order nodes and target nodes. Inspired by the observation that LLMs pre-trained on one language can achieve exceptional performance on another with minimal additional training, we propose \textbf{G}raph-\textbf{D}efined \textbf{L}anguage for \textbf{L}arge \textbf{L}anguage \textbf{M}odel (GDL4LLM). This novel framework enables LLMs to transfer their powerful language understanding capabilities to graph-structured data. GDL4LLM translates graphs into a graph language corpus instead of graph descriptions and pre-trains LLMs on this corpus to adequately understand graph structures. During fine-tuning, this corpus describes the structural information of target nodes concisely with only a few tokens. By treating graphs as a new language, GDL4LLM enables LLMs to model graph structures adequately and concisely for node classification tasks. Extensive experiments on three real-world datasets demonstrate that GDL4LLM outperforms description-based and textual attribute embeddings-based baselines by efficiently modeling different orders of graph structure with LLMs. △ Less

Submitted 25 May, 2025; v1 submitted 20 January, 2025; originally announced January 2025.

arXiv:2501.09251 [pdf, other]

doi 10.1145/3710848.3710888

Acc-SpMM: Accelerating General-purpose Sparse Matrix-Matrix Multiplication with GPU Tensor Cores

Authors: Haisha Zhao, San Li, Jiaheng Wang, Chunbao Zhou, Jue Wang, Zhikuang Xin, Shunde Li, Zhiqiang Liang, Zhijie Pan, Fang Liu, Yan Zeng, Yangang Wang, Xuebin Chi

Abstract: General-purpose Sparse Matrix-Matrix Multiplication (SpMM) is a fundamental kernel in scientific computing and deep learning. The emergence of new matrix computation units such as Tensor Cores (TCs) brings more opportunities for SpMM acceleration. However, in order to fully unleash the power of hardware performance, systematic optimization is required. In this paper, we propose Acc-SpMM, a high-pe… ▽ More General-purpose Sparse Matrix-Matrix Multiplication (SpMM) is a fundamental kernel in scientific computing and deep learning. The emergence of new matrix computation units such as Tensor Cores (TCs) brings more opportunities for SpMM acceleration. However, in order to fully unleash the power of hardware performance, systematic optimization is required. In this paper, we propose Acc-SpMM, a high-performance SpMM library on TCs, with multiple optimizations, including data-affinity-based reordering, memory efficient compressed format, high-throughput pipeline, and adaptive sparsity-aware load balancing. In contrast to the state-of-the-art SpMM kernels on various NVIDIA GPU architectures with a diverse range of benchmark matrices, Acc-SpMM achieves significant performance improvements, on average 2.52x (up to 5.11x) speedup on RTX 4090, on average 1.91x (up to 4.68x) speedup on A800, and on average 1.58x (up to 3.60x) speedup on H100 over cuSPARSE. △ Less

Submitted 15 January, 2025; originally announced January 2025.

Comments: 11 pages,15 figures, accepted by PPoPP 2025

MSC Class: 68W10

arXiv:2501.08888 [pdf, other]

A Partial Initialization Strategy to Mitigate the Overfitting Problem in CATE Estimation with Hidden Confounding

Authors: Chuan Zhou, Yaxuan Li, Chunyuan Zheng, Haiteng Zhang, Haoxuan Li, Mingming Gong

Abstract: Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics. Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders, whose existence cannot be tested from observational data and can invalidate any causal conclusion. In contrast, data collected from ran… ▽ More Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics. Existing studies mainly rely on the strong ignorability assumption that there are no hidden confounders, whose existence cannot be tested from observational data and can invalidate any causal conclusion. In contrast, data collected from randomized controlled trials (RCT) do not suffer from confounding but are usually limited by a small sample size. To avoid overfitting caused by the small-scale RCT data, we propose a novel two-stage pretraining-finetuning (TSPF) framework with a partial parameter initialization strategy to estimate the CATE in the presence of hidden confounding. In the first stage, a foundational representation of covariates is trained to estimate counterfactual outcomes through large-scale observational data. In the second stage, we propose to train an augmented representation of the covariates, which is concatenated with the foundational representation obtained in the first stage to adjust for the hidden confounding. Rather than training a separate network from scratch, part of the prediction heads are initialized from the first stage. The superiority of our approach is validated on two datasets with extensive experiments. △ Less

Submitted 25 January, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

Comments: Presented as a poster in the 2nd Workshop on Causal Inference and Machine Learning in Practice at KDD 2024

Showing 151–200 of 1,735 results for author: Zhou, C