Search | arXiv e-print repository

doi 10.1088/0256-307X/42/6/067504

Electric Field Induced Superconductivity in Bilayer Octagraphene

Authors: Yitong Yao, Jun Li, Jiacheng Ye, Fan Yang, Dao-Xin Yao

Abstract: We investigate the energy bands, magnetism, and superconductivity of bilayer octagraphene with A-A stacking under a perpendicular electric field. A tight-binding model is used to analyze the band structure of the system. The doubling of the unit cell results in each band of the single layer splitting into two. We find that applying a perpendicular electric field increases the band splitting. As th… ▽ More We investigate the energy bands, magnetism, and superconductivity of bilayer octagraphene with A-A stacking under a perpendicular electric field. A tight-binding model is used to analyze the band structure of the system. The doubling of the unit cell results in each band of the single layer splitting into two. We find that applying a perpendicular electric field increases the band splitting. As the electric field strength increases, the nesting of the Fermi Surface(FS) weakens, eventually disrupting the antiferromagnetic order and bilayer octagraphene exhibits superconductivity. Spin fluctuations can induce unconventional superconductivity with s+--wave pairing. Applying a perpendicular electric field to bilayer octagraphene parent weakens the nesting of the FS, ultimately killing the spin-density-wave (SDW) ordered state and transitioning it into the superconducting state, whichworks as a doping effect. We use the random-phase approximation approach to obtain the pairing eigenvalues and pairing symmetries of the perpendicular electric field-tuned bilayer octagraphene in the weak coupling limit. By tuning the strength of the perpendicular electric field, the critical interaction strength for SDW order can be modified, which in turn may promote the emergence of unconventional superconductivity. △ Less

Submitted 3 July, 2025; originally announced July 2025.

Comments: 7 pages, 6 figures

Journal ref: Chin. Phys. Lett., 2025, 42(6): 067504

arXiv:2506.23757 [pdf, ps, other]

Training of Spiking Neural Networks with Expectation-Propagation

Authors: Dan Yao, Steve McLaughlin, Yoann Altmann

Abstract: In this paper, we propose a unifying message-passing framework for training spiking neural networks (SNNs) using Expectation-Propagation. Our gradient-free method is capable of learning the marginal distributions of network parameters and simultaneously marginalizes nuisance parameters, such as the outputs of hidden layers. This framework allows for the first time, training of discrete and continu… ▽ More In this paper, we propose a unifying message-passing framework for training spiking neural networks (SNNs) using Expectation-Propagation. Our gradient-free method is capable of learning the marginal distributions of network parameters and simultaneously marginalizes nuisance parameters, such as the outputs of hidden layers. This framework allows for the first time, training of discrete and continuous weights, for deterministic and stochastic spiking networks, using batches of training samples. Although its convergence is not ensured, the algorithm converges in practice faster than gradient-based methods, without requiring a large number of passes through the training data. The classification and regression results presented pave the way for new efficient training methods for deep Bayesian networks. △ Less

Submitted 30 June, 2025; originally announced June 2025.

Comments: 10 pages

arXiv:2506.23071 [pdf, ps, other]

Text2VectorSQL: Bridging Text-to-SQL and Vector Search for Unified Natural Language Queries

Authors: Zhengren Wang, Bozhou Li, Dongwen Yao, Wentao Zhang

Abstract: While Text-to-SQL enables natural language interaction with structured databases, its effectiveness diminishes with unstructured data or ambiguous queries due to rigid syntax and limited expressiveness. Concurrently, vector search has emerged as a powerful paradigm for semantic retrieval, particularly for unstructured data. However, existing VectorSQL implementations still rely heavily on manual c… ▽ More While Text-to-SQL enables natural language interaction with structured databases, its effectiveness diminishes with unstructured data or ambiguous queries due to rigid syntax and limited expressiveness. Concurrently, vector search has emerged as a powerful paradigm for semantic retrieval, particularly for unstructured data. However, existing VectorSQL implementations still rely heavily on manual crafting and lack tailored evaluation frameworks, leaving a significant gap between theoretical potential and practical deployment. To bridge these complementary paradigms, we introduces Text2VectorSQL, a novel framework unifying Text-to-SQL and vector search to overcome expressiveness constraints and support more diverse and holistical natural language queries. Specifically, Text2VectorSQL enables semantic filtering, multi-modal matching, and retrieval acceleration. For evaluation, we build vector index on appropriate columns, extend user queries with semantic search, and annotate ground truths via an automatic pipeline with expert review. Furthermore, we develop dedicated Text2VectorSQL models with synthetic data, demonstrating significant performance improvements over baseline methods. Our work establishes the foundation for the Text2VectorSQL task, paving the way for more versatile and intuitive database interfaces. The repository will be publicly available at https://github.com/Open-DataFlow/Text2VectorSQL. △ Less

Submitted 28 June, 2025; originally announced June 2025.

Comments: Work in progess

arXiv:2506.20997 [pdf, ps, other]

A Glimpse of Satellite Galaxies in the Milky Way with the 2.5-meter Wide Field Survey Telescope (WFST): Bootes III and Draco

Authors: Chao Yang, Zhizheng Pan, Min Fang, Xian Zhong Zheng, Binyang Liu, Guoliang Li, Tian-Rui Sun, Ji-An Jiang, Miaomiao Zhang, Zhen Wan, Shuang Liu, Han Qu, Ji Yang, Xu Kong, Wenhao Liu, Yiping Shu, Jiang Chang, Tinggui Wang, Lulu Fan, Yongquan Xue, Wentao Luo, Hongxin Zhang, Zheng Lou, Haibin Zhao, Bin Li , et al. (12 additional authors not shown)

Abstract: We carry out deep imaging of the Milky Way satellite galaxies, Bootes III and Draco, with WFST as one pilot observing program to demonstrate the capability of WFST. Combining catalogs with PS1 DR2 and Gaia DR3, we derive proper motions for candidate member stars in these two satellite galaxies over a 12-year time baseline, yielding uncertainties of ~1.8 mas/yr at 21 mag and ~3.0 mas/yr at 22 mag i… ▽ More We carry out deep imaging of the Milky Way satellite galaxies, Bootes III and Draco, with WFST as one pilot observing program to demonstrate the capability of WFST. Combining catalogs with PS1 DR2 and Gaia DR3, we derive proper motions for candidate member stars in these two satellite galaxies over a 12-year time baseline, yielding uncertainties of ~1.8 mas/yr at 21 mag and ~3.0 mas/yr at 22 mag in the r band. The proper motions derived from bright and faint stars are consistent, indicating no significant variation in proper motion across stellar luminosity as these galaxies undergo tidal interactions with the MW. Meanwhile, we suggest that Bootes III represents the bound remnant of the progenitor galaxy that gave rise to the Styx stream, as evidenced by its elongated density profile and overdensity in both spatial and kinematic space. This is the first paper to use WFST to measure the proper motions of faint stars in Milky Way satellite galaxies. More detailed analyses will be presented in forthcoming papers from the wide field survey (WFS) program. △ Less

Submitted 26 June, 2025; originally announced June 2025.

Comments: 17 pages, 12 figures, 3 tables. Accepted for publication in ApJ

arXiv:2506.20727 [pdf, ps, other]

Pairing symmetry and superconductivity in La$_3$Ni$_2$O$_7$ thin films

Authors: Wenyuan Qiu, Zhihui Luo, Xunwu Hu, Dao-Xin Yao

Abstract: The recent discovery of superconductivity with a transition temperature $T_c$ over 40 K in La$_3$Ni$_2$O$_7$ and (La,Pr)$_{3}$Ni$_2$O$_7$ thin films at ambient pressure marks an important step in the field of nickelate superconductors. Here, we perform a renormalized mean-field theory study of the superconductivity in $\mathrm{La_3Ni_2O_7}$ thin films, using a bilayer two-orbital $t-J$ model. Our… ▽ More The recent discovery of superconductivity with a transition temperature $T_c$ over 40 K in La$_3$Ni$_2$O$_7$ and (La,Pr)$_{3}$Ni$_2$O$_7$ thin films at ambient pressure marks an important step in the field of nickelate superconductors. Here, we perform a renormalized mean-field theory study of the superconductivity in $\mathrm{La_3Ni_2O_7}$ thin films, using a bilayer two-orbital $t-J$ model. Our result reveals an $s_\pm$-wave pairing symmetry driven by the strong interlayer superexchange coupling of $d_{z^2}$ orbital, resembling the pressurized bulk case. Also, we roughly reproduce the experimentally reported nodeless shape of the superconducting gap at the $β$ pocket and the superconducting $T_c$. To gain insight into the orbital feature of superconductivity, we explore the projection of different pairing bonds on Fermi surface. We find that the nodeless gap at $β$ pocket is related to the interlayer pairing within both $d_{z^2}$ and $d_{x^2-y^2}$ orbitals, in which the latter is triggered by the former through hybridization, and both hold the same sign. △ Less

Submitted 25 June, 2025; originally announced June 2025.

Comments: 5 pages,4 figures

arXiv:2506.16096 [pdf, ps, other]

A Brain-to-Population Graph Learning Framework for Diagnosing Brain Disorders

Authors: Qianqian Liao, Wuque Cai, Hongze Sun, Dongze Liu, Duo Chen, Dezhong Yao, Daqing Guo

Abstract: Recent developed graph-based methods for diagnosing brain disorders using functional connectivity highly rely on predefined brain atlases, but overlook the rich information embedded within atlases and the confounding effects of site and phenotype variability. To address these challenges, we propose a two-stage Brain-to-Population Graph Learning (B2P-GL) framework that integrates the semantic simil… ▽ More Recent developed graph-based methods for diagnosing brain disorders using functional connectivity highly rely on predefined brain atlases, but overlook the rich information embedded within atlases and the confounding effects of site and phenotype variability. To address these challenges, we propose a two-stage Brain-to-Population Graph Learning (B2P-GL) framework that integrates the semantic similarity of brain regions and condition-based population graph modeling. In the first stage, termed brain representation learning, we leverage brain atlas knowledge from GPT-4 to enrich the graph representation and refine the brain graph through an adaptive node reassignment graph attention network. In the second stage, termed population disorder diagnosis, phenotypic data is incorporated into population graph construction and feature fusion to mitigate confounding effects and enhance diagnosis performance. Experiments on the ABIDE I, ADHD-200, and Rest-meta-MDD datasets show that B2P-GL outperforms state-of-the-art methods in prediction accuracy while enhancing interpretability. Overall, our proposed framework offers a reliable and personalized approach to brain disorder diagnosis, advancing clinical applicability. △ Less

Submitted 19 June, 2025; originally announced June 2025.

Comments: 16 pages, 7 figures, 13 tables; this paper has been submitted for possible publication

arXiv:2506.10078 [pdf, ps, other]

Worldline deconfinement and emergent long-range interaction in entanglement Hamiltonian and entanglement spectrum

Authors: Zenan Liu, Zhe Wang, Dao-Xin Yao, Zheng Yan

Abstract: When a system exhibits a bulk gap but gapless edge states (e.g., a symmetry-protected topological phase), the entanglement spectrum (ES) resembles the energy spectrum on virtual edge, that is the Li-Haldane conjecture. In this way, the ES plays an important probe to detect the topological phases according to this bulk-edge correspondence. When a system is fully gapped, both in bulk and edge, the E… ▽ More When a system exhibits a bulk gap but gapless edge states (e.g., a symmetry-protected topological phase), the entanglement spectrum (ES) resembles the energy spectrum on virtual edge, that is the Li-Haldane conjecture. In this way, the ES plays an important probe to detect the topological phases according to this bulk-edge correspondence. When a system is fully gapped, both in bulk and edge, the ES still remains similar to the virtual edge spectrum which can be explained by the recently proposed wormhole effect in the path integral of reduced density matrix. However, what will happen in the ES when the system is fully gapless? We find that though the ES roughly seems like an edge energy spectrum, and it actually contains relevant long-range interaction which modifies the intrinsic physics of entanglement Hamiltonian (EH). Moreover, the mechanism of short-/long-range interaction in EH can be understood as the confinement/deconfinement of worldlines in a path integral of reduced density matrix. Our work demonstrates that the gapless mode can induce a long-range interaction in EH. △ Less

Submitted 11 June, 2025; originally announced June 2025.

Comments: 9 pages, 6 figures

arXiv:2506.01456 [pdf]

GenDMR: A dynamic multimodal role-swapping network for identifying risk gene phenotypes

Authors: Lina Qin, Cheng Zhu, Chuqi Zhou, Yukun Huang, Jiayi Zhu, Ping Liang, Jinju Wang, Yixing Huang, Cheng Luo, Dezhong Yao, Ying Tan

Abstract: Recent studies have shown that integrating multimodal data fusion techniques for imaging and genetic features is beneficial for the etiological analysis and predictive diagnosis of Alzheimer's disease (AD). However, there are several critical flaws in current deep learning methods. Firstly, there has been insufficient discussion and exploration regarding the selection and encoding of genetic infor… ▽ More Recent studies have shown that integrating multimodal data fusion techniques for imaging and genetic features is beneficial for the etiological analysis and predictive diagnosis of Alzheimer's disease (AD). However, there are several critical flaws in current deep learning methods. Firstly, there has been insufficient discussion and exploration regarding the selection and encoding of genetic information. Secondly, due to the significantly superior classification value of AD imaging features compared to genetic features, many studies in multimodal fusion emphasize the strengths of imaging features, actively mitigating the influence of weaker features, thereby diminishing the learning of the unique value of genetic features. To address this issue, this study proposes the dynamic multimodal role-swapping network (GenDMR). In GenDMR, we develop a novel approach to encode the spatial organization of single nucleotide polymorphisms (SNPs), enhancing the representation of their genomic context. Additionally, to adaptively quantify the disease risk of SNPs and brain region, we propose a multi-instance attention module to enhance model interpretability. Furthermore, we introduce a dominant modality selection module and a contrastive self-distillation module, combining them to achieve a dynamic teacher-student role exchange mechanism based on dominant and auxiliary modalities for bidirectional co-updating of different modal data. Finally, GenDMR achieves state-of-the-art performance on the ADNI public dataset and visualizes attention to different SNPs, focusing on confirming 12 potential high-risk genes related to AD, including the most classic APOE and recently highlighted significant risk genes. This demonstrates GenDMR's interpretable analytical capability in exploring AD genetic features, providing new insights and perspectives for the development of multimodal data fusion techniques. △ Less

Submitted 2 June, 2025; originally announced June 2025.

Comments: 31 pages, 9 figures

arXiv:2505.23802 [pdf, ps, other]

MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks

Authors: Suhana Bedi, Hejie Cui, Miguel Fuentes, Alyssa Unell, Michael Wornow, Juan M. Banda, Nikesh Kotecha, Timothy Keyes, Yifan Mai, Mert Oez, Hao Qiu, Shrey Jain, Leonardo Schettini, Mehr Kashyap, Jason Alan Fries, Akshay Swaminathan, Philip Chung, Fateme Nateghi, Asad Aali, Ashwin Nayak, Shivam Vedak, Sneha S. Jain, Birju Patel, Oluseyi Fayanju, Shreya Shah , et al. (56 additional authors not shown)

Abstract: While large language models (LLMs) achieve near-perfect scores on medical licensing exams, these evaluations inadequately reflect the complexity and diversity of real-world clinical practice. We introduce MedHELM, an extensible evaluation framework for assessing LLM performance for medical tasks with three key contributions. First, a clinician-validated taxonomy spanning 5 categories, 22 subcatego… ▽ More While large language models (LLMs) achieve near-perfect scores on medical licensing exams, these evaluations inadequately reflect the complexity and diversity of real-world clinical practice. We introduce MedHELM, an extensible evaluation framework for assessing LLM performance for medical tasks with three key contributions. First, a clinician-validated taxonomy spanning 5 categories, 22 subcategories, and 121 tasks developed with 29 clinicians. Second, a comprehensive benchmark suite comprising 35 benchmarks (17 existing, 18 newly formulated) providing complete coverage of all categories and subcategories in the taxonomy. Third, a systematic comparison of LLMs with improved evaluation methods (using an LLM-jury) and a cost-performance analysis. Evaluation of 9 frontier LLMs, using the 35 benchmarks, revealed significant performance variation. Advanced reasoning models (DeepSeek R1: 66% win-rate; o3-mini: 64% win-rate) demonstrated superior performance, though Claude 3.5 Sonnet achieved comparable results at 40% lower estimated computational cost. On a normalized accuracy scale (0-1), most models performed strongly in Clinical Note Generation (0.73-0.85) and Patient Communication & Education (0.78-0.83), moderately in Medical Research Assistance (0.65-0.75), and generally lower in Clinical Decision Support (0.56-0.72) and Administration & Workflow (0.53-0.63). Our LLM-jury evaluation method achieved good agreement with clinician ratings (ICC = 0.47), surpassing both average clinician-clinician agreement (ICC = 0.43) and automated baselines including ROUGE-L (0.36) and BERTScore-F1 (0.44). Claude 3.5 Sonnet achieved comparable performance to top models at lower estimated cost. These findings highlight the importance of real-world, task-specific evaluation for medical use of LLMs and provides an open source framework to enable this. △ Less

Submitted 2 June, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

arXiv:2505.19586 [pdf, ps, other]

TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization

Authors: Dingyu Yao, Bowen Shen, Zheng Lin, Wei Liu, Jian Luan, Bin Wang, Weiping Wang

Abstract: The Key-Value (KV) cache in generative large language models (LLMs) introduces substantial memory overhead. Existing works mitigate this burden by offloading or compressing the KV cache. However, loading the entire cache incurs significant latency due to PCIe bandwidth bottlenecks in CPU-GPU communication, while aggressive compression causes notable performance degradation. We identify that certai… ▽ More The Key-Value (KV) cache in generative large language models (LLMs) introduces substantial memory overhead. Existing works mitigate this burden by offloading or compressing the KV cache. However, loading the entire cache incurs significant latency due to PCIe bandwidth bottlenecks in CPU-GPU communication, while aggressive compression causes notable performance degradation. We identify that certain layers in the LLM need to maintain global information and are unsuitable for selective loading. In contrast, other layers primarily focus on a few tokens with dominant activations that potentially incur substantial quantization error. This observation leads to a key insight that loading dominant tokens and quantizing all tokens can complement each other. Building on this insight, we propose a hybrid compression method, TailorKV, which seamlessly integrates quantization and offloading. TailorKV develops an inference framework along with a hardware-friendly implementation that leverages these complementary characteristics. Extensive long-context evaluations exhibit that TailorKV achieves nearly lossless performance under aggressive compression settings, outperforming the state-of-the-art. Particularly, the Llama-3.1-8B with 128k context can be served within a single RTX 3090 GPU, reaching 82 ms per token during decoding. △ Less

Submitted 26 May, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

arXiv:2505.17708 [pdf, ps, other]

The Third Pillar of Causal Analysis? A Measurement Perspective on Causal Representations

Authors: Dingling Yao, Shimeng Huang, Riccardo Cadei, Kun Zhang, Francesco Locatello

Abstract: Causal reasoning and discovery, two fundamental tasks of causal analysis, often face challenges in applications due to the complexity, noisiness, and high-dimensionality of real-world data. Despite recent progress in identifying latent causal structures using causal representation learning (CRL), what makes learned representations useful for causal downstream tasks and how to evaluate them are sti… ▽ More Causal reasoning and discovery, two fundamental tasks of causal analysis, often face challenges in applications due to the complexity, noisiness, and high-dimensionality of real-world data. Despite recent progress in identifying latent causal structures using causal representation learning (CRL), what makes learned representations useful for causal downstream tasks and how to evaluate them are still not well understood. In this paper, we reinterpret CRL using a measurement model framework, where the learned representations are viewed as proxy measurements of the latent causal variables. Our approach clarifies the conditions under which learned representations support downstream causal reasoning and provides a principled basis for quantitatively assessing the quality of representations using a new Test-based Measurement EXclusivity (T-MEX) score. We validate T-MEX across diverse causal inference scenarios, including numerical simulations and real-world ecological video analysis, demonstrating that the proposed framework and corresponding score effectively assess the identification of learned representations and their usefulness for causal downstream tasks. △ Less

Submitted 27 May, 2025; v1 submitted 23 May, 2025; originally announced May 2025.

Comments: 22 pages, 12 figures, 2 tables

arXiv:2505.15906 [pdf, ps, other]

Pairing mechanism and superconductivity in pressurized La$_5$Ni$_3$O$_{11}$

Authors: Ming Zhang, Cui-Qun Chen, Dao-Xin Yao, Fan Yang

Abstract: The discovery of superconductivity (SC) with critical temperature $T_c$ above the boiling point of liquid nitrogen in pressurized La$_3$Ni$_2$O$_{7}$ has sparked a surge of exploration of high-$T_c$ superconductors in the Ruddlesden-Popper (RP) phase nickelates. More recently, the RP phase nicklate La$_5$Ni$_3$O$_{11}$, which hosts layered structure with alternating bilayer and single-layer NiO… ▽ More The discovery of superconductivity (SC) with critical temperature $T_c$ above the boiling point of liquid nitrogen in pressurized La$_3$Ni$_2$O$_{7}$ has sparked a surge of exploration of high-$T_c$ superconductors in the Ruddlesden-Popper (RP) phase nickelates. More recently, the RP phase nicklate La$_5$Ni$_3$O$_{11}$, which hosts layered structure with alternating bilayer and single-layer NiO$_2$ planes, is reported to accommodate SC under pressure, exhibiting a dome-shaped pressure dependence with highest $T_c\approx 64$ K, capturing a lot of interests. Here, using density functional theory (DFT) and random phase approximation (RPA) calculations, we systematically study the electronic properties and superconducting mechanism of this material. Our DFT calculations yield a band structure including two nearly decoupled sets of sub-band structures, with one set originating from the bilayer subsystem and the other from the single-layer one. RPA-based analysis demonstrates that SC in this material occurs primarily within the bilayer subsystem exhibiting an $s^\pm$ wave pairing symmetry similar to that observed in pressurized La$_3$Ni$_2$O$_{7}$, while the single-layer subsystem mainly serves as a bridge facilitating the inter-bilayer phase coherence through the interlayer Josephson coupling (IJC). Since the IJC thus attained is extremely weak, it experiences a prominent enhancement under pressure, leading to the increase of the bulk $T_c$ with pressure initially. When the pressure is high enough, the $T_c$ gradually decreases due to the reduced density of states on the $γ$-pocket. In this way, the dome-shaped pressure dependence of $T_c$ observed experimentally is naturally understood. △ Less

Submitted 21 May, 2025; originally announced May 2025.

Comments: 11 pages, 6 figures

arXiv:2505.14910 [pdf, ps, other]

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis

Authors: Yu Zhang, Wenxiang Guo, Changhao Pan, Dongyu Yao, Zhiyuan Zhu, Ziyue Jiang, Yuhan Wang, Tao Jin, Zhou Zhao

Abstract: Customizable multilingual zero-shot singing voice synthesis (SVS) has various potential applications in music composition and short video dubbing. However, existing SVS models overly depend on phoneme and note boundary annotations, limiting their robustness in zero-shot scenarios and producing poor transitions between phonemes and notes. Moreover, they also lack effective multi-level style control… ▽ More Customizable multilingual zero-shot singing voice synthesis (SVS) has various potential applications in music composition and short video dubbing. However, existing SVS models overly depend on phoneme and note boundary annotations, limiting their robustness in zero-shot scenarios and producing poor transitions between phonemes and notes. Moreover, they also lack effective multi-level style control via diverse prompts. To overcome these challenges, we introduce TCSinger 2, a multi-task multilingual zero-shot SVS model with style transfer and style control based on various prompts. TCSinger 2 mainly includes three key modules: 1) Blurred Boundary Content (BBC) Encoder, predicts duration, extends content embedding, and applies masking to the boundaries to enable smooth transitions. 2) Custom Audio Encoder, uses contrastive learning to extract aligned representations from singing, speech, and textual prompts. 3) Flow-based Custom Transformer, leverages Cus-MOE, with F0 supervision, enhancing both the synthesis quality and style modeling of the generated singing voice. Experimental results show that TCSinger 2 outperforms baseline models in both subjective and objective metrics across multiple related tasks. Singing voice samples are available at https://aaronz345.github.io/TCSinger2Demo/. △ Less

Submitted 30 May, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

Comments: Accepted by Findings of ACL 2025

arXiv:2505.07296 [pdf, ps, other]

Karmarkar-Tolman Embedded Charged Anisotropic Stars in f(R) Gravity

Authors: W. U. Rahman, M. Ilyas, Yi Zhong, De-Liang Yao

Abstract: We investigate various anisotropic spherical distributions of charged celestial bodies within the context of f(R) gravity, where R represents the Ricci scalar. The properties of specific charged compact objects are analyzed by using the Karmarkar-Tolman spacetime and three distinct gravitational models. The behavior of the structural parameters is examined via graphical methods. Energy constraints… ▽ More We investigate various anisotropic spherical distributions of charged celestial bodies within the context of f(R) gravity, where R represents the Ricci scalar. The properties of specific charged compact objects are analyzed by using the Karmarkar-Tolman spacetime and three distinct gravitational models. The behavior of the structural parameters is examined via graphical methods. Energy constraints are applied to assess how well the results align with the Karmarkar-Tolman spacetime model. The physical acceptability of the stellar models is evaluated by checking the energy conditions and the equation of state parameter. Additionally, we explore the influence of anisotropy on the stability and internal structure of the models. Our findings are compared with predictions from general relativity to highlight the effects of f(R) gravity on charged compact stars. The obtained results are useful to enhance our understanding of how modified gravity theories affect the properties of compact astrophysical objects. △ Less

Submitted 12 May, 2025; originally announced May 2025.

Comments: 25 pages, 18 figures, 2 tables

arXiv:2505.04298 [pdf, other]

Magnetization-resolved density of states and quasi-first order transition in the two-dimensional random bond Ising model: an entropic sampling study

Authors: Yi Liu, Ding Wang, Xin Wang, Dao-Xin Yao, Lei-Han Tang

Abstract: Systems with quenched disorder possess complex energy landscapes that are challenging to explore under the conventional Monte Carlo method. In this work, we implement an efficient entropy sampling scheme for accurate computation of the entropy function in low-energy regions. The method is applied to the two-dimensional $\pm J$ random-bond Ising model, where frustration is controlled by the fractio… ▽ More Systems with quenched disorder possess complex energy landscapes that are challenging to explore under the conventional Monte Carlo method. In this work, we implement an efficient entropy sampling scheme for accurate computation of the entropy function in low-energy regions. The method is applied to the two-dimensional $\pm J$ random-bond Ising model, where frustration is controlled by the fraction $p$ of ferromagnetic bonds. We investigate the low-temperature paramagnetic--ferromagnetic phase boundary below the multicritical point at $T_N = 0.9530(4)$, $P_N = 0.89078(8)$, as well as the zero-temperature ferromagnetic--spin-glass transition. Finite-size scaling analysis reveals that the phase boundary for $T < T_N$ exhibits reentrant behavior. By analyzing the evolution of the magnetization-resolved density of states $g(E, M)$ and ground-state spin configurations against increasing frustration, we provide strong evidence that the zero-temperature transition is quasi-first order. Finite-size scaling conducted on the spin-glass side supports the validity of $β= 0$, with a correlation length exponent $ν= 1.50(8)$. Our results provide new insights into the nature of the ferromagnetic-to-spin-glass phase transition in an extensively degenerate ground state. △ Less

Submitted 7 May, 2025; originally announced May 2025.

Comments: 27 pages, 10 figures

arXiv:2505.02030 [pdf, other]

Exact diagonalization study of triangular Heisenberg model with four-spin ring-exchange interaction

Authors: Yuchao Zheng, Muwei Wu, Dao-Xin Yao, Han-Qing Wu

Abstract: Using Lanczos exact diagonalization (ED), we study the spin-1/2 $J_1$-$J_2$ Heisenberg model with the four-spin ring-exchange interaction $J_r$ on triangular lattice. We mainly use the level spectroscopic technique of two 36-site tori to investigate the ground-state phase diagram, and further characterize phases by spin, dimer and chiral correlation functions. The ground state has rich phases incl… ▽ More Using Lanczos exact diagonalization (ED), we study the spin-1/2 $J_1$-$J_2$ Heisenberg model with the four-spin ring-exchange interaction $J_r$ on triangular lattice. We mainly use the level spectroscopic technique of two 36-site tori to investigate the ground-state phase diagram, and further characterize phases by spin, dimer and chiral correlation functions. The ground state has rich phases including several magnetic ordered phases like zigzag phase and tetrahedral phase, as well as several novel nonmagnetic phases, some of which exhibit valence bond solid behavior in their dimer correlation functions. However, we do not find direct evidence of a quantum spin liquid phase with spinon Fermi surface in this model. Our results can give a better understanding of the ground-state properties of the triangular Heisenberg model with ring-exchange interaction, and help to understand the relevant triangular materials. △ Less

Submitted 4 May, 2025; originally announced May 2025.

Comments: 12 pages, 11 figures

arXiv:2505.01329 [pdf, ps, other]

Semileptonic Decays of $D \to ρl^+ ν$ and $D_{(s)} \to K^\ast l^+ ν$ from Light-Cone Sum Rules

Authors: Wang Lin, Xiao-En Huang, Shan Cheng, De-Liang Yao

Abstract: We investigate the semileptonic decays of charmed mesons to light vector mesons within the framework of light-cone sum rules. Our calculation is performed at leading order in QCD coupling, incorporating contributions up to twist-five accuracy from both two-particle and three-particle light-cone distribution amplitudes. Transition form factors are predicted twist by twist to assess the convergence… ▽ More We investigate the semileptonic decays of charmed mesons to light vector mesons within the framework of light-cone sum rules. Our calculation is performed at leading order in QCD coupling, incorporating contributions up to twist-five accuracy from both two-particle and three-particle light-cone distribution amplitudes. Transition form factors are predicted twist by twist to assess the convergence property of the operator product expansion. It is verified that the twist-four and twist-five contributions are indeed negligible for all the decays under consideration. Twist-three dominance is observed for some of the form factors, subject to heavy quark effective field theory interpretation. Branching ratios for the decays $D^+ \to ρ^0 \ell^+ν_\ell$ , $D_s^+ \to K^{\ast0} \ell^+ν_\ell$, $D^0 \to K^{\ast-} \ell^+ν_\ell$ and $D^+ \to \bar{K}^{\ast0} \ell^+ν_\ell$ are obtained, and a $10\%$--$20\%$ discrepancy from experimental measurements is found. Our finding indicates that the resonant-width and non-resonant QCD backgrounds effects should be potentially significant, implying the necessity to further implement their contributions in future precision studies of the semileptonic charm decays. △ Less

Submitted 4 June, 2025; v1 submitted 2 May, 2025; originally announced May 2025.

Comments: 25 pages, 7 figures, and 6 tables, accepted for publication in Phys.Rev.D

arXiv:2504.15354 [pdf, other]

Multimagnon and multispinon $L_3$-edge RIXS spectra of an effective $\tilde{J}_1-\tilde{J}_2-\tilde{J}_3$ square lattice Heisenberg model

Authors: Kai-Yuan Qi, Shangjian Jin, Trinanjan Datta, Dao-Xin Yao

Abstract: We investigate the multimagnon and the multispinon $L_3$-edge resonant inelastic x-ray scattering (RIXS) spectra of a spin-1/2 effective $\tilde{J}_1-\tilde{J}_2-\tilde{J}_3$ square lattice Heisenberg model in its Néel ordered phase. Motivated by the observation of satellite intensity peaks above the single magnon dispersion in the $L$-edge RIXS spectrum, we propose a resonating valence bond (RVB)… ▽ More We investigate the multimagnon and the multispinon $L_3$-edge resonant inelastic x-ray scattering (RIXS) spectra of a spin-1/2 effective $\tilde{J}_1-\tilde{J}_2-\tilde{J}_3$ square lattice Heisenberg model in its Néel ordered phase. Motivated by the observation of satellite intensity peaks above the single magnon dispersion in the $L$-edge RIXS spectrum, we propose a resonating valence bond (RVB) inspired RIXS mechanism that incorporates the local site ultrashort core-hole lifetime (UCL) expansion. We compute the multimagnon and the multispinon excitations using $\mathcal{O}(1/S)$ interacting spin wave theory and Schwinger boson mean-field theory (SBMFT) formalism, respectively. We treat the x-ray scattering process up to second order in the UCL expansion. Our calculations of two-magnon, bimagnon, and three-magnon RIXS intensities reveal that interacting spin wave theory fails to fully capture all the quantum correlations in the antiferromagnetic ordered phase. However utilizing the SBMFT framework, with a ground state that combines Néel order and fluctuating RVB components, we demonstrate that a RIXS bond-flipping mechanism provides an alternative deeper physical explanation of the satellite intensities. Specifically, we find that the spin correlation spectra predicted by the fluctuating RVB mechanism aligns with higher order UCL expansion results. We further show that the satellite intensity above the single-magnon mode can originate both from a one-to-three-magnon hybridization vertex process and from condensed spinons exhibiting Higgs mechanism. These features reflect the interplay of quantum fluctuation, entanglement, and gauge interaction effects of quantum magnetism probed by RIXS. △ Less

Submitted 21 April, 2025; originally announced April 2025.

Comments: 23 pages, 10 figures

arXiv:2504.04709 [pdf, other]

ChPT and lattice QCD studies of doubly charmed baryons

Authors: Ze-Rui Liang, Jing-Yu Yi, Liuming Liu, De-Liang Yao

Abstract: The scattering lengths on the interactions between the spin-$1/2$ doubly charmed baryons and Nambu-Goldstone bosons are of great importance for the investigation of the spectroscopy of heavy flavored baryons. To that end, we have conducted a systematic analysis of the low-energy dynamics of doubly charmed baryons within the frameworks of chiral perturbation theory (ChPT) and lattice quantum chromo… ▽ More The scattering lengths on the interactions between the spin-$1/2$ doubly charmed baryons and Nambu-Goldstone bosons are of great importance for the investigation of the spectroscopy of heavy flavored baryons. To that end, we have conducted a systematic analysis of the low-energy dynamics of doubly charmed baryons within the frameworks of chiral perturbation theory (ChPT) and lattice quantum chromodynamics (QCD). On the one hand, the S- and P-wave scattering lengths are predicted in a manifestly relativistic baryon ChPT at leading one-loop order. On the other hand, results of the S-wave scattering lengths for four elastic scattering single channels are obtained in lattice QCD for the first time. △ Less

Submitted 6 April, 2025; originally announced April 2025.

Comments: 10 pages, 2 figures, 5 tables. Proceedings of the 11th International Workshop on Chiral Dynamics (CD2024)

arXiv:2504.00095 [pdf, other]

Spin order, spin excitations, and RIXS spectra of spin-1/2 tetramer chains

Authors: Junli Li, Jun-Qing Cheng, Trinanjan Datta, Dao-Xin Yao

Abstract: We investigate the spin dynamics of a 1D spin-1/2 Heisenberg tetramer chain. Employing a combination of Density Matrix Renormalization Group, quantum renormalization group, and perturbation theory techniques, we compute the energy levels and the quantum phase diagram, analyze the phase transitions, and evaluate the $L$ and $K$ -edge resonant inelastic x-ray scattering (RIXS) spectrum of fractional… ▽ More We investigate the spin dynamics of a 1D spin-1/2 Heisenberg tetramer chain. Employing a combination of Density Matrix Renormalization Group, quantum renormalization group, and perturbation theory techniques, we compute the energy levels and the quantum phase diagram, analyze the phase transitions, and evaluate the $L$ and $K$ -edge resonant inelastic x-ray scattering (RIXS) spectrum of fractionalized and collective (single and multi-particle) excitations. Our calculations suggest that the chain can transition between a hidden $Z_2\times Z_2$ discrete symmetry preserving tetramer phase and a Haldane phase with non-vanishing string order that breaks the hidden symmetry. These two gapped phases are intervened by an intermediate deconfined quantum critical state comprising of free spins and three-site doublets, which is a gapless critical phase with deconfined spinons. We find that the tetramer chain can support fractionalized (spinon) and collective (triplon and quinton) excitations. In the ferromagnetic intra-tetramer limit, the chain can support a quinton excitation which has a five-fold degenerate excited state. String order parameter calculations suggest CuInVO$_5$ to be in a Haldane-like phase whose $L$ -edge RIXS spectrum can support observable triplon and quinton excitations. We also identify possible two-particle excitations (two-singlon, two-triplon, triplon-quinton, and two-quinton excitations) resulting from the double spin-flip effect in the $K$ -edge RIXS spectrum. △ Less

Submitted 31 March, 2025; originally announced April 2025.

Comments: 13 pages, 7 figures

arXiv:2503.24226 [pdf, other]

Asymptotic Freedom and Finite-size Scaling of Two-dimensional Classical Heisenberg Model

Authors: Dingyun Yao, Chao Zhang, Z. Y. Xie, Zhijie Fan, Youjin Deng

Abstract: The classical Heisenberg model is one of the most fundamental models in statistical and condensed matter physics. Extensive theoretical and numerical studies suggest that, in two dimensions, this model does not exhibit a finite-temperature phase transition but instead manifests asymptotic freedom. However, some research has also proposed the possibility of a Berezinskii-Kosterlitz-Thouless (BKT) p… ▽ More The classical Heisenberg model is one of the most fundamental models in statistical and condensed matter physics. Extensive theoretical and numerical studies suggest that, in two dimensions, this model does not exhibit a finite-temperature phase transition but instead manifests asymptotic freedom. However, some research has also proposed the possibility of a Berezinskii-Kosterlitz-Thouless (BKT) phase transition over the years. In this study, we revisit the classical two-dimensional (2D) Heisenberg model through large-scale simulations with linear system sizes up to $L=16384$. Our Monte-Carlo data, without any extrapolation, clearly reveal an exponential divergence of the correlation length $ξ$ as a function of inverse temperature $β$, a hallmark of asymptotic freedom. Moreover, extrapolating $ξ$ to the thermodynamic limit in the low-temperature regime achieves close agreement with the three-loop perturbative calculations. We further propose a finite-size scaling (FSS) ansatz for $ξ$, demonstrating that the pseudo-critical point $β_L$ diverges logarithmically with $L$. The thermodynamic and finite-size scaling behaviors of the magnetic susceptibility $χ$ are also investigated and corroborate the prediction of asymptotic freedom. Our work provides solid evidence for asymptotic freedom in the 2D Heisenberg model and advances understanding of finite-size scaling in such systems. △ Less

Submitted 31 March, 2025; originally announced March 2025.

arXiv:2503.21761 [pdf, other]

Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video

Authors: David Yifan Yao, Albert J. Zhai, Shenlong Wang

Abstract: This paper presents a unified approach to understanding dynamic scenes from casual videos. Large pretrained vision foundation models, such as vision-language, video depth prediction, motion tracking, and segmentation models, offer promising capabilities. However, training a single model for comprehensive 4D understanding remains challenging. We introduce Uni4D, a multi-stage optimization framework… ▽ More This paper presents a unified approach to understanding dynamic scenes from casual videos. Large pretrained vision foundation models, such as vision-language, video depth prediction, motion tracking, and segmentation models, offer promising capabilities. However, training a single model for comprehensive 4D understanding remains challenging. We introduce Uni4D, a multi-stage optimization framework that harnesses multiple pretrained models to advance dynamic 3D modeling, including static/dynamic reconstruction, camera pose estimation, and dense 3D motion tracking. Our results show state-of-the-art performance in dynamic 4D modeling with superior visual quality. Notably, Uni4D requires no retraining or fine-tuning, highlighting the effectiveness of repurposing visual foundation models for 4D understanding. △ Less

Submitted 27 March, 2025; originally announced March 2025.

Comments: CVPR 2025. Project page (with code): https://davidyao99.github.io/uni4d

arXiv:2503.20600 [pdf]

Efficient second-harmonic emission via strong modal overlap in single-resonant lithium niobate nanocavity

Authors: Zhi Jiang, Danyang Yao, Yu Gao, Xu Ran, Duomao Li, Erqi Zhang, Jianguo Wang, Xuetao Gan, Jinchuan Zhang, Fengqi Liu, Yue Hao

Abstract: High-efficiency second-harmonic generation (SHG) in compact integrated photonic systems is crucial for advancing nonlinear optical technologies. However, achieving exceptional conversion efficiencies while maintaining stable performance remains a significant challenge. Here, we report a high-Q single-resonant photonic crystal nanobeam cavity (PCNBC) on a polymer-loaded lithium niobate on insulator… ▽ More High-efficiency second-harmonic generation (SHG) in compact integrated photonic systems is crucial for advancing nonlinear optical technologies. However, achieving exceptional conversion efficiencies while maintaining stable performance remains a significant challenge. Here, we report a high-Q single-resonant photonic crystal nanobeam cavity (PCNBC) on a polymer-loaded lithium niobate on insulator (LNOI) platform, which enables bright second-harmonic (SH) emission. Through synergistic optimization of modal confinement and spatial overlap in a y-cut LN architecture, our device achieves a normalized SHG conversion efficiency of 163%/W, outperforming previous LN-based photonic crystal cavities LN-based photonic crystal cavities by over three orders of magnitude. The visible SH emission at 768.77 nm exhibits a single-lobe radiation pattern with precise spectral alignment between fundamental (FH) and second-harmonic (SH) modes, a critical feature for integrated photonic circuits. Remarkably, the conversion efficiency remains stable under thermal variations up to 20°C, addressing a key limitation of multi-resonant systems. High-order cavity modes are directly visualized via CCD imaging, confirming strong spatial overlap. This work establishes a record SHG conversion efficiency for LN microcavities and provides a scalable, temperature-insensitive architecture for nonlinear light sources, with immediate applications in quantum optics and chip-scale interconnects. △ Less

Submitted 26 March, 2025; originally announced March 2025.

Comments: 17 pages, 5 figures

arXiv:2503.19716 [pdf, ps, other]

doi 10.15302/frontphys.2025.054501

Magnetic excitations of a trilayer antiferromagnetic Heisenberg model

Authors: Lan-Ye He, Xin-Man Ye, Dao-Xin Yao

Abstract: We investigate the squared sublattice magnetizations and magnetic excitations of a $S=1/2$ trilayer antiferromagnetic Heisenberg model with interlayer interaction $J_{\bot}$ and intralayer interaction $J_{//}$ by employing stochastic series expansion quantum Monte Carlo (SSE-QMC) and stochastic analytic continuation (SAC) methods. Compared with the bilayer model, the trilayer model has one inner l… ▽ More We investigate the squared sublattice magnetizations and magnetic excitations of a $S=1/2$ trilayer antiferromagnetic Heisenberg model with interlayer interaction $J_{\bot}$ and intralayer interaction $J_{//}$ by employing stochastic series expansion quantum Monte Carlo (SSE-QMC) and stochastic analytic continuation (SAC) methods. Compared with the bilayer model, the trilayer model has one inner layer and two outer layers. The change in its symmetry can lead to special magnetic excitations. Our study reveals that the maximum of the magnetization of the outer sublattice corresponds to smaller ratio parameter $g={J_{//}}/{J_{\bot}}$, a finding that is verified using the finite-size extrapolation. As $g$ decreases, the excitation spectra gradually evolve from a degenerate magnon mode with continua to low-energy and high-energy branches. Particularly when $g$ is small enough, like $0.02$, the high-energy spectrum further splits into characteristic doublon ($\approx J_{\bot}$) and quarton ($\approx 1.5 J_{\bot}$) spectral bands. Moreover, the accuracy of the magnetic excitations is confirmed through the SpinW software package and the dispersion relations derived through the linear spin wave theory. Our results provide an important reference for experiments, which can be directly compared with experimental data from inelastic neutron scattering results to verify and guide the accuracy of experimental detection. △ Less

Submitted 22 June, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

Comments: 8 pages, 6 figures

Journal ref: Frontiers of Physics, 2025, 20(5): 054501

arXiv:2503.17223 [pdf, other]

Electronic structures and multi-orbital models of La$_3$Ni$_2$O$_7$ thin films at ambient pressure

Authors: Xunwu Hu, Wenyuan Qiu, Cun-Qun Chen, Zhihui Luo, Dao-Xin Yao

Abstract: The recent discovery of superconductivity with a transition temperature $T_c$ exceeding 40 K in La$_3$Ni$_2$O$_7$ and (La,Pr)$_{3}$Ni$_2$O$_7$ thin films at ambient pressure marks a significant breakthrough in the field of nickelate superconductors. Using density functional theory (DFT), we propose a double-stacked two-orbital effective model for La$_3$Ni$_2$O$_7$ thin film based on the Ni$-e_g$ o… ▽ More The recent discovery of superconductivity with a transition temperature $T_c$ exceeding 40 K in La$_3$Ni$_2$O$_7$ and (La,Pr)$_{3}$Ni$_2$O$_7$ thin films at ambient pressure marks a significant breakthrough in the field of nickelate superconductors. Using density functional theory (DFT), we propose a double-stacked two-orbital effective model for La$_3$Ni$_2$O$_7$ thin film based on the Ni$-e_g$ orbitals. Our analysis of the Fermi surface reveals three electron pockets ($α,α^{\prime},β$) and two hole pockets ($γ,γ^{\prime}$), where the additional $α^{\prime}$ and $γ^{\prime}$ pockets arise from inter-stack interactions. Furthermore, we introduce a high-energy model that incorporates O$-p$ orbitals to facilitate future studies. Calculations of spin susceptibility within the random phase approximation (RPA) indicate that magnetic correlations are enhanced by nesting of the $γ$ pocket, which is predominantly derived from the Ni$-d_{z^2}$ orbital. Our results provide a theoretical foundation for understanding the electronic and magnetic properties of La$_3$Ni$_2$O$_7$ thin films. △ Less

Submitted 3 April, 2025; v1 submitted 21 March, 2025; originally announced March 2025.

arXiv:2503.12367 [pdf]

Integrating mobile and fixed monitoring data for high-resolution PM2.5 mapping using machine learning

Authors: Rui Xu, Dawen Yao, Yuzhuang Pian, Ruhui Cao, Yixin Fu, Xinru Yang, Ting Gan, Yonghong Liu

Abstract: Constructing high resolution air pollution maps at lower cost is crucial for sustainable city management and public health risk assessment. However, traditional fixed-site monitoring lacks spatial coverage, while mobile low-cost sensors exhibit significant data instability. This study integrates PM2.5 data from 320 taxi-mounted mobile low-cost sensors and 52 fixed monitoring stations to address th… ▽ More Constructing high resolution air pollution maps at lower cost is crucial for sustainable city management and public health risk assessment. However, traditional fixed-site monitoring lacks spatial coverage, while mobile low-cost sensors exhibit significant data instability. This study integrates PM2.5 data from 320 taxi-mounted mobile low-cost sensors and 52 fixed monitoring stations to address these limitations. By employing the machine learning methods, an appropriate mapping relationship was established between fixed and mobile monitoring concentration. The resulting pollution maps achieved 500-meter spatial and 5-minute temporal resolutions, showing close alignment with fixed monitoring data (+4.35% bias) but significant deviation from raw mobile data (-31.77%). The fused map exhibits the fine-scale spatial variability also observed in the mobile pollution map, while showing the stable temporal variability closer to that of the fixed pollution map (fixed: 1.12 plus or minus 0.73%, mobile: 3.15 plus or minus 2.44%, mapped: 1.01 plus or minus 0.65%). These findings demonstrate the potential of large-scale mobile low-cost sensor networks for high-resolution air quality mapping, supporting targeted urban environmental governance and health risk mitigation. △ Less

Submitted 16 March, 2025; originally announced March 2025.

arXiv:2503.11720 [pdf, other]

Fine-Tuning Diffusion Generative Models via Rich Preference Optimization

Authors: Hanyang Zhao, Haoxian Chen, Yucheng Guo, Genta Indra Winata, Tingting Ou, Ziyu Huang, David D. Yao, Wenpin Tang

Abstract: We introduce Rich Preference Optimization (RPO), a novel pipeline that leverages rich feedback signals to improve the curation of preference pairs for fine-tuning text-to-image diffusion models. Traditional methods, like Diffusion-DPO, often rely solely on reward model labeling, which can be opaque, offer limited insights into the rationale behind preferences, and are prone to issues such as rewar… ▽ More We introduce Rich Preference Optimization (RPO), a novel pipeline that leverages rich feedback signals to improve the curation of preference pairs for fine-tuning text-to-image diffusion models. Traditional methods, like Diffusion-DPO, often rely solely on reward model labeling, which can be opaque, offer limited insights into the rationale behind preferences, and are prone to issues such as reward hacking or overfitting. In contrast, our approach begins with generating detailed critiques of synthesized images to extract reliable and actionable image editing instructions. By implementing these instructions, we create refined images, resulting in synthetic, informative preference pairs that serve as enhanced tuning datasets. We demonstrate the effectiveness of our pipeline and the resulting datasets in fine-tuning state-of-the-art diffusion models. △ Less

Submitted 16 April, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

arXiv:2503.10195 [pdf, other]

ST-FlowNet: An Efficient Spiking Neural Network for Event-Based Optical Flow Estimation

Authors: Hongze Sun, Jun Wang, Wuque Cai, Duo Chen, Qianqian Liao, Jiayi He, Yan Cui, Dezhong Yao, Daqing Guo

Abstract: Spiking Neural Networks (SNNs) have emerged as a promising tool for event-based optical flow estimation tasks due to their ability to leverage spatio-temporal information and low-power capabilities. However, the performance of SNN models is often constrained, limiting their application in real-world scenarios. In this work, we address this gap by proposing a novel neural network architecture, ST-F… ▽ More Spiking Neural Networks (SNNs) have emerged as a promising tool for event-based optical flow estimation tasks due to their ability to leverage spatio-temporal information and low-power capabilities. However, the performance of SNN models is often constrained, limiting their application in real-world scenarios. In this work, we address this gap by proposing a novel neural network architecture, ST-FlowNet, specifically tailored for optical flow estimation from event-based data. The ST-FlowNet architecture integrates ConvGRU modules to facilitate cross-modal feature augmentation and temporal alignment of the predicted optical flow, improving the network's ability to capture complex motion dynamics. Additionally, to overcome the challenges associated with training SNNs, we introduce a novel approach to derive SNN models from pre-trained artificial neural networks (ANNs) through ANN-to-SNN conversion or our proposed BISNN method. Notably, the BISNN method alleviates the complexities involved in biological parameter selection, further enhancing the robustness of SNNs in optical flow estimation tasks. Extensive evaluations on three benchmark event-based datasets demonstrate that the SNN-based ST-FlowNet model outperforms state-of-the-art methods, delivering superior performance in accurate optical flow estimation across a diverse range of dynamic visual scenes. Furthermore, the inherent energy efficiency of SNN models is highlighted, establishing a compelling advantage for their practical deployment. Overall, our work presents a novel framework for optical flow estimation using SNNs and event-based data, contributing to the advancement of neuromorphic vision applications. △ Less

Submitted 27 April, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

Comments: 13 pages, 6 figures, 6 tables; This work has been submitted to Neural Networks for possible publication

arXiv:2503.07032 [pdf, other]

Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation

Authors: Zhi Qin, Qianhui Gui, Mouxiao Bian, Rui Wang, Hong Ge, Dandan Yao, Ziying Sun, Yuan Zhao, Yu Zhang, Hui Shi, Dongdong Wang, Chenxin Song, Shenghong Ju, Lihao Liu, Junjun He, Jie Xu, Yuan-Cheng Wang

Abstract: Medical imaging quality control (QC) is essential for accurate diagnosis, yet traditional QC methods remain labor-intensive and subjective. To address this challenge, in this study, we establish a standardized dataset and evaluation framework for medical imaging QC, systematically assessing large language models (LLMs) in image quality assessment and report standardization. Specifically, we first… ▽ More Medical imaging quality control (QC) is essential for accurate diagnosis, yet traditional QC methods remain labor-intensive and subjective. To address this challenge, in this study, we establish a standardized dataset and evaluation framework for medical imaging QC, systematically assessing large language models (LLMs) in image quality assessment and report standardization. Specifically, we first constructed and anonymized a dataset of 161 chest X-ray (CXR) radiographs and 219 CT reports for evaluation. Then, multiple LLMs, including Gemini 2.0-Flash, GPT-4o, and DeepSeek-R1, were evaluated based on recall, precision, and F1 score to detect technical errors and inconsistencies. Experimental results show that Gemini 2.0-Flash achieved a Macro F1 score of 90 in CXR tasks, demonstrating strong generalization but limited fine-grained performance. DeepSeek-R1 excelled in CT report auditing with a 62.23\% recall rate, outperforming other models. However, its distilled variants performed poorly, while InternLM2.5-7B-chat exhibited the highest additional discovery rate, indicating broader but less precise error detection. These findings highlight the potential of LLMs in medical imaging QC, with DeepSeek-R1 and Gemini 2.0-Flash demonstrating superior performance. △ Less

Submitted 10 March, 2025; originally announced March 2025.

arXiv:2503.05160 [pdf, other]

A pilot survey on globular clusters with the Wide Field Survey Telescope (WFST)

Authors: Zhen Wan, Lulu Fan, Xuzhi Li, Xu Kong, Tinggui Wang, Qingfeng Zhu, Ji-an Jiang, Minxuan Cai, Zelin Xu, Xianzhong Zheng, Jingquan Cheng, Feng Li, Ming Liang, Hao Liu, Wentao Luo, Jinlong Tang, Hairen Wang, Jian Wang, Yongquan Xue, Dazhi Yao, Hongfei Zhang, Wen Zhao

Abstract: We carry out an imaging survey of six globular clusters (GCs) with a limit magnitude to 22 mag at the 5 sigma level, down to the main sequence stars of the respective cluster, as one of the pilot observing program of the Wide Field Survey Telescope (WFST). This paper present the early results of this survey, where we investigate the tidal characters at the periphery of the clusters NGC 4147, NGC 5… ▽ More We carry out an imaging survey of six globular clusters (GCs) with a limit magnitude to 22 mag at the 5 sigma level, down to the main sequence stars of the respective cluster, as one of the pilot observing program of the Wide Field Survey Telescope (WFST). This paper present the early results of this survey, where we investigate the tidal characters at the periphery of the clusters NGC 4147, NGC 5024, NGC 5053, NGC 5272, NGC 5904 and NGC 6341. We present the estimated number density of cluster candidates and their spatial distribution. We confirm the presence of tidal arms in NGC 4147 and NGC 5904 and identify several intriguing potential tidal structures in NGC 4147, NGC 5024, NGC 5272, corroborated the elliptical morphology of the periphery of NGC 6341. WFST shows its ability to detect faint main-sequence stars of clusters beyond 15 kpc in helio-centric distance. Our findings underscore the WFST's capability for probing faint structural features in GCs, paving the way for future in-depth studies, especially for the search of the large scale tidal streams associated with the clusters with the future wide field survey. △ Less

Submitted 29 April, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

Comments: 13 pages, 8 figures. accepted by MNRAS. Comments are welcome

arXiv:2503.04684 [pdf, other]

Propagating Model Uncertainty through Filtering-based Probabilistic Numerical ODE Solvers

Authors: Dingling Yao, Filip Tronarp, Nathanael Bosch

Abstract: Filtering-based probabilistic numerical solvers for ordinary differential equations (ODEs), also known as ODE filters, have been established as efficient methods for quantifying numerical uncertainty in the solution of ODEs. In practical applications, however, the underlying dynamical system often contains uncertain parameters, requiring the propagation of this model uncertainty to the ODE solutio… ▽ More Filtering-based probabilistic numerical solvers for ordinary differential equations (ODEs), also known as ODE filters, have been established as efficient methods for quantifying numerical uncertainty in the solution of ODEs. In practical applications, however, the underlying dynamical system often contains uncertain parameters, requiring the propagation of this model uncertainty to the ODE solution. In this paper, we demonstrate that ODE filters, despite their probabilistic nature, do not automatically solve this uncertainty propagation problem. To address this limitation, we present a novel approach that combines ODE filters with numerical quadrature to properly marginalize over uncertain parameters, while accounting for both parameter uncertainty and numerical solver uncertainty. Experiments across multiple dynamical systems demonstrate that the resulting uncertainty estimates closely match reference solutions. Notably, we show how the numerical uncertainty from the ODE solver can help prevent overconfidence in the propagated uncertainty estimates, especially when using larger step sizes. Our results illustrate that probabilistic numerical methods can effectively quantify both numerical and parametric uncertainty in dynamical systems. △ Less

Submitted 6 March, 2025; originally announced March 2025.

arXiv:2503.02224 [pdf, other]

Role of $a_0(980)$ in the decays $D^{0} \rightarrow K^{+} K^{-} η$ and $π^{+} π^{-} η$

Authors: Sara Rahmani, Wei Liang, Yu-Wen Peng, Yu Lu, De-Liang Yao, Chu-Wen Xiao

Abstract: In present work, we study the reactions $D^{0} \rightarrow K^{+} K^{-} η$ and $D^0 \rightarrow π^{+} π^{-} η$, and find that the $a_0(980)$ state plays a dominant role. At the quark level, the external and internal $W$-emission mechanisms are taken into account, which can hadronize into the final states, and then the $a_0(980)$ state is generated from the final state interaction. Besides, the cont… ▽ More In present work, we study the reactions $D^{0} \rightarrow K^{+} K^{-} η$ and $D^0 \rightarrow π^{+} π^{-} η$, and find that the $a_0(980)$ state plays a dominant role. At the quark level, the external and internal $W$-emission mechanisms are taken into account, which can hadronize into the final states, and then the $a_0(980)$ state is generated from the final state interaction. Besides, the contributions of other intermediate resonances, such as $ρ(770)$ and $φ(1020)$, are also considered. We make a combined fit of the invariant mass spectra measured by the Belle and BESIII Collaborations, where the results are in good agreement with the experiments, and the signal of the $a_0(980)$ shows great significance. Besides, the antisymmetry data for the production of the $a_0(980)^+$ and $a_0(980)^-$ is described well in the combined fit. △ Less

Submitted 3 March, 2025; originally announced March 2025.

Comments: 19 pages, 8 figures

arXiv:2502.19168 [pdf, other]

Chiral Representation of the Nucleon Mass at Leading Two-loop Order

Authors: Ze-Rui Liang, Han-Xue Chen, Feng-Kun Guo, Zhi-Hui Guo, De-Liang Yao

Abstract: We calculate the nucleon mass in a manifestly relativistic baryon chiral perturbation theory up to the leading two-loop order. Through dimensional counting analysis, we perform the chiral expansion and verify the validity of the extended-on-mass-shell scheme at the two-loop level. As a result, we obtain the complete chiral representation of the nucleon mass up to $\mathcal{O}(p^5)$, which preserve… ▽ More We calculate the nucleon mass in a manifestly relativistic baryon chiral perturbation theory up to the leading two-loop order. Through dimensional counting analysis, we perform the chiral expansion and verify the validity of the extended-on-mass-shell scheme at the two-loop level. As a result, we obtain the complete chiral representation of the nucleon mass up to $\mathcal{O}(p^5)$, which preserves the original analytic properties and satisfies the correct power counting. The obtained chiral result is well-suited for chiral extrapolation and provides an excellent description of lattice QCD data across a broad range of pion masses. We find that the $\mathcal{O}(p^5)$ contribution is small, approximately $10$ MeV, and varies only mildly with increasing pion mass, demonstrating good convergence of the nucleon mass up to pion masses of about 350 MeV at two-loop order. △ Less

Submitted 21 March, 2025; v1 submitted 26 February, 2025; originally announced February 2025.

Comments: 38 pages, 9 figures, 2 tables, version accepted for publication in JHEP

arXiv:2502.12084 [pdf, ps, other]

VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Authors: Jianshu Zhang, Dongyu Yao, Renjie Pi, Paul Pu Liang, Yi R. Fung

Abstract: Visually linking matching cues is a crucial ability in daily life, such as identifying the same person in multiple photos based on their cues, even without knowing who they are. Despite the extensive knowledge that vision-language models (VLMs) possess, it remains largely unexplored whether they are capable of performing this fundamental task. To address this, we introduce \textbf{VLM2-Bench}, a b… ▽ More Visually linking matching cues is a crucial ability in daily life, such as identifying the same person in multiple photos based on their cues, even without knowing who they are. Despite the extensive knowledge that vision-language models (VLMs) possess, it remains largely unexplored whether they are capable of performing this fundamental task. To address this, we introduce \textbf{VLM2-Bench}, a benchmark designed to assess whether VLMs can Visually Link Matching cues, with 9 subtasks and over 3,000 test cases. Comprehensive evaluation across twelve VLMs, along with further analysis of various language-side and vision-side prompting methods, leads to a total of eight key findings. We identify critical challenges in models' ability to link visual cues, highlighting a significant performance gap. Based on these insights, we advocate for (i) enhancing core visual capabilities to improve adaptability and reduce reliance on prior knowledge, (ii) establishing clearer principles for integrating language-based reasoning in vision-centric tasks to prevent unnecessary biases, and (iii) shifting vision-text training paradigms toward fostering models' ability to independently structure and infer relationships among visual cues. △ Less

Submitted 2 July, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

Comments: Project Page: https://vlm2-bench.github.io/ Camera Ready version

arXiv:2502.12026 [pdf, other]

Analysis of the Order Flow Auction under Proposer-Builder Separation

Authors: Ruofei Ma, Wenpin Tang, David Yao

Abstract: In this paper, we consider the impact of the order flow auction (OFA) in the context of the proposer-builder separation (PBS) mechanism through a game-theoretic perspective. The OFA is designed to improve user welfare by redistributing maximal extractable value (MEV) to the users, in which two auctions take place: the order flow auction and the block-building auction. We formulate the OFA as a mul… ▽ More In this paper, we consider the impact of the order flow auction (OFA) in the context of the proposer-builder separation (PBS) mechanism through a game-theoretic perspective. The OFA is designed to improve user welfare by redistributing maximal extractable value (MEV) to the users, in which two auctions take place: the order flow auction and the block-building auction. We formulate the OFA as a multiplayer game, and focus our analyses on the case of two competing players (builders). We prove the existence and uniqueness of a Nash equilibrium for the two-player game, and derive a closed-form solution by solving a quartic equation. Our result shows that the builder with a competitive advantage pays a relatively lower cost, leading to centralization in the builder space. In contrast, the proposer's shares evolve as a martingale process, which implies decentralization in the proposer (or, validator) space. Our analyses rely on various tools from stochastic processes, convex optimization, and polynomial equations. We also conduct numerical studies to corroborate our findings, and explore other features of the OFA under the PBS mechanism. △ Less

Submitted 17 February, 2025; originally announced February 2025.

arXiv:2502.08518 [pdf, other]

FedMHO: Heterogeneous One-Shot Federated Learning Towards Resource-Constrained Edge Devices

Authors: Dezhong Yao, Yuexin Shi, Tongtong Liu, Zhiqiang Xu

Abstract: Federated Learning (FL) is increasingly adopted in edge computing scenarios, where a large number of heterogeneous clients operate under constrained or sufficient resources. The iterative training process in conventional FL introduces significant computation and communication overhead, which is unfriendly for resource-constrained edge devices. One-shot FL has emerged as a promising approach to mit… ▽ More Federated Learning (FL) is increasingly adopted in edge computing scenarios, where a large number of heterogeneous clients operate under constrained or sufficient resources. The iterative training process in conventional FL introduces significant computation and communication overhead, which is unfriendly for resource-constrained edge devices. One-shot FL has emerged as a promising approach to mitigate communication overhead, and model-heterogeneous FL solves the problem of diverse computing resources across clients. However, existing methods face challenges in effectively managing model-heterogeneous one-shot FL, often leading to unsatisfactory global model performance or reliance on auxiliary datasets. To address these challenges, we propose a novel FL framework named FedMHO, which leverages deep classification models on resource-sufficient clients and lightweight generative models on resource-constrained devices. On the server side, FedMHO involves a two-stage process that includes data generation and knowledge fusion. Furthermore, we introduce FedMHO-MD and FedMHO-SD to mitigate the knowledge-forgetting problem during the knowledge fusion stage, and an unsupervised data optimization solution to improve the quality of synthetic samples. Comprehensive experiments demonstrate the effectiveness of our methods, as they outperform state-of-the-art baselines in various experimental setups. △ Less

Submitted 12 February, 2025; originally announced February 2025.

arXiv:2502.04255 [pdf]

The effect of Carrier Doping and Thickness on the Electronic Structures of La$3$Ni$2$O$7$ Thin Films

Authors: Haoliang Shi, Zihao Huo, Guanlin Li, Hao Ma, Tian Cui, Dao-Xin Yao, Defang Duan

Abstract: Recently, the superconductivity of bilayer nickelate La3Ni2O7 has been observed in the thin film at ambient pressure, facilitated by epitaxial strain. Here, we investigate the effects of film thickness and carrier doping on the electronic structure of La3Ni2O7 thin films with thickness of 0.5-3 unit cells (UC) using first-principles calculations. At an optimal doping concentration of 0.4 holes per… ▽ More Recently, the superconductivity of bilayer nickelate La3Ni2O7 has been observed in the thin film at ambient pressure, facilitated by epitaxial strain. Here, we investigate the effects of film thickness and carrier doping on the electronic structure of La3Ni2O7 thin films with thickness of 0.5-3 unit cells (UC) using first-principles calculations. At an optimal doping concentration of 0.4 holes per formula unit for 2UC film, the Ni-"d" _("z" ^"2" ) interlayer bonding state metallizes, leading to the formation of γ pockets at the Fermi surface, which quantitatively matches the experimental results of angle-resolved photoemission spectroscopy (ARPES). These findings provide theoretical support for recent experimental observations of ambient-pressure superconductivity in La3Ni2O7 thin films and highlight the crucial role of film thickness and carrier doping in modulating electronic properties. △ Less

Submitted 26 March, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

Comments: 12 pages, 4 figures

arXiv:2502.01819 [pdf, other]

Score as Action: Fine-Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

Authors: Hanyang Zhao, Haoxian Chen, Ji Zhang, David D. Yao, Wenpin Tang

Abstract: Reinforcement learning from human feedback (RLHF), which aligns a diffusion model with input prompt, has become a crucial step in building reliable generative AI models. Most works in this area use a discrete-time formulation, which is prone to induced errors, and often not applicable to models with higher-order/black-box solvers. The objective of this study is to develop a disciplined approach to… ▽ More Reinforcement learning from human feedback (RLHF), which aligns a diffusion model with input prompt, has become a crucial step in building reliable generative AI models. Most works in this area use a discrete-time formulation, which is prone to induced errors, and often not applicable to models with higher-order/black-box solvers. The objective of this study is to develop a disciplined approach to fine-tune diffusion models using continuous-time RL, formulated as a stochastic control problem with a reward function that aligns the end result (terminal state) with input prompt. The key idea is to treat score matching as controls or actions, and thereby making connections to policy optimization and regularization in continuous-time RL. To carry out this idea, we lay out a new policy optimization framework for continuous-time RL, and illustrate its potential in enhancing the value networks design space via leveraging the structural property of diffusion models. We validate the advantages of our method by experiments in downstream tasks of fine-tuning large-scale Text2Image models of Stable Diffusion v1.5. △ Less

Submitted 16 April, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

Comments: arXiv admin note: text overlap with arXiv:2409.08400

arXiv:2501.18196 [pdf, other]

GDformer: Going Beyond Subsequence Isolation for Multivariate Time Series Anomaly Detection

Authors: Qingxiang Liu, Chenghao Liu, Sheng Sun, Di Yao, Yuxuan Liang

Abstract: Unsupervised anomaly detection of multivariate time series is a challenging task, given the requirements of deriving a compact detection criterion without accessing the anomaly points. The existing methods are mainly based on reconstruction error or association divergence, which are both confined to isolated subsequences with limited horizons, hardly promising unified series-level criterion. In th… ▽ More Unsupervised anomaly detection of multivariate time series is a challenging task, given the requirements of deriving a compact detection criterion without accessing the anomaly points. The existing methods are mainly based on reconstruction error or association divergence, which are both confined to isolated subsequences with limited horizons, hardly promising unified series-level criterion. In this paper, we propose the Global Dictionary-enhanced Transformer (GDformer) with a renovated dictionary-based cross attention mechanism to cultivate the global representations shared by all normal points in the entire series. Accordingly, the cross-attention maps reflect the correlation weights between the point and global representations, which naturally leads to the representation-wise similarity-based detection criterion. To foster more compact detection boundary, prototypes are introduced to capture the distribution of normal point-global correlation weights. GDformer consistently achieves state-of-the-art unsupervised anomaly detection performance on five real-world benchmark datasets. Further experiments validate the global dictionary has great transferability among various datasets. The code is available at https://github.com/yuppielqx/GDformer. △ Less

Submitted 9 May, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

arXiv:2501.10798 [pdf, ps, other]

Critical radii and suprema of random waves over Riemannian manifolds

Authors: Renjie Feng, Dong Yao, Robert J. Adler

Abstract: We study random waves on smooth, compact, Riemannian manifolds under the spherical ensemble. Our first main result shows that there is a positive universal limit for the critical radius of a specific deterministic embedding, defined via the eigenfunctions of the Laplace-Beltrami operator, of such manifolds into higher dimensional Euclidean spaces. This result enables the application of Weyl's tube… ▽ More We study random waves on smooth, compact, Riemannian manifolds under the spherical ensemble. Our first main result shows that there is a positive universal limit for the critical radius of a specific deterministic embedding, defined via the eigenfunctions of the Laplace-Beltrami operator, of such manifolds into higher dimensional Euclidean spaces. This result enables the application of Weyl's tube formula to derive the tail probabilities for the suprema of random waves. Consequently, the estimate for the expectation of the Euler characteristic of the excursion set follows directly. △ Less

Submitted 18 January, 2025; originally announced January 2025.

arXiv:2501.10766 [pdf, other]

doi 10.1103/PhysRevB.111.134414

Theory of spin magnetization driven by chiral phonons

Authors: Dapeng Yao, Shuichi Murakami

Abstract: We construct a general theory of spin magnetization driven by chiral phonons under an adiabatic process, in which atoms rotate around their equilibrium positions with a low phonon frequency. Here the spin magnetization originates from the modulated electronic states with spin-orbital coupling by atomic rotations. Under the adiabatic approximation, the time-dependent spin magnetization can be calcu… ▽ More We construct a general theory of spin magnetization driven by chiral phonons under an adiabatic process, in which atoms rotate around their equilibrium positions with a low phonon frequency. Here the spin magnetization originates from the modulated electronic states with spin-orbital coupling by atomic rotations. Under the adiabatic approximation, the time-dependent spin magnetization can be calculated by a Berry-phase method. In this paper, we focus on its time average, which is evaluated by assuming that the phonon displacement is small. As a result, the time average of the spin magnetization is concisely formulated in the form of the Berry curvature defined in the phonon-displacement space as an intrinsic property of atomic rotations. Our formula for spin magnetization reflects the chiral nature of phonons, and is convenient for $ab$ $initio$ calculations. △ Less

Submitted 28 March, 2025; v1 submitted 18 January, 2025; originally announced January 2025.

Comments: 9 pages, 4 figures

Journal ref: Phys. Rev. B 111, 134414 (2025)

arXiv:2501.05025 [pdf, other]

Microscopic origin of magnetoferroelectricity in monolayer NiBr$_{2}$ and NiI$_{2}$

Authors: Hui-Shi Yu, Xiao-Sheng Ni, Dao-Xin Yao, Kun Cao

Abstract: We investigate the magnetoelectric properties of the monolayer NiX$_{2}$ (X = Br, I) through first-principles calculations. Our calculations predict that the NiBr$_{2}$ monolayer exhibits a cycloidal magnetic ground state. For the NiI$_{2}$ monolayer, a proper-screw helical magnetic ground state with modulation vector $\boldsymbol{Q} = (q, 0, 0)$ is adopted, approximated based on experimental ob… ▽ More We investigate the magnetoelectric properties of the monolayer NiX$_{2}$ (X = Br, I) through first-principles calculations. Our calculations predict that the NiBr$_{2}$ monolayer exhibits a cycloidal magnetic ground state. For the NiI$_{2}$ monolayer, a proper-screw helical magnetic ground state with modulation vector $\boldsymbol{Q} = (q, 0, 0)$ is adopted, approximated based on experimental observations. The electric polarization in NiBr$_{2}$ shows a linear dependence on the spin-orbit coupling strength $λ_{\text{SOC}}$, which can be adequately described by the generalized Katsura-Nagaosa-Balatsky (gKNB) model, considering contributions from up to the third nearest-neighbor spin pairs. In contrast, the electric polarization in NiI$_{2}$ exhibits a distinct dependence on $q$ and $λ_{\text{SOC}}$, which cannot be fully explained by the gKNB mechanism alone. To address this, the $p$-$d$ hybridization mechanism is extended to NiI$_{2}$ to explain the observed behavior. The respective contributions from the $p$-$d$ hybridization and the gKNB mechanism in NiI$_{2}$ are then quantitatively evaluated. Overall, our work elucidates the microscopic mechanisms underlying multiferroicity in NiBr$_{2}$ and NiI$_{2}$ monolayers, with the conclusions readily applicable to their bulk forms. △ Less

Submitted 9 January, 2025; originally announced January 2025.

arXiv:2501.02455 [pdf]

Large upper critical fields and strong coupling superconductivity in the medium-entropy alloy (Ti1/3Hf1/3Ta1/3)1-xNbx

Authors: Longfu Li, Hongyan Tian, Xunwu Hu, Lingyong Zeng, Kuan Li, Peifeng Yu, Kangwang Wang, Rui Chen, Zaichen Xiang, Dao-Xin Yao, Huixia Luo

Abstract: Since the discovery of high-entropy superconductors in 2014, superconductivity has remained a focal point of interest in medium- and high-entropy alloys (MEAs-HEAs). Here, we report a series of (Ti0.33Hf0.33Ta0.33)1-xNbx MEA superconductors crystallized in the BCC structure, whose superconductivity was characterized by resistivity, magnetization, and specific heat measurements. The study found tha… ▽ More Since the discovery of high-entropy superconductors in 2014, superconductivity has remained a focal point of interest in medium- and high-entropy alloys (MEAs-HEAs). Here, we report a series of (Ti0.33Hf0.33Ta0.33)1-xNbx MEA superconductors crystallized in the BCC structure, whose superconductivity was characterized by resistivity, magnetization, and specific heat measurements. The study found that the (Ti0.33Hf0.33Ta0.33)1-xNbx MEAs exhibit bulk superconductivity. With the doping of Nb, the superconducting transition temperature (Tc) increases from 5.31 K to 9.11 K, and the normalized Cel jumps at Tc, and the logarithmically averaged characteristic phonon frequency exhibit dome-shaped curves. Results from specific heat measurements indicate that the superconductivity is of a strongly coupled s-wave type observed. Furthermore, at low Nb content, the upper critical field of the samples is larger than the Pauli paramagnetic limit. The strongly coupling behavior and large upper critical field in s-wave type (Ti0.33Hf0.33Ta0.33)1-xNbx MEA superconductors are unusual, as they typically occur in other unconventional superconductors. Thus, (Ti0.33Hf0.33Ta0.33)1-xNbx may have significant potential in the research and understanding of physical mechanisms. △ Less

Submitted 5 January, 2025; originally announced January 2025.

Comments: 20 pages, 5 figures

Journal ref: Superconductor Science and Technology,2025

arXiv:2501.01075 [pdf, other]

Studying the $B^{0} \to J/ψh_{1}$ decays with $h_{1}(1170)-h_{1}(1415)$ mixing in the perturbative QCD approach

Authors: Qin Chang, De-Hua Yao, Xin Liu

Abstract: In this paper, we study the $B^{0} \to J/ψh_{1}$ decays for the first time by using perturbative QCD approach up to the presently known next-to-leading order accuracy. The vertex corrections present significant contribution to the amplitude. In the calculation, the mixing between two light axial-vector mesons $h_{1}(1170)$ and $h_{1}(1415)$ are also studied in detail. The observables including the… ▽ More In this paper, we study the $B^{0} \to J/ψh_{1}$ decays for the first time by using perturbative QCD approach up to the presently known next-to-leading order accuracy. The vertex corrections present significant contribution to the amplitude. In the calculation, the mixing between two light axial-vector mesons $h_{1}(1170)$ and $h_{1}(1415)$ are also studied in detail. The observables including the branching ratios, polarization fractions and $CP$ asymmetries are predicted and discussed explicitly. It is found that the $B^{0} \to J/ψh_{1}$ decays have relatively large branching fractions, which are generally at the order of ${\cal O}(10^{-6}\sim10^{-3})$, and thus are possible to be observed by the LHCb and Belle-II experiments in the near future. Moreover, they are very sensitive to the mixing angle $θ$ and can be used to test the values of $θ$. In addition, some ratios between the branching fractions of $B^{0} \to J/ψh_{1}$ decays can provide much stronger constraints on $θ$ due to their relatively small theoretical errors. The $B^{0} \to J/ψh_{1}$ decays are generally dominated by the longitudinal polarization contributions, specifically, $f_{L}(B^{0} \to J/ψh_{1})>80\%$, except for the case that $θ\sim 35^\circ$ and $-55^\circ$. Unfortunately, the direct $CP$ asymmetries of $B^{0} \to J/ψh_{1}$ decays are too small to be observed soon even if the effect of $θ$ is considered. The future precise measurements on $B^{0} \to J/ψh_{1}$ decays are expected for testing these theoretical findings and exploring the interesting nature of $h_{1}(1170)$ and $h_{1}(1415)$. △ Less

Submitted 18 March, 2025; v1 submitted 2 January, 2025; originally announced January 2025.

Comments: 23 pages, 6 figures, 8 tables

arXiv:2412.18820 [pdf, other]

CausalTAD: Causal Implicit Generative Model for Debiased Online Trajectory Anomaly Detection

Authors: Wenbin Li, Di Yao, Chang Gong, Xiaokai Chu, Quanliang Jing, Xiaolei Zhou, Yuxuan Zhang, Yunxia Fan, Jingping Bi

Abstract: Trajectory anomaly detection, aiming to estimate the anomaly risk of trajectories given the Source-Destination (SD) pairs, has become a critical problem for many real-world applications. Existing solutions directly train a generative model for observed trajectories and calculate the conditional generative probability $P({T}|{C})$ as the anomaly risk, where ${T}$ and ${C}$ represent the trajectory… ▽ More Trajectory anomaly detection, aiming to estimate the anomaly risk of trajectories given the Source-Destination (SD) pairs, has become a critical problem for many real-world applications. Existing solutions directly train a generative model for observed trajectories and calculate the conditional generative probability $P({T}|{C})$ as the anomaly risk, where ${T}$ and ${C}$ represent the trajectory and SD pair respectively. However, we argue that the observed trajectories are confounded by road network preference which is a common cause of both SD distribution and trajectories. Existing methods ignore this issue limiting their generalization ability on out-of-distribution trajectories. In this paper, we define the debiased trajectory anomaly detection problem and propose a causal implicit generative model, namely CausalTAD, to solve it. CausalTAD adopts do-calculus to eliminate the confounding bias of road network preference and estimates $P({T}|do({C}))$ as the anomaly criterion. Extensive experiments show that CausalTAD can not only achieve superior performance on trained trajectories but also generally improve the performance of out-of-distribution data, with improvements of $2.1\% \sim 5.7\%$ and $10.6\% \sim 32.7\%$ respectively. △ Less

Submitted 25 December, 2024; originally announced December 2024.

Comments: Accepted by ICDE 2024

arXiv:2412.16955 [pdf, other]

NumbOD: A Spatial-Frequency Fusion Attack Against Object Detectors

Authors: Ziqi Zhou, Bowen Li, Yufei Song, Zhifei Yu, Shengshan Hu, Wei Wan, Leo Yu Zhang, Dezhong Yao, Hai Jin

Abstract: With the advancement of deep learning, object detectors (ODs) with various architectures have achieved significant success in complex scenarios like autonomous driving. Previous adversarial attacks against ODs have been focused on designing customized attacks targeting their specific structures (e.g., NMS and RPN), yielding some results but simultaneously constraining their scalability. Moreover,… ▽ More With the advancement of deep learning, object detectors (ODs) with various architectures have achieved significant success in complex scenarios like autonomous driving. Previous adversarial attacks against ODs have been focused on designing customized attacks targeting their specific structures (e.g., NMS and RPN), yielding some results but simultaneously constraining their scalability. Moreover, most efforts against ODs stem from image-level attacks originally designed for classification tasks, resulting in redundant computations and disturbances in object-irrelevant areas (e.g., background). Consequently, how to design a model-agnostic efficient attack to comprehensively evaluate the vulnerabilities of ODs remains challenging and unresolved. In this paper, we propose NumbOD, a brand-new spatial-frequency fusion attack against various ODs, aimed at disrupting object detection within images. We directly leverage the features output by the OD without relying on its internal structures to craft adversarial examples. Specifically, we first design a dual-track attack target selection strategy to select high-quality bounding boxes from OD outputs for targeting. Subsequently, we employ directional perturbations to shift and compress predicted boxes and change classification results to deceive ODs. Additionally, we focus on manipulating the high-frequency components of images to confuse ODs' attention on critical objects, thereby enhancing the attack efficiency. Our extensive experiments on nine ODs and two datasets show that NumbOD achieves powerful attack performance and high stealthiness. △ Less

Submitted 22 December, 2024; originally announced December 2024.

Comments: Accepted by AAAI 2025

arXiv:2412.16581 [pdf, other]

Effective and Efficient Representation Learning for Flight Trajectories

Authors: Shuo Liu, Wenbin Li, Di Yao, Jingping Bi

Abstract: Flight trajectory data plays a vital role in the traffic management community, especially for downstream tasks such as trajectory prediction, flight recognition, and anomaly detection. Existing works often utilize handcrafted features and design models for different tasks individually, which heavily rely on domain expertise and are hard to extend. We argue that different flight analysis tasks shar… ▽ More Flight trajectory data plays a vital role in the traffic management community, especially for downstream tasks such as trajectory prediction, flight recognition, and anomaly detection. Existing works often utilize handcrafted features and design models for different tasks individually, which heavily rely on domain expertise and are hard to extend. We argue that different flight analysis tasks share the same useful features of the trajectory. Jointly learning a unified representation for flight trajectories could be beneficial for improving the performance of various tasks. However, flight trajectory representation learning (TRL) faces two primary challenges, \ie unbalanced behavior density and 3D spatial continuity, which disable recent general TRL methods. In this paper, we propose Flight2Vec , a flight-specific representation learning method to address these challenges. Specifically, a behavior-adaptive patching mechanism is used to inspire the learned representation to pay more attention to behavior-dense segments. Moreover, we introduce a motion trend learning technique that guides the model to memorize not only the precise locations, but also the motion trend to generate better representations. Extensive experimental results demonstrate that Flight2Vec significantly improves performance in downstream tasks such as flight trajectory prediction, flight recognition, and anomaly detection. △ Less

Submitted 21 December, 2024; originally announced December 2024.

Comments: Accepted by AAAI 2025

arXiv:2412.12601 [pdf, other]

Minute-cadence observations on Galactic plane with Wide Field Survey Telescope (WFST): Overview, methodology and early results

Authors: Jie Lin, Tinggui Wang, Minxuan Cai, Zhen Wan, Xuzhi Li, Lulu Fan, Qingfeng Zhu, Ji-an Jiang, Ning Jiang, Xu Kong, Zheyu Lin, Jiazheng Zhu, Zhengyan Liu, Jie Gao, Bin Li, Feng Li, Ming Liang, Hao Liu, Wei Liu, Wentao Luo, Jinlong Tang, Hairen Wang, Jian Wang, Yongquan Xue, Dazhi Yao , et al. (4 additional authors not shown)

Abstract: As the time-domain survey telescope of the highest survey power in the northern hemisphere currently, Wide Field Survey Telescope (WFST) is scheduled to hourly/daily/semi-weekly scan northern sky up to ~23 mag in four optical (ugri) bands. Unlike the observation cadences in the forthcoming regular survey missions, WFST performed "staring" observations toward Galactic plane in a cadence of… ▽ More As the time-domain survey telescope of the highest survey power in the northern hemisphere currently, Wide Field Survey Telescope (WFST) is scheduled to hourly/daily/semi-weekly scan northern sky up to ~23 mag in four optical (ugri) bands. Unlike the observation cadences in the forthcoming regular survey missions, WFST performed "staring" observations toward Galactic plane in a cadence of $\approx$1 minute for a total on-source time of about 13 hours, during the commissioning and pilot observation phases. Such an observation cadence is well applied in producing densely sampling light curves and hunting for stars exhibiting fast stellar variabilities. Here we introduce the primary methodologies in detecting variability, periodicity, and stellar flares among a half million sources from the minute-cadence observations, and present the WFST g-/r-band light curves generated from periodic variable stars and flaring stars. Benefit from high photometric precisions and deep detection limits of WFST, the observations have captured several rare variable stars, such as a variable hot white dwarf (WD) and an ellipsoidal WD binary candidate. By surveying the almost unexplored parameter spaces for variables, WFST will lead to new opportunities in discovering unique variable stars in the northern sky. △ Less

Submitted 16 March, 2025; v1 submitted 17 December, 2024; originally announced December 2024.

Comments: 20 pages, 12 figures, accepted by ApJS

arXiv:2412.10033 [pdf, other]

Timealign: A multi-modal object detection method for time misalignment fusing in autonomous driving

Authors: Zhihang Song, Lihui Peng, Jianming Hu, Danya Yao, Yi Zhang

Abstract: The multi-modal perception methods are thriving in the autonomous driving field due to their better usage of complementary data from different sensors. Such methods depend on calibration and synchronization between sensors to get accurate environmental information. There have already been studies about space-alignment robustness in autonomous driving object detection process, however, the research… ▽ More The multi-modal perception methods are thriving in the autonomous driving field due to their better usage of complementary data from different sensors. Such methods depend on calibration and synchronization between sensors to get accurate environmental information. There have already been studies about space-alignment robustness in autonomous driving object detection process, however, the research for time-alignment is relatively few. As in reality experiments, LiDAR point clouds are more challenging for real-time data transfer, our study used historical frames of LiDAR to better align features when the LiDAR data lags exist. We designed a Timealign module to predict and combine LiDAR features with observation to tackle such time misalignment based on SOTA GraphBEV framework. △ Less

Submitted 13 December, 2024; originally announced December 2024.

Comments: 8 pages, 3 figures

arXiv:2412.09936 [pdf, other]

CaLoRAify: Calorie Estimation with Visual-Text Pairing and LoRA-Driven Visual Language Models

Authors: Dongyu Yao, Keling Yao, Junhong Zhou, Yinghao Zhang

Abstract: The obesity phenomenon, known as the heavy issue, is a leading cause of preventable chronic diseases worldwide. Traditional calorie estimation tools often rely on specific data formats or complex pipelines, limiting their practicality in real-world scenarios. Recently, vision-language models (VLMs) have excelled in understanding real-world contexts and enabling conversational interactions, making… ▽ More The obesity phenomenon, known as the heavy issue, is a leading cause of preventable chronic diseases worldwide. Traditional calorie estimation tools often rely on specific data formats or complex pipelines, limiting their practicality in real-world scenarios. Recently, vision-language models (VLMs) have excelled in understanding real-world contexts and enabling conversational interactions, making them ideal for downstream tasks such as ingredient analysis. However, applying VLMs to calorie estimation requires domain-specific data and alignment strategies. To this end, we curated CalData, a 330K image-text pair dataset tailored for ingredient recognition and calorie estimation, combining a large-scale recipe dataset with detailed nutritional instructions for robust vision-language training. Built upon this dataset, we present CaLoRAify, a novel VLM framework aligning ingredient recognition and calorie estimation via training with visual-text pairs. During inference, users only need a single monocular food image to estimate calories while retaining the flexibility of agent-based conversational interaction. With Low-rank Adaptation (LoRA) and Retrieve-augmented Generation (RAG) techniques, our system enhances the performance of foundational VLMs in the vertical domain of calorie estimation. Our code and data are fully open-sourced at https://github.com/KennyYao2001/16824-CaLORAify. △ Less

Submitted 13 December, 2024; originally announced December 2024.

Comments: Disclaimer: This work is part of a course project and reflects ongoing exploration in the field of vision-language models and calorie estimation. Findings and conclusions are subject to further validation and refinement

MSC Class: 68T07; 68U35 ACM Class: I.2.10; I.2.6; I.5.4

Showing 1–50 of 466 results for author: Yao, D