Search | arXiv e-print repository

arXiv:2504.19467 [pdf]

BRIDGE: Benchmarking Large Language Models for Understanding Real-world Clinical Practice Text

Authors: Jiageng Wu, Bowen Gu, Ren Zhou, Kevin Xie, Doug Snyder, Yixing Jiang, Valentina Carducci, Richard Wyss, Rishi J Desai, Emily Alsentzer, Leo Anthony Celi, Adam Rodman, Sebastian Schneeweiss, Jonathan H. Chen, Santiago Romero-Brufau, Kueiyu Joshua Lin, Jie Yang

Abstract: Large language models (LLMs) hold great promise for medical applications and are evolving rapidly, with new models being released at an accelerated pace. However, current evaluations of LLMs in clinical contexts remain limited. Most existing benchmarks rely on medical exam-style questions or PubMed-derived text, failing to capture the complexity of real-world electronic health record (EHR) data. O… ▽ More Large language models (LLMs) hold great promise for medical applications and are evolving rapidly, with new models being released at an accelerated pace. However, current evaluations of LLMs in clinical contexts remain limited. Most existing benchmarks rely on medical exam-style questions or PubMed-derived text, failing to capture the complexity of real-world electronic health record (EHR) data. Others focus narrowly on specific application scenarios, limiting their generalizability across broader clinical use. To address this gap, we present BRIDGE, a comprehensive multilingual benchmark comprising 87 tasks sourced from real-world clinical data sources across nine languages. We systematically evaluated 52 state-of-the-art LLMs (including DeepSeek-R1, GPT-4o, Gemini, and Llama 4) under various inference strategies. With a total of 13,572 experiments, our results reveal substantial performance variation across model sizes, languages, natural language processing tasks, and clinical specialties. Notably, we demonstrate that open-source LLMs can achieve performance comparable to proprietary models, while medically fine-tuned LLMs based on older architectures often underperform versus updated general-purpose models. The BRIDGE and its corresponding leaderboard serve as a foundational resource and a unique reference for the development and evaluation of new LLMs in real-world clinical text understanding. The BRIDGE leaderboard: https://huggingface.co/spaces/YLab-Open/BRIDGE-Medical-Leaderboard △ Less

Submitted 30 April, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

arXiv:2504.19228 [pdf, ps, other]

Full analysis of CP violation induced by the decay angular correlations in four-body cascade decays of heavy hadrons

Authors: Zhen-Hua Zhang, Jian-Yu Yang, Xin-Heng Guo

Abstract: The violation of the charge-parity (CP) transformation symmetry, which although has been observed in plenty of pure meson decay processes, was only confirmed just very recently by the LHCb collaboration in the four-body decay of the heavy baryon $Λ_b^0$, $Λ_b^0\to p K^- π^+ π^-$, through a comparison of the decay branching ratio with that of the CP-conjugate process. However, the detailed dynamics… ▽ More The violation of the charge-parity (CP) transformation symmetry, which although has been observed in plenty of pure meson decay processes, was only confirmed just very recently by the LHCb collaboration in the four-body decay of the heavy baryon $Λ_b^0$, $Λ_b^0\to p K^- π^+ π^-$, through a comparison of the decay branching ratio with that of the CP-conjugate process. However, the detailed dynamics behind this CP asymmetry is obviously far from clear. In this paper, we propose a formalism for the full analysis of the decay angular correlations in four-body cascade decays of heavy hadrons which can provide more information about the CP violation in these decays. To illustrate this, we apply the decay angular correlation analysis of CP violation to another four-body decay channel that involve baryons, $B^0\to p\bar{p}K^+π^-$, which has also been investigated by the LHCb collaboration with no evidence of CP violation being found. Surprisingly, with the event yield extracted inversely from the published data of LHCb, we obtain non-zero CP asymmetries of about $10\%$ corresponding to the decay angular correlations at larger than $5σ$ confidence level, which are considerably larger than the CPA asymmetries observed in the $Λ_b^0\to p K^- π^+ π^-$ channel, indicating that CP violation could have been observed in processes involving baryons much earlier if the full analysis of angular correlations had been performed. We suggest our experimental colleagues to perform full decay angular correlation analyses of CP violation in four-body decays of heavy hadrons, including the above two decay channels. △ Less

Submitted 27 April, 2025; originally announced April 2025.

Comments: 19 pages, 1figure, 4 tables

arXiv:2504.19213 [pdf, other]

Measurements of branching fractions of $D^0\to K^- 3π^+2π^-$, $D^0\to K^- 2π^+π^-2π^0$ and $D^+\to K^- 3π^+π^-π^0$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (693 additional authors not shown)

Abstract: Utilizing $7.9\,\rm fb^{-1}$ of $e^+e^-$ collision data taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, we report the measurements of absolute branching fractions of the hadronic decays $D^0\to K^- 3π^+2π^-$, $D^0\to K^- 2π^+π^-2π^0$ and $D^+\to K^- 3π^+π^-π^0$. The $D^0\to K^- 3π^+2π^-$ decay is measured with improved precision, while the latter two decays are observed w… ▽ More Utilizing $7.9\,\rm fb^{-1}$ of $e^+e^-$ collision data taken with the BESIII detector at the center-of-mass energy of 3.773 GeV, we report the measurements of absolute branching fractions of the hadronic decays $D^0\to K^- 3π^+2π^-$, $D^0\to K^- 2π^+π^-2π^0$ and $D^+\to K^- 3π^+π^-π^0$. The $D^0\to K^- 3π^+2π^-$ decay is measured with improved precision, while the latter two decays are observed with statistical significance higher than $5σ$ for the first time. The absolute branching fractions of these decays are determined to be ${\mathcal B}(D^0\to K^- 3π^+2π^-)=( 1.35\pm 0.23\pm 0.08 )\times 10^{-4}$, ${\mathcal B}(D^0\to K^- 2π^+π^-2π^0)=( 19.0\pm 1.1\pm 1.5)\times 10^{-4}$, and ${\mathcal B}(D^+\to K^- 3π^+π^-π^0)=( 6.57\pm 0.69\pm 0.33)\times 10^{-4}$, where the first uncertainties are statistical and the second systematic. △ Less

Submitted 27 April, 2025; originally announced April 2025.

Comments: 12pages, 6 figures, 4 tables

Report number: BAM-00843

arXiv:2504.19119 [pdf, other]

MLICv2: Enhanced Multi-Reference Entropy Modeling for Learned Image Compression

Authors: Wei Jiang, Yongqi Zhai, Jiayu Yang, Feng Gao, Ronggang Wang

Abstract: Recent advancements in learned image compression (LIC) have yielded impressive performance gains. Notably, the learned image compression models with multi-reference entropy models (MLIC series) have significantly outperformed existing traditional image codecs such as the Versatile Video Coding (VVC) Intra. In this paper, we present MLICv2 and MLICv2$^+$, enhanced versions of the MLIC series, featu… ▽ More Recent advancements in learned image compression (LIC) have yielded impressive performance gains. Notably, the learned image compression models with multi-reference entropy models (MLIC series) have significantly outperformed existing traditional image codecs such as the Versatile Video Coding (VVC) Intra. In this paper, we present MLICv2 and MLICv2$^+$, enhanced versions of the MLIC series, featuring improved transform techniques, entropy modeling, and instance adaptability. For better transform, we introduce a simple token mixing transform block inspired by the meta transformer architecture, addressing the performance degradation at high bit-rates observed in previous MLIC series while maintaining computational efficiency. To enhance entropy modeling, we propose a hyperprior-guided global correlation prediction, enabling the capture of global contexts in the initial slice of the latent representation. We also develop a channel reweighting module to dynamically prioritize important channels within each context. Additionally, advanced positional embedding for context modeling and selective compression with guided optimization are investigated. To boost instance adaptability, we employ stochastic Gumbel annealing to iteratively refine the latent representation according to the rate-distortion optimization of a specific input image. This approach further enhances performance without impacting decoding speed. Experimental results demonstrate that our MLICv2 and MLICv2$^+$ achieve state-of-the-art performance, reducing Bjontegaard-Delta rate (BD-rate) by 16.54%, 21.61%, 16.05% and 20.46%, 24.35%, 19.14% respectively, compared to VTM-17.0 Intra on the Kodak, Tecnick, CLIC Pro Val dataset, respectively. △ Less

Submitted 27 April, 2025; originally announced April 2025.

Comments: Under Review

arXiv:2504.19108 [pdf, other]

A Multi-Language Perspective on the Robustness of LLM Code Generation

Authors: Fazle Rabbi, Zishuo Ding, Jinqiu Yang

Abstract: Large language models have gained significant traction and popularity in recent times, extending their usage to code-generation tasks. While this field has garnered considerable attention, the exploration of testing and evaluating the robustness of code generation models remains an ongoing endeavor. Previous studies have primarily focused on code generation models specifically for the Python langu… ▽ More Large language models have gained significant traction and popularity in recent times, extending their usage to code-generation tasks. While this field has garnered considerable attention, the exploration of testing and evaluating the robustness of code generation models remains an ongoing endeavor. Previous studies have primarily focused on code generation models specifically for the Python language, overlooking other widely used programming languages. In this research, we conduct a comprehensive comparative analysis to assess the robustness performance of several prominent code generation models. Furthermore, we investigate how their performance varies across different programming languages. To accomplish this, we introduce perturbations in four key areas of the prompt: DocString, function name, syntax, and format. We have compiled and released a dedicated dataset for this purpose. This work presents our experimental findings, shedding light on the performance of code generation models in various scenarios. △ Less

Submitted 1 May, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

arXiv:2504.19087 [pdf, ps, other]

Search for $η_{1}(1855)$ in $χ_{cJ}\toηηη^{\prime}$ decays

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (697 additional authors not shown)

Abstract: Based on a sample of $2.7\times10^{9}$ $ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, an analysis of the decay $ψ(3686)\toγχ_{cJ}, χ_{cJ}\toηηη^{\prime}$ is performed. The decay modes $χ_{c1}$ and $χ_{c2}\toηηη^{\prime}$ are observed for the first time, and their corresponding branching fractions are determined to be… ▽ More Based on a sample of $2.7\times10^{9}$ $ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, an analysis of the decay $ψ(3686)\toγχ_{cJ}, χ_{cJ}\toηηη^{\prime}$ is performed. The decay modes $χ_{c1}$ and $χ_{c2}\toηηη^{\prime}$ are observed for the first time, and their corresponding branching fractions are determined to be $\mathcal{B}(χ_{c1}\toηηη^{\prime}) = (1.39 \pm 0.13(\text{stat.}) \pm 0.09(\text{sys.})) \times 10^{-4}$ and $\mathcal{B}(χ_{c2}\toηηη^{\prime}) = (4.42 \pm 0.86(\text{stat.}) \pm 0.37(\text{sys.})) \times 10^{-5}$. An upper limit on the branching fraction of $χ_{c0}\toηηη^{\prime}$ is set as $2.64 \times 10^{-5}$ at 90\% confidence level (CL). A partial wave analysis (PWA) of the decay $χ_{c1}\toηηη^{\prime}$ is performed to search for the $1^{-+}$ exotic state $η_1(1855)$. The PWA result indicates that the structure in the $ηη^{\prime}$ mass spectrum is mainly attributed to the $f_0(1500)$, while in the $ηη$ mass spectrum, it is primarily the $0^{++}$ phase space. The upper limit of $\mathcal{B}(χ_{c1}\toη_{1}(1855)η) \cdot \mathcal{B}(η_{1}(1855)\toηη^{\prime})< 9.79 \times 10^{-5}$ is set based on the PWA at 90\% CL. △ Less

Submitted 26 April, 2025; originally announced April 2025.

arXiv:2504.19086 [pdf, other]

Boosting Single-domain Generalized Object Detection via Vision-Language Knowledge Interaction

Authors: Xiaoran Xu, Jiangang Yang, Wenyue Chong, Wenhui Shi, Shichu Sun, Jing Xing, Jian Liu

Abstract: Single-Domain Generalized Object Detection~(S-DGOD) aims to train an object detector on a single source domain while generalizing well to diverse unseen target domains, making it suitable for multimedia applications that involve various domain shifts, such as intelligent video surveillance and VR/AR technologies. With the success of large-scale Vision-Language Models, recent S-DGOD approaches expl… ▽ More Single-Domain Generalized Object Detection~(S-DGOD) aims to train an object detector on a single source domain while generalizing well to diverse unseen target domains, making it suitable for multimedia applications that involve various domain shifts, such as intelligent video surveillance and VR/AR technologies. With the success of large-scale Vision-Language Models, recent S-DGOD approaches exploit pre-trained vision-language knowledge to guide invariant feature learning across visual domains. However, the utilized knowledge remains at a coarse-grained level~(e.g., the textual description of adverse weather paired with the image) and serves as an implicit regularization for guidance, struggling to learn accurate region- and object-level features in varying domains. In this work, we propose a new cross-modal feature learning method, which can capture generalized and discriminative regional features for S-DGOD tasks. The core of our method is the mechanism of Cross-modal and Region-aware Feature Interaction, which simultaneously learns both inter-modal and intra-modal regional invariance through dynamic interactions between fine-grained textual and visual features. Moreover, we design a simple but effective strategy called Cross-domain Proposal Refining and Mixing, which aligns the position of region proposals across multiple domains and diversifies them, enhancing the localization ability of detectors in unseen scenarios. Our method achieves new state-of-the-art results on S-DGOD benchmark datasets, with improvements of +8.8\%~mPC on Cityscapes-C and +7.9\%~mPC on DWD over baselines, demonstrating its efficacy. △ Less

Submitted 26 April, 2025; originally announced April 2025.

arXiv:2504.19054 [pdf, ps, other]

Entrywise Approximate Matrix Inversion

Authors: Mehrdad Ghadiri, Junzhao Yang

Abstract: We study the bit complexity of inverting diagonally dominant matrices, which are associated with random walk quantities such as hitting times and escape probabilities. Such quantities can be exponentially small, even on undirected unit-weighted graphs. However, their nonnegativity suggests that they can be approximated entrywise, leading to a stronger notion of approximation than vector norm-based… ▽ More We study the bit complexity of inverting diagonally dominant matrices, which are associated with random walk quantities such as hitting times and escape probabilities. Such quantities can be exponentially small, even on undirected unit-weighted graphs. However, their nonnegativity suggests that they can be approximated entrywise, leading to a stronger notion of approximation than vector norm-based error. Under this notion of error, existing Laplacian solvers and fast matrix multiplication approaches have bit complexities of $mn^2$ and $n^{ω+1}$, respectively, where $m$ is the number of nonzero entries in the matrix, $n$ is its size, and $ω$ is the matrix multiplication exponent. We present algorithms that compute entrywise $\exp(ε)$-approximate inverses of row diagonally dominant $L$-matrices (RDDL) in two settings: (1) when the matrix entries are given in floating-point representation; (2) when they are given in fixed-point representation. For floating-point inputs, we present a cubic-time algorithm and show that it has an optimal running time under the all-pairs shortest paths (APSP) conjecture. For fixed-point inputs, we present several algorithms for solving linear systems and inverting RDDL and SDDM matrices, all with high probability. Omitting logarithmic factors: (1) For SDDM matrices, we provide an algorithm for solving a linear system with entrywise approximation guarantees using $\tilde{O}(m\sqrt{n})$ bit operations, and another for computing an entrywise approximate inverse using $\tilde{O}(mn)$ bit operations. (2) For RDDL matrices, we present an algorithm for solving a linear system using $\tilde{O}(mn^{1+o(1)})$ bit operations, and two algorithms for computing an entrywise approximate inverse: one using $\tilde{O}(n^{ω+0.5})$ bit operations, and the other using $\tilde{O}(mn^{1.5+o(1)})$ bit operations. △ Less

Submitted 26 April, 2025; originally announced April 2025.

Comments: 70 pages

MSC Class: 65Y20; 65F10; 65F05; 15A09 ACM Class: G.1.3; F.2.1

arXiv:2504.18616 [pdf, other]

Insights on Metal Enrichment and Environmental Effect at $z\approx5-7$ with JWST ASPIRE/EIGER and Chemical Evolution Model

Authors: Zihao Li, Koki Kakiichi, Lise Christensen, Zheng Cai, Avishai Dekel, Xiaohui Fan, Emanuele Paolo Farina, Hyunsung D. Jun, Zhaozhou Li, Mingyu Li, Maria Pudoka, Fengwu Sun, Maxime Trebitsch, Fabian Walter, Feige Wang, Jinyi Yang, Huanian Zhang, Siwei Zou

Abstract: We present the mass-metallicity relation (MZR) for a parent sample of 604 galaxies at $z=5.34-6.94$ with [\text{O}~\textsc{iii}] doublets detected, using the deep JWST/NIRCam wide field slitless spectroscopic (WFSS) observations in 26 quasar fields. The sample incorporates the full observations of 25 quasar fields from JWST Cycle 1 GO program ASPIRE and the quasar SDSS J0100+2802 from JWST EIGER p… ▽ More We present the mass-metallicity relation (MZR) for a parent sample of 604 galaxies at $z=5.34-6.94$ with [\text{O}~\textsc{iii}] doublets detected, using the deep JWST/NIRCam wide field slitless spectroscopic (WFSS) observations in 26 quasar fields. The sample incorporates the full observations of 25 quasar fields from JWST Cycle 1 GO program ASPIRE and the quasar SDSS J0100+2802 from JWST EIGER program. We identify 204 galaxies residing in overdense structures using friends-of-friends (FoF) algorithm. We estimate the electron temperature of $2.0^{+0.3}_{-0.4}\times10^4$ K from the Hg and $[\text{O}~\textsc{iii}]_{4363}$ lines in the stacked spectrum, indicating a metal-poor sample with median gas phase metallicity 12+$\log(\mathrm{O/H})=7.64^{+0.23}_{-0.11}$. With the most up-to-date strong line calibration based on NIRSpec observations, we find that the MZR shows a metal enhancement of $\sim0.2$ dex at high mass end in overdense environments. However, compared to the local Fundamental Metallicity Relation (FMR), our galaxy sample at $z>5$ shows a metal deficiency of $\sim0.2$ dex relative to FMR predictions. We explain the observed trend of FMR with a simple analytical model, and we favor dilution from intense gas accretion over outflow to explain the metallicity properties at $z>5$. Those high redshift galaxies are likely in a rapid gas accretion phase when their metal and gas contents are in a non-equilibrium state. According to model predictions, the protocluster members are closer to the gas equilibrium state than field galaxies and thus have higher metallicity and are closer to the local FMR. Our results suggest that the accelerated star formation during protocluster assembly likely plays a key role in shaping the observed MZR and FMR, indicating a potentially earlier onset of metal enrichment in overdense environments at $z\approx5-7$. △ Less

Submitted 25 April, 2025; originally announced April 2025.

Comments: 19 pages, 9 figures. Comments are welcome

arXiv:2504.18184 [pdf, ps, other]

Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels

Authors: Jia-Qi Yang, Lei Shi

Abstract: This paper investigates regularized stochastic gradient descent (SGD) algorithms for estimating nonlinear operators from a Polish space to a separable Hilbert space. We assume that the regression operator lies in a vector-valued reproducing kernel Hilbert space induced by an operator-valued kernel. Two significant settings are considered: an online setting with polynomially decaying step sizes and… ▽ More This paper investigates regularized stochastic gradient descent (SGD) algorithms for estimating nonlinear operators from a Polish space to a separable Hilbert space. We assume that the regression operator lies in a vector-valued reproducing kernel Hilbert space induced by an operator-valued kernel. Two significant settings are considered: an online setting with polynomially decaying step sizes and regularization parameters, and a finite-horizon setting with constant step sizes and regularization parameters. We introduce regularity conditions on the structure and smoothness of the target operator and the input random variables. Under these conditions, we provide a dimension-free convergence analysis for the prediction and estimation errors, deriving both expectation and high-probability error bounds. Our analysis demonstrates that these convergence rates are nearly optimal. Furthermore, we present a new technique for deriving bounds with high probability for general SGD schemes, which also ensures almost-sure convergence. Finally, we discuss potential extensions to more general operator-valued kernels and the encoder-decoder framework. △ Less

Submitted 25 April, 2025; originally announced April 2025.

Comments: 56 pages, 2 figures

arXiv:2504.18127 [pdf, other]

Salient Region-Guided Spacecraft Image Arbitrary-Scale Super-Resolution Network

Authors: Jingfan Yang, Hu Gao, Ying Zhang, Depeng Dang

Abstract: Spacecraft image super-resolution seeks to enhance low-resolution spacecraft images into high-resolution ones. Although existing arbitrary-scale super-resolution methods perform well on general images, they tend to overlook the difference in features between the spacecraft core region and the large black space background, introducing irrelevant noise. In this paper, we propose a salient region-gui… ▽ More Spacecraft image super-resolution seeks to enhance low-resolution spacecraft images into high-resolution ones. Although existing arbitrary-scale super-resolution methods perform well on general images, they tend to overlook the difference in features between the spacecraft core region and the large black space background, introducing irrelevant noise. In this paper, we propose a salient region-guided spacecraft image arbitrary-scale super-resolution network (SGSASR), which uses features from the spacecraft core salient regions to guide latent modulation and achieve arbitrary-scale super-resolution. Specifically, we design a spacecraft core region recognition block (SCRRB) that identifies the core salient regions in spacecraft images using a pre-trained saliency detection model. Furthermore, we present an adaptive-weighted feature fusion enhancement mechanism (AFFEM) to selectively aggregate the spacecraft core region features with general image features by dynamic weight parameter to enhance the response of the core salient regions. Experimental results demonstrate that the proposed SGSASR outperforms state-of-the-art approaches. △ Less

Submitted 25 April, 2025; originally announced April 2025.

arXiv:2504.18083 [pdf, other]

Automating Function-Level TARA for Automotive Full-Lifecycle Security

Authors: Yuqiao Yang, Yongzhao Zhang, Wenhao Liu, Jun Li, Pengtao Shi, DingYu Zhong, Jie Yang, Ting Chen, Sheng Cao, Yuntao Ren, Yongyue Wu, Xiaosong Zhang

Abstract: As modern vehicles evolve into intelligent and connected systems, their growing complexity introduces significant cybersecurity risks. Threat Analysis and Risk Assessment (TARA) has therefore become essential for managing these risks under mandatory regulations. However, existing TARA automation methods rely on static threat libraries, limiting their utility in the detailed, function-level analyse… ▽ More As modern vehicles evolve into intelligent and connected systems, their growing complexity introduces significant cybersecurity risks. Threat Analysis and Risk Assessment (TARA) has therefore become essential for managing these risks under mandatory regulations. However, existing TARA automation methods rely on static threat libraries, limiting their utility in the detailed, function-level analyses demanded by industry. This paper introduces DefenseWeaver, the first system that automates function-level TARA using component-specific details and large language models (LLMs). DefenseWeaver dynamically generates attack trees and risk evaluations from system configurations described in an extended OpenXSAM++ format, then employs a multi-agent framework to coordinate specialized LLM roles for more robust analysis. To further adapt to evolving threats and diverse standards, DefenseWeaver incorporates Low-Rank Adaptation (LoRA) fine-tuning and Retrieval-Augmented Generation (RAG) with expert-curated TARA reports. We validated DefenseWeaver through deployment in four automotive security projects, where it identified 11 critical attack paths, verified through penetration testing, and subsequently reported and remediated by the relevant automakers and suppliers. Additionally, DefenseWeaver demonstrated cross-domain adaptability, successfully applying to unmanned aerial vehicles (UAVs) and marine navigation systems. In comparison to human experts, DefenseWeaver outperformed manual attack tree generation across six assessment scenarios. Integrated into commercial cybersecurity platforms such as UAES and Xiaomi, DefenseWeaver has generated over 8,200 attack trees. These results highlight its ability to significantly reduce processing time, and its scalability and transformative impact on cybersecurity across industries. △ Less

Submitted 25 April, 2025; originally announced April 2025.

arXiv:2504.17980 [pdf, other]

Robust Poling and Frequency Conversion on Thin-Film Periodically Poled Lithium Tantalate

Authors: Anna Shelton, C. J. Xin, Keith Powell, Jiayu Yang, Shengyuan Lu, Neil Sinclair, Marko Loncar

Abstract: We explore a robust fabrication process for periodically-poled thin-film lithium tantalate (PP-TFLT) by systematically varying fabrication parameters and confirming the quality of inverted domains with second-harmonic microscopy (SHM). We find a periodic poling recipe that can be applied to both acoustic-grade and optical-grade film, electrode material, and presence of an oxide interlayer. By usin… ▽ More We explore a robust fabrication process for periodically-poled thin-film lithium tantalate (PP-TFLT) by systematically varying fabrication parameters and confirming the quality of inverted domains with second-harmonic microscopy (SHM). We find a periodic poling recipe that can be applied to both acoustic-grade and optical-grade film, electrode material, and presence of an oxide interlayer. By using a single high-voltage electrical pulse with peak voltage time of 10 ms or less and a ramp-down time of 90 s, rectangular poling domains are established and stabilized in the PP-TFLT. We employ our robust periodic poling process in a controllable pole-after-etch approach to produce PP-TFLT ridge waveguides with normalized second harmonic generation (SHG) conversion efficiencies of 208 %W-1cm-2 from 1550 nm to 775 nm in line with the theoretical value of 244 %W-1cm-2. This work establishes a high-performance poling process and demonstrates telecommunications band SHG for thin-film lithium tantalate, expanding the capabilities of the platform for frequency mixing applications in quantum photonics, sensing, and spectroscopy. △ Less

Submitted 24 April, 2025; originally announced April 2025.

Comments: 13 pages, 3 figures

arXiv:2504.17811 [pdf, other]

OmniSage: Large Scale, Multi-Entity Heterogeneous Graph Representation Learning

Authors: Anirudhan Badrinath, Alex Yang, Kousik Rajesh, Prabhat Agarwal, Jaewon Yang, Haoyu Chen, Jiajing Xu, Charles Rosenberg

Abstract: Representation learning, a task of learning latent vectors to represent entities, is a key task in improving search and recommender systems in web applications. Various representation learning methods have been developed, including graph-based approaches for relationships among entities, sequence-based methods for capturing the temporal evolution of user activities, and content-based models for le… ▽ More Representation learning, a task of learning latent vectors to represent entities, is a key task in improving search and recommender systems in web applications. Various representation learning methods have been developed, including graph-based approaches for relationships among entities, sequence-based methods for capturing the temporal evolution of user activities, and content-based models for leveraging text and visual content. However, the development of a unifying framework that integrates these diverse techniques to support multiple applications remains a significant challenge. This paper presents OmniSage, a large-scale representation framework that learns universal representations for a variety of applications at Pinterest. OmniSage integrates graph neural networks with content-based models and user sequence models by employing multiple contrastive learning tasks to effectively process graph data, user sequence data, and content signals. To support the training and inference of OmniSage, we developed an efficient infrastructure capable of supporting Pinterest graphs with billions of nodes. The universal representations generated by OmniSage have significantly enhanced user experiences on Pinterest, leading to an approximate 2.5% increase in sitewide repins (saves) across five applications. This paper highlights the impact of unifying representation learning methods, and we will open source the OmniSage code by the time of publication. △ Less

Submitted 1 May, 2025; v1 submitted 22 April, 2025; originally announced April 2025.

arXiv:2504.17395 [pdf, other]

SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting

Authors: Yiming Zhao, Guorong Li, Laiyun Qing, Amin Beheshti, Jian Yang, Michael Sheng, Yuankai Qi, Qingming Huang

Abstract: Open-world object counting leverages the robust text-image alignment of pre-trained vision-language models (VLMs) to enable counting of arbitrary categories in images specified by textual queries. However, widely adopted naive fine-tuning strategies concentrate exclusively on text-image consistency for categories contained in training, which leads to limited generalizability for unseen categories.… ▽ More Open-world object counting leverages the robust text-image alignment of pre-trained vision-language models (VLMs) to enable counting of arbitrary categories in images specified by textual queries. However, widely adopted naive fine-tuning strategies concentrate exclusively on text-image consistency for categories contained in training, which leads to limited generalizability for unseen categories. In this work, we propose a plug-and-play Semantic-Driven Visual Prompt Tuning framework (SDVPT) that transfers knowledge from the training set to unseen categories with minimal overhead in parameters and inference time. First, we introduce a two-stage visual prompt learning strategy composed of Category-Specific Prompt Initialization (CSPI) and Topology-Guided Prompt Refinement (TGPR). The CSPI generates category-specific visual prompts, and then TGPR distills latent structural patterns from the VLM's text encoder to refine these prompts. During inference, we dynamically synthesize the visual prompts for unseen categories based on the semantic correlation between unseen and training categories, facilitating robust text-image alignment for unseen categories. Extensive experiments integrating SDVPT with all available open-world object counting models demonstrate its effectiveness and adaptability across three widely used datasets: FSC-147, CARPK, and PUCPR+. △ Less

Submitted 24 April, 2025; originally announced April 2025.

arXiv:2504.17263 [pdf, other]

Precision Neural Network Quantization via Learnable Adaptive Modules

Authors: Wenqiang Zhou, Zhendong Yu, Xinyu Liu, Jiaming Yang, Rong Xiao, Tao Wang, Chenwei Tang, Jiancheng Lv

Abstract: Quantization Aware Training (QAT) is a neural network quantization technique that compresses model size and improves operational efficiency while effectively maintaining model performance. The paradigm of QAT is to introduce fake quantization operators during the training process, allowing the model to autonomously compensate for information loss caused by quantization. Making quantization paramet… ▽ More Quantization Aware Training (QAT) is a neural network quantization technique that compresses model size and improves operational efficiency while effectively maintaining model performance. The paradigm of QAT is to introduce fake quantization operators during the training process, allowing the model to autonomously compensate for information loss caused by quantization. Making quantization parameters trainable can significantly improve the performance of QAT, but at the cost of compromising the flexibility during inference, especially when dealing with activation values with substantially different distributions. In this paper, we propose an effective learnable adaptive neural network quantization method, called Adaptive Step Size Quantization (ASQ), to resolve this conflict. Specifically, the proposed ASQ method first dynamically adjusts quantization scaling factors through a trained module capable of accommodating different activations. Then, to address the rigid resolution issue inherent in Power of Two (POT) quantization, we propose an efficient non-uniform quantization scheme. We utilize the Power Of Square root of Two (POST) as the basis for exponential quantization, effectively handling the bell-shaped distribution of neural network weights across various bit-widths while maintaining computational efficiency through a Look-Up Table method (LUT). Extensive experimental results demonstrate that the proposed ASQ method is superior to the state-of-the-art QAT approaches. Notably that the ASQ is even competitive compared to full precision baselines, with its 4-bit quantized ResNet34 model improving accuracy by 1.2\% on ImageNet. △ Less

Submitted 24 April, 2025; originally announced April 2025.

arXiv:2504.17220 [pdf, other]

Does Knowledge Distillation Matter for Large Language Model based Bundle Generation?

Authors: Kaidong Feng, Zhu Sun, Jie Yang, Hui Fang, Xinghua Qu, Wenyuan Liu

Abstract: LLMs are increasingly explored for bundle generation, thanks to their reasoning capabilities and knowledge. However, deploying large-scale LLMs introduces significant efficiency challenges, primarily high computational costs during fine-tuning and inference due to their massive parameterization. Knowledge distillation (KD) offers a promising solution, transferring expertise from large teacher mode… ▽ More LLMs are increasingly explored for bundle generation, thanks to their reasoning capabilities and knowledge. However, deploying large-scale LLMs introduces significant efficiency challenges, primarily high computational costs during fine-tuning and inference due to their massive parameterization. Knowledge distillation (KD) offers a promising solution, transferring expertise from large teacher models to compact student models. This study systematically investigates knowledge distillation approaches for bundle generation, aiming to minimize computational demands while preserving performance. We explore three critical research questions: (1) how does the format of KD impact bundle generation performance? (2) to what extent does the quantity of distilled knowledge influence performance? and (3) how do different ways of utilizing the distilled knowledge affect performance? We propose a comprehensive KD framework that (i) progressively extracts knowledge (patterns, rules, deep thoughts); (ii) captures varying quantities of distilled knowledge through different strategies; and (iii) exploits complementary LLM adaptation techniques (in-context learning, supervised fine-tuning, combination) to leverage distilled knowledge in small student models for domain-specific adaptation and enhanced efficiency. Extensive experiments provide valuable insights into how knowledge format, quantity, and utilization methodologies collectively shape LLM-based bundle generation performance, exhibiting KD's significant potential for more efficient yet effective LLM-based bundle generation. △ Less

Submitted 23 April, 2025; originally announced April 2025.

arXiv:2504.16633 [pdf, other]

Unveiling Solitonic Collisions in Mechanical Metamaterials

Authors: Yasuhiro Miyazawa, Christopher Chong, Panayotis G. Kevrekidis, Jinkyu Yang

Abstract: Interactions between solitary waves have been pivotal to understanding nonlinear phenomena across various disciplines. The dynamics of rarefaction solitary waves holds great potential, yet their fundamental characteristics and interactions remain only partially understood through experimental means in mechanical metamaterials. Previous studies highlighted their existence and proposed applications,… ▽ More Interactions between solitary waves have been pivotal to understanding nonlinear phenomena across various disciplines. The dynamics of rarefaction solitary waves holds great potential, yet their fundamental characteristics and interactions remain only partially understood through experimental means in mechanical metamaterials. Previous studies highlighted their existence and proposed applications, such as waveguides, impact mitigation, and energy harvesting. Challenges, including energy dissipation and a lack of precise measurement techniques, have hindered deeper exploration, most notably of solitonic collisions. In this work, we provide a definitive platform for examining pure rarefaction solitons propagating through a strain-softening mechanical lattice, addressing these challenges. Employing a theoretical framework based on the Boussinesq approximation and multiple-scale analysis, we predict soliton behavior, including phase shifts resulting from head-on collisions. These theoretical insights are corroborated through numerical simulations and systematic experiments designed to generate and measure pure rarefaction solitons with high precision. Both symmetric and asymmetric collisions are examined, revealing practically elastic interaction behaviors and amplitude-dependent phase shifts. Furthermore, collision dynamics, such as speed and phase shifts during rarefaction soliton collisions, from the experimental results show agreement with theoretical and numerical models. These results validate our experimental platform and findings, underscoring the potential of mechanical rarefaction solitons as robust, controllable wave packets. This suggests a robust paradigm for exploring nonlinear wave interactions in mechanical systems, opening new application avenues in mechanical metamaterials, such as wave-based computing and advanced signal processing. △ Less

Submitted 20 May, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

arXiv:2504.16563 [pdf, other]

Enhancing LLM-Based Agents via Global Planning and Hierarchical Execution

Authors: Junjie Chen, Haitao Li, Jingli Yang, Yiqun Liu, Qingyao Ai

Abstract: Intelligent agent systems based on Large Language Models (LLMs) have shown great potential in real-world applications. However, existing agent frameworks still face critical limitations in task planning and execution, restricting their effectiveness and generalizability. Specifically, current planning methods often lack clear global goals, leading agents to get stuck in local branches, or produce… ▽ More Intelligent agent systems based on Large Language Models (LLMs) have shown great potential in real-world applications. However, existing agent frameworks still face critical limitations in task planning and execution, restricting their effectiveness and generalizability. Specifically, current planning methods often lack clear global goals, leading agents to get stuck in local branches, or produce non-executable plans. Meanwhile, existing execution mechanisms struggle to balance complexity and stability, and their limited action space restricts their ability to handle diverse real-world tasks. To address these limitations, we propose GoalAct, a novel agent framework that introduces a continuously updated global planning mechanism and integrates a hierarchical execution strategy. GoalAct decomposes task execution into high-level skills, including searching, coding, writing and more, thereby reducing planning complexity while enhancing the agents' adaptability across diverse task scenarios. We evaluate GoalAct on LegalAgentBench, a benchmark with multiple types of legal tasks that require the use of multiple types of tools. Experimental results demonstrate that GoalAct achieves state-of-the-art (SOTA) performance, with an average improvement of 12.22% in success rate. These findings highlight GoalAct's potential to drive the development of more advanced intelligent agent systems, making them more effective across complex real-world applications. Our code can be found at https://github.com/cjj826/GoalAct. △ Less

Submitted 29 April, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

arXiv:2504.16057 [pdf, other]

Automated Static Vulnerability Detection via a Holistic Neuro-symbolic Approach

Authors: Penghui Li, Songchen Yao, Josef Sarfati Korich, Changhua Luo, Jianjia Yu, Yinzhi Cao, Junfeng Yang

Abstract: Static vulnerability detection is still a challenging problem and demands excessive human efforts, e.g., manual curation of good vulnerability patterns. None of prior works, including classic program analysis or Large Language Model (LLM)-based approaches, have fully automated such vulnerability pattern generations with reasonable detection accuracy. In this paper, we design and implement, MoCQ, a… ▽ More Static vulnerability detection is still a challenging problem and demands excessive human efforts, e.g., manual curation of good vulnerability patterns. None of prior works, including classic program analysis or Large Language Model (LLM)-based approaches, have fully automated such vulnerability pattern generations with reasonable detection accuracy. In this paper, we design and implement, MoCQ, a novel holistic neuro-symbolic framework that combines the complementary strengths of LLMs and classical static analysis to enable scalable vulnerability detection. The key insight is that MoCQ leverages an LLM to automatically extract vulnerability patterns and translate them into detection queries, and then on static analysis to refine such queries in a feedback loop and eventually execute them for analyzing large codebases and mining vulnerabilities. We evaluate MoCQ on seven types of vulnerabilities spanning two programming languages. We found MoCQ-generated queries uncovered at least 12 patterns that were missed by experts. On a ground truth dataset, MoCQ achieved comparable precision and recall compared to expert-crafted queries. Moreover, MoCQ has identified seven previously unknown vulnerabilities in real-world applications, demonstrating its practical effectiveness. We have responsibly disclosed them to the corresponding developers. △ Less

Submitted 23 April, 2025; v1 submitted 22 April, 2025; originally announced April 2025.

arXiv:2504.15756 [pdf, other]

DSDNet: Raw Domain Demoiréing via Dual Color-Space Synergy

Authors: Qirui Yang, Fangpu Zhang, Yeying Jin, Qihua Cheng, Pengtao Jiang, Huanjing Yue, Jingyu Yang

Abstract: With the rapid advancement of mobile imaging, capturing screens using smartphones has become a prevalent practice in distance learning and conference recording. However, moiré artifacts, caused by frequency aliasing between display screens and camera sensors, are further amplified by the image signal processing pipeline, leading to severe visual degradation. Existing sRGB domain demoiréing methods… ▽ More With the rapid advancement of mobile imaging, capturing screens using smartphones has become a prevalent practice in distance learning and conference recording. However, moiré artifacts, caused by frequency aliasing between display screens and camera sensors, are further amplified by the image signal processing pipeline, leading to severe visual degradation. Existing sRGB domain demoiréing methods struggle with irreversible information loss, while recent two-stage raw domain approaches suffer from information bottlenecks and inference inefficiency. To address these limitations, we propose a single-stage raw domain demoiréing framework, Dual-Stream Demoiréing Network (DSDNet), which leverages the synergy of raw and YCbCr images to remove moiré while preserving luminance and color fidelity. Specifically, to guide luminance correction and moiré removal, we design a raw-to-YCbCr mapping pipeline and introduce the Synergic Attention with Dynamic Modulation (SADM) module. This module enriches the raw-to-sRGB conversion with cross-domain contextual features. Furthermore, to better guide color fidelity, we develop a Luminance-Chrominance Adaptive Transformer (LCAT), which decouples luminance and chrominance representations. Extensive experiments demonstrate that DSDNet outperforms state-of-the-art methods in both visual quality and quantitative evaluation, and achieves an inference speed $\mathrm{\textbf{2.4x}}$ faster than the second-best method, highlighting its practical advantages. We provide an anonymous online demo at https://xxxxxxxxdsdnet.github.io/DSDNet/. △ Less

Submitted 22 April, 2025; originally announced April 2025.

arXiv:2504.15320 [pdf, other]

Efficient and Safe Planner for Automated Driving on Ramps Considering Unsatisfication

Authors: Qinghao Li, Zhen Tian, Xiaodan Wang, Jinming Yang, Zhihao Lin

Abstract: Automated driving on ramps presents significant challenges due to the need to balance both safety and efficiency during lane changes. This paper proposes an integrated planner for automated vehicles (AVs) on ramps, utilizing an unsatisfactory level metric for efficiency and arrow-cluster-based sampling for safety. The planner identifies optimal times for the AV to change lanes, taking into account… ▽ More Automated driving on ramps presents significant challenges due to the need to balance both safety and efficiency during lane changes. This paper proposes an integrated planner for automated vehicles (AVs) on ramps, utilizing an unsatisfactory level metric for efficiency and arrow-cluster-based sampling for safety. The planner identifies optimal times for the AV to change lanes, taking into account the vehicle's velocity as a key factor in efficiency. Additionally, the integrated planner employs arrow-cluster-based sampling to evaluate collision risks and select an optimal lane-changing curve. Extensive simulations were conducted in a ramp scenario to verify the planner's efficient and safe performance. The results demonstrate that the proposed planner can effectively select an appropriate lane-changing time point and a safe lane-changing curve for AVs, without incurring any collisions during the maneuver. △ Less

Submitted 20 April, 2025; originally announced April 2025.

Comments: The 45th IEEE International Conference on Distributed Computing Systems Workshop (ICDCSW) has accepted this paper (https://icdcs2025.icdcs.org/accepted-papers/ In Conjunction Events/ Page 4/ Number 174)

arXiv:2504.15284 [pdf, other]

EditLord: Learning Code Transformation Rules for Code Editing

Authors: Weichen Li, Albert Jan, Baishakhi Ray, Chengzhi Mao, Junfeng Yang, Kexin Pei

Abstract: Code editing is a foundational task in software development, where its effectiveness depends on whether it introduces desired code property changes without changing the original code's intended functionality. Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps. Thus, they s… ▽ More Code editing is a foundational task in software development, where its effectiveness depends on whether it introduces desired code property changes without changing the original code's intended functionality. Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps. Thus, they suffer from suboptimal performance and lack of robustness and generalization. We introduce EditLord, a code editing framework that makes the code transformation steps explicit. Our key insight is to employ a language model (LM) as an inductive learner to extract code editing rules from the training code pairs as concise meta-rule sets. Such rule sets will be manifested for each training sample to augment them for finetuning or assist in prompting- and iterative-based code editing. EditLordoutperforms the state-of-the-art by an average of 22.7% in editing performance and 58.1% in robustness while achieving 20.2% higher functional correctness across critical software engineering and security applications, LM models, and editing modes. △ Less

Submitted 23 April, 2025; v1 submitted 10 March, 2025; originally announced April 2025.

arXiv:2504.15113 [pdf, ps, other]

Adaptive sieving with semismooth Newton proximal augmented Lagrangian algorithm for multi-task Lasso problems

Authors: Lanyu Lin, Yong-Jin Liu, Bo Wang, Junfeng Yang

Abstract: Multi-task learning enhances model generalization by jointly learning from related tasks. This paper focuses on the $\ell_{1,\infty}$-norm constrained multi-task learning problem, which promotes a shared feature representation while inducing sparsity in task-specific parameters. We propose an adaptive sieving (AS) strategy to efficiently generate a solution path for multi-task Lasso problems. Each… ▽ More Multi-task learning enhances model generalization by jointly learning from related tasks. This paper focuses on the $\ell_{1,\infty}$-norm constrained multi-task learning problem, which promotes a shared feature representation while inducing sparsity in task-specific parameters. We propose an adaptive sieving (AS) strategy to efficiently generate a solution path for multi-task Lasso problems. Each subproblem along the path is solved via an inexact semismooth Newton proximal augmented Lagrangian ({\sc Ssnpal}) algorithm, achieving an asymptotically superlinear convergence rate. By exploiting the Karush-Kuhn-Tucker (KKT) conditions and the inherent sparsity of multi-task Lasso solutions, the {\sc Ssnpal} algorithm solves a sequence of reduced subproblems with small dimensions. This approach enables our method to scale effectively to large problems. Numerical experiments on synthetic and real-world datasets demonstrate the superior efficiency and robustness of our algorithm compared to state-of-the-art solvers. △ Less

Submitted 21 April, 2025; originally announced April 2025.

arXiv:2504.14960 [pdf, other]

MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core

Authors: Dennis Liu, Zijie Yan, Xin Yao, Tong Liu, Vijay Korthikanti, Evan Wu, Shiqing Fan, Gao Deng, Hongxiao Bai, Jianbin Chang, Ashwath Aithal, Michael Andersch, Mohammad Shoeybi, Jiajie Yao, Chandler Zhou, David Wu, Xipeng Li, June Yang

Abstract: Mixture of Experts (MoE) models enhance neural network scalability by dynamically selecting relevant experts per input token, enabling larger model sizes while maintaining manageable computation costs. However, efficient training of large-scale MoE models across thousands of GPUs presents significant challenges due to limitations in existing parallelism strategies. We introduce an end-to-end train… ▽ More Mixture of Experts (MoE) models enhance neural network scalability by dynamically selecting relevant experts per input token, enabling larger model sizes while maintaining manageable computation costs. However, efficient training of large-scale MoE models across thousands of GPUs presents significant challenges due to limitations in existing parallelism strategies. We introduce an end-to-end training framework for large-scale MoE models that utilizes five-dimensional hybrid parallelism: Tensor Parallelism, Expert Parallelism, Context Parallelism, Data Parallelism, and Pipeline Parallelism. Central to our approach is MoE Parallel Folding, a novel strategy that decouples the parallelization of attention and MoE layers in Transformer models, allowing each layer type to adopt optimal parallel configurations. Additionally, we develop a flexible token-level dispatcher that supports both token-dropping and token-dropless MoE training across all five dimensions of parallelism. This dispatcher accommodates dynamic tensor shapes and coordinates different parallelism schemes for Attention and MoE layers, facilitating complex parallelism implementations. Our experiments demonstrate significant improvements in training efficiency and scalability. We achieve up to 49.3% Model Flops Utilization (MFU) for the Mixtral 8x22B model and 39.0% MFU for the Qwen2-57B-A14B model on H100 GPUs, outperforming existing methods. The framework scales efficiently up to 1,024 GPUs and maintains high performance with sequence lengths up to 128K tokens, validating its effectiveness for large-scale MoE model training. The code is available in Megatron-Core. △ Less

Submitted 23 April, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

arXiv:2504.14804 [pdf, ps, other]

Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends

Authors: Jiaxin GUO, Xiaoyu Chen, Zhiqiang Rao, Jinlong Yang, Zongyao Li, Hengchao Shang, Daimeng Wei, Hao Yang

Abstract: With the rapid development of deep learning technologies, the field of machine translation has witnessed significant progress, especially with the advent of large language models (LLMs) that have greatly propelled the advancement of document-level translation. However, accurately evaluating the quality of document-level translation remains an urgent issue. This paper first introduces the developme… ▽ More With the rapid development of deep learning technologies, the field of machine translation has witnessed significant progress, especially with the advent of large language models (LLMs) that have greatly propelled the advancement of document-level translation. However, accurately evaluating the quality of document-level translation remains an urgent issue. This paper first introduces the development status of document-level translation and the importance of evaluation, highlighting the crucial role of automatic evaluation metrics in reflecting translation quality and guiding the improvement of translation systems. It then provides a detailed analysis of the current state of automatic evaluation schemes and metrics, including evaluation methods with and without reference texts, as well as traditional metrics, Model-based metrics and LLM-based metrics. Subsequently, the paper explores the challenges faced by current evaluation methods, such as the lack of reference diversity, dependence on sentence-level alignment information, and the bias, inaccuracy, and lack of interpretability of the LLM-as-a-judge method. Finally, the paper looks ahead to the future trends in evaluation methods, including the development of more user-friendly document-level evaluation methods and more robust LLM-as-a-judge methods, and proposes possible research directions, such as reducing the dependency on sentence-level information, introducing multi-level and multi-granular evaluation approaches, and training models specifically for machine translation evaluation. This study aims to provide a comprehensive analysis of automatic evaluation for document-level translation and offer insights into future developments. △ Less

Submitted 20 April, 2025; originally announced April 2025.

arXiv:2504.14790 [pdf, other]

Enhanced Data-driven Topology Design Methodology with Multi-level Mesh and Correlation-based Mutation for Stress-related Multi-objective Optimization

Authors: Jun Yang, Shintaro Yamasaki

Abstract: Topology optimization (TO) serves as a widely applied structural design approach to tackle various engineering problems. Nevertheless, sensitivity-based TO methods usually struggle with solving strongly nonlinear optimization problems. By leveraging high capacity of deep generative model, which is an influential machine learning technique, the sensitivity-free data-driven topology design (DDTD) me… ▽ More Topology optimization (TO) serves as a widely applied structural design approach to tackle various engineering problems. Nevertheless, sensitivity-based TO methods usually struggle with solving strongly nonlinear optimization problems. By leveraging high capacity of deep generative model, which is an influential machine learning technique, the sensitivity-free data-driven topology design (DDTD) methodology is regarded as an effective means of overcoming these issues. The DDTD methodology depends on initial dataset with a certain regularity, making its results highly sensitive to initial dataset quality. This limits its effectiveness and generalizability, especially for optimization problems without priori information. In this research, we proposed a multi-level mesh DDTD-based method with correlation-based mutation module to escape from the limitation of the quality of the initial dataset on the results and enhance computational efficiency. The core is to employ a correlation-based mutation module to assign new geometric features with physical meaning to the generated data, while utilizing a multi-level mesh strategy to progressively enhance the refinement of the structural representation, thus avoiding the maintenance of a high degree-of-freedom (DOF) representation throughout the iterative process. The proposed multi-level mesh DDTD-based method can be driven by a low quality initial dataset without the need for time-consuming construction of a specific dataset, thus significantly increasing generality and reducing application difficulty, while further lowering computational cost of DDTD methodology. Various comparison experiments with the traditional sensitivity-based TO methods on stress-related strongly nonlinear problems demonstrate the generality and effectiveness of the proposed method. △ Less

Submitted 20 April, 2025; originally announced April 2025.

Comments: 23 pages, 22 figures

arXiv:2504.14750 [pdf, other]

Data-Driven Evolutionary Game-Based Model Predictive Control for Hybrid Renewable Energy Dispatch in Autonomous Ships

Authors: Yaoze Liu, Zhen Tian, Jinming Yang, Zhihao Lin

Abstract: In this paper, we propose a data-driven Evolutionary Game-Based Model Predictive Control (EG-MPC) framework for the energy dispatch of a hybrid renewable energy system powering an autonomous ship. The system integrates solar photovoltaic and wind turbine generation with battery energy storage and diesel backup power to ensure reliable operation. Given the uncertainties in renewable generation and… ▽ More In this paper, we propose a data-driven Evolutionary Game-Based Model Predictive Control (EG-MPC) framework for the energy dispatch of a hybrid renewable energy system powering an autonomous ship. The system integrates solar photovoltaic and wind turbine generation with battery energy storage and diesel backup power to ensure reliable operation. Given the uncertainties in renewable generation and dynamic energy demands, an optimal dispatch strategy is crucial to minimize operational costs while maintaining system reliability. To address these challenges, we formulate a cost minimization problem that considers both battery degradation costs and diesel fuel expenses, leveraging real-world data to enhance modeling accuracy. The EG-MPC approach integrates evolutionary game dynamics within a receding-horizon optimization framework, enabling adaptive and near-optimal control solutions in real time. Simulation results based on site-specific data demonstrate that the proposed method achieves cost-effective, reliable, and adaptive energy dispatch, outperforming conventional rule-based and standard MPC approaches, particularly under uncertainty. △ Less

Submitted 20 April, 2025; originally announced April 2025.

Comments: This paper has been accepted by the 2025 4th International Conference on New Energy System and Power Engineering (NESP 2025)

arXiv:2504.14747 [pdf, other]

Adaptive Field Effect Planner for Safe Interactive Autonomous Driving on Curved Roads

Authors: Qinghao Li, Zhen Tian, Xiaodan Wang, Jinming Yang, Zhihao Lin

Abstract: Autonomous driving has garnered significant attention for its potential to improve safety, traffic efficiency, and user convenience. However, the dynamic and complex nature of interactive driving poses significant challenges, including the need to navigate non-linear road geometries, handle dynamic obstacles, and meet stringent safety and comfort requirements. Traditional approaches, such as artif… ▽ More Autonomous driving has garnered significant attention for its potential to improve safety, traffic efficiency, and user convenience. However, the dynamic and complex nature of interactive driving poses significant challenges, including the need to navigate non-linear road geometries, handle dynamic obstacles, and meet stringent safety and comfort requirements. Traditional approaches, such as artificial potential fields (APF), often fall short in addressing these complexities independently, necessitating the development of integrated and adaptive frameworks. This paper presents a novel approach to autonomous vehicle navigation that integrates artificial potential fields, Frenet coordinates, and improved particle swarm optimization (IPSO). A dynamic risk field, adapted from traditional APF, is proposed to ensure interactive safety by quantifying risks and dynamically adjusting lane-changing intentions based on surrounding vehicle behavior. Frenet coordinates are utilized to simplify trajectory planning on non-straight roads, while an enhanced quintic polynomial trajectory generator ensures smooth and comfortable path transitions. Additionally, an IPSO algorithm optimizes trajectory selection in real time, balancing safety and user comfort within a feasible input range. The proposed framework is validated through extensive simulations and real-world scenarios, demonstrating its ability to navigate complex traffic environments, maintain safety margins, and generate smooth, dynamically feasible trajectories. △ Less

Submitted 20 April, 2025; originally announced April 2025.

Comments: The 45th IEEE International Conference on Distributed Computing Systems Workshop (ICDCSW) has accepted this paper (https://icdcs2025.icdcs.org/accepted-papers/ In Conjunction Events/ Page 4/ Number 175)

arXiv:2504.14533 [pdf, other]

An effective finite-range Gogny-type interaction for the quantum molecular dynamics like model

Authors: Meiqi Sun, Dandan Niu, Junping Yang, Ying Cui, Zhuxia Li, Qiang Zhao, Kai Zhao, Yingxun Zhang

Abstract: In this work, we propose an effective finite-range Gogny-type interaction that can be directly used in the quantum molecular dynamics (QMD) like model. Two methods for determining the parameters of the effective interaction are discussed. The first method establishes an approach to connect the conventional Gogny interaction in nuclear structure to that in heavy-ion collisions, the second method al… ▽ More In this work, we propose an effective finite-range Gogny-type interaction that can be directly used in the quantum molecular dynamics (QMD) like model. Two methods for determining the parameters of the effective interaction are discussed. The first method establishes an approach to connect the conventional Gogny interaction in nuclear structure to that in heavy-ion collisions, the second method allows for the description of the symmetry energy varying from the supersoft to stiff, as well as the momentum-dependent symmetry potential, exhibiting behaviors ranging from monotonic to non-monotonic variations. This effective interaction opens up opportunities for a deeper understanding of finite-range interactions and non-monotonic momentum-dependent symmetry potentials in future studies. △ Less

Submitted 20 April, 2025; originally announced April 2025.

Comments: 9 pages, 5 figures

arXiv:2504.14369 [pdf, other]

Sensitivity of the CUPID experiment to $0νββ$ decay of $^{100}$Mo

Authors: K. Alfonso, A. Armatol, C. Augier, F. T. Avignone III, O. Azzolini, A. S. Barabash, G. Bari, A. Barresi, D. Baudin, F. Bellini, G. Benato, L. Benussi, V. Berest, M. Beretta, L. Bergé, M. Bettelli, M. Biassoni, J. Billard, F. Boffelli, V. Boldrini, E. D. Brandani, C. Brofferio, C. Bucci, M. Buchynska, J. Camilleri , et al. (167 additional authors not shown)

Abstract: CUPID is a next-generation bolometric experiment to search for neutrinoless double-beta decay ($0νββ$) of $^{100}$Mo using Li$_2$MoO$_4$ scintillating crystals. It will operate 1596 crystals at $\sim$10 mK in the CUORE cryostat at the Laboratori Nazionali del Gran Sasso in Italy. Each crystal will be facing two Ge-based bolometric light detectors for $α$ rejection. We compute the discovery and the… ▽ More CUPID is a next-generation bolometric experiment to search for neutrinoless double-beta decay ($0νββ$) of $^{100}$Mo using Li$_2$MoO$_4$ scintillating crystals. It will operate 1596 crystals at $\sim$10 mK in the CUORE cryostat at the Laboratori Nazionali del Gran Sasso in Italy. Each crystal will be facing two Ge-based bolometric light detectors for $α$ rejection. We compute the discovery and the exclusion sensitivity of CUPID to $0νββ$ in a Frequentist and a Bayesian framework. This computation is done numerically based on pseudo-experiments. For the CUPID baseline scenario, with a background and an energy resolution of $1.0 \times 10^{-4}$ counts/keV/kg/yr and 5 keV FWHM at the Q-value, respectively, this results in a Bayesian exclusion sensitivity (90% c.i.) of $\hat{T}_{1/2} > 1.6^{+0.6}_{-0.5} \times 10^{27} \ \mathrm{yr}$, corresponding to the effective Majorana neutrino mass of $\hat{m}_{ββ} < \ 9.6$ -- $16.3 \ \mathrm{meV}$. The Frequentist discovery sensitivity (3$σ$) is $\hat{T}_{1/2}= 1.0 \times 10^{27} \ \mathrm{yr}$, corresponding to $\hat{m}_{ββ}= \ 12.2$ -- $20.6 \ \mathrm{meV}$. △ Less

Submitted 19 April, 2025; originally announced April 2025.

arXiv:2504.14246 [pdf, ps, other]

Logarithmic Crystalline Representations

Authors: Zhenmou Liu, Jinbang Yang, Kang Zuo

Abstract: In 1989, Faltings proved the comparison theorem between étale cohomology and crystalline cohomology by studying Fontaine-Faltings modules and crystalline representations. In his paper, he mentioned these modules and representations can be extended to the logarithmic context, but without detail. This note aims to explicitly present the construction of logarithmic Fontaine-Faltings modules and logar… ▽ More In 1989, Faltings proved the comparison theorem between étale cohomology and crystalline cohomology by studying Fontaine-Faltings modules and crystalline representations. In his paper, he mentioned these modules and representations can be extended to the logarithmic context, but without detail. This note aims to explicitly present the construction of logarithmic Fontaine-Faltings modules and logarithmic crystalline representations. △ Less

Submitted 19 April, 2025; originally announced April 2025.

Comments: 21 pages

arXiv:2504.13825 [pdf, other]

Feature Alignment and Representation Transfer in Knowledge Distillation for Large Language Models

Authors: Junjie Yang, Junhao Song, Xudong Han, Ziqian Bi, Tianyang Wang, Chia Xin Liang, Xinyuan Song, Yichao Zhang, Qian Niu, Benji Peng, Keyu Chen, Ming Liu

Abstract: Knowledge distillation (KD) is a technique for transferring knowledge from complex teacher models to simpler student models, significantly enhancing model efficiency and accuracy. It has demonstrated substantial advancements in various applications including image classification, object detection, language modeling, text classification, and sentiment analysis. Recent innovations in KD methods, suc… ▽ More Knowledge distillation (KD) is a technique for transferring knowledge from complex teacher models to simpler student models, significantly enhancing model efficiency and accuracy. It has demonstrated substantial advancements in various applications including image classification, object detection, language modeling, text classification, and sentiment analysis. Recent innovations in KD methods, such as attention-based approaches, block-wise logit distillation, and decoupling distillation, have notably improved student model performance. These techniques focus on stimulus complexity, attention mechanisms, and global information capture to optimize knowledge transfer. In addition, KD has proven effective in compressing large language models while preserving accuracy, reducing computational overhead, and improving inference speed. This survey synthesizes the latest literature, highlighting key findings, contributions, and future directions in knowledge distillation to provide insights for researchers and practitioners on its evolving role in artificial intelligence and machine learning. △ Less