Skip to main content

Showing 101–150 of 10,719 results for author: CHen, J

.
  1. arXiv:2506.01829  [pdf, ps, other

    cs.CL cs.AI cs.IR

    CiteEval: Principle-Driven Citation Evaluation for Source Attribution

    Authors: Yumo Xu, Peng Qi, Jifan Chen, Kunlun Liu, Rujun Han, Lan Liu, Bonan Min, Vittorio Castelli, Arshit Gupta, Zhiguo Wang

    Abstract: Citation quality is crucial in information-seeking systems, directly influencing trust and the effectiveness of information access. Current evaluation frameworks, both human and automatic, mainly rely on Natural Language Inference (NLI) to assess binary or ternary supportiveness from cited sources, which we argue is a suboptimal proxy for citation evaluation. In this work we introduce CiteEval, a… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: ACL 2025

  2. arXiv:2506.01738  [pdf, ps, other

    cs.CV

    STORM: Benchmarking Visual Rating of MLLMs with a Comprehensive Ordinal Regression Dataset

    Authors: Jinhong Wang, Shuo Tong, Jian liu, Dongqi Tang, Jintai Chen, Haochao Ying, Hongxia Xu, Danny Chen, Jian Wu

    Abstract: Visual rating is an essential capability of artificial intelligence (AI) for multi-dimensional quantification of visual content, primarily applied in ordinal regression (OR) tasks such as image quality assessment, facial age estimation, and medical image grading. However, current multi-modal large language models (MLLMs) under-perform in such visual rating ability while also suffering the lack of… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: underreview of NIPS2025 D&B track

  3. arXiv:2506.01616  [pdf, ps, other

    cs.AI

    MLA-Trust: Benchmarking Trustworthiness of Multimodal LLM Agents in GUI Environments

    Authors: Xiao Yang, Jiawei Chen, Jun Luo, Zhengwei Fang, Yinpeng Dong, Hang Su, Jun Zhu

    Abstract: The emergence of multimodal LLM-based agents (MLAs) has transformed interaction paradigms by seamlessly integrating vision, language, action and dynamic environments, enabling unprecedented autonomous capabilities across GUI applications ranging from web automation to mobile systems. However, MLAs introduce critical trustworthiness challenges that extend far beyond traditional language models' lim… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  4. arXiv:2506.01559  [pdf, ps, other

    quant-ph

    hqQUBO: A Hybrid-querying Quantum Optimization Model Validated with 16-qubits on an Ion Trap Quantum Computer for Life Science Applications

    Authors: Rong Chen, Quan-Xin Mei, Wen-Ding Zhao, Lin Yao, Hao-Xiang Yang, Shun-Yao Zhang, Jiao Chen, Hong-Lin Li

    Abstract: AlphaFold has achieved groundbreaking advancements in protein structure prediction, exerting profound influence across biology, medicine, and drug discovery. However, its reliance on multiple sequence alignment (MSA) is inherently time-consuming due to the NP-hard nature of constructing MSAs. Quantum computing emerges as a promising alternative, compared to classical computers, offering the potent… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  5. arXiv:2506.01502  [pdf, other

    cs.LG cs.AI stat.ML

    Learning of Population Dynamics: Inverse Optimization Meets JKO Scheme

    Authors: Mikhail Persiianov, Jiawei Chen, Petr Mokrov, Alexander Tyurin, Evgeny Burnaev, Alexander Korotin

    Abstract: Learning population dynamics involves recovering the underlying process that governs particle evolution, given evolutionary snapshots of samples at discrete time points. Recent methods frame this as an energy minimization problem in probability space and leverage the celebrated JKO scheme for efficient time discretization. In this work, we introduce $\texttt{iJKOnet}$, an approach that combines th… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  6. arXiv:2506.01316  [pdf, ps, other

    math.PR

    Non-conformality of large deviations of moving average of the random walk in strongly mixing environment

    Authors: Jiaming Chen

    Abstract: The quenched and annealed large deviations of the random walk in random environment are shown to conform on any compact set whenever the level of disorder is sufficiently low. In this work, we show that these two large deviations always disagree at some interior point of the natural domain of the random walk in strongly mixing environment, regardless of the level of disorder.

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: arXiv admin note: substantial text overlap with arXiv:2409.06581

  7. arXiv:2506.01103  [pdf, ps, other

    cs.CV

    DeepVerse: 4D Autoregressive Video Generation as a World Model

    Authors: Junyi Chen, Haoyi Zhu, Xianglong He, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Zhoujie Fu, Jiangmiao Pang, Tong He

    Abstract: World models serve as essential building blocks toward Artificial General Intelligence (AGI), enabling intelligent agents to predict future states and plan actions by simulating complex physical interactions. However, existing interactive models primarily predict visual observations, thereby neglecting crucial hidden states like geometric structures and spatial coherence. This leads to rapid error… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  8. arXiv:2506.01097  [pdf, other

    cs.CV

    Generic Token Compression in Multimodal Large Language Models from an Explainability Perspective

    Authors: Lei Lei, Jie Gu, Xiaokang Ma, Chu Tang, Jingmin Chen, Tong Xu

    Abstract: Existing Multimodal Large Language Models (MLLMs) process a large number of visual tokens, leading to significant computational costs and inefficiency. Previous works generally assume that all visual tokens are necessary in the shallow layers of LLMs, and therefore token compression typically occurs in intermediate layers. In contrast, our study reveals an interesting insight: with proper selectio… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  9. arXiv:2506.01077  [pdf, other

    cs.GR cs.HC

    TRiMM: Transformer-Based Rich Motion Matching for Real-Time multi-modal Interaction in Digital Humans

    Authors: Yueqian Guo, Tianzhao Li, Xin Lyu, Jiehaolin Chen, Zhaohan Wang, Sirui Xiao, Yurun Chen, Yezi He, Helin Li, Fan Zhang

    Abstract: Large Language Model (LLM)-driven digital humans have sparked a series of recent studies on co-speech gesture generation systems. However, existing approaches struggle with real-time synthesis and long-text comprehension. This paper introduces Transformer-Based Rich Motion Matching (TRiMM), a novel multi-modal framework for real-time 3D gesture generation. Our method incorporates three modules: 1)… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 24 pages,12 figures

    MSC Class: 68U05(Primary); 62M45(Secondary)

  10. arXiv:2506.01064  [pdf, ps, other

    cs.CV cs.AI

    Fighting Fire with Fire (F3): A Training-free and Efficient Visual Adversarial Example Purification Method in LVLMs

    Authors: Yudong Zhang, Ruobing Xie, Yiqing Huang, Jiansheng Chen, Xingwu Sun, Zhanhui Kang, Di Wang, Yu Wang

    Abstract: Recent advances in large vision-language models (LVLMs) have showcased their remarkable capabilities across a wide range of multimodal vision-language tasks. However, these models remain vulnerable to visual adversarial attacks, which can substantially compromise their performance. Despite their potential impact, the development of effective methods for purifying such adversarial examples has rece… ▽ More

    Submitted 10 June, 2025; v1 submitted 1 June, 2025; originally announced June 2025.

    Comments: 14 pages, 5 figures

  11. arXiv:2506.01039  [pdf, ps, other

    eess.AS cs.SD

    PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data

    Authors: Songjun Cao, Qinghua Wu, Jie Chen, Jin Li, Long Ma

    Abstract: As parallel training data is scarce for one-shot voice conversion (VC) tasks, waveform reconstruction is typically performed by various VC systems. A typical one-shot VC system comprises a content encoder and a speaker encoder. However, two types of mismatches arise: one for the inputs to the content encoder during training and inference, and another for the inputs to the speaker encoder. To addre… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 5 pages, 3 figures

  12. arXiv:2506.01033  [pdf

    cond-mat.mes-hall quant-ph

    Electrically tunable quantum interference of atomic spins on surfaces

    Authors: Hao Wang, Jing Chen, Peng Fan, Yelko del Castillo, Alejandro Ferrón, Lili Jiang, Zilong Wu, Shijie Li, Hong-Jun Gao, Heng Fan, Joaquín Fernández-Rossier, Kai Yang

    Abstract: Controlling quantum interference near avoided energy-level crossings is crucial for fast and reliable coherent manipulation in quantum information processing. However, achieving tunable quantum interference in atomically-precise engineered structures remains challenging. Here, we demonstrate electrical control of quantum interference using atomic spins on an insulating film in a scanning tunneling… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  13. arXiv:2506.01015  [pdf, ps, other

    cs.CV

    AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting

    Authors: Yuyuan Liu, Yuanhong Chen, Chong Wang, Junlin Han, Junde Wu, Can Peng, Jingkun Chen, Yu Tian, Gustavo Carneiro

    Abstract: Segment Anything Model 2 (SAM2) exhibits strong generalisation for promptable segmentation in video clips; however, its integration with the audio modality remains underexplored. Existing approaches mainly follow two directions: (1) injecting adapters into the image encoder to receive audio signals, which incurs efficiency costs during prompt engineering, and (2) leveraging additional foundation m… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 18 pages, 18 Figures and 7 tables

  14. arXiv:2506.01012  [pdf, ps, other

    math.DG

    Stability and rigidity results of space-like hypersurface in the Minkowski space

    Authors: Jianhua Chen, Haiyun Deng, Haiqin Xie, Jiabin Yin

    Abstract: In this paper, we establish some rigidity theorems for space-like hypersurfaces in Minkowski space by using a Weinberger-type approach with P-functions and integral identities. Firstly, for space-like hypersurfaces $M$ represented as graphs $x_{n+1}=u(x)$ over domain $Ω\subset\mathbb R^n$, if higher-order mean curvature ratio $\frac{H_{k}}{H_l}(l<k)$ is constant and the boundary $\partial M$ lies… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  15. arXiv:2506.01005  [pdf, ps, other

    hep-ph

    $λ$ and $ρ$ Regge trajectories for the pentaquark $P_{cc\bar{c}bb}$ in the diquark-triquark picture

    Authors: He Song, Xin-Ru Liu, Jia-Qi Xie, Jiao-Kai Chen

    Abstract: We propose the Regge trajectory relations for the fully heavy pentaquark $P_{cc\bar{c}bb}$ utilizing both diquark and triquark Regge trajectory relations. Using these new relations, we discuss four series of Regge trajectories: the $ρ_1$-, $ρ_2$-, $λ_1$-, and $λ_2$-trajectories. We provide rough estimates for the masses of the $ρ_1$-, $ρ_2$-, $λ_1$-, and $λ_2$-excited states. Except for the $λ_1$-… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 11 pages,3 figures,5 tables

  16. arXiv:2506.00930  [pdf, ps, other

    cs.AI cs.CL

    Aligning VLM Assistants with Personalized Situated Cognition

    Authors: Yongqi Li, Shen Zhou, Xiaohu Li, Xin Miao, Jintao Wen, Mayi Xu, Jianhao Chen, Birong Pan, Hankun Kang, Yuanyuan Zhu, Ming Zhong, Tieyun Qian

    Abstract: Vision-language models (VLMs) aligned with general human objectives, such as being harmless and hallucination-free, have become valuable assistants of humans in managing visual tasks. However, people with diversified backgrounds have different cognition even in the same situation. Consequently, they may have personalized expectations for VLM assistants. This highlights the urgent need to align VLM… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted to ACL 2025 (main), camera-ready version

  17. arXiv:2506.00864  [pdf, ps, other

    nucl-th nucl-ex

    Perspectives for hyperon and hypernuclei physics

    Authors: Jin-Hui Chen, Li-Sheng Geng, Emiko Hiyama, Zhi-Wei Liu, Josef Pochodzalla

    Abstract: Hypernuclei, nuclei containing one or more hyperons, serve as unique laboratories for probing the non-perturbative quantum chromodynamics (QCD). Recent progress in hypernuclear physics, driven by advanced experimental techniques and theoretical innovations, is briefly reviewed with a focus on key findings and unresolved challenges, such as the precise determination of the hypertriton binding energ… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: 10 pages, 3 figures; Comments and suggestions are welcome

  18. arXiv:2506.00816  [pdf, ps, other

    cs.CV cs.AI

    L3A: Label-Augmented Analytic Adaptation for Multi-Label Class Incremental Learning

    Authors: Xiang Zhang, Run He, Jiao Chen, Di Fang, Ming Li, Ziqian Zeng, Cen Chen, Huiping Zhuang

    Abstract: Class-incremental learning (CIL) enables models to learn new classes continually without forgetting previously acquired knowledge. Multi-label CIL (MLCIL) extends CIL to a real-world scenario where each sample may belong to multiple classes, introducing several challenges: label absence, which leads to incomplete historical information due to missing labels, and class imbalance, which results in t… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: Accepted by ICML2025

  19. arXiv:2506.00652  [pdf, ps, other

    cs.CV cs.CR

    Video Signature: In-generation Watermarking for Latent Video Diffusion Models

    Authors: Yu Huang, Junhao Chen, Qi Zheng, Hanqian Li, Shuliang Liu, Xuming Hu

    Abstract: The rapid development of Artificial Intelligence Generated Content (AIGC) has led to significant progress in video generation but also raises serious concerns about intellectual property protection and reliable content tracing. Watermarking is a widely adopted solution to this issue, but existing methods for video generation mainly follow a post-generation paradigm, which introduces additional com… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  20. arXiv:2506.00539  [pdf, ps, other

    cs.CL

    ARIA: Training Language Agents with Intention-Driven Reward Aggregation

    Authors: Ruihan Yang, Yikai Zhang, Aili Chen, Xintao Wang, Siyu Yuan, Jiangjie Chen, Deqing Yang, Yanghua Xiao

    Abstract: Large language models (LLMs) have enabled agents to perform complex reasoning and decision-making through free-form language interactions. However, in open-ended language action environments (e.g., negotiation or question-asking games), the action space can be formulated as a joint distribution over tokens, resulting in an exponentially large action space. Sampling actions in such a space can lead… ▽ More

    Submitted 4 June, 2025; v1 submitted 31 May, 2025; originally announced June 2025.

  21. arXiv:2506.00534  [pdf, ps, other

    cs.CR cs.AI

    The Security Threat of Compressed Projectors in Large Vision-Language Models

    Authors: Yudong Zhang, Ruobing Xie, Xingwu Sun, Jiansheng Chen, Zhanhui Kang, Di Wang, Yu Wang

    Abstract: The choice of a suitable visual language projector (VLP) is critical to the successful training of large visual language models (LVLMs). Mainstream VLPs can be broadly categorized into compressed and uncompressed projectors, and each offering distinct advantages in performance and computational efficiency. However, their security implications have not been thoroughly examined. Our comprehensive ev… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  22. arXiv:2506.00468  [pdf, ps, other

    cs.NE

    Regionalized Metric Framework: A Novel Approach for Evaluating Multimodal Multi-Objective Optimization Algorithms

    Authors: Jintai Chen, Fangqing Liu, Xueming Yan, Han Huang

    Abstract: This study aims to optimize the evaluation metric of multimodal multi-objective optimization problems using a Regionalized Metric Framework, which provides a certain boost to research in this field. Existing evaluation metrics usually use the reference set as the evaluation basis, which inevitably leads to reference set dependence. To optimize this problem, this study proposes an evaluation metric… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: conference

  23. arXiv:2506.00385  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS

    MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation

    Authors: Yakun Song, Jiawei Chen, Xiaobin Zhuang, Chenpeng Du, Ziyang Ma, Jian Wu, Jian Cong, Dongya Jia, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xie Chen

    Abstract: Neural audio codecs have made significant strides in efficiently mapping raw audio waveforms into discrete token representations, which are foundational for contemporary audio generative models. However, most existing codecs are optimized primarily for reconstruction quality, often at the expense of the downstream modelability of the encoded tokens. Motivated by the need to overcome this bottlenec… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: 18 pages, 3 figures. The code and pre-trained models are available at https://github.com/Ereboas/MagiCodec

  24. arXiv:2506.00356  [pdf

    cs.LG cs.AI

    Exploring the Performance of Perforated Backpropagation through Further Experiments

    Authors: Rorry Brenner, Evan Davis, Rushi Chaudhari, Rowan Morse, Jingyao Chen, Xirui Liu, Zhaoyi You, Laurent Itti

    Abstract: Perforated Backpropagation is a neural network optimization technique based on modern understanding of the computational importance of dendrites within biological neurons. This paper explores further experiments from the original publication, generated from a hackathon held at the Carnegie Mellon Swartz Center in February 2025. Students and local Pittsburgh ML practitioners were brought together t… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

    Comments: 10 pages, 7 figures, 1 table

  25. arXiv:2506.00191  [pdf, ps, other

    cs.CR cs.AI cs.LG

    Heterogeneous Graph Backdoor Attack

    Authors: Jiawei Chen, Lusi Li, Daniel Takabi, Masha Sosonkina, Rui Ning

    Abstract: Heterogeneous Graph Neural Networks (HGNNs) excel in modeling complex, multi-typed relationships across diverse domains, yet their vulnerability to backdoor attacks remains unexplored. To address this gap, we conduct the first investigation into the susceptibility of HGNNs to existing graph backdoor attacks, revealing three critical issues: (1) high attack budget required for effective backdoor in… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

  26. arXiv:2506.00044  [pdf, ps, other

    stat.AP cs.LG stat.ML

    Probabilistic intraday electricity price forecasting using generative machine learning

    Authors: Jieyu Chen, Sebastian Lerch, Melanie Schienle, Tomasz Serafin, Rafał Weron

    Abstract: The growing importance of intraday electricity trading in Europe calls for improved price forecasting and tailored decision-support tools. In this paper, we propose a novel generative neural network model to generate probabilistic path forecasts for intraday electricity prices and use them to construct effective trading strategies for Germany's continuous-time intraday market. Our method demonstra… ▽ More

    Submitted 28 May, 2025; originally announced June 2025.

  27. arXiv:2505.24667  [pdf, ps, other

    cs.CV

    Decoupled Competitive Framework for Semi-supervised Medical Image Segmentation

    Authors: Jiahe Chen, Jiahe Ying, Shen Wang, Jianwei Zheng

    Abstract: Confronting the critical challenge of insufficiently annotated samples in medical domain, semi-supervised medical image segmentation (SSMIS) emerges as a promising solution. Specifically, most methodologies following the Mean Teacher (MT) or Dual Students (DS) architecture have achieved commendable results. However, to date, these approaches face a performance bottleneck due to two inherent limita… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: Published in ECAI 2024

  28. arXiv:2505.24586  [pdf, ps, other

    astro-ph.HE

    All-sky search for individual Primordial Black Hole bursts with LHAASO

    Authors: Zhen Cao, F. Aharonian, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, C. M. Cai, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, G. H. Chen, H. X. Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen, S. H. Chen , et al. (293 additional authors not shown)

    Abstract: Primordial Black Holes~(PBHs) are hypothetical black holes with a wide range of masses that formed in the early universe. As a result, they may play an important cosmological role and provide a unique probe of the early universe. A PBH with an initial mass of approximately $10^{15}$~g is expected to explode today in a final burst of Hawking radiation. In this work, we conduct an all-sky search for… ▽ More

    Submitted 2 June, 2025; v1 submitted 30 May, 2025; originally announced May 2025.

    Comments: 8 pages, 2 figures

  29. arXiv:2505.24346  [pdf, ps, other

    cs.CV

    VUDG: A Dataset for Video Understanding Domain Generalization

    Authors: Ziyi Wang, Zhi Gao, Boxuan Yu, Zirui Dai, Yuxiang Song, Qingyuan Lu, Jin Chen, Xinxiao Wu

    Abstract: Video understanding has made remarkable progress in recent years, largely driven by advances in deep models and the availability of large-scale annotated datasets. However, existing works typically ignore the inherent domain shifts encountered in real-world video applications, leaving domain generalization (DG) in video understanding underexplored. Hence, we propose Video Understanding Domain Gene… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  30. arXiv:2505.24189  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows

    Authors: Orlando Marquez Ayala, Patrice Bechard, Emily Chen, Maggie Baird, Jingfei Chen

    Abstract: Large Language Models (LLMs) such as GPT-4o can handle a wide range of complex tasks with the right prompt. As per token costs are reduced, the advantages of fine-tuning Small Language Models (SLMs) for real-world applications -- faster inference, lower costs -- may no longer be clear. In this work, we present evidence that, for domain-specific tasks that require structured outputs, SLMs still hav… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  31. arXiv:2505.24167  [pdf, ps, other

    cs.CV

    Pretraining Deformable Image Registration Networks with Random Images

    Authors: Junyu Chen, Shuwen Wei, Yihao Liu, Aaron Carass, Yong Du

    Abstract: Recent advances in deep learning-based medical image registration have shown that training deep neural networks~(DNNs) does not necessarily require medical images. Previous work showed that DNNs trained on randomly generated images with carefully designed noise and contrast properties can still generalize well to unseen medical data. Building on this insight, we propose using registration between… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: Accepted by MIDL 2025. Code available at https://github.com/junyuchen245/Pretraining_Image_Registration_DNNs

  32. arXiv:2505.24166  [pdf, ps, other

    eess.IV

    Deep learning-derived arterial input function

    Authors: Junyu Chen, Zirui Jiang, Jennifer M. Coughlin, Martin G. Pomper, Yong Du

    Abstract: Dynamic positron emission tomography (PET) imaging combined with radiotracer kinetic modeling is a powerful technique for visualizing biological processes in the brain, offering valuable insights into brain functions and neurological disorders such as Alzheimer's and Parkinson's diseases. Accurate kinetic modeling relies heavily on the use of a metabolite-corrected arterial input function (AIF), w… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  33. arXiv:2505.24160  [pdf, ps, other

    eess.IV cs.CV

    Beyond the LUMIR challenge: The pathway to foundational registration models

    Authors: Junyu Chen, Shuwen Wei, Joel Honkamaa, Pekka Marttinen, Hang Zhang, Min Liu, Yichao Zhou, Zuopeng Tan, Zhuoyuan Wang, Yi Wang, Hongchao Zhou, Shunbo Hu, Yi Zhang, Qian Tao, Lukas Förner, Thomas Wendler, Bailiang Jian, Benedikt Wiestler, Tim Hable, Jin Kim, Dan Ruan, Frederic Madesta, Thilo Sentker, Wiebke Heyer, Lianrui Zuo , et al. (11 additional authors not shown)

    Abstract: Medical image challenges have played a transformative role in advancing the field, catalyzing algorithmic innovation and establishing new performance standards across diverse clinical applications. Image registration, a foundational task in neuroimaging pipelines, has similarly benefited from the Learn2Reg initiative. Building on this foundation, we introduce the Large-scale Unsupervised Brain MRI… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  34. arXiv:2505.24144  [pdf, ps, other

    math.PR math.ST

    Sharp Concentration of Simple Random Tensors II: Asymmetry

    Authors: Jiaheng Chen, Daniel Sanz-Alonso

    Abstract: This paper establishes sharp concentration inequalities for simple random tensors. Our theory unveils a phenomenon that arises only for asymmetric tensors of order $p \ge 3:$ when the effective ranks of the covariances of the component random variables lie on both sides of a critical threshold, an additional logarithmic factor emerges that is not present in sharp bounds for symmetric tensors. To e… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 36 pages

  35. arXiv:2505.24031  [pdf

    cond-mat.mtrl-sci

    Electrical Detection of Single-Domain Néel Vector Reorientation across the Spin-Flop Transition in Cr2O3 Crystals

    Authors: Wei-Cheng Liao, Haoyu Liu, Weilun Tan, Josiah Keagy, Jia-mou Chen, Jing Shi

    Abstract: Electrical transport measurements in heterostructures of antiferromagnetic Cr2O3 bulk crystals and a thin Pt layer exhibit sharp responses as the Néel vector of the Cr2O3 undergoes the spin-flop transition. This abrupt change can arise from several distinct mechanisms including magnetostriction, proximity-induced anomalous Hall, spin Hall anomalous Hall, and spin Hall planar Hall effects. While la… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  36. arXiv:2505.23967  [pdf, ps, other

    cs.LG cs.DS

    Improved Approximations for Hard Graph Problems using Predictions

    Authors: Anders Aamand, Justin Y. Chen, Siddharth Gollapudi, Sandeep Silwal, Hao Wu

    Abstract: We design improved approximation algorithms for NP-hard graph problems by incorporating predictions (e.g., learned from past data). Our prediction model builds upon and extends the $\varepsilon$-prediction framework by Cohen-Addad, d'Orsi, Gupta, Lee, and Panigrahi (NeurIPS 2024). We consider an edge-based version of this model, where each edge provides two bits of information, corresponding to pr… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  37. arXiv:2505.23946  [pdf, ps, other

    cs.AI cs.LG cs.MA cs.SE

    Lessons Learned: A Multi-Agent Framework for Code LLMs to Learn and Improve

    Authors: Yuanzhe Liu, Ryan Deng, Tim Kaler, Xuhao Chen, Charles E. Leiserson, Yao Ma, Jie Chen

    Abstract: Recent studies show that LLMs possess different skills and specialize in different tasks. In fact, we observe that their varied performance occur in several levels of granularity. For example, in the code optimization task, code LLMs excel at different optimization categories and no one dominates others. This observation prompts the question of how one leverages multiple LLM agents to solve a codi… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  38. arXiv:2505.23826  [pdf, ps, other

    cs.SI

    FinRipple: Aligning Large Language Models with Financial Market for Event Ripple Effect Awareness

    Authors: Yuanjian Xu, Jianing Hao, Kunsheng Tang, Jingnan Chen, Anxian Liu, Peng Liu, Guang Zhang

    Abstract: Financial markets exhibit complex dynamics where localized events trigger ripple effects across entities. Previous event studies, constrained by static single-company analyses and simplistic assumptions, fail to capture these ripple effects. While large language models (LLMs) offer emergent reasoning capabilities, their direct application falters due to structural market unawareness and limited ca… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  39. arXiv:2505.23802  [pdf, ps, other

    cs.CL cs.AI

    MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks

    Authors: Suhana Bedi, Hejie Cui, Miguel Fuentes, Alyssa Unell, Michael Wornow, Juan M. Banda, Nikesh Kotecha, Timothy Keyes, Yifan Mai, Mert Oez, Hao Qiu, Shrey Jain, Leonardo Schettini, Mehr Kashyap, Jason Alan Fries, Akshay Swaminathan, Philip Chung, Fateme Nateghi, Asad Aali, Ashwin Nayak, Shivam Vedak, Sneha S. Jain, Birju Patel, Oluseyi Fayanju, Shreya Shah , et al. (56 additional authors not shown)

    Abstract: While large language models (LLMs) achieve near-perfect scores on medical licensing exams, these evaluations inadequately reflect the complexity and diversity of real-world clinical practice. We introduce MedHELM, an extensible evaluation framework for assessing LLM performance for medical tasks with three key contributions. First, a clinician-validated taxonomy spanning 5 categories, 22 subcatego… ▽ More

    Submitted 2 June, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

  40. Position Dependent Prediction Combination For Intra-Frame Video Coding

    Authors: Amir Said, Xin Zhao, Marta Karczewicz, Jianle Chen, Feng Zou

    Abstract: Intra-frame prediction in the High Efficiency Video Coding (HEVC) standard can be empirically improved by applying sets of recursive two-dimensional filters to the predicted values. However, this approach does not allow (or complicates significantly) the parallel computation of pixel predictions. In this work we analyze why the recursive filters are effective, and use the results to derive sets of… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Journal ref: 2016 IEEE International Conference on Image Processing

  41. arXiv:2505.23646  [pdf, ps, other

    cs.CL cs.LG

    Are Reasoning Models More Prone to Hallucination?

    Authors: Zijun Yao, Yantao Liu, Yanxu Chen, Jianhui Chen, Junfeng Fang, Lei Hou, Juanzi Li, Tat-Seng Chua

    Abstract: Recently evolved large reasoning models (LRMs) show powerful performance in solving complex tasks with long chain-of-thought (CoT) reasoning capability. As these LRMs are mostly developed by post-training on formal reasoning tasks, whether they generalize the reasoning capability to help reduce hallucination in fact-seeking tasks remains unclear and debated. For instance, DeepSeek-R1 reports incre… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  42. arXiv:2505.23627  [pdf, ps, other

    cs.LG

    Prompting Whisper for Improved Verbatim Transcription and End-to-end Miscue Detection

    Authors: Griffin Dietz Smith, Dianna Yee, Jennifer King Chen, Leah Findlater

    Abstract: Identifying mistakes (i.e., miscues) made while reading aloud is commonly approached post-hoc by comparing automatic speech recognition (ASR) transcriptions to the target reading text. However, post-hoc methods perform poorly when ASR inaccurately transcribes verbatim speech. To improve on current methods for reading error annotation, we propose a novel end-to-end architecture that incorporates th… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: Interspeech 2025

  43. arXiv:2505.23530  [pdf, ps, other

    hep-ex

    Measurement of the Lund plane for light- and beauty-quark jets

    Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis, L. An , et al. (1133 additional authors not shown)

    Abstract: The substructure of jets in quantum chromodynamics (QCD) has garnered significant attention with the advent of infrared- and collinear-safe clustering algorithms and observables. A key question emerging from these studies is how in-jet emissions at soft and hard energy scales, across collinear and wide angles relative to the emitter, differ with the mass of the emitting parton. The Lund jet plane… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2025-010.html (LHCb public pages)

    Report number: LHCb-PAPER-2025-010,CERN-EP-2025-093

  44. arXiv:2505.23440  [pdf, ps, other

    math.DG

    Comparison of total $σ_k$-curvature

    Authors: Jiaqi Chen, Yufei Shan, Yinghui Ye

    Abstract: Volume comparison theorem is a type of fundamental results in Riemannian geometry. In this article, we extend the volume comparison result in \cite{Besse2008} to the comparison of total $σ_l$-curvature with respect to $σ_k$-curvature ($l<k$). In particular, we prove the comparison holds for metrics close to strictly stable positive Einstein metric with $l<\frac{n}{2}$. As for negative Einstein met… ▽ More

    Submitted 9 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

  45. arXiv:2505.23298  [pdf, ps, other

    cs.SD cs.IR eess.AS

    Bridging the Gap Between Semantic and User Preference Spaces for Multi-modal Music Representation Learning

    Authors: Xiaofeng Pan, Jing Chen, Haitong Zhang, Menglin Xing, Jiayi Wei, Xuefeng Mu, Zhongqian Xie

    Abstract: Recent works of music representation learning mainly focus on learning acoustic music representations with unlabeled audios or further attempt to acquire multi-modal music representations with scarce annotated audio-text pairs. They either ignore the language semantics or rely on labeled audio datasets that are difficult and expensive to create. Moreover, merely modeling semantic space usually fai… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: ICMR 2025

  46. arXiv:2505.23221  [pdf, ps, other

    astro-ph.GA

    The Chemical Clock of High-mass Star-forming Regions: N2H+/CCS

    Authors: J. L. Chen, J. S. Zhang, J. X. Ge, Y. X. Wang, H. Z. Yu, Y. P. Zou, Y. T. Yan, X. Y. Wang, D. Y. Wei

    Abstract: Using the IRAM 30 m telescope, we presented observations of N2H+ J = 1-0, CCS JN = 87-76 and 77-66 lines toward a large sample of ultracompact HII regions (UC HIIs). Among our 88 UC HIIs, 87 and 33 sources were detected in the N2H+ J = 1-0 and CCS JN = 87-76 lines, respectively. For the CCS 77-66 transition, we detected emission in 10 out of 82 targeted sources, all of which also exhibited emissio… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: This paper has been accepted for publication in The Astronomical Journal. 36 pages, 10 figures

  47. arXiv:2505.23201  [pdf, ps, other

    cs.CV

    WTEFNet: Real-Time Low-Light Object Detection for Advanced Driver Assistance Systems

    Authors: Hao Wu, Junzhou Chen, Ronghui Zhang, Nengchao Lyu, Hongyu Hu, Yanyong Guo, Tony Z. Qiu

    Abstract: Object detection is a cornerstone of environmental perception in advanced driver assistance systems(ADAS). However, most existing methods rely on RGB cameras, which suffer from significant performance degradation under low-light conditions due to poor image quality. To address this challenge, we proposes WTEFNet, a real-time object detection framework specifically designed for low-light scenarios,… ▽ More

    Submitted 29 May, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

    Comments: This paper is expected to be submitted to IEEE Transactions on Instrumentation and Measurement

  48. arXiv:2505.23143  [pdf, ps, other

    cs.CV

    Interpreting Chest X-rays Like a Radiologist: A Benchmark with Clinical Reasoning

    Authors: Jinquan Guan, Qi Chen, Lizhou Liang, Yuhang Liu, Vu Minh Hieu Phan, Minh-Son To, Jian Chen, Yutong Xie

    Abstract: Artificial intelligence (AI)-based chest X-ray (CXR) interpretation assistants have demonstrated significant progress and are increasingly being applied in clinical settings. However, contemporary medical AI models often adhere to a simplistic input-to-output paradigm, directly processing an image and an instruction to generate a result, where the instructions may be integral to the model's archit… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: 10 pages (main text), 18 pages (appendix)

  49. arXiv:2505.23015  [pdf, ps, other

    cs.CL

    Detecting Stealthy Backdoor Samples based on Intra-class Distance for Large Language Models

    Authors: Jinwen Chen, Hainan Zhang, Fei Sun, Qinnan Zhang, Sijia Wen, Ziwei Wang, Zhiming Zheng

    Abstract: Fine-tuning LLMs with datasets containing stealthy backdoors from publishers poses security risks to downstream applications. Mainstream detection methods either identify poisoned samples by analyzing the prediction probability of poisoned classification models or rely on the rewriting model to eliminate the stealthy triggers. However, the former cannot be applied to generation tasks, while the la… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  50. arXiv:2505.22977  [pdf, ps, other

    cs.CV

    HyperMotion: DiT-Based Pose-Guided Human Image Animation of Complex Motions

    Authors: Shuolin Xu, Siming Zheng, Ziyi Wang, HC Yu, Jinwei Chen, Huaqi Zhang, Bo Li, Peng-Tao Jiang

    Abstract: Recent advances in diffusion models have significantly improved conditional video generation, particularly in the pose-guided human image animation task. Although existing methods are capable of generating high-fidelity and time-consistent animation sequences in regular motions and static scenes, there are still obvious limitations when facing complex human body motions (Hypermotion) that contain… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: 17 pages, 7 figures