Skip to main content

Showing 1–50 of 10,651 results for author: Yue

.
  1. Real-time Terrain Analysis for Off-road Autonomous Vehicles

    Authors: Edwina Lewis, Aditya Parameshwaran, Laura Redmond, Yue Wang

    Abstract: This research addresses critical autonomous vehicle control challenges arising from road roughness variation, which induces course deviations and potential loss of road contact during steering operations. We present a novel real-time road roughness estimation system employing Bayesian calibration methodology that processes axle accelerations to predict terrain roughness with quantifiable confidenc… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Journal ref: SAE Technical Papers 2025-01-8343

  2. arXiv:2506.21289  [pdf, ps, other

    physics.acc-ph

    Dynamic Focusing to Suppress Emittance Transfer in Crab-Crossing Flat Beam Collisions

    Authors: Derong Xu, J Scott Berg, Michael M Blaskiewicz, Yue Hao, Yun Luo, Christoph Montag, Sergei Nagaitsev, Boris Podobedov, Vadim Ptitsyn, Ferdinand Willeke, Binping Xiao

    Abstract: Flat hadron beam collisions, though expected to enhance peak luminosity by about an order of magnitude, have not yet been demonstrated. Our study reveals a critical limitation: realistic fluctuations, when amplified by synchro-betatron resonance, lead to transverse emittance transfer in flat-beam collisions. Using beam-beam simulations based on Electron-Ion Collider design parameters, we show that… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 5 figures

  3. arXiv:2506.21030  [pdf, ps, other

    cs.RO

    STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner

    Authors: Zhou Tianxing, Wang Zhirui, Ao Haojia, Chen Guangyan, Xing Boyang, Cheng Jingwen, Yang Yi, Yue Yufeng

    Abstract: The ability to perform reliable long-horizon task planning is crucial for deploying robots in real-world environments. However, directly employing Large Language Models (LLMs) as action sequence generators often results in low success rates due to their limited reasoning ability for long-horizon embodied tasks. In the STEP framework, we construct a subgoal tree through a pair of closed-loop models… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

  4. Statistical Strong Lensing as a Test of Conformal Gravity

    Authors: Li-Xue Yue, Da-Ming Chen

    Abstract: As an alternative gravitational theory to General Relativity (GR), Conformal Gravity (CG) can be verified through astronomical observations. Currently, Mannheim and Kazanas have provided vacuum solutions for cosmological and local gravitational systems, and these solutions may resolve the dark matter and dark energy issues encountered in GR, making them particularly valuable. For static, spherical… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: 19 pages, 3 figures, 2 tables, Published in Universe Journal

    Journal ref: Universe 2025, 11(6), 178

  5. arXiv:2506.20406  [pdf, ps, other

    stat.ML cs.IT cs.LG stat.ME

    POLAR: A Pessimistic Model-based Policy Learning Algorithm for Dynamic Treatment Regimes

    Authors: Ruijia Zhang, Zhengling Qi, Yue Wu, Xiangyu Zhang, Yanxun Xu

    Abstract: Dynamic treatment regimes (DTRs) provide a principled framework for optimizing sequential decision-making in domains where decisions must adapt over time in response to individual trajectories, such as healthcare, education, and digital interventions. However, existing statistical methods often rely on strong positivity assumptions and lack robustness under partial data coverage, while offline rei… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  6. Semantic-enhanced Modality-asymmetric Retrieval for Online E-commerce Search

    Authors: Zhigong Zhou, Ning Ding, Xiaochuan Fan, Yue Shang, Yiming Qiu, Jingwei Zhuo, Zhiwei Ge, Songlin Wang, Lin Liu, Sulong Xu, Han Zhang

    Abstract: Semantic retrieval, which retrieves semantically matched items given a textual query, has been an essential component to enhance system effectiveness in e-commerce search. In this paper, we study the multimodal retrieval problem, where the visual information (e.g, image) of item is leveraged as supplementary of textual information to enrich item representation and further improve retrieval perform… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: published in sigir2023

  7. arXiv:2506.20225  [pdf

    cs.DL physics.soc-ph

    The role of preprints in open science: Accelerating knowledge transfer from science to technology

    Authors: Zhiqi Wang, Yue Chen, Chun Yang

    Abstract: Preprints have become increasingly essential in the landscape of open science, facilitating not only the exchange of knowledge within the scientific community but also bridging the gap between science and technology. However, the impact of preprints on technological innovation, given their unreviewed nature, remains unclear. This study fills this gap by conducting a comprehensive scientometric ana… ▽ More

    Submitted 26 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

    Comments: Accepted manuscript for publication in Journal of Informetrics.The final version is available at DOI:10.1016/j.joi.2025.101663

    Journal ref: Journal of Informetrics (2025)

  8. arXiv:2506.20167  [pdf, ps, other

    cs.CL cs.AI

    SEED: A Structural Encoder for Embedding-Driven Decoding in Time Series Prediction with LLMs

    Authors: Fengze Li, Yue Wang, Yangle Liu, Ming Huang, Dou Hong, Jieming Ma

    Abstract: Multivariate time series forecasting requires models to simultaneously capture variable-wise structural dependencies and generalize across diverse tasks. While structural encoders are effective in modeling feature interactions, they lack the capacity to support semantic-level reasoning or task adaptation. Conversely, large language models (LLMs) possess strong generalization capabilities but remai… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  9. arXiv:2506.19918  [pdf, ps, other

    hep-ph astro-ph.CO

    QCD Axion Domain Walls from Super-Cooling First Order Phase Transition

    Authors: Kun-Feng Lyu, Yue Zhao

    Abstract: The QCD axion is a well-motivated hypothetical particle beyond the Standard Model (SM) and a compelling dark matter candidate. Its relic abundance is highly sensitive to the thermal history of the universe when the temperature is around the QCD confinement scale. Meanwhile, the NANOGrav Collaboration has reported evidence for a stochastic gravitational wave background, which could originate from a… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: 7 pages, 6 figures

  10. arXiv:2506.19694  [pdf, ps, other

    cs.CV

    UltraAD: Fine-Grained Ultrasound Anomaly Classification via Few-Shot CLIP Adaptation

    Authors: Yue Zhou, Yuan Bi, Wenjuan Tong, Wei Wang, Nassir Navab, Zhongliang Jiang

    Abstract: Precise anomaly detection in medical images is critical for clinical decision-making. While recent unsupervised or semi-supervised anomaly detection methods trained on large-scale normal data show promising results, they lack fine-grained differentiation, such as benign vs. malignant tumors. Additionally, ultrasound (US) imaging is highly sensitive to devices and acquisition parameter variations,… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  11. arXiv:2506.19563  [pdf, ps, other

    cs.CR cs.AI

    PrivacyXray: Detecting Privacy Breaches in LLMs through Semantic Consistency and Probability Certainty

    Authors: Jinwen He, Yiyang Lu, Zijin Lin, Kai Chen, Yue Zhao

    Abstract: Large Language Models (LLMs) are widely used in sensitive domains, including healthcare, finance, and legal services, raising concerns about potential private information leaks during inference. Privacy extraction attacks, such as jailbreaking, expose vulnerabilities in LLMs by crafting inputs that force the models to output sensitive information. However, these attacks cannot verify whether the e… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  12. arXiv:2506.19474  [pdf, ps, other

    cs.CV

    HMSViT: A Hierarchical Masked Self-Supervised Vision Transformer for Corneal Nerve Segmentation and Diabetic Neuropathy Diagnosis

    Authors: Xin Zhang, Liangxiu Han, Yue Shi, Yanlin Zheng, Alam Uazman, Maryam Ferdousi, Rayaz Malik

    Abstract: Diabetic Peripheral Neuropathy (DPN) affects nearly half of diabetes patients, requiring early detection. Corneal Confocal Microscopy (CCM) enables non-invasive diagnosis, but automated methods suffer from inefficient feature extraction, reliance on handcrafted priors, and data limitations. We propose HMSViT, a novel Hierarchical Masked Self-Supervised Vision Transformer (HMSViT) designed for corn… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  13. arXiv:2506.19358  [pdf, ps, other

    eess.SP cs.AI

    From High-SNR Radar Signal to ECG: A Transfer Learning Model with Cardio-Focusing Algorithm for Scenarios with Limited Data

    Authors: Yuanyuan Zhang, Haocheng Zhao, Sijie Xiong, Rui Yang, Eng Gee Lim, Yutao Yue

    Abstract: Electrocardiogram (ECG), as a crucial find-grained cardiac feature, has been successfully recovered from radar signals in the literature, but the performance heavily relies on the high-quality radar signal and numerous radar-ECG pairs for training, restricting the applications in new scenarios due to data scarcity. Therefore, this work will focus on radar-based ECG recovery in new scenarios with l… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  14. arXiv:2506.19324  [pdf, ps, other

    cs.CV

    Memory-Augmented Incomplete Multimodal Survival Prediction via Cross-Slide and Gene-Attentive Hypergraph Learning

    Authors: Mingcheng Qu, Guang Yang, Donglin Di, Yue Gao, Tonghua Su, Yang Song, Lei Fan

    Abstract: Multimodal pathology-genomic analysis is critical for cancer survival prediction. However, existing approaches predominantly integrate formalin-fixed paraffin-embedded (FFPE) slides with genomic data, while neglecting the availability of other preservation slides, such as Fresh Froze (FF) slides. Moreover, as the high-resolution spatial nature of pathology data tends to dominate the cross-modality… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

    Comments: accepted by MICCAI2025 code: https://github.com/MCPathology/M2Surv

  15. arXiv:2506.19288  [pdf, ps, other

    cs.CV cs.RO

    Da Yu: Towards USV-Based Image Captioning for Waterway Surveillance and Scene Understanding

    Authors: Runwei Guan, Ningwei Ouyang, Tianhao Xu, Shaofeng Liang, Wei Dai, Yafeng Sun, Shang Gao, Songning Lai, Shanliang Yao, Xuming Hu, Ryan Wen Liu, Yutao Yue, Hui Xiong

    Abstract: Automated waterway environment perception is crucial for enabling unmanned surface vessels (USVs) to understand their surroundings and make informed decisions. Most existing waterway perception models primarily focus on instance-level object perception paradigms (e.g., detection, segmentation). However, due to the complexity of waterway environments, current perception datasets and models fail to… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 14 pages, 13 figures

  16. arXiv:2506.19257  [pdf, ps, other

    cs.CV cs.CL

    MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models

    Authors: Yinan Xia, Yilei Jiang, Yingshui Tan, Xiaoyong Zhu, Xiangyu Yue, Bo Zheng

    Abstract: Vision-Language Models (VLMs) have achieved remarkable progress in multimodal reasoning tasks through enhanced chain-of-thought capabilities. However, this advancement also introduces novel safety risks, as these models become increasingly vulnerable to harmful multimodal prompts that can trigger unethical or unsafe behaviors. Existing safety alignment approaches, primarily designed for unimodal l… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  17. arXiv:2506.19201  [pdf, ps, other

    cs.RO

    The MOTIF Hand: A Robotic Hand for Multimodal Observations with Thermal, Inertial, and Force Sensors

    Authors: Hanyang Zhou, Haozhe Lou, Wenhao Liu, Enyu Zhao, Yue Wang, Daniel Seita

    Abstract: Advancing dexterous manipulation with multi-fingered robotic hands requires rich sensory capabilities, while existing designs lack onboard thermal and torque sensing. In this work, we propose the MOTIF hand, a novel multimodal and versatile robotic hand that extends the LEAP hand by integrating: (i) dense tactile information across the fingers, (ii) a depth sensor, (iii) a thermal camera, (iv), IM… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  18. arXiv:2506.18951  [pdf, ps, other

    cs.DB cs.AI

    SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications

    Authors: Jinyang Li, Xiaolong Li, Ge Qu, Per Jacobsson, Bowen Qin, Binyuan Hui, Shuzheng Si, Nan Huo, Xiaohan Xu, Yue Zhang, Ziwei Tang, Yuanshuai Li, Florensia Widjaja, Xintong Zhu, Feige Zhou, Yongfeng Huang, Yannis Papakonstantinou, Fatma Ozcan, Chenhao Ma, Reynold Cheng

    Abstract: Resolution of complex SQL issues persists as a significant bottleneck in real-world database applications. Current Large Language Models (LLMs), while adept at text-to-SQL translation, have not been rigorously evaluated on the more challenging task of debugging SQL issues. To address this gap, we introduce BIRD-CRITIC, a new SQL issue debugging benchmark comprising 530 PostgreSQL tasks (BIRD-CRITI… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 26 pages, 9 figures

  19. arXiv:2506.18898  [pdf, ps, other

    cs.CV cs.AI cs.CL cs.MM

    Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

    Authors: Jiaming Han, Hao Chen, Yang Zhao, Hanyu Wang, Qi Zhao, Ziyan Yang, Hao He, Xiangyu Yue, Lu Jiang

    Abstract: This paper presents a multimodal framework that attempts to unify visual understanding and generation within a shared discrete semantic representation. At its core is the Text-Aligned Tokenizer (TA-Tok), which converts images into discrete tokens using a text-aligned codebook projected from a large language model's (LLM) vocabulary. By integrating vision and text into a unified space with an expan… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: Project page: https://tar.csuhan.com

  20. arXiv:2506.18737  [pdf, ps, other

    cs.CV cs.RO

    USVTrack: USV-Based 4D Radar-Camera Tracking Dataset for Autonomous Driving in Inland Waterways

    Authors: Shanliang Yao, Runwei Guan, Yi Ni, Sen Xu, Yong Yue, Xiaohui Zhu, Ryan Wen Liu

    Abstract: Object tracking in inland waterways plays a crucial role in safe and cost-effective applications, including waterborne transportation, sightseeing tours, environmental monitoring and surface rescue. Our Unmanned Surface Vehicle (USV), equipped with a 4D radar, a monocular camera, a GPS, and an IMU, delivers robust tracking capabilities in complex waterborne environments. By leveraging these sensor… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: Accepted by IROS

  21. arXiv:2506.18586  [pdf

    cs.AI cs.CE cs.CL

    Airalogy: AI-empowered universal data digitization for research automation

    Authors: Zijie Yang, Qiji Zhou, Fang Guo, Sijie Zhang, Yexun Xi, Jinglei Nie, Yudian Zhu, Liping Huang, Chou Wu, Yonghe Xia, Xiaoyu Ma, Yingming Pu, Panzhong Lu, Junshu Pan, Mingtao Chen, Tiannan Guo, Yanmei Dou, Hongyu Chen, Anping Zeng, Jiaxing Huang, Tian Xu, Yue Zhang

    Abstract: Research data are the foundation of Artificial Intelligence (AI)-driven science, yet current AI applications remain limited to a few fields with readily available, well-structured, digitized datasets. Achieving comprehensive AI empowerment across multiple disciplines is still out of reach. Present-day research data collection is often fragmented, lacking unified standards, inefficiently managed, a… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 146 pages, 6 figures, 49 supplementary figures

  22. arXiv:2506.18512  [pdf, ps, other

    eess.IV cs.CL cs.CV q-bio.QM

    MedTVT-R1: A Multimodal LLM Empowering Medical Reasoning and Diagnosis

    Authors: Yuting Zhang, Kaishen Yuan, Hao Lu, Yutao Yue, Jintai Chen, Kaishun Wu

    Abstract: Accurate and interpretable multi-disease diagnosis remains a critical challenge in medical research, particularly when leveraging heterogeneous multimodal medical data. Current approaches often rely on single-modal data, limiting their ability to comprehensively understand complex diseases. To address this, we propose MedTVT-R1, a novel Multimodal Large Language Model (MLLM) framework designed to… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  23. arXiv:2506.18333  [pdf, ps, other

    cond-mat.mtrl-sci cond-mat.stat-mech

    Doping-induced Polyamorphic Transitions in Fluorite Oxides

    Authors: Hao Yang, Qiaotong Luan, Qing Zhang, Yuhao Yue, Yawen Xu, Xiaohui Liu, Zheng Wen, Zhaoru Sun

    Abstract: Fluorite oxides such as HfO$_2$ exhibit rich and tunable phase behavior, making them promising candidates for next generation electronic devices. A key challenge is to design amorphous HfO$_2$-based high-$k$ materials with both structural and performance stability. Here, using molecular dynamics simulations supported by experimental measurements, we reveal that Ba doping stimulates a polyamorphic… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 4 figures

  24. arXiv:2506.18310  [pdf

    physics.optics physics.app-ph

    Programmable electro-optic frequency comb empowers integrated parallel convolution processing

    Authors: Jinze He, Junzhe Qiang, Yiying Dong, Jingyi Wang, Tian Dong, Gongcheng Yue, Rongjin Zhuang, Mingze Lv, Siyuan Yu, Zhongjin Lin, Xinlun Cai, Yuanmu Yang, Guanhao Wu, Yang Li

    Abstract: Integrated photonic convolution processors make optical neural networks (ONNs) a transformative solution for artificial intelligence applications such as machine vision. To enhance the parallelism, throughput, and energy efficiency of ONNs, wavelength multiplexing is widely applied. However, it often encounters the challenges of low compactness, limited scalability, and high weight reconstruction… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  25. arXiv:2506.18285  [pdf, ps, other

    cs.LG cs.AI

    Learning Causal Graphs at Scale: A Foundation Model Approach

    Authors: Naiyu Yin, Tian Gao, Yue Yu

    Abstract: Due to its human-interpretability and invariance properties, Directed Acyclic Graph (DAG) has been a foundational tool across various areas of AI research, leading to significant advancements. However, DAG learning remains highly challenging, due to its super-exponential growth in computational cost and identifiability issues, particularly in small-sample regimes. To address these two challenges,… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  26. arXiv:2506.18234  [pdf, ps, other

    cs.CV cs.RO

    Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning

    Authors: Yue Li, Meng Tian, Dechang Zhu, Jiangtong Zhu, Zhenyu Lin, Zhiwei Xiong, Xinhai Zhao

    Abstract: Large vision-language models (VLMs) for autonomous driving (AD) are evolving beyond perception and cognition tasks toward motion planning. However, we identify two critical challenges in this direction: (1) VLMs tend to learn shortcuts by relying heavily on history input information, achieving seemingly strong planning results without genuinely understanding the visual inputs; and (2) the chain-of… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  27. arXiv:2506.18222  [pdf

    physics.flu-dyn math-ph math.AP nlin.CD physics.ao-ph

    The mechanism of tornadogenesis from the perspective of vortex tubes

    Authors: Peng Yue, Y. Charles Li, Jiamin Dang, Leigh Orf, Grace Yan

    Abstract: In this paper, we propose a new theory on tornadogenesis from the perspective of vortex tubes based on Kelvin-Helmholtz Theorems. When the pressure difference between the lowest pressure line from the wall cloud down to the ground and its surroundings is large enough, the increase of vorticity inside the squeezed vortex tube can reach the tornado level, and thus a tornado is born. When the pressur… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  28. arXiv:2506.18028  [pdf, ps, other

    cs.CV

    MiCo: Multiple Instance Learning with Context-Aware Clustering for Whole Slide Image Analysis

    Authors: Junjian Li, Hulin Kuang, Jin Liu, Hailin Yue, Mengshen He, Jianxin Wang

    Abstract: Multiple instance learning (MIL) has shown significant promise in histopathology whole slide image (WSI) analysis for cancer diagnosis and prognosis. However, the inherent spatial heterogeneity of WSIs presents critical challenges, as morphologically similar tissue types are often dispersed across distant anatomical regions. Conventional MIL methods struggle to model these scattered tissue distrib… ▽ More

    Submitted 25 June, 2025; v1 submitted 22 June, 2025; originally announced June 2025.

    Comments: MICCAI 2025

  29. arXiv:2506.17964  [pdf, ps, other

    cs.CE

    Learning from the Storm: A Multivariate Machine Learning Approach to Predicting Hurricane-Induced Economic Losses

    Authors: Bolin Shen, Eren Erman Ozguven, Yue Zhao, Guang Wang, Yiqun Xie, Yushun Dong

    Abstract: Florida is particularly vulnerable to hurricanes, which frequently cause substantial economic losses. While prior studies have explored specific contributors to hurricane-induced damage, few have developed a unified framework capable of integrating a broader range of influencing factors to comprehensively assess the sources of economic loss. In this study, we propose a comprehensive modeling frame… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  30. arXiv:2506.17924  [pdf, ps, other

    math.OC eess.SY

    Inverse Chance Constrained Optimal Power Flow

    Authors: Shenglu Wang, Kairui Feng, Mengqi Xue, Yue Song

    Abstract: The chance constrained optimal power flow (CC-OPF) essentially finds the low-cost generation dispatch scheme ensuring operational constraints are met with a specified probability, termed the security level. While the security level is a crucial input parameter, how it shapes the CC-OPF feasibility boundary has not been revealed. Changing the security level from a parameter to a decision variable,… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

    Comments: 3 pages, 1 figure

  31. arXiv:2506.17733  [pdf, ps, other

    cs.CV

    YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception

    Authors: Mengqi Lei, Siqi Li, Yihong Wu, Han Hu, You Zhou, Xinhu Zheng, Guiguang Ding, Shaoyi Du, Zongze Wu, Yue Gao

    Abstract: The YOLO series models reign supreme in real-time object detection due to their superior accuracy and computational efficiency. However, both the convolutional architectures of YOLO11 and earlier versions and the area-based self-attention mechanism introduced in YOLOv12 are limited to local information aggregation and pairwise correlation modeling, lacking the capability to capture global multi-to… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  32. arXiv:2506.17611  [pdf, ps, other

    cs.CL cs.SD eess.AS

    OpusLM: A Family of Open Unified Speech Language Models

    Authors: Jinchuan Tian, William Chen, Yifan Peng, Jiatong Shi, Siddhant Arora, Shikhar Bharadwaj, Takashi Maekaku, Yusuke Shinohara, Keita Goto, Xiang Yue, Huck Yang, Shinji Watanabe

    Abstract: This paper presents Open Unified Speech Language Models (OpusLMs), a family of open foundational speech language models (SpeechLMs) up to 7B. Initialized from decoder-only text language models, the OpusLMs are continuously pre-trained on 213K hours of speech-text pairs and 292B text-only tokens. We demonstrate our OpusLMs achieve comparable (or even superior) performance with existing SpeechLMs in… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  33. arXiv:2506.17609  [pdf, ps, other

    cs.CL cs.LG

    TyphoFormer: Language-Augmented Transformer for Accurate Typhoon Track Forecasting

    Authors: Lincan Li, Eren Erman Ozguven, Yue Zhao, Guang Wang, Yiqun Xie, Yushun Dong

    Abstract: Accurate typhoon track forecasting is crucial for early system warning and disaster response. While Transformer-based models have demonstrated strong performance in modeling the temporal dynamics of dense trajectories of humans and vehicles in smart cities, they usually lack access to broader contextual knowledge that enhances the forecasting reliability of sparse meteorological trajectories, such… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

  34. arXiv:2506.17587  [pdf, ps, other

    cs.CV cs.AI cs.LG

    HalluRNN: Mitigating Hallucinations via Recurrent Cross-Layer Reasoning in Large Vision-Language Models

    Authors: Le Yu, Kaishen Wang, Jianlong Xiong, Yue Cao, Tao He

    Abstract: Though Large Vision-Language Models (LVLMs) have achieved remarkable performance across various tasks, they are still prone to hallucinations-generating outputs that are textually plausible but visually ungrounded. While prior approaches generally address this issue through data-centric fine-tuning or innovative decoding strategies, these methods often require substantial resources or task-specifi… ▽ More

    Submitted 21 June, 2025; originally announced June 2025.

    Comments: 6 figures, 9 tables

  35. arXiv:2506.17554  [pdf, ps, other

    astro-ph.GA

    Dynamics of Multiphase Carbon in the Turbulent Circumgalactic Medium

    Authors: Yue Hu, Evan Scannapieco, Edward Buie II, Siyao Xu, Samuel T Sebastian, Om Biswal

    Abstract: The circumgalactic medium (CGM) plays a crucial role in regulating material and energy exchange between galaxies and their environments. The best means of observing this medium is through absorption-line spectroscopy, but we have yet to develop a consistent physical model that fully explains these results. Here we investigate the impact of turbulence and non-equilibrium chemistry on the properties… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: 15 pages, 10 figures, accepted for publication in ApJ

  36. arXiv:2506.17335  [pdf, ps, other

    cs.SE cs.AI

    LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research

    Authors: Shuo Yan, Ruochen Li, Ziming Luo, Zimu Wang, Daoyang Li, Liqiang Jing, Kaiyu He, Peilin Wu, George Michalopoulos, Yue Zhang, Ziyang Zhang, Mian Zhang, Zhiyu Chen, Xinya Du

    Abstract: Large language model (LLM) agents have demonstrated remarkable potential in advancing scientific discovery. However, their capability in the fundamental yet crucial task of reproducing code from research papers, especially in the NLP domain, remains underexplored. This task includes unique complex reasoning challenges in the intellectual synthesis of abstract concepts and the comprehension of code… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  37. arXiv:2506.17311  [pdf, ps, other

    cs.CY

    Can Large Language Models Be Trusted Paper Reviewers? A Feasibility Study

    Authors: Chuanlei Li, Xu Hu, Minghui Xu, Kun Li, Yue Zhang, Xiuzhen Cheng

    Abstract: Academic paper review typically requires substantial time, expertise, and human resources. Large Language Models (LLMs) present a promising method for automating the review process due to their extensive training data, broad knowledge base, and relatively low usage cost. This work explores the feasibility of using LLMs for academic paper review by proposing an automated review system. The system i… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  38. arXiv:2506.17219  [pdf, ps, other

    cs.LG cs.AI

    No Free Lunch: Rethinking Internal Feedback for LLM Reasoning

    Authors: Yanzhi Zhang, Zhaoxi Zhang, Haoxiang Guan, Yilin Cheng, Yitong Duan, Chen Wang, Yue Wang, Shuxin Zheng, Jiyan He

    Abstract: Reinforcement learning has emerged as a powerful paradigm for post-training large language models (LLMs) to improve reasoning. Approaches like Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning with Verifiable Rewards (RLVR) have shown strong results, but they require extensive external supervision. We investigate an alternative class of methods, Reinforcement Learning fr… ▽ More

    Submitted 25 June, 2025; v1 submitted 20 June, 2025; originally announced June 2025.

  39. arXiv:2506.17113  [pdf, ps, other

    cs.CV cs.AI cs.CL

    MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation

    Authors: Shoubin Yu, Yue Zhang, Ziyang Wang, Jaehong Yoon, Mohit Bansal

    Abstract: Combining pre-trained expert models offers substantial potential for scalable multimodal reasoning, but building a unified framework remains challenging due to the increasing diversity of input modalities and task complexity. For instance, medical diagnosis requires precise reasoning over structured clinical tables, while financial forecasting depends on interpreting plot-based data to make inform… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

    Comments: The first two authors contributed equally; Github link: https://github.com/Yui010206/MEXA

  40. arXiv:2506.16878  [pdf, ps, other

    cs.SE

    Quantum Optimization for Software Engineering: A Survey

    Authors: Man Zhang, Yuechen Li, Tao Yue, Kai-Yuan Cai

    Abstract: Quantum computing, particularly in the area of quantum optimization, is steadily progressing toward practical applications, supported by an expanding range of hardware platforms and simulators. While Software Engineering (SE) optimization has a strong foundation, which is exemplified by the active Search-Based Software Engineering (SBSE) community and numerous classical optimization methods, the g… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  41. arXiv:2506.16691  [pdf, ps, other

    cs.CV

    LaVi: Efficient Large Vision-Language Models via Internal Feature Modulation

    Authors: Tongtian Yue, Longteng Guo, Yepeng Tang, Zijia Zhao, Xinxin Zhu, Hua Huang, Jing Liu

    Abstract: Despite the impressive advancements of Large Vision-Language Models (LVLMs), existing approaches suffer from a fundamental bottleneck: inefficient visual-language integration. Current methods either disrupt the model's inherent structure or introduce severe long-context computational burden, severely limiting scalability and efficiency. In this paper, we rethink multimodal integration and present… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  42. arXiv:2506.16690  [pdf, ps, other

    cs.CV

    DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches

    Authors: Yun Xing, Yue Cao, Nhat Chung, Jie Zhang, Ivor Tsang, Ming-Ming Cheng, Yang Liu, Lei Ma, Qing Guo

    Abstract: Stereo Depth estimation is a critical task in autonomous driving and robotics, where inaccuracies (such as misidentifying nearby objects as distant) can lead to dangerous situations. Adversarial attacks against stereo depth estimation can help reveal vulnerabilities before deployment. Previous work has shown that repeating optimized textures can effectively mislead stereo depth estimation in digit… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  43. arXiv:2506.16683  [pdf, ps, other

    cs.IR cs.AI

    A Simple Contrastive Framework Of Item Tokenization For Generative Recommendation

    Authors: Penglong Zhai, Yifang Yuan, Fanyi Di, Jie Li, Yue Liu, Chen Li, Jie Huang, Sicong Wang, Yao Xu, Xin Li

    Abstract: Generative retrieval-based recommendation has emerged as a promising paradigm aiming at directly generating the identifiers of the target candidates. However, in large-scale recommendation systems, this approach becomes increasingly cumbersome due to the redundancy and sheer scale of the token space. To overcome these limitations, recent research has explored the use of semantic tokens as an alter… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: 12 pages,7 figures

  44. arXiv:2506.16255  [pdf, ps, other

    astro-ph.IM cs.AI

    Category-based Galaxy Image Generation via Diffusion Models

    Authors: Xingzhong Fan, Hongming Tang, Yue Zeng, M. B. N. Kouwenhoven, Guangquan Zeng

    Abstract: Conventional galaxy generation methods rely on semi-analytical models and hydrodynamic simulations, which are highly dependent on physical assumptions and parameter tuning. In contrast, data-driven generative models do not have explicit physical parameters pre-determined, and instead learn them efficiently from observational data, making them alternative solutions to galaxy generation. Among these… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

    Comments: 18 pages, 6 figures. Submitted to AAS Astronomical Journal (AJ) and is under revision. See another indenpdent work for furthur reference -- Can AI Dream of Unseen Galaxies? Conditional Diffusion Model for Galaxy Morphology Augmentation (Ma, Sun et al.). Comments are welcome

  45. arXiv:2506.16210  [pdf, ps, other

    eess.IV cs.CV

    From Coarse to Continuous: Progressive Refinement Implicit Neural Representation for Motion-Robust Anisotropic MRI Reconstruction

    Authors: Zhenxuan Zhang, Lipei Zhang, Yanqi Cheng, Zi Wang, Fanwen Wang, Haosen Zhang, Yue Yang, Yinzhe Wu, Jiahao Huang, Angelica I Aviles-Rivero, Zhifan Gao, Guang Yang, Peter J. Lally

    Abstract: In motion-robust magnetic resonance imaging (MRI), slice-to-volume reconstruction is critical for recovering anatomically consistent 3D brain volumes from 2D slices, especially under accelerated acquisitions or patient motion. However, this task remains challenging due to hierarchical structural disruptions. It includes local detail loss from k-space undersampling, global structural aliasing cause… ▽ More

    Submitted 24 June, 2025; v1 submitted 19 June, 2025; originally announced June 2025.

  46. arXiv:2506.16093  [pdf, ps, other

    cond-mat.mes-hall

    Finite Thickness Effects on Metallization Vs. Chiral Majorana Fermions

    Authors: Xin Yue, Guo-Jian Qiao, C. P. Sun

    Abstract: In heterostructures composed of quantum anomalous Hall insulators and \textit{s}-wave superconductors (SCs), metallization hinders the identification of chiral Majorana fermions (CMFs). In this Letter, we study how the thickness of SC affects the competition between metallization and CMFs by a holistic approach previously developed for hybrid nanowire systems [Phys. Rev. Lett. 133, 266605 (2024)].… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  47. arXiv:2506.16084  [pdf

    q-bio.BM

    Aptamer-protein interaction prediction model based on transformer

    Authors: Zhichao Yan, Yue Kang, Buyong Ma

    Abstract: Aptamers are single-stranded DNA/RNAs or short peptides with unique tertiary structures that selectively bind to specific targets. They have great potential in the detection and medical fields. Here, we present SelfTrans-Ensemble, a deep learning model that integrates sequence information models and structural information models to extract multi-scale features for predicting aptamer-protein intera… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  48. arXiv:2506.16012  [pdf, ps, other

    cs.RO

    DualTHOR: A Dual-Arm Humanoid Simulation Platform for Contingency-Aware Planning

    Authors: Boyu Li, Siyuan He, Hang Xu, Haoqi Yuan, Yu Zang, Liwei Hu, Junpeng Yue, Zhenxiong Jiang, Pengbo Hu, Börje F. Karlsson, Yehui Tang, Zongqing Lu

    Abstract: Developing embodied agents capable of performing complex interactive tasks in real-world scenarios remains a fundamental challenge in embodied AI. Although recent advances in simulation platforms have greatly enhanced task diversity to train embodied Vision Language Models (VLMs), most platforms rely on simplified robot morphologies and bypass the stochastic nature of low-level execution, which li… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  49. arXiv:2506.15533  [pdf, ps, other

    hep-ex

    Measurements of the absolute branching fractions of the doubly Cabibbo-suppressed decays $D^+\to K^+π^0$, $D^+\to K^+η$ and $D^+\to K^+η^{\prime}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann, H. Cai , et al. (697 additional authors not shown)

    Abstract: Using $20.3\,\rm fb^{-1}$ of $e^+e^-$ collision data collected at a center-of-mass energy of 3.773\,GeV with the BESIII detector, we present improved measurements of the absolute branching fractions of the doubly Cabibbo-suppressed decays $D^+\to K^+π^0$, $D^+\to K^+η$ and $ D^+ \to K^+ η^{\prime}$ with the double-tag method. The statistical significance of each signal decay exceeds $10σ$. The bra… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: 20 pages, 4 figures

  50. arXiv:2506.15490  [pdf, ps, other

    quant-ph

    Symmetry in Multi-Qubit Correlated Noise Errors Enhances Surface Code Thresholds

    Authors: SiYing Wang, Yue Yan, ZhiXin Xia, Xiang-Bin Wang

    Abstract: Surface codes are promising for practical quantum error correction due to their high threshold and experimental feasibility. However, their performance under realistic noise conditions, particularly those involving correlated errors, requires further investigation. In this study, we investigate the impact of correlated errors on the error threshold. In particular, we focus on several distinct type… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.