Skip to main content

Showing 1–50 of 373 results for author: Gao, K

.
  1. arXiv:2507.03263  [pdf, ps, other

    cs.SE

    Analyzing C/C++ Library Migrations at the Package-level: Prevalence, Domains, Targets and Rationals across Seven Package Management Tools

    Authors: Haiqiao Gu, Yiliang Zhao, Kai Gao, Minghui Zhou

    Abstract: Library migration happens when a library can not meet the project's requirements and is non-trivial to accomplish. To mitigate the problem, substantial efforts have been devoted to understanding its characteristics and recommending alternative libraries, especially for programming language (PL) ecosystems with a central package hosting platform, such as Python (PyPI). However, to the best of our k… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

  2. arXiv:2506.21085  [pdf, ps, other

    q-bio.BM cs.AI cs.LG

    CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions

    Authors: Yangzhe Peng, Kaiyuan Gao, Liang He, Yuheng Cong, Haiguang Liu, Kun He, Lijun Wu

    Abstract: Molecular docking plays a crucial role in predicting the binding mode of ligands to target proteins, and covalent interactions, which involve the formation of a covalent bond between the ligand and the target, are particularly valuable due to their strong, enduring binding nature. However, most existing docking methods and deep learning approaches hardly account for the formation of covalent bonds… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Accepted to KDD 2025 Research Track

  3. arXiv:2506.17423  [pdf, ps, other

    physics.atom-ph physics.optics

    A Liquid-Nitrogen-Cooled Ca+ Ion Optical Clock with a Systematic Uncertainty of 4.6E-19

    Authors: Baolin Zhang, Zixiao Ma, Yao Huang, Huili Han, Ruming Hu, Yuzhuo Wang, Huaqing Zhang, Liyan Tang, Tingyun Shi, Hua Guan, Kelin Gao

    Abstract: We report a single-ion optical clock based on the 4S_1/2-3D_5/2 transition of the 40Ca+ ion, operated in a liquid nitrogen cryogenic environment,achieving a total systematic uncertainty of 4.6E-19. We employ a refined temperature evaluation scheme to reduce the frequency uncertainty due to blackbody radiation (BBR), and the 3D sideband cooling has been implemented to minimize the second-order Dopp… ▽ More

    Submitted 3 July, 2025; v1 submitted 20 June, 2025; originally announced June 2025.

    Comments: 12 pages, 14 figures

  4. arXiv:2506.15755  [pdf, ps, other

    cs.CV cs.CL

    VLMInferSlow: Evaluating the Efficiency Robustness of Large Vision-Language Models as a Service

    Authors: Xiasi Wang, Tianliang Yao, Simin Chen, Runqi Wang, Lei YE, Kuofeng Gao, Yi Huang, Yuan Yao

    Abstract: Vision-Language Models (VLMs) have demonstrated great potential in real-world applications. While existing research primarily focuses on improving their accuracy, the efficiency remains underexplored. Given the real-time demands of many applications and the high inference overhead of VLMs, efficiency robustness is a critical issue. However, previous studies evaluate efficiency robustness under unr… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

    Comments: Accepted by ACL 2025

  5. arXiv:2506.12355  [pdf, ps, other

    cs.LG cs.CL

    QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm

    Authors: Qirui Zhou, Shaohui Peng, Weiqiang Xiong, Haixin Chen, Yuanbo Wen, Haochen Li, Ling Li, Qi Guo, Yongwei Zhao, Ke Gao, Ruizhi Chen, Yanjun Wu, Chen Zhao, Yunji Chen

    Abstract: The attention operator remains a critical performance bottleneck in large language models (LLMs), particularly for long-context scenarios. While FlashAttention is the most widely used and effective GPU-aware acceleration algorithm, it must require time-consuming and hardware-specific manual implementation, limiting adaptability across GPU architectures. Existing LLMs have shown a lot of promise in… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    ACM Class: I.2.7

  6. arXiv:2505.24586  [pdf, ps, other

    astro-ph.HE

    All-sky search for individual Primordial Black Hole bursts with LHAASO

    Authors: Zhen Cao, F. Aharonian, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, C. M. Cai, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, G. H. Chen, H. X. Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen, S. H. Chen , et al. (293 additional authors not shown)

    Abstract: Primordial Black Holes~(PBHs) are hypothetical black holes with a wide range of masses that formed in the early universe. As a result, they may play an important cosmological role and provide a unique probe of the early universe. A PBH with an initial mass of approximately $10^{15}$~g is expected to explode today in a final burst of Hawking radiation. In this work, we conduct an all-sky search for… ▽ More

    Submitted 2 June, 2025; v1 submitted 30 May, 2025; originally announced May 2025.

    Comments: 8 pages, 2 figures

  7. arXiv:2505.23177  [pdf, other

    cs.CL

    Infinite-Instruct: Synthesizing Scaling Code instruction Data with Bidirectional Synthesis and Static Verification

    Authors: Wenjing Xing, Wenke Lu, Yeheng Duan, Bing Zhao, Zhenghui kang, Yaolong Wang, Kai Gao, Lei Qiao

    Abstract: Traditional code instruction data synthesis methods suffer from limited diversity and poor logic. We introduce Infinite-Instruct, an automated framework for synthesizing high-quality question-answer pairs, designed to enhance the code generation capabilities of large language models (LLMs). The framework focuses on improving the internal logic of synthesized problems and the quality of synthesized… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  8. arXiv:2505.19678  [pdf, other

    cs.CL cs.CV

    Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs

    Authors: Hao Fang, Changle Zhou, Jiawei Kong, Kuofeng Gao, Bin Chen, Tao Liang, Guojun Ma, Shu-Tao Xia

    Abstract: Large Vision-Language Models (LVLMs) are susceptible to hallucinations, where generated responses seem semantically plausible yet exhibit little or no relevance to the input image. Previous studies reveal that this issue primarily stems from LVLMs' over-reliance on language priors while disregarding the visual information during decoding. To alleviate this issue, we introduce a novel Conditional P… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  9. arXiv:2505.17601  [pdf, other

    cs.CL

    Wolf Hidden in Sheep's Conversations: Toward Harmless Data-Based Backdoor Attacks for Jailbreaking Large Language Models

    Authors: Jiawei Kong, Hao Fang, Xiaochen Yang, Kuofeng Gao, Bin Chen, Shu-Tao Xia, Yaowei Wang, Min Zhang

    Abstract: Supervised fine-tuning (SFT) aligns large language models (LLMs) with human intent by training them on labeled task-specific data. Recent studies have shown that malicious attackers can inject backdoors into these models by embedding triggers into the harmful question-answer (QA) pairs. However, existing poisoning attacks face two critical limitations: (1) they are easily detected and filtered by… ▽ More

    Submitted 28 May, 2025; v1 submitted 23 May, 2025; originally announced May 2025.

  10. arXiv:2505.15337  [pdf, other

    cs.CL cs.AI

    Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors

    Authors: Hao Fang, Jiawei Kong, Tianqu Zhuang, Yixiang Qiu, Kuofeng Gao, Bin Chen, Shu-Tao Xia, Yaowei Wang, Min Zhang

    Abstract: The misuse of large language models (LLMs), such as academic plagiarism, has driven the development of detectors to identify LLM-generated texts. To bypass these detectors, paraphrase attacks have emerged to purposely rewrite these texts to evade detection. Despite the success, existing methods require substantial data and computational budgets to train a specialized paraphraser, and their attack… ▽ More

    Submitted 26 May, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

  11. arXiv:2505.14447  [pdf, ps, other

    astro-ph.HE hep-ex

    First Identification and Precise Spectral Measurement of the Proton Component in the Cosmic-Ray `Knee'

    Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, C. M. Cai, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, G. H. Chen, H. X. Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (292 additional authors not shown)

    Abstract: We report the first high-purity identification of cosmic-ray (CR) protons and a precise measurement of their energy spectrum from 0.15 to 12 PeV using the Large High Altitude Air Shower Observatory (LHAASO). Abundant event statistics, combined with the simultaneous detection of electrons/photons, muons, and Cherenkov light in air showers, enable spectroscopic measurements with statistical and syst… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  12. arXiv:2505.06302  [pdf, other

    cs.LG cs.AI

    QiMeng-TensorOp: Automatically Generating High-Performance Tensor Operators with Hardware Primitives

    Authors: Xuzhi Zhang, Shaohui Peng, Qirui Zhou, Yuanbo Wen, Qi Guo, Ruizhi Chen, Xinguo Zhu, Weiqiang Xiong, Haixin Chen, Congying Ma, Ke Gao, Chen Zhao, Yanjun Wu, Yunji Chen, Ling Li

    Abstract: Computation-intensive tensor operators constitute over 90\% of the computations in Large Language Models (LLMs) and Deep Neural Networks.Automatically and efficiently generating high-performance tensor operators with hardware primitives is crucial for diverse and ever-evolving hardware architectures like RISC-V, ARM, and GPUs, as manually optimized implementation takes at least months and lacks po… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 10 pages, 5 figures

    ACM Class: I.2.2

  13. arXiv:2505.02824  [pdf, ps, other

    cs.CV cs.AI cs.CR

    Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models

    Authors: Kuofeng Gao, Yufei Zhu, Yiming Li, Jiawang Bai, Yong Yang, Zhifeng Li, Shu-Tao Xia

    Abstract: Text-to-image (T2I) diffusion models have rapidly advanced, enabling high-quality image generation conditioned on textual prompts. However, the growing trend of fine-tuning pre-trained models for personalization raises serious concerns about unauthorized dataset usage. To combat this, dataset ownership verification (DOV) has emerged as a solution, embedding watermarks into the fine-tuning datasets… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

  14. arXiv:2504.21101  [pdf, other

    cond-mat.supr-con

    Enhanced superconductivity in X4H15compounds via hole-doping at ambient pressure

    Authors: Kun Gao, Wenwen Cui, Tiago F. T. Cerqueira, Hai-ChenWang, Silvana Botti, Miguel A. L. Marque

    Abstract: This study presents a computational investigation of X4H15 compounds (where X represents a metal) as potential superconductors at ambient conditions or under pressure. Through systematic density functional theory calculations and electron-phonon coupling analysis, we demonstrate that electronic structure engineering via hole doping dramatically enhances the superconducting properties of these mate… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  15. arXiv:2504.20094  [pdf, ps, other

    cs.IR cs.CL cs.HC

    MATCHA: Can Multi-Agent Collaboration Build a Trustworthy Conversational Recommender?

    Authors: Zheng Hui, Xiaokai Wei, Yexi Jiang, Kevin Gao, Chen Wang, Frank Ong, Se-eun Yoon, Rachit Pareek, Michelle Gong

    Abstract: In this paper, we propose a multi-agent collaboration framework called MATCHA for conversational recommendation system, leveraging large language models (LLMs) to enhance personalization and user engagement. Users can request recommendations via free-form text and receive curated lists aligned with their interests, preferences, and constraints. Our system introduces specialized agents for intent a… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

  16. arXiv:2504.19182  [pdf

    physics.atom-ph

    Coulomb Crystallization of Highly Charged Ni^12+ Ions in a Linear Paul Trap

    Authors: Shaolong Chen, Zhiqiang Zhou, Guosheng Zhang, Jun Xiao, Yao Huang, Kelin Gao, Hua Guan

    Abstract: Optical clocks have garnered widespread attention due to their unparalleled precision in time-frequency standards, geodetic measurements, and fundamental physics research. Among emerging developments, highly charged ion (HCI)-based optical clocks have attracted significant scientific interest owing to their exceptional resilience against electromagnetic perturbations and enhanced sensitivity to va… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

    Comments: 18 pages,8 figures

  17. arXiv:2504.12100  [pdf, other

    cs.CV

    Generalized Visual Relation Detection with Diffusion Models

    Authors: Kaifeng Gao, Siqi Chen, Hanwang Zhang, Jun Xiao, Yueting Zhuang, Qianru Sun

    Abstract: Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image. Although recent VRD models have achieved impressive performance, they are all restricted to pre-defined relation categories, while failing to consider the semantic ambiguity characteristic of visual relations. Unlike objects, the appearance of visual relations is always subtle and can… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: Under review at IEEE TCSVT. The Appendix is provided additionally

  18. arXiv:2504.10812  [pdf, other

    cs.RO cs.AI

    E2E Parking Dataset: An Open Benchmark for End-to-End Autonomous Parking

    Authors: Kejia Gao, Liguo Zhou, Mingjun Liu, Alois Knoll

    Abstract: End-to-end learning has shown great potential in autonomous parking, yet the lack of publicly available datasets limits reproducibility and benchmarking. While prior work introduced a visual-based parking model and a pipeline for data generation, training, and close-loop test, the dataset itself was not released. To bridge this gap, we create and open-source a high-quality dataset for end-to-end a… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  19. arXiv:2503.22747  [pdf, other

    cs.LG cs.AI cs.ET

    LeForecast: Enterprise Hybrid Forecast by Time Series Intelligence

    Authors: Zheng Tan, Yiwen Nie, Wenfa Wu, Guanyu Zhang, Yanze Liu, Xinyuan Tian, Kailin Gao, Mengya Liu, Qijiang Cheng, Haipeng Jiang, Yingzheng Ma, Wei Zheng, Yuci Zhu, Yuanyuan Sun, Xiangyu Lei, Xiyu Guan, Wanqing Huang, Shouming Liu, Xiangquan Meng, Pengzhan Qu, Chao Yang, Jiaxuan Fan, Yuan He, Hongsheng Qi, Yangzhou Du

    Abstract: Demand is spiking in industrial fields for multidisciplinary forecasting, where a broad spectrum of sectors needs planning and forecasts to streamline intelligent business management, such as demand forecasting, product planning, inventory optimization, etc. Specifically, these tasks expecting intelligent approaches to learn from sequentially collected historical data and then foresee most possibl… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  20. arXiv:2503.21824  [pdf, other

    cs.CV cs.CR

    Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations

    Authors: Haitong Liu, Kuofeng Gao, Yang Bai, Jinmin Li, Jinxiao Shan, Tao Dai, Shu-Tao Xia

    Abstract: Recently, video-based large language models (video-based LLMs) have achieved impressive performance across various video comprehension tasks. However, this rapid advancement raises significant privacy and security concerns, particularly regarding the unauthorized use of personal video data in automated annotation by video-based LLMs. These unauthorized annotated video-text pairs can then be used t… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

    Comments: Accepted by CVPR 2025

  21. arXiv:2503.11580  [pdf, other

    quant-ph

    Complementary Collective Spin Descriptions of Superradiant Ramsey Spectroscopy

    Authors: Ke-Xin Gao, Yuan Zhang, Shi-Lei Su, Gang Chen, Chongxin Shan, Klaus Mølmer

    Abstract: A recent experiment demonstrated delayed superradiance from strontium-88 atoms, which are coupled to a longitudinal mode of a cavity while being excited by laser pulses propagating along a transversal direction [Nat. Commun. 15, 1084 (2024)]. A coherent picture of the atomic ensemble dynamics in this experiment requires complementary representations of the external driving dynamics and the superra… ▽ More

    Submitted 15 May, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: 12 pages, 10 figures

  22. arXiv:2503.11240  [pdf, other

    cs.CV cs.LG

    Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

    Authors: Zijing Hu, Fengda Zhang, Long Chen, Kun Kuang, Jiahui Li, Kaifeng Gao, Jun Xiao, Xin Wang, Wenwu Zhu

    Abstract: Diffusion models have achieved remarkable success in text-to-image generation. However, their practical applications are hindered by the misalignment between generated images and corresponding text prompts. To tackle this issue, reinforcement learning (RL) has been considered for diffusion model fine-tuning. Yet, RL's effectiveness is limited by the challenge of sparse reward, where feedback is on… ▽ More

    Submitted 26 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: Accepted to CVPR 2025, add references

  23. arXiv:2503.07203  [pdf

    q-bio.MN

    POINT: a web-based platform for pharmacological investigation enhanced by multi-omics networks and knowledge graphs

    Authors: Zihao He, Liu Liu, Dongchen Han, Kai Gao, Lei Dong, Dechao Bu, Peipei Huo, Zhihao Wang, Wenxin Deng, Jingjia Liu, Jin-cheng Guo, Yi Zhao, Yang Wu

    Abstract: Network pharmacology (NP) explores pharmacological mechanisms through biological networks. Multi-omics data enable multi-layer network construction under diverse conditions, requiring integration into NP analyses. We developed POINT, a novel NP platform enhanced by multi-omics biological networks, advanced algorithms, and knowledge graphs (KGs) featuring network-based and KG-based analytical funct… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 45 pages. 7 figures

  24. arXiv:2503.06687  [pdf, other

    cs.LG cond-mat.mtrl-sci cs.AI physics.bio-ph physics.chem-ph

    UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion

    Authors: Gongbo Zhang, Yanting Li, Renqian Luo, Pipi Hu, Zeru Zhao, Lingbo Li, Guoqing Liu, Zun Wang, Ran Bi, Kaiyuan Gao, Liya Guo, Yu Xie, Chang Liu, Jia Zhang, Tian Xie, Robert Pinsler, Claudio Zeni, Ziheng Lu, Yingce Xia, Marwin Segler, Maik Riechert, Li Yuan, Lei Chen, Haiguang Liu, Tao Qin

    Abstract: Unified generation of sequence and structure for scientific data (e.g., materials, molecules, proteins) is a critical task. Existing approaches primarily rely on either autoregressive sequence models or diffusion models, each offering distinct advantages and facing notable limitations. Autoregressive models, such as GPT, Llama, and Phi-4, have demonstrated remarkable success in natural language ge… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  25. arXiv:2503.06353  [pdf

    cs.CY cs.AI

    The AI Pentad, the CHARME$^{2}$D Model, and an Assessment of Current-State AI Regulation

    Authors: Di Kevin Gao, Sudip Mittal, Jiming Wu, Hongwei Du, Jingdao Chen, Shahram Rahimi

    Abstract: Artificial Intelligence (AI) has made remarkable progress in the past few years with AI-enabled applications beginning to permeate every aspect of our society. Despite the widespread consensus on the need to regulate AI, there remains a lack of a unified approach to framing, developing, and assessing AI regulations. Many of the existing methods take a value-based approach, for example, accountabil… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

  26. arXiv:2503.01873  [pdf, other

    cs.LG cs.AI cs.PF math.NA

    Online Pseudo-average Shifting Attention(PASA) for Robust Low-precision LLM Inference: Algorithms and Numerical Analysis

    Authors: Long Cheng, Qichen Liao, Fan Wu, Junlin Mu, Tengfei Han, Zhe Qiu, Lianqiang Li, Tianyi Liu, Fangzheng Miao, Keming Gao, Liang Wang, Zhen Zhang, Qiande Yin

    Abstract: Attention calculation is extremely time-consuming for long-sequence inference tasks, such as text or image/video generation, in large models. To accelerate this process, we developed a low-precision, mathematically-equivalent algorithm called PASA, based on Flash Attention. PASA introduces two novel techniques: online pseudo-average shifting and global recovering. These techniques enable the use o… ▽ More

    Submitted 25 February, 2025; originally announced March 2025.

    Comments: 21 Pages, 14 figures, conference paper

  27. arXiv:2503.01853  [pdf

    physics.chem-ph physics.flu-dyn

    Direct detonation initiation and propagation in methane/air mixtures containing coal particles

    Authors: Shengnan Li, Shangpeng Li, Shumeng Xie, Yong Xu, Ke Gao, Huangwei Zhang

    Abstract: The mechanisms of direct detonation initiation (DDI) in methane/air mixtures containing coal particles are investigated through simulations conducted using the Eulerian-Lagrangian method in a two-dimensional configuration. Methane-air combustion is modelled with a detailed chemical mechanism involving 36 species and 219 reactions, while coal particle surface reactions are computed using a kinetic/… ▽ More

    Submitted 21 February, 2025; originally announced March 2025.

  28. arXiv:2503.00566  [pdf, ps, other

    cs.AI cs.CL

    Instructor-Worker Large Language Model System for Policy Recommendation: a Case Study on Air Quality Analysis of the January 2025 Los Angeles Wildfires

    Authors: Kyle Gao, Dening Lu, Liangzhi Li, Nan Chen, Hongjie He, Linlin Xu, Jonathan Li

    Abstract: The Los Angeles wildfires of January 2025 caused more than 250 billion dollars in damage and lasted for nearly an entire month before containment. Following our previous work, the Digital Twin Building, we modify and leverage the multi-agent large language model framework as well as the cloud-mapping integration to study the air quality during the Los Angeles wildfires. Recent advances in large la… ▽ More

    Submitted 5 June, 2025; v1 submitted 1 March, 2025; originally announced March 2025.

  29. arXiv:2502.18281  [pdf, other

    cond-mat.supr-con

    The Maximum $T_c$ of Conventional Superconductors at Ambient Pressure

    Authors: Kun Gao, Tiago F. T. Cerqueira, Antonio Sanna, Yue-Wen Fang, Đorđe Dangić, Ion Errea, Hai-Chen Wang, Silvana Botti, Miguel A. L. Marques

    Abstract: The theoretical maximum critical temperature ($T_c$) for conventional superconductors at ambient pressure remains a fundamental question in condensed matter physics. Through analysis of electron-phonon calculations for over 20,000 metals, we critically examine this question. We find that while hydride metals can exhibit maximum phonon frequencies of more than 5000 K, the crucial logarithmic averag… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  30. arXiv:2502.15447  [pdf, other

    astro-ph.HE hep-ph

    Ultra-high-energy $γ$-ray emission associated with the tail of a bow-shock pulsar wind nebula

    Authors: Zhen Cao, F. Aharonian, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, C. M. Cai, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen, S. H. Chen, S. Z. Chen , et al. (274 additional authors not shown)

    Abstract: In this study, we present a comprehensive analysis of an unidentified point-like ultra-high-energy (UHE) $γ$-ray source, designated as 1LHAASO J1740+0948u, situated in the vicinity of the middle-aged pulsar PSR J1740+1000. The detection significance reached 17.1$σ$ (9.4$σ$) above 25$\,$TeV (100$\,$TeV). The source energy spectrum extended up to 300$\,$TeV, which was well fitted by a log-parabola f… ▽ More

    Submitted 24 February, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

    Comments: Corrected spelling errors in several author names

    Journal ref: The Innovation (2025), 100802

  31. arXiv:2502.14934  [pdf, other

    q-bio.QM cs.AI cs.LG

    Fast and Accurate Blind Flexible Docking

    Authors: Zizhuo Zhang, Lijun Wu, Kaiyuan Gao, Jiangchao Yao, Tao Qin, Bo Han

    Abstract: Molecular docking that predicts the bound structures of small molecules (ligands) to their protein targets, plays a vital role in drug discovery. However, existing docking methods often face limitations: they either overlook crucial structural changes by assuming protein rigidity or suffer from low computational efficiency due to their reliance on generative models for structure sampling. To addre… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

    Comments: 25 pages, Accepted by ICLR 2025

  32. arXiv:2502.11184  [pdf, ps, other

    cs.CL cs.AI cs.CV cs.MM

    Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs

    Authors: Wenxuan Wang, Xiaoyuan Liu, Kuiyi Gao, Jen-tse Huang, Youliang Yuan, Pinjia He, Shuai Wang, Zhaopeng Tu

    Abstract: Multimodal Large Language Models (MLLMs) have expanded the capabilities of traditional language models by enabling interaction through both text and images. However, ensuring the safety of these models remains a significant challenge, particularly in accurately identifying whether multimodal content is safe or unsafe-a capability we term safety awareness. In this paper, we introduce MMSafeAware, t… ▽ More

    Submitted 3 June, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

    Comments: Accepted by ACL 2025

  33. arXiv:2502.09723  [pdf, other

    cs.CR cs.AI cs.CL

    QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query Language

    Authors: Qingsong Zou, Jingyu Xiao, Qing Li, Zhi Yan, Yuhang Wang, Li Xu, Wenxuan Wang, Kuofeng Gao, Ruoyu Li, Yong Jiang

    Abstract: Recent advances in large language models (LLMs) have demonstrated remarkable potential in the field of natural language processing. Unfortunately, LLMs face significant security and ethical risks. Although techniques such as safety alignment are developed for defense, prior researches reveal the possibility of bypassing such defenses through well-designed jailbreak attacks. In this paper, we propo… ▽ More

    Submitted 26 May, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

    Comments: To appear in ACL 2025

  34. arXiv:2502.07527  [pdf, ps, other

    cs.AI cs.LG

    Nature Language Model: Deciphering the Language of Nature for Scientific Discovery

    Authors: Yingce Xia, Peiran Jin, Shufang Xie, Liang He, Chuan Cao, Renqian Luo, Guoqing Liu, Yue Wang, Zequn Liu, Yuan-Jyue Chen, Zekun Guo, Yeqi Bai, Pan Deng, Yaosen Min, Ziheng Lu, Hongxia Hao, Han Yang, Jielan Li, Chang Liu, Jia Zhang, Jianwei Zhu, Ran Bi, Kehan Wu, Wei Zhang, Kaiyuan Gao , et al. (21 additional authors not shown)

    Abstract: Foundation models have revolutionized natural language processing and artificial intelligence, significantly enhancing how machines comprehend and generate human languages. Inspired by the success of these foundation models, researchers have developed foundation models for individual scientific domains, including small molecules, materials, proteins, DNA, RNA and even cells. However, these models… ▽ More

    Submitted 20 June, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

    Comments: 95 pages

  35. arXiv:2502.06802  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Solving the Content Gap in Roblox Game Recommendations: LLM-Based Profile Generation and Reranking

    Authors: Chen Wang, Xiaokai Wei, Yexi Jiang, Frank Ong, Kevin Gao, Xiao Yu, Zheng Hui, Se-eun Yoon, Philip Yu, Michelle Gong

    Abstract: With the vast and dynamic user-generated content on Roblox, creating effective game recommendations requires a deep understanding of game content. Traditional recommendation models struggle with the inconsistent and sparse nature of game text features such as titles and descriptions. Recent advancements in large language models (LLMs) offer opportunities to enhance recommendation systems by analyz… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

  36. arXiv:2502.05822  [pdf, other

    cs.IR

    HCMRM: A High-Consistency Multimodal Relevance Model for Search Ads

    Authors: Guobing Gan, Kaiming Gao, Li Wang, Shen Jiang, Peng Jiang

    Abstract: Search advertising is essential for merchants to reach the target users on short video platforms. Short video ads aligned with user search intents are displayed through relevance matching and bid ranking mechanisms. This paper focuses on improving query-to-video relevance matching to enhance the effectiveness of ranking in ad systems. Recent vision-language pre-training models have demonstrated pr… ▽ More

    Submitted 9 February, 2025; originally announced February 2025.

    Comments: Accepted by WWW 2025 (Industry Track)

  37. arXiv:2502.05769  [pdf, other

    cs.CV

    Digital Twin Buildings: 3D Modeling, GIS Integration, and Visual Descriptions Using Gaussian Splatting, ChatGPT/Deepseek, and Google Maps Platform

    Authors: Kyle Gao, Dening Lu, Liangzhi Li, Nan Chen, Hongjie He, Linlin Xu, Jonathan Li

    Abstract: Urban digital twins are virtual replicas of cities that use multi-source data and data analytics to optimize urban planning, infrastructure management, and decision-making. Towards this, we propose a framework focused on the single-building scale. By connecting to cloud mapping platforms such as Google Map Platforms APIs, by leveraging state-of-the-art multi-agent Large Language Models data analys… ▽ More

    Submitted 20 April, 2025; v1 submitted 8 February, 2025; originally announced February 2025.

    Comments: -Fixed minor typo

  38. arXiv:2502.04848  [pdf, other

    astro-ph.HE

    Broadband $γ$-ray spectrum of supernova remnant Cassiopeia A

    Authors: Zhen Cao, F. Aharonian, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, C. M. Cai, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen, S. H. Chen, S. Z. Chen , et al. (293 additional authors not shown)

    Abstract: The core-collapse supernova remnant (SNR) Cassiopeia A (Cas A) is one of the brightest galactic radio sources with an angular radius of $\sim$ 2.5 $\arcmin$. Although no extension of this source has been detected in the $γ$-ray band, using more than 1000 days of LHAASO data above $\sim 0.8$ TeV, we find that its spectrum is significantly softer than those obtained with Imaging Air Cherenkov Telesc… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

  39. arXiv:2502.02753  [pdf, other

    cs.RO

    MuST: Multi-Head Skill Transformer for Long-Horizon Dexterous Manipulation with Skill Progress

    Authors: Kai Gao, Fan Wang, Erica Aduh, Dylan Randle, Jane Shi

    Abstract: Robot picking and packing tasks require dexterous manipulation skills, such as rearranging objects to establish a good grasping pose, or placing and pushing items to achieve tight packing. These tasks are challenging for robots due to the complexity and variability of the required actions. To tackle the difficulty of learning and executing long-horizon tasks, we propose a novel framework called th… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: Accepted by ICRA 2025 (2025 IEEE International Conference on Robotics & Automation)

  40. arXiv:2502.02493  [pdf, ps, other

    cs.LG

    EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization

    Authors: Yize Wu, Ke Gao, Yanjun Wu

    Abstract: Speculative decoding is an effective and lossless method for Large Language Model (LLM) inference acceleration. It employs a smaller model to generate a draft token sequence, which is then verified by the original base model. In multi-GPU systems, inference latency can be further reduced through tensor parallelism (TP), while the optimal TP size of the draft model is typically smaller than that of… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    MSC Class: I.2.11

  41. arXiv:2501.19143  [pdf, other

    cs.AI cs.CR cs.CV

    Imitation Game for Adversarial Disillusion with Multimodal Generative Chain-of-Thought Role-Play

    Authors: Ching-Chun Chang, Fan-Yun Chen, Shih-Hong Gu, Kai Gao, Hanrui Wang, Isao Echizen

    Abstract: As the cornerstone of artificial intelligence, machine perception confronts a fundamental threat posed by adversarial illusions. These adversarial attacks manifest in two primary forms: deductive illusion, where specific stimuli are crafted based on the victim model's general decision logic, and inductive illusion, where the victim model's general decision logic is shaped by specific stimuli. The… ▽ More

    Submitted 31 January, 2025; originally announced January 2025.

  42. arXiv:2501.16663  [pdf, other

    cs.CR cs.AI

    Data Duplication: A Novel Multi-Purpose Attack Paradigm in Machine Unlearning

    Authors: Dayong Ye, Tianqing Zhu, Jiayang Li, Kun Gao, Bo Liu, Leo Yu Zhang, Wanlei Zhou, Yang Zhang

    Abstract: Duplication is a prevalent issue within datasets. Existing research has demonstrated that the presence of duplicated data in training datasets can significantly influence both model performance and data privacy. However, the impact of data duplication on the unlearning process remains largely unexplored. This paper addresses this gap by pioneering a comprehensive investigation into the role of dat… ▽ More

    Submitted 11 March, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

    Comments: Accepted at USENIX Security 2025

  43. arXiv:2501.15396  [pdf

    q-bio.QM cs.CV cs.LG eess.IV stat.AP

    Foundations of a Knee Joint Digital Twin from qMRI Biomarkers for Osteoarthritis and Knee Replacement

    Authors: Gabrielle Hoyer, Kenneth T Gao, Felix G Gassert, Johanna Luitjens, Fei Jiang, Sharmila Majumdar, Valentina Pedoia

    Abstract: This study forms the basis of a digital twin system of the knee joint, using advanced quantitative MRI (qMRI) and machine learning to advance precision health in osteoarthritis (OA) management and knee replacement (KR) prediction. We combined deep learning-based segmentation of knee joint structures with dimensionality reduction to create an embedded feature space of imaging biomarkers. Through cr… ▽ More

    Submitted 25 January, 2025; originally announced January 2025.

    Comments: This manuscript builds on an earlier preprint version available on Research Square: https://doi.org/10.21203/rs.3.rs-4317958/v1

    Journal ref: npj Digit. Med. 8, 118 (2025)

  44. arXiv:2501.13340  [pdf, other

    cs.CV

    Retrievals Can Be Detrimental: A Contrastive Backdoor Attack Paradigm on Retrieval-Augmented Diffusion Models

    Authors: Hao Fang, Xiaohang Sui, Hongyao Yu, Kuofeng Gao, Jiawei Kong, Sijin Yu, Bin Chen, Hao Wu, Shu-Tao Xia

    Abstract: Diffusion models (DMs) have recently demonstrated remarkable generation capability. However, their training generally requires huge computational resources and large-scale datasets. To solve these, recent studies empower DMs with the advanced Retrieval-Augmented Generation (RAG) technique and propose retrieval-augmented diffusion models (RDMs). By incorporating rich knowledge from an auxiliary dat… ▽ More

    Submitted 9 March, 2025; v1 submitted 22 January, 2025; originally announced January 2025.

  45. arXiv:2501.12948  [pdf, other

    cs.CL cs.AI cs.LG

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    Authors: DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu , et al. (175 additional authors not shown)

    Abstract: We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  46. arXiv:2501.01843  [pdf, other

    quant-ph cond-mat.quant-gas physics.optics

    A stable phase-locking-free single beam optical lattice with multiple configurations

    Authors: Yirong Wang, Xiaoyu Dai, Xue Zhao, Guangren Sun, Kuiyi Gao, Wei Zhang

    Abstract: Optical lattices formed by interfering laser beams are widely used to trap and manipulate atoms for quantum simulation, metrology, and computation. To stabilize optical lattices in experiments, it is usually challenging to implement delicate phase-locking systems with complicated optics and electronics to reduce the relative phase fluctuation of multiple laser beams. Here we report a phase-locking… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

  47. arXiv:2501.00625  [pdf, ps, other

    cs.CV cs.GR

    Gaussian Building Mesh (GBM): Extract a Building's 3D Mesh with Google Earth and Gaussian Splatting

    Authors: Kyle Gao, Liangzhi Li, Hongjie He, Dening Lu, Linlin Xu, Jonathan Li

    Abstract: Recently released open-source pre-trained foundational image segmentation and object detection models (SAM2+GroundingDINO) allow for geometrically consistent segmentation of objects of interest in multi-view 2D images. Users can use text-based or click-based prompts to segment objects of interest without requiring labeled training datasets. Gaussian Splatting allows for the learning of the 3D repr… ▽ More

    Submitted 5 June, 2025; v1 submitted 31 December, 2024; originally announced January 2025.

  48. arXiv:2412.19437  [pdf, other

    cs.CL cs.AI

    DeepSeek-V3 Technical Report

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao , et al. (175 additional authors not shown)

    Abstract: We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for loa… ▽ More

    Submitted 18 February, 2025; v1 submitted 26 December, 2024; originally announced December 2024.

  49. arXiv:2412.15398  [pdf, other

    cs.RO

    Tabletop Object Rearrangement: Structure, Complexity, and Efficient Combinatorial Search-Based Solutions

    Authors: Kai Gao

    Abstract: This thesis provides an in-depth structural analysis and efficient algorithmic solutions for tabletop object rearrangement with overhand grasps (TORO), a foundational task in advancing intelligent robotic manipulation. Rearranging multiple objects in a confined workspace presents two primary challenges: sequencing actions to minimize pick-and-place operations - an NP-hard problem in TORO - and det… ▽ More

    Submitted 31 January, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: PhD Thesis of Kai Gao, written under the direction of Prof. Jingjin Yu

  50. arXiv:2412.12453  [pdf, other

    cs.MM

    Multimodal Classification and Out-of-distribution Detection for Multimodal Intent Understanding

    Authors: Hanlei Zhang, Qianrui Zhou, Hua Xu, Jianhua Su, Roberto Evans, Kai Gao

    Abstract: Multimodal intent understanding is a significant research area that requires effective leveraging of multiple modalities to analyze human language. Existing methods face two main challenges in this domain. Firstly, they have limitations in capturing the nuanced and high-level semantics underlying complex in-distribution (ID) multimodal intents. Secondly, they exhibit poor generalization when confr… ▽ More

    Submitted 23 May, 2025; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: Accepted by IEEE Transactions on Multimedia