Skip to main content

Showing 1–50 of 557 results for author: Ye, Q

.
  1. arXiv:2507.02870  [pdf, ps, other

    cs.CL

    Loki's Dance of Illusions: A Comprehensive Survey of Hallucination in Large Language Models

    Authors: Chaozhuo Li, Pengbo Wang, Chenxu Wang, Litian Zhang, Zheng Liu, Qiwei Ye, Yuanbo Xu, Feiran Huang, Xi Zhang, Philip S. Yu

    Abstract: Edgar Allan Poe noted, "Truth often lurks in the shadow of error," highlighting the deep complexity intrinsic to the interplay between truth and falsehood, notably under conditions of cognitive and informational asymmetry. This dynamic is strikingly evident in large language models (LLMs). Despite their impressive linguistic generation capabilities, LLMs sometimes produce information that appears… ▽ More

    Submitted 6 June, 2025; originally announced July 2025.

  2. arXiv:2507.02757  [pdf, ps, other

    astro-ph.EP astro-ph.GA astro-ph.IM

    Discovery and Preliminary Characterization of a Third Interstellar Object: 3I/ATLAS

    Authors: Darryl Z. Seligman, Marco Micheli, Davide Farnocchia, Larry Denneau, John W. Noonan, Henry H. Hsieh, Toni Santana-Ros, John Tonry, Katie Auchettl, Luca Conversi, Maxime Devogèle, Laura Faggioli, Adina D. Feinstein, Marco Fenucci, Marin Ferrais, Tessa Frincke, Olivier R. Hainaut, Kyle Hart, Andrew Hoffman, Carrie E. Holt, Willem B. Hoogendam, Mark E. Huber, Emmanuel Jehin, Theodore Kareta, Jacqueline V. Keane , et al. (20 additional authors not shown)

    Abstract: We report initial observations aimed at the characterization of a third interstellar object candidate. This object, 3I/ATLAS or C/2025 N1 (ATLAS), was discovered on 2025 July 1 UT and has an orbital eccentricity of $e\sim6.1$, perihelion of $q\sim 1.36$ au, inclination of $\sim175^\circ$, and hyperbolic velocity of $V_\infty\sim 58$ km s$^{-1}$. We report deep stacked images obtained using the Can… ▽ More

    Submitted 7 July, 2025; v1 submitted 3 July, 2025; originally announced July 2025.

    Comments: Submitted to AAS Journals. 13 pages, 8 figures, 1 table. Community follow-up organization can be found here: https://3i-atlas.github.io/ The showyourwork! version of the manuscript can be found here: https://github.com/3I-ATLAS/discovery-paper

  3. arXiv:2507.00008  [pdf, other

    cs.AI cs.CV cs.HC

    DiMo-GUI: Advancing Test-time Scaling in GUI Grounding via Modality-Aware Visual Reasoning

    Authors: Hang Wu, Hongkai Chen, Yujun Cai, Chang Liu, Qingwen Ye, Ming-Hsuan Yang, Yiwei Wang

    Abstract: Grounding natural language queries in graphical user interfaces (GUIs) poses unique challenges due to the diversity of visual elements, spatial clutter, and the ambiguity of language. In this paper, we introduce DiMo-GUI, a training-free framework for GUI grounding that leverages two core strategies: dynamic visual grounding and modality-aware optimization. Instead of treating the GUI as a monolit… ▽ More

    Submitted 11 June, 2025; originally announced July 2025.

    Comments: 8 pages, 6 figures

  4. arXiv:2506.21600  [pdf, ps, other

    cs.CL cs.AI cs.IR

    Structured Attention Matters to Multimodal LLMs in Document Understanding

    Authors: Chang Liu, Hongkai Chen, Yujun Cai, Hang Wu, Qingwen Ye, Ming-Hsuan Yang, Yiwei Wang

    Abstract: Document understanding remains a significant challenge for multimodal large language models (MLLMs). While previous research has primarily focused on locating evidence pages through precise multimodal queries, our work investigates a fundamental yet overlooked aspect: how input format influences document comprehension performance. Through systematic analysis, we discover that raw OCR text often im… ▽ More

    Submitted 19 June, 2025; originally announced June 2025.

  5. arXiv:2506.19027  [pdf, ps, other

    astro-ph.EP

    A Large Outburst, Coma Asymmetries, and the Color of Comet 243P/NEAT

    Authors: Michael S. P. Kelley, Silvia Protopapa, Dennis Bodewits, Aren N. Heinze, Youssef Moulane, Quanzhi Ye, Bryce Bolin, Simon Conseil, Tony L. Farnham, Lori Feaga, Xing Gao, Chih-Hao Hsia, Emmanuel Jehin, Shrinivas R. Kulkarni, Russ R. Laher, Tim Lister, Frank J. Masci, Josiah Purdum, Bin Yang

    Abstract: Water ice is a fundamental building material of comets and other bodies in the outer solar system. Yet, the properties of cometary water ice are challenging to study, due to its volatility and the typical distances at which comets are observed. Cometary outbursts, impulsive mass-loss events that can liberate large amounts of material, offer opportunities to directly observe and characterize cometa… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

    Comments: 41 pages, 14 figures, 5 tables; accepted for publication in The Planetary Science Journal

  6. arXiv:2506.18324  [pdf, ps, other

    eess.SP

    ARSAR-Net: Intelligent SAR Imaging with Adaptive Regularization

    Authors: Shiping Fu, Yufan Chen, Zhe Zhang, Xiaolan Qiu, Qixiang Ye

    Abstract: Deep unfolding networks have recently emerged as a promising approach for synthetic aperture radar (SAR) imaging. However, baseline unfolding networks, typically derived from iterative reconstruction algorithms such as the alternating direction method of multipliers (ADMM), lack generalization capability across scenes, primarily because their regularizers are empirically designed rather than learn… ▽ More

    Submitted 26 June, 2025; v1 submitted 23 June, 2025; originally announced June 2025.

  7. arXiv:2506.15215  [pdf, ps, other

    cs.CL

    MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs

    Authors: Yongqi Fan, Yating Wang, Guandong Wang, Jie Zhai, Jingping Liu, Qi Ye, Tong Ruan

    Abstract: Open-ended question answering (QA) is a key task for evaluating the capabilities of large language models (LLMs). Compared to closed-ended QA, it demands longer answer statements, more nuanced reasoning processes, and diverse expressions, making refined and interpretable automatic evaluation both crucial and challenging. Traditional metrics like ROUGE and BERTScore struggle to capture semantic sim… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  8. arXiv:2506.15150  [pdf, ps, other

    cs.RO eess.SY

    Human Locomotion Implicit Modeling Based Real-Time Gait Phase Estimation

    Authors: Yuanlong Ji, Xingbang Yang, Ruoqi Zhao, Qihan Ye, Quan Zheng, Yubo Fan

    Abstract: Gait phase estimation based on inertial measurement unit (IMU) signals facilitates precise adaptation of exoskeletons to individual gait variations. However, challenges remain in achieving high accuracy and robustness, particularly during periods of terrain changes. To address this, we develop a gait phase estimation neural network based on implicit modeling of human locomotion, which combines tem… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  9. arXiv:2506.06915  [pdf

    q-bio.BM cs.LG

    Graph Neural Networks in Modern AI-aided Drug Discovery

    Authors: Odin Zhang, Haitao Lin, Xujun Zhang, Xiaorui Wang, Zhenxing Wu, Qing Ye, Weibo Zhao, Jike Wang, Kejun Ying, Yu Kang, Chang-yu Hsieh, Tingjun Hou

    Abstract: Graph neural networks (GNNs), as topology/structure-aware models within deep learning, have emerged as powerful tools for AI-aided drug discovery (AIDD). By directly operating on molecular graphs, GNNs offer an intuitive and expressive framework for learning the complex topological and geometric features of drug-like molecules, cementing their role in modern molecular modeling. This review provide… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

  10. arXiv:2506.05105  [pdf, ps, other

    cond-mat.mtrl-sci physics.comp-ph

    Classification and enumeration of solid-solid phase transition mechanisms

    Authors: Fang-Cheng Wang, Qi-Jun Ye, Yu-Cheng Zhu, Xin-Zheng Li

    Abstract: Crystal-structure match (CSM), the atom-to-atom correspondence between two crystalline phases, is used extensively to describe solid-solid phase transition (SSPT) mechanisms. However, existing computational methods cannot account for all possible CSMs. Here, we propose a formalism to classify all CSMs into a tree structure, which is independent of the choices of unit cell and supercell. We rigorou… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: 22 pages, 14 figures

  11. arXiv:2506.03800  [pdf, ps, other

    q-bio.BM

    STELLA: Towards Protein Function Prediction with Multimodal LLMs Integrating Sequence-Structure Representations

    Authors: Hongwang Xiao, Wenjun Lin, Xi Chen, Hui Wang, Kai Chen, Jiashan Li, Yuancheng Sun, Sicheng Dai, Boya Wu, Qiwei Ye

    Abstract: Protein biology focuses on the intricate relationships among sequences, structures, and functions. Deciphering protein functions is crucial for understanding biological processes, advancing drug discovery, and enabling synthetic biology applications. Since protein sequences determine tertiary structures, which in turn govern functions, integrating sequence and structure information is essential fo… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

  12. arXiv:2505.23885  [pdf, ps, other

    cs.AI cs.CL

    OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation

    Authors: Mengkang Hu, Yuhang Zhou, Wendong Fan, Yuzhou Nie, Bowei Xia, Tao Sun, Ziyu Ye, Zhaoxuan Jin, Yingru Li, Qiguang Chen, Zeyu Zhang, Yifeng Wang, Qianshuo Ye, Bernard Ghanem, Ping Luo, Guohao Li

    Abstract: Large Language Model (LLM)-based multi-agent systems show promise for automating real-world tasks but struggle to transfer across domains due to their domain-specific nature. Current approaches face two critical shortcomings: they require complete architectural redesign and full retraining of all components when applied to new domains. We introduce Workforce, a hierarchical multi-agent framework t… ▽ More

    Submitted 10 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

    Comments: Project Page: https://github.com/camel-ai/owl

  13. arXiv:2505.23213  [pdf

    physics.chem-ph

    Transparent and heat-insulation bionic hydrogel-based smart window system for long-term cooling and waste heat collection

    Authors: Qianwang Ye, Hanqing Dai, Yukun Yan, Liwei Wang, Xinlin Du, Yimeng Wang, Zhile Han, Wanlu Zhang, Ruiqian Guo

    Abstract: With the energy crisis and climate warming, the position of a new generation of smart windows is becoming increasingly important, and materials or systems that can have high blocking of near-infrared (NIR) and ultraviolet (UV) and high transmittance of visible light (VIS) are needed. Currently, it is difficult for smart heat-insulation materials to achieve high transmittance of VIS, good UV isolat… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  14. arXiv:2505.20989  [pdf, ps, other

    cond-mat.stat-mech

    Determination of melting temperature of hexagonal ice using Lee-Yang phase transition theory

    Authors: Ling Liu, Yihua Dong, Qijun Ye, Xin-Zheng Li

    Abstract: Lee-Yang phase transition theory is a milestone in statistical physics. Its applications in realistic systems, however, had been substantially hindered by availability of practical schemes to calculate the Lee-Yang zeros. In this manuscript, we extend the scheme we have designed earlier [Phys. Rev. E 109, 024118 (2024)] and report simulation results for the melting temperature (T) of ice Ih under… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 10 pages, 8 figures

  15. arXiv:2505.19656  [pdf, ps, other

    cs.CV

    ReDDiT: Rehashing Noise for Discrete Visual Generation

    Authors: Tianren Ma, Xiaosong Zhang, Boyu Yang, Junlan Feng, Qixiang Ye

    Abstract: Discrete diffusion models are gaining traction in the visual generative area for their efficiency and compatibility. However, the pioneered attempts still fall behind the continuous counterparts, which we attribute to the noise (absorbing state) design and sampling heuristics. In this study, we propose the rehashing noise framework for discrete diffusion transformer, termed ReDDiT, to extend absor… ▽ More

    Submitted 29 May, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

    Comments: Preprint. Check out our project page at github.com/martian422/ReDDiT

  16. arXiv:2505.18947  [pdf, other

    cs.CV

    OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model

    Authors: Zhenhao Zhang, Ye Shi, Lingxiao Yang, Suting Ni, Qi Ye, Jingya Wang

    Abstract: Understanding and synthesizing realistic 3D hand-object interactions (HOI) is critical for applications ranging from immersive AR/VR to dexterous robotics. Existing methods struggle with generalization, performing well on closed-set objects and predefined tasks but failing to handle unseen objects or open-vocabulary instructions. We introduce OpenHOI, the first framework for open-world HOI synthes… ▽ More

    Submitted 24 May, 2025; originally announced May 2025.

  17. arXiv:2505.16831  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs

    Authors: Xiaoyu Xu, Xiang Yue, Yang Liu, Qingqing Ye, Haibo Hu, Minxin Du

    Abstract: Unlearning in large language models (LLMs) is intended to remove the influence of specific data, yet current evaluations rely heavily on token-level metrics such as accuracy and perplexity. We show that these metrics can be misleading: models often appear to forget, but their original behavior can be rapidly restored with minimal fine-tuning, revealing that unlearning may obscure information rathe… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 44 pages

  18. arXiv:2505.14451  [pdf, ps, other

    cs.LG cs.AI

    RefiDiff: Refinement-Aware Diffusion for Efficient Missing Data Imputation

    Authors: Md Atik Ahamed, Qiang Ye, Qiang Cheng

    Abstract: Missing values in high-dimensional, mixed-type datasets pose significant challenges for data imputation, particularly under Missing Not At Random (MNAR) mechanisms. Existing methods struggle to integrate local and global data characteristics, limiting performance in MNAR and high-dimensional settings. We propose an innovative framework, RefiDiff, combining local machine learning predictions with a… ▽ More

    Submitted 20 May, 2025; originally announced May 2025.

  19. arXiv:2505.13410  [pdf, ps, other

    math.ST math.PR stat.ML

    Joint stochastic localization and applications

    Authors: Tom Alberts, Yiming Xu, Qiang Ye

    Abstract: Stochastic localization is a pathwise analysis technique originating from convex geometry. This paper explores certain algorithmic aspects of stochastic localization as a computational tool. First, we unify various existing stochastic localization schemes and discuss their localization rates and regularization. We then introduce a joint stochastic localization framework for constructing couplings… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: 34 pages

  20. arXiv:2505.12871  [pdf, other

    cs.LG cs.AI cs.CL cs.CR

    Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

    Authors: Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Ronghua Li

    Abstract: Low rank adaptation (LoRA) has emerged as a prominent technique for fine-tuning large language models (LLMs) thanks to its superb efficiency gains over previous methods. While extensive studies have examined the performance and structural properties of LoRA, its behavior upon training-time attacks remain underexplored, posing significant security risks. In this paper, we theoretically investigate… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: To appear at ICML 25

  21. arXiv:2505.11932  [pdf, other

    cs.CL cs.IR

    Neuro-Symbolic Query Compiler

    Authors: Yuyao Zhang, Zhicheng Dou, Xiaoxi Li, Jiajie Jin, Yongkang Wu, Zhonghua Li, Qi Ye, Ji-Rong Wen

    Abstract: Precise recognition of search intent in Retrieval-Augmented Generation (RAG) systems remains a challenging goal, especially under resource constraints and for complex queries with nested structures and dependencies. This paper presents QCompiler, a neuro-symbolic framework inspired by linguistic grammar rules and compiler design, to bridge this gap. It theoretically designs a minimal yet sufficien… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

    Comments: Findings of ACL2025, codes are available at this url: https://github.com/YuyaoZhangQAQ/Query_Compiler

  22. arXiv:2505.10413  [pdf, other

    cs.CL

    Hierarchical Document Refinement for Long-context Retrieval-augmented Generation

    Authors: Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu, Yongkang Wu, Zhonghua Li, Qi Ye, Zhicheng Dou

    Abstract: Real-world RAG applications often encounter long-context input scenarios, where redundant information and noise results in higher inference costs and reduced performance. To address these challenges, we propose LongRefiner, an efficient plug-and-play refiner that leverages the inherent structural characteristics of long documents. LongRefiner employs dual-level query analysis, hierarchical documen… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

  23. arXiv:2505.09684  [pdf, ps, other

    quant-ph

    Demonstration of low-overhead quantum error correction codes

    Authors: Ke Wang, Zhide Lu, Chuanyu Zhang, Gongyu Liu, Jiachen Chen, Yanzhe Wang, Yaozu Wu, Shibo Xu, Xuhao Zhu, Feitong Jin, Yu Gao, Ziqi Tan, Zhengyi Cui, Ning Wang, Yiren Zou, Aosai Zhang, Tingting Li, Fanhao Shen, Jiarun Zhong, Zehang Bao, Zitian Zhu, Yihang Han, Yiyang He, Jiayuan Shen, Han Wang , et al. (17 additional authors not shown)

    Abstract: Quantum computers hold the potential to surpass classical computers in solving complex computational problems. However, the fragility of quantum information and the error-prone nature of quantum operations make building large-scale, fault-tolerant quantum computers a prominent challenge. To combat errors, pioneering experiments have demonstrated a variety of quantum error correction codes. Yet, mo… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  24. arXiv:2505.07062  [pdf, ps, other

    cs.CV cs.AI

    Seed1.5-VL Technical Report

    Authors: Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang, Jingji Chen, Jingjia Huang, Kang Lei, Liping Yuan, Lishu Luo, Pengfei Liu, Qinghao Ye, Rui Qian, Shen Yan, Shixiong Zhao, Shuai Peng, Shuangye Li, Sihang Yuan, Sijin Wu, Tianheng Cheng , et al. (172 additional authors not shown)

    Abstract: We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning. Seed1.5-VL is composed with a 532M-parameter vision encoder and a Mixture-of-Experts (MoE) LLM of 20B active parameters. Despite its relatively compact architecture, it delivers strong performance across a wide spectrum of public VLM benchmarks and internal evaluati… ▽ More

    Submitted 11 May, 2025; originally announced May 2025.

  25. arXiv:2505.04416  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models

    Authors: Xiaoyu Xu, Minxin Du, Qingqing Ye, Haibo Hu

    Abstract: Large language models (LLMs) trained over extensive corpora risk memorizing sensitive, copyrighted, or toxic content. To address this, we propose OBLIVIATE, a robust unlearning framework that removes targeted data while preserving model utility. The framework follows a structured process: extracting target tokens, building retain sets, and fine-tuning with a tailored loss function comprising three… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 18 pages, 2 figures

  26. arXiv:2505.00929  [pdf, other

    cs.LG stat.ML

    Compact Recurrent Transformer with Persistent Memory

    Authors: Edison Mucllari, Zachary Daniels, David Zhang, Qiang Ye

    Abstract: The Transformer architecture has shown significant success in many language processing and visual tasks. However, the method faces challenges in efficiently scaling to long sequences because the self-attention computation is quadratic with respect to the input length. To overcome this limitation, several approaches scale to longer sequences by breaking long sequences into a series of segments, res… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  27. arXiv:2504.19314  [pdf, other

    cs.CL

    BrowseComp-ZH: Benchmarking Web Browsing Ability of Large Language Models in Chinese

    Authors: Peilin Zhou, Bruce Leon, Xiang Ying, Can Zhang, Yifan Shao, Qichen Ye, Dading Chong, Zhiling Jin, Chenxuan Xie, Meng Cao, Yuxin Gu, Sixin Hong, Jing Ren, Jian Chen, Chao Liu, Yining Hua

    Abstract: As large language models (LLMs) evolve into tool-using agents, the ability to browse the web in real-time has become a critical yardstick for measuring their reasoning and retrieval competence. Existing benchmarks such as BrowseComp concentrate on English and overlook the linguistic, infrastructural, and censorship-related complexities of other major information ecosystems -- most notably Chinese.… ▽ More

    Submitted 1 May, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

    Comments: Under Review

  28. arXiv:2504.17523  [pdf, other

    cs.DB cs.CR

    From Randomized Response to Randomized Index: Answering Subset Counting Queries with Local Differential Privacy

    Authors: Qingqing Ye, Liantong Yu, Kai Huang, Xiaokui Xiao, Weiran Liu, Haibo Hu

    Abstract: Local Differential Privacy (LDP) is the predominant privacy model for safeguarding individual data privacy. Existing perturbation mechanisms typically require perturbing the original values to ensure acceptable privacy, which inevitably results in value distortion and utility deterioration. In this work, we propose an alternative approach -- instead of perturbing values, we apply randomization to… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: This paper is accepted by IEEE S&P 2025

  29. arXiv:2504.17431  [pdf, other

    hep-ph

    The resonance parameters of the vector charmonium-like state $G(3900)$

    Authors: Quanxing Ye, Zhenyu Zhang, Meng-Lin Du, Ulf-G. Meißner, Peng-Yu Niu, Qian Wang

    Abstract: Motivated by the updated analysis of the $G(3900)$ by the BESIII collaboration, we perform a global analysis of the cross sections of the $e^+e^-\to D\bar{D}$, $e^+e^-\to D\bar{D}^*+c.c.$, $e^+e^-\to D^*\bar{D}^*$ processes, especially focusing on the properties of the $G(3900)$. As the energy region of interest is limited by the next opening threshold, i.e. the $D_1\bar{D}$ threshold, we focus on… ▽ More

    Submitted 24 April, 2025; originally announced April 2025.

    Comments: 22 pages, 8 figures

  30. arXiv:2504.14993  [pdf, other

    cs.CR cs.DB

    Dual Utilization of Perturbation for Stream Data Publication under Local Differential Privacy

    Authors: Rong Du, Qingqing Ye, Yaxin Xiao, Liantong Yu, Yue Fu, Haibo Hu

    Abstract: Stream data from real-time distributed systems such as IoT, tele-health, and crowdsourcing has become an important data source. However, the collection and analysis of user-generated stream data raise privacy concerns due to the potential exposure of sensitive information. To address these concerns, local differential privacy (LDP) has emerged as a promising standard. Nevertheless, applying LDP to… ▽ More

    Submitted 21 April, 2025; originally announced April 2025.

  31. arXiv:2504.13526  [pdf, other

    cs.CR

    Multi-class Item Mining under Local Differential Privacy

    Authors: Yulian Mao, Qingqing Ye, Rong Du, Qi Wang, Kai Huang, Haibo Hu

    Abstract: Item mining, a fundamental task for collecting statistical data from users, has raised increasing privacy concerns. To address these concerns, local differential privacy (LDP) was proposed as a privacy-preserving technique. Existing LDP item mining mechanisms primarily concentrate on global statistics, i.e., those from the entire dataset. Nevertheless, they fall short of user-tailored tasks such a… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

  32. Deep learning to improve the discovery of near-Earth asteroids in the Zwicky Transient Facility

    Authors: Belén Yu Irureta-Goyena, George Helou, Jean-Paul Kneib, Frank Masci, Thomas Prince, Kumar Venkataramani, Quanzhi Ye, Joseph Masiero, Frédéric Dux, Mathieu Salzmann

    Abstract: We present a novel pipeline that uses a convolutional neural network (CNN) to improve the detection capability of near-Earth asteroids (NEAs) in the context of planetary defense. Our work aims to minimize the dependency on human intervention of the current approach adopted by the Zwicky Transient Facility (ZTF). The target NEAs have a high proper motion of up to tens of degrees per day and thus ap… ▽ More

    Submitted 30 May, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

    Comments: Published in Publications of the Astronomical Society of the Pacific (Open Access)

    Journal ref: Publications of the Astronomical Society of the Pacific, 137:054503 (13pp), 2025 May

  33. arXiv:2504.10409  [pdf, other

    cs.CV

    GPS: Distilling Compact Memories via Grid-based Patch Sampling for Efficient Online Class-Incremental Learning

    Authors: Mingchuan Ma, Yuhao Zhou, Jindi Lv, Yuxin Tian, Dan Si, Shujian Li, Qing Ye, Jiancheng Lv

    Abstract: Online class-incremental learning aims to enable models to continuously adapt to new classes with limited access to past data, while mitigating catastrophic forgetting. Replay-based methods address this by maintaining a small memory buffer of previous samples, achieving competitive performance. For effective replay under constrained storage, recent approaches leverage distilled data to enhance the… ▽ More

    Submitted 14 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

    Comments: 10 pages, 10 figures

  34. arXiv:2504.05953  [pdf, ps, other

    math.CO

    On walk domination: Between different types of walks and $m_3$-path

    Authors: Hangdi Chen, Yuhan Ma, Qingjie Ye

    Abstract: This paper investigates the domination relationships among various types of walks connecting two non-adjacent vertices in a graph. In particular, we center our attention on the problem which is proposed in [S. B. Tondato, Graphs Combin. 40 (2024)]. A \textit{\( uv \)-\( m_3 \) path} is a \( uv \)-induced path of length at least three. A walk between two non-adjacent vertices in a graph $G$ is call… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

  35. arXiv:2504.05618  [pdf, other

    cs.LG cs.AI cs.CV cs.DB

    Technical Report: Full Version of Analyzing and Optimizing Perturbation of DP-SGD Geometrically

    Authors: Jiawei Duan, Haibo Hu, Qingqing Ye, Xinyue Sun

    Abstract: Differential privacy (DP) has become a prevalent privacy model in a wide range of machine learning tasks, especially after the debut of DP-SGD. However, DP-SGD, which directly perturbs gradients in the training iterations, fails to mitigate the negative impacts of noise on gradient direction. As a result, DP-SGD is often inefficient. Although various solutions (e.g., clipping to reduce the sensiti… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

    Comments: This is the full version of our paper "Analyzing and Optimizing Perturbation of DP-SGD Geometrically", which will appear in ICDE 2025 as a regular research paper

    Journal ref: International Conference of Data Engineering (ICDE 2025)

  36. arXiv:2504.01482  [pdf, other

    cs.LG math.NA

    A Robust Model-Based Approach for Continuous-Time Policy Evaluation with Unknown Lévy Process Dynamics

    Authors: Qihao Ye, Xiaochuan Tian, Yuhua Zhu

    Abstract: This paper develops a model-based framework for continuous-time policy evaluation (CTPE) in reinforcement learning, incorporating both Brownian and Lévy noise to model stochastic dynamics influenced by rare and extreme events. Our approach formulates the policy evaluation problem as solving a partial integro-differential equation (PIDE) for the value function with unknown coefficients. A key chall… ▽ More

    Submitted 24 April, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

    Comments: 28 pages, 9 figures

    MSC Class: 65R20; 62M05; 35R09; 60H35; 93E35; 90C40; 68T05

  37. arXiv:2503.21426  [pdf, other

    cs.LG cs.CR

    AdvSGM: Differentially Private Graph Learning via Adversarial Skip-gram Model

    Authors: Sen Zhang, Qingqing Ye, Haibo Hu, Jianliang Xu

    Abstract: The skip-gram model (SGM), which employs a neural network to generate node vectors, serves as the basis for numerous popular graph embedding techniques. However, since the training datasets contain sensitive linkage information, the parameters of a released SGM may encode private information and pose significant privacy risks. Differential privacy (DP) is a rigorous standard for protecting individ… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

    Comments: Accepted by ICDE 2025

  38. arXiv:2503.21269  [pdf, other

    cs.CV

    Delving Deep into Semantic Relation Distillation

    Authors: Zhaoyi Yan, Kangjun Liu, Qixiang Ye

    Abstract: Knowledge distillation has become a cornerstone technique in deep learning, facilitating the transfer of knowledge from complex models to lightweight counterparts. Traditional distillation approaches focus on transferring knowledge at the instance level, but fail to capture nuanced semantic relationships within the data. In response, this paper introduces a novel methodology, Semantics-based Relat… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  39. arXiv:2503.12053  [pdf, other

    cs.LG cs.AI

    Ferret: An Efficient Online Continual Learning Framework under Varying Memory Constraints

    Authors: Yuhao Zhou, Yuxin Tian, Jindi Lv, Mingjia Shi, Yuanxi Li, Qing Ye, Shuhao Zhang, Jiancheng Lv

    Abstract: In the realm of high-frequency data streams, achieving real-time learning within varying memory constraints is paramount. This paper presents Ferret, a comprehensive framework designed to enhance online accuracy of Online Continual Learning (OCL) algorithms while dynamically adapting to varying memory budgets. Ferret employs a fine-grained pipeline parallelism strategy combined with an iterative g… ▽ More

    Submitted 15 March, 2025; originally announced March 2025.

    Comments: CVPR 2025

  40. arXiv:2503.08670  [pdf, other

    astro-ph.EP

    In Search of the Potentially Hazardous Asteroids in the Taurid Resonant Swarm

    Authors: Jasmine Li, Quanzhi Ye, Denis Vida, David L. Clark, Eric C. Bellm, Richard Dekany, Matthew J. Graham, Frank J. Masci, Josiah Purdum, Benjamin Racine, Avery Wold

    Abstract: The Taurid Complex is a large interplanetary system that contains comet 2P/Encke, several meteoroid streams, and possibly a number of near-Earth asteroids. The size and nature of the system has led to the speculation that it was formed through a large-scale cometary breakup. Numerical investigations have suggested that planetary dynamics can create a resonant region with a large number of objects… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

    Comments: PSJ in press

  41. arXiv:2503.08297  [pdf, other

    cs.CR

    Privacy for Free: Leveraging Local Differential Privacy Perturbed Data from Multiple Services

    Authors: Rong Du, Qingqing Ye, Yue Fu, Haibo Hu

    Abstract: Local Differential Privacy (LDP) has emerged as a widely adopted privacy-preserving technique in modern data analytics, enabling users to share statistical insights while maintaining robust privacy guarantees. However, current LDP applications assume a single service gathering perturbed information from users. In reality, multiple services may be interested in collecting users' data, which poses p… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  42. arXiv:2503.07906  [pdf, other

    cs.CV

    Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

    Authors: Qinghao Ye, Xianhan Zeng, Fu Li, Chunyuan Li, Haoqi Fan

    Abstract: Image captioning has long been a pivotal task in visual understanding, with recent advancements in vision-language models (VLMs) significantly enhancing the ability to generate detailed image captions. However, the evaluation of detailed image captioning remains underexplored due to outdated evaluation metrics and coarse annotations. In this paper, we introduce DeCapBench along with a novel metric… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: Accepted by ICLR 2025

  43. arXiv:2503.05499  [pdf, other

    cs.LG

    Mol-CADiff: Causality-Aware Autoregressive Diffusion for Molecule Generation

    Authors: Md Atik Ahamed, Qiang Ye, Qiang Cheng

    Abstract: The design of novel molecules with desired properties is a key challenge in drug discovery and materials science. Traditional methods rely on trial-and-error, while recent deep learning approaches have accelerated molecular generation. However, existing models struggle with generating molecules based on specific textual descriptions. We introduce Mol-CADiff, a novel diffusion-based framework that… ▽ More

    Submitted 7 March, 2025; originally announced March 2025.

  44. arXiv:2503.02101  [pdf, ps, other

    cs.CV

    Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection

    Authors: Boyong He, Yuxiang Ji, Qianwen Ye, Zhuoyue Tan, Liaoni Wu

    Abstract: Domain generalization (DG) for object detection aims to enhance detectors' performance in unseen scenarios. This task remains challenging due to complex variations in real-world applications. Recently, diffusion models have demonstrated remarkable capabilities in diverse scene generation, which inspires us to explore their potential for improving DG tasks. Instead of generating images, our method… ▽ More

    Submitted 4 June, 2025; v1 submitted 3 March, 2025; originally announced March 2025.

    Comments: CVPR2025 camera-ready version with supplementary material

  45. arXiv:2502.21271  [pdf, other

    cs.CV cs.AI cs.LG

    Adaptive Keyframe Sampling for Long Video Understanding

    Authors: Xi Tang, Jihao Qiu, Lingxi Xie, Yunjie Tian, Jianbin Jiao, Qixiang Ye

    Abstract: Multimodal large language models (MLLMs) have enabled open-world visual understanding by injecting visual input as extra tokens into large language models (LLMs) as contexts. However, when the visual input changes from a single image to a long video, the above paradigm encounters difficulty because the vast amount of video tokens has significantly exceeded the maximal capacity of MLLMs. Therefore,… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

    Comments: CVPR2025

  46. arXiv:2502.19070  [pdf, other

    cs.LG cs.CR

    A Sample-Level Evaluation and Generative Framework for Model Inversion Attacks

    Authors: Haoyang Li, Li Bai, Qingqing Ye, Haibo Hu, Yaxin Xiao, Huadi Zheng, Jianliang Xu

    Abstract: Model Inversion (MI) attacks, which reconstruct the training dataset of neural networks, pose significant privacy concerns in machine learning. Recent MI attacks have managed to reconstruct realistic label-level private data, such as the general appearance of a target person from all training images labeled on him. Beyond label-level privacy, in this paper we show sample-level privacy, the private… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

    Comments: Accepted to be appeared in 39th Annual AAAI Conference on Artificial Intelligence (AAAI-25)

  47. arXiv:2502.12524  [pdf, other

    cs.CV cs.AI

    YOLOv12: Attention-Centric Real-Time Object Detectors

    Authors: Yunjie Tian, Qixiang Ye, David Doermann

    Abstract: Enhancing the network architecture of the YOLO framework has been crucial for a long time, but has focused on CNN-based improvements despite the proven superiority of attention mechanisms in modeling capabilities. This is because attention-based models cannot match the speed of CNN-based models. This paper proposes an attention-centric YOLO framework, namely YOLOv12, that matches the speed of prev… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: https://github.com/sunsmarterjie/yolov12

  48. arXiv:2502.11900  [pdf, ps, other

    quant-ph cs.IT cs.LG

    Ansatz-free Hamiltonian learning with Heisenberg-limited scaling

    Authors: Hong-Ye Hu, Muzhou Ma, Weiyuan Gong, Qi Ye, Yu Tong, Steven T. Flammia, Susanne F. Yelin

    Abstract: Learning the unknown interactions that govern a quantum system is crucial for quantum information processing, device benchmarking, and quantum sensing. The problem, known as Hamiltonian learning, is well understood under the assumption that interactions are local, but this assumption may not hold for arbitrary Hamiltonians. Previous methods all require high-order inverse polynomial dependency with… ▽ More

    Submitted 30 June, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: Updated version with expanded explanations, added pseudocode, and new numerical demonstrations. 10 pages, 4 figures. HYH and MM contributed equally

  49. arXiv:2502.11703  [pdf, other

    cs.CL

    CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation

    Authors: Guangya Yu, Yanhao Li, Zongying Jiang, Yuxiong Jin, Li Dai, Yupian Lin, Ruihui Hou, Weiyan Zhang, Yongqi Fan, Qi Ye, Jingping Liu, Tong Ruan

    Abstract: Medical quality control indicators are essential to assess the qualifications of healthcare institutions for medical services. With the impressive performance of large language models (LLMs) like GPT-4 in the medical field, leveraging these technologies for the Medical Quality Control Indicator Calculation (MQCIC) presents a promising approach. In this work, (1) we introduce a real-world task MQCI… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: 16 pages

  50. arXiv:2502.10734  [pdf, other

    cs.RO

    Motion planning for highly-dynamic unconditioned reflexes based on chained Signed Distance Functions

    Authors: Ken Lin, Qi Ye, Tin Lun Lam, Zhibin Li, Jiming Chen, Gaofeng Li

    Abstract: The unconditioned reflex (e.g., protective reflex), which is the innate reaction of the organism and usually performed through the spinal cord rather than the brain, can enable organisms to escape harms from environments. In this paper, we propose an online, highly-dynamic motion planning algorithm to endow manipulators the highly-dynamic unconditioned reflexes to humans and/or environments. Our m… ▽ More

    Submitted 18 February, 2025; v1 submitted 15 February, 2025; originally announced February 2025.