Skip to main content

Showing 51–100 of 558 results for author: Ye, Q

.
  1. arXiv:2502.10734  [pdf, other

    cs.RO

    Motion planning for highly-dynamic unconditioned reflexes based on chained Signed Distance Functions

    Authors: Ken Lin, Qi Ye, Tin Lun Lam, Zhibin Li, Jiming Chen, Gaofeng Li

    Abstract: The unconditioned reflex (e.g., protective reflex), which is the innate reaction of the organism and usually performed through the spinal cord rather than the brain, can enable organisms to escape harms from environments. In this paper, we propose an online, highly-dynamic motion planning algorithm to endow manipulators the highly-dynamic unconditioned reflexes to humans and/or environments. Our m… ▽ More

    Submitted 18 February, 2025; v1 submitted 15 February, 2025; originally announced February 2025.

  2. arXiv:2502.05264  [pdf, other

    quant-ph cs.AI cs.LG

    Quantum automated learning with provable and explainable trainability

    Authors: Qi Ye, Shuangyue Geng, Zizhao Han, Weikang Li, L. -M. Duan, Dong-Ling Deng

    Abstract: Machine learning is widely believed to be one of the most promising practical applications of quantum computing. Existing quantum machine learning schemes typically employ a quantum-classical hybrid approach that relies crucially on gradients of model parameters. Such an approach lacks provable convergence to global minima and will become infeasible as quantum learning models scale up. Here, we in… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

    Comments: 21 pages, 7 figures

  3. arXiv:2502.03092  [pdf, other

    cs.LG cs.AI cs.DC

    E-3SFC: Communication-Efficient Federated Learning with Double-way Features Synthesizing

    Authors: Yuhao Zhou, Yuxin Tian, Mingjia Shi, Yuanxi Li, Yanan Sun, Qing Ye, Jiancheng Lv

    Abstract: The exponential growth in model sizes has significantly increased the communication burden in Federated Learning (FL). Existing methods to alleviate this burden by transmitting compressed gradients often face high compression errors, which slow down the model's convergence. To simultaneously achieve high compression effectiveness and lower compression errors, we study the gradient compression prob… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

    Comments: Accepted by TNNLS. arXiv admin note: text overlap with arXiv:2302.13562

  4. arXiv:2502.00282  [pdf, other

    cs.LG

    GraphMinNet: Learning Dependencies in Graphs with Light Complexity Minimal Architecture

    Authors: Md Atik Ahamed, Andrew Cheng, Qiang Ye, Qiang Cheng

    Abstract: Graph Neural Networks (GNNs) have demonstrated remarkable success in various applications, yet they often struggle to capture long-range dependencies (LRD) effectively. This paper introduces GraphMinNet, a novel GNN architecture that generalizes the idea of minimal Gated Recurrent Units to graph-structured data. Our approach achieves efficient LRD modeling with linear computational complexity whil… ▽ More

    Submitted 31 January, 2025; originally announced February 2025.

  5. arXiv:2501.16614  [pdf, other

    cs.LG

    FUNU: Boosting Machine Unlearning Efficiency by Filtering Unnecessary Unlearning

    Authors: Zitong Li, Qingqing Ye, Haibo Hu

    Abstract: Machine unlearning is an emerging field that selectively removes specific data samples from a trained model. This capability is crucial for addressing privacy concerns, complying with data protection regulations, and correcting errors or biases introduced by certain data. Unlike traditional machine learning, where models are typically static once trained, machine unlearning facilitates dynamic upd… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: This paper has been accepted by WWW'25

  6. arXiv:2501.08801  [pdf, other

    cond-mat.other cond-mat.mtrl-sci

    Quantum disorder induced by nuclear tunneling in lattice

    Authors: Yu-Cheng Zhu, Jia-Xi Zeng, Qi-Jun Ye, Xin-Zheng Li

    Abstract: Lattice degrees of freedom (DoFs) may induce quantum disorder (QD) when nuclear tunneling outvies long-range order, but conventional phonon theory is incapable of describing such QD phases. Here we develop a method based on path-integral molecular dynamics to solve this problem. Its accuracy is verified in a double-well chain model and it is applied to a real material from first principles. A quan… ▽ More

    Submitted 15 January, 2025; originally announced January 2025.

  7. arXiv:2501.05835  [pdf, other

    cs.LG cs.CR

    Fine-tuning is Not Fine: Mitigating Backdoor Attacks in GNNs with Limited Clean Data

    Authors: Jiale Zhang, Bosen Rao, Chengcheng Zhu, Xiaobing Sun, Qingming Li, Haibo Hu, Xiapu Luo, Qingqing Ye, Shouling Ji

    Abstract: Graph Neural Networks (GNNs) have achieved remarkable performance through their message-passing mechanism. However, recent studies have highlighted the vulnerability of GNNs to backdoor attacks, which can lead the model to misclassify graphs with attached triggers as the target class. The effectiveness of recent promising defense techniques, such as fine-tuning or distillation, is heavily continge… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

  8. arXiv:2501.03606  [pdf, other

    cs.RO cs.CV

    VTAO-BiManip: Masked Visual-Tactile-Action Pre-training with Object Understanding for Bimanual Dexterous Manipulation

    Authors: Zhengnan Sun, Zhaotai Shi, Jiayin Chen, Qingtao Liu, Yu Cui, Qi Ye, Jiming Chen

    Abstract: Bimanual dexterous manipulation remains significant challenges in robotics due to the high DoFs of each hand and their coordination. Existing single-hand manipulation techniques often leverage human demonstrations to guide RL methods but fail to generalize to complex bimanual tasks involving multiple sub-skills. In this paper, we introduce VTAO-BiManip, a novel framework that combines visual-tacti… ▽ More

    Submitted 7 January, 2025; originally announced January 2025.

  9. arXiv:2501.03451  [pdf, other

    stat.ML cs.LG cs.SI

    Structure-Preference Enabled Graph Embedding Generation under Differential Privacy

    Authors: Sen Zhang, Qingqing Ye, Haibo Hu

    Abstract: Graph embedding generation techniques aim to learn low-dimensional vectors for each node in a graph and have recently gained increasing research attention. Publishing low-dimensional node vectors enables various graph analysis tasks, such as structural equivalence and link prediction. Yet, improper publication opens a backdoor to malicious attackers, who can infer sensitive information of individu… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

    Comments: Accepted by ICDE 25

  10. arXiv:2501.02415  [pdf, other

    cond-mat.mes-hall

    Electron-Phonon Temperature Inversion in Nanostructures under Pulsed Photoexcitation

    Authors: Qian Ye, Stephen K. Sanders, Andrea Schirato, Alessandro Alabastri

    Abstract: Photoexcitation of metallic nanostructures with short optical pulses can drive non-thermal electronic states, which, upon decay, lead to elevated electronic temperatures ($T_e \gtrapprox 1000\,\mathrm{K}$) eventually equilibrating with the lattice ($T_p$) through electron-phonon scattering. Here, we show that, in spatially extended nanostructures, the lattice temperature can locally exceed that of… ▽ More

    Submitted 4 January, 2025; originally announced January 2025.

  11. arXiv:2501.02354  [pdf, other

    cs.DB cs.CR

    PrivDPR: Synthetic Graph Publishing with Deep PageRank under Differential Privacy

    Authors: Sen Zhang, Haibo Hu, Qingqing Ye, Jianliang Xu

    Abstract: The objective of privacy-preserving synthetic graph publishing is to safeguard individuals' privacy while retaining the utility of original data. Most existing methods focus on graph neural networks under differential privacy (DP), and yet two fundamental problems in generating synthetic graphs remain open. First, the current research often encounters high sensitivity due to the intricate relation… ▽ More

    Submitted 4 January, 2025; originally announced January 2025.

    Comments: Accepted by KDD 25

  12. arXiv:2412.20378  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Tri-Ergon: Fine-grained Video-to-Audio Generation with Multi-modal Conditions and LUFS Control

    Authors: Bingliang Li, Fengyu Yang, Yuxin Mao, Qingwen Ye, Hongkai Chen, Yiran Zhong

    Abstract: Video-to-audio (V2A) generation utilizes visual-only video features to produce realistic sounds that correspond to the scene. However, current V2A models often lack fine-grained control over the generated audio, especially in terms of loudness variation and the incorporation of multi-modal conditions. To overcome these limitations, we introduce Tri-Ergon, a diffusion-based V2A model that incorpora… ▽ More

    Submitted 29 December, 2024; originally announced December 2024.

    Comments: AAAI 2025 Accepted

  13. arXiv:2412.19837  [pdf, other

    cs.CR cs.DB

    Data Poisoning Attacks to Local Differential Privacy Protocols for Graphs

    Authors: Xi He, Kai Huang, Qingqing Ye, Haibo Hu

    Abstract: Graph analysis has become increasingly popular with the prevalence of big data and machine learning. Traditional graph data analysis methods often assume the existence of a trusted third party to collect and store the graph data, which does not align with real-world situations. To address this, some research has proposed utilizing Local Differential Privacy (LDP) to collect graph data or graph met… ▽ More

    Submitted 23 December, 2024; originally announced December 2024.

  14. arXiv:2412.14832  [pdf, other

    cs.CR cs.DB

    Federated Heavy Hitter Analytics with Local Differential Privacy

    Authors: Yuemin Zhang, Qingqing Ye, Haibo Hu

    Abstract: Federated heavy hitter analytics enables service providers to better understand the preferences of cross-party users by analyzing the most frequent items. As with federated learning, it faces challenges of privacy concerns, statistical heterogeneity, and expensive communication. Local differential privacy (LDP), as the de facto standard for privacy-preserving data collection, solves the privacy ch… ▽ More

    Submitted 2 January, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: Accepted by SIGMOD 2025

  15. arXiv:2412.11919  [pdf, other

    cs.CL cs.AI cs.IR

    RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation

    Authors: Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, Qi Ye, Zhicheng Dou

    Abstract: Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, but existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens from retrieved text chunks, and the lack of joint optimizat… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

  16. arXiv:2412.08464  [pdf, other

    cs.CV

    CC-Diff: Enhancing Contextual Coherence in Remote Sensing Image Synthesis

    Authors: Mu Zhang, Yunfan Liu, Yue Liu, Yuzhong Zhao, Qixiang Ye

    Abstract: Existing image synthesis methods for natural scenes focus primarily on foreground control, often reducing the background to simplistic textures. Consequently, these approaches tend to overlook the intrinsic correlation between foreground and background, which may lead to incoherent and unrealistic synthesis results in remote sensing (RS) scenarios. In this paper, we introduce CC-Diff, a… ▽ More

    Submitted 10 March, 2025; v1 submitted 11 December, 2024; originally announced December 2024.

  17. arXiv:2412.06157  [pdf, other

    cs.CR

    Membership Inference Attacks and Defenses in Federated Learning: A Survey

    Authors: Li Bai, Haibo Hu, Qingqing Ye, Haoyang Li, Leixia Wang, Jianliang Xu

    Abstract: Federated learning is a decentralized machine learning approach where clients train models locally and share model updates to develop a global model. This enables low-resource devices to collaboratively build a high-quality model without requiring direct access to the raw training data. However, despite only sharing model updates, federated learning still faces several privacy vulnerabilities. One… ▽ More

    Submitted 8 December, 2024; originally announced December 2024.

    Comments: To be published in ACM Computing Surveys

  18. arXiv:2412.05674  [pdf, other

    quant-ph cs.AI cs.DS cs.LG stat.ML

    No-Free-Lunch Theories for Tensor-Network Machine Learning Models

    Authors: Jing-Chuan Wu, Qi Ye, Dong-Ling Deng, Li-Wei Yu

    Abstract: Tensor network machine learning models have shown remarkable versatility in tackling complex data-driven tasks, ranging from quantum many-body problems to classical pattern recognitions. Despite their promising performance, a comprehensive understanding of the underlying assumptions and limitations of these models is still lacking. In this work, we focus on the rigorous formulation of their no-fre… ▽ More

    Submitted 7 December, 2024; originally announced December 2024.

    Comments: 7+23 pages, comments welcome

  19. arXiv:2412.02759  [pdf, other

    cs.CV

    Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning

    Authors: Zhaozhi Wang, Conghu Li, Qixiang Ye, Tong Zhang

    Abstract: Most parameter-efficient fine-tuning (PEFT) methods rely on low-rank representations to adapt models. However, these approaches often oversimplify representations, particularly when the underlying data has high-rank or high-frequency components. This limitation hinders the model's ability to capture complex data interactions effectively. In this paper, we propose a novel approach that models netwo… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

    Comments: 14 pages, 7 figures, 9 tables

  20. arXiv:2412.02249  [pdf, other

    cs.RO cs.CV

    Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance

    Authors: Jing Zeng, Qi Ye, Tianle Liu, Yang Xu, Jin Li, Jinming Xu, Liang Li, Jiming Chen

    Abstract: Implicit neural representations and 3D Gaussian splatting (3DGS) have shown great potential for scene reconstruction. Recent studies have expanded their applications in autonomous reconstruction through task assignment methods. However, these methods are mainly limited to single robot, and rapid reconstruction of large-scale scenes remains challenging. Additionally, task-driven planning based on s… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  21. arXiv:2412.00452  [pdf, other

    cs.LG cs.CV

    Learning Locally, Revising Globally: Global Reviser for Federated Learning with Noisy Labels

    Authors: Yuxin Tian, Mouxing Yang, Yuhao Zhou, Jian Wang, Qing Ye, Tongliang Liu, Gang Niu, Jiancheng Lv

    Abstract: The success of most federated learning (FL) methods heavily depends on label quality, which is often inaccessible in real-world scenarios, such as medicine, leading to the federated label-noise (F-LN) problem. In this study, we observe that the global model of FL memorizes the noisy labels slowly. Based on the observations, we propose a novel approach dubbed Global Reviser for Federated Learning w… ▽ More

    Submitted 30 November, 2024; originally announced December 2024.

    Comments: 19 pages

  22. arXiv:2411.19108  [pdf, other

    cs.CV

    Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

    Authors: Feng Liu, Shiwei Zhang, Xiaofeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan

    Abstract: As a fundamental backbone for video generation, diffusion models are challenged by low inference speed due to the sequential nature of denoising. Previous methods speed up the models by caching and reusing model outputs at uniformly selected timesteps. However, such a strategy neglects the fact that differences among model outputs are not uniform across timesteps, which hinders selecting the appro… ▽ More

    Submitted 18 March, 2025; v1 submitted 28 November, 2024; originally announced November 2024.

    Comments: Accepted in CVPR 2025. Project: https://liewfeng.github.io/TeaCache

  23. arXiv:2411.17984  [pdf, ps, other

    cs.CV

    RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model

    Authors: Huiyang Hu, Peijin Wang, Hanbo Bi, Boyuan Tong, Zhaozhi Wang, Wenhui Diao, Hao Chang, Yingchao Feng, Ziqi Zhang, Yaowei Wang, Qixiang Ye, Kun Fu, Xian Sun

    Abstract: Remote sensing foundation models largely break away from the traditional paradigm of designing task-specific models, offering greater scalability across multiple tasks. However, they face challenges such as low computational efficiency and limited interpretability, especially when dealing with large-scale remote sensing images. To overcome these, we draw inspiration from heat conduction, a physica… ▽ More

    Submitted 25 June, 2025; v1 submitted 26 November, 2024; originally announced November 2024.

    Comments: 19 pages, 8 figures and 10 tables

  24. arXiv:2411.16830  [pdf, ps, other

    physics.optics cond-mat.mes-hall quant-ph

    Cavity-Quantum Electrodynamics with Moiré Flatband Photonic Crystals

    Authors: Yu-Tong Wang, Qi-Hang Ye, Jun-Yong Yan, Yufei Qiao, Chen Chen, Xiao-Tian Cheng, Chen-Hui Li, Zi-Jian Zhang, Cheng-Nian Huang, Yun Meng, Kai Zou, Wen-Kang Zhan, Chao Zhao, Xiaolong Hu, Clarence Augustine T H Tee, Wei E. I. Sha, Zhixiang Huang, Huiyun Liu, Chao-Yuan Jin, Lei Ying, Feng Liu

    Abstract: Quantum emitters are a key component in photonic quantum technologies. Enhancing their single-photon emission by engineering the photonic environment using cavities can significantly improve the overall efficiency in quantum information processing. However, this enhancement is often constrained by the need for precise nanoscale control over the emitter's position within micro- or nano-cavities. In… ▽ More

    Submitted 6 June, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

    Journal ref: Sci. Adv. 11 (2025) eadv8115

  25. arXiv:2411.13183  [pdf, other

    cs.CV

    ClickTrack: Towards Real-time Interactive Single Object Tracking

    Authors: Kuiran Wang, Xuehui Yu, Wenwen Yu, Guorong Li, Xiangyuan Lan, Qixiang Ye, Jianbin Jiao, Zhenjun Han

    Abstract: Single object tracking(SOT) relies on precise object bounding box initialization. In this paper, we reconsidered the deficiencies in the current approaches to initializing single object trackers and propose a new paradigm for single object tracking algorithms, ClickTrack, a new paradigm using clicking interaction for real-time scenarios. Moreover, click as an input type inherently lack hierarchica… ▽ More

    Submitted 24 November, 2024; v1 submitted 20 November, 2024; originally announced November 2024.

  26. arXiv:2411.07435  [pdf, other

    astro-ph.EP

    The Volatile Composition and Activity Evolution of Main-Belt Comet 358P/PANSTARRS

    Authors: Henry H. Hsieh, John W. Noonan, Michael S. P. Kelley, Dennis Bodewits, Jana Pittichova, Audrey Thirouin, Marco Micheli, Matthew M. Knight, Michele T. Bannister, Colin O. Chandler, Carrie E. Holt, Matthew J. Hopkins, Yaeji Kim, Nicholas A. Moskovitz, William J. Oldroyd, Jack Patterson, Scott S. Sheppard, Nicole Tan, Chadwick A. Trujillo, Quanzhi Ye

    Abstract: We report the detection of water vapor associated with main-belt comet 358P/PANSTARRS on UT 2024 January 8-9 using the NIRSPEC instrument aboard JWST. We derive a water production rate of Q(H2O)=(5.0+/-0.2)x10^25 molecules/s, marking only the second direct detection of sublimation products of any kind from a main-belt comet, after 238P/Read. Similar to 238P, we find a remarkable absence of hypervo… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

    Comments: 32 pages, 13 figures. Accepted for publication in The Planetary Science Journal

  27. arXiv:2411.03313  [pdf, other

    cs.CV

    Classification Done Right for Vision-Language Pre-Training

    Authors: Zilong Huang, Qinghao Ye, Bingyi Kang, Jiashi Feng, Haoqi Fan

    Abstract: We introduce SuperClass, a super simple classification method for vision-language pre-training on image-text data. Unlike its contrastive counterpart CLIP who contrast with a text encoder, SuperClass directly utilizes tokenized raw text as supervised classification labels, without the need for additional text filtering or selection. Due to the absence of the text encoding as contrastive target, Su… ▽ More

    Submitted 6 November, 2024; v1 submitted 5 November, 2024; originally announced November 2024.

    Comments: NeurIPS 2024

  28. arXiv:2410.13264  [pdf, other

    cs.LG cs.AI

    The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion

    Authors: Xu Han, Yuancheng Sun, Kai Chen, Kang Liu, Qiwei Ye

    Abstract: Coarse-grained(CG) molecular dynamics simulations offer computational efficiency for exploring protein conformational ensembles and thermodynamic properties. Though coarse representations enable large-scale simulations across extended temporal and spatial ranges, the sacrifice of atomic-level details limits their utility in tasks such as ligand docking and protein-protein interaction prediction. B… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: Paper under review

  29. arXiv:2410.12712  [pdf, other

    quant-ph cs.DS cs.IT cs.LG

    On the sample complexity of purity and inner product estimation

    Authors: Weiyuan Gong, Jonas Haferkamp, Qi Ye, Zhihan Zhang

    Abstract: We study the sample complexity of the prototypical tasks quantum purity estimation and quantum inner product estimation. In purity estimation, we are to estimate $tr(ρ^2)$ of an unknown quantum state $ρ$ to additive error $ε$. Meanwhile, for quantum inner product estimation, Alice and Bob are to estimate $tr(ρσ)$ to additive error $ε$ given copies of unknown quantum state $ρ$ and $σ$ using classic… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 33 pages, 1 figure

  30. arXiv:2410.12671  [pdf, other

    cs.LG

    New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class

    Authors: Yanyun Wang, Li Liu, Zi Liang, Yi R., Fung, Qingqing Ye, Haibo Hu

    Abstract: Adversarial Training (AT) is one of the most effective methods to enhance the robustness of Deep Neural Networks (DNNs). However, existing AT methods suffer from an inherent accuracy-robustness trade-off. Previous works have studied this issue under the current AT paradigm, but still face over 10% accuracy reduction without significant robustness improvement over simple baselines such as PGD-AT. T… ▽ More

    Submitted 26 May, 2025; v1 submitted 16 October, 2024; originally announced October 2024.

    Comments: Preprint. Under review

    ACM Class: I.2.6

  31. arXiv:2410.02712  [pdf, other

    cs.CV cs.CL

    LLaVA-Critic: Learning to Evaluate Multimodal Models

    Authors: Tianyi Xiong, Xiyao Wang, Dong Guo, Qinghao Ye, Haoqi Fan, Quanquan Gu, Heng Huang, Chunyuan Li

    Abstract: We introduce LLaVA-Critic, the first open-source large multimodal model (LMM) designed as a generalist evaluator to assess performance across a wide range of multimodal tasks. LLaVA-Critic is trained using a high-quality critic instruction-following dataset that incorporates diverse evaluation criteria and scenarios. Our experiments demonstrate the model's effectiveness in two key areas: (1) LMM-a… ▽ More

    Submitted 3 March, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: Accepted by CVPR 2025; Project Page: https://llava-vl.github.io/blog/2024-10-03-llava-critic

  32. arXiv:2410.01110   

    cs.CV

    RobustEMD: Domain Robust Matching for Cross-domain Few-shot Medical Image Segmentation

    Authors: Yazhou Zhu, Minxian Li, Qiaolin Ye, Shidong Wang, Tong Xin, Haofeng Zhang

    Abstract: Few-shot medical image segmentation (FSMIS) aims to perform the limited annotated data learning in the medical image analysis scope. Despite the progress has been achieved, current FSMIS models are all trained and deployed on the same data domain, as is not consistent with the clinical reality that medical imaging data is always across different data domains (e.g. imaging modalities, institutions… ▽ More

    Submitted 25 March, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: More details should be included, and more experiments

  33. arXiv:2410.00232  [pdf, ps, other

    cs.LG math.NA stat.ML

    Preconditioning for Accelerated Gradient Descent Optimization and Regularization

    Authors: Qiang Ye

    Abstract: Accelerated training algorithms, such as adaptive learning rates and various normalization methods, are widely used but not fully understood. When regularization is introduced, standard optimizers like adaptive learning rates may not perform effectively. This raises the need for alternative regularization approaches and the question of how to properly combine regularization with preconditioning. I… ▽ More

    Submitted 30 September, 2024; originally announced October 2024.

    Comments: 7 pages

  34. arXiv:2409.09540  [pdf, other

    astro-ph.EP

    Minor planets, asteroids, comets and interplanetary dust within 30 au

    Authors: Quanzhi Ye

    Abstract: Our Solar System includes the Sun, eight major planets and their moons, along with numerous asteroids, comets, and dust particles, collectively known as the small Solar System bodies. Small bodies are relics from the birth of the Solar System and offer valuable insights into planetary formation and the origins of life. This chapter explores this important component of our Solar System, discussing… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: Preprint of a chapter for the 'Encyclopedia of Astrophysics' (Editor-in-Chief Ilya Mandel, Section Editor Dimitri Veras) to be published by Elsevier as a Reference Module

  35. arXiv:2409.04851  [pdf, other

    cs.CV

    AdaptiveFusion: Adaptive Multi-Modal Multi-View Fusion for 3D Human Body Reconstruction

    Authors: Anjun Chen, Xiangyu Wang, Zhi Xu, Kun Shi, Yan Qin, Yuchi Huo, Jiming Chen, Qi Ye

    Abstract: Recent advancements in sensor technology and deep learning have led to significant progress in 3D human body reconstruction. However, most existing approaches rely on data from a specific sensor, which can be unreliable due to the inherent limitations of individual sensing modalities. Additionally, existing multi-modal fusion methods generally require customized designs based on the specific senso… ▽ More

    Submitted 13 March, 2025; v1 submitted 7 September, 2024; originally announced September 2024.

    Comments: TMM 2025, Project Page: https://chen3110.github.io/adaptivefusion/index.html

  36. arXiv:2409.02718  [pdf, other

    cs.CR cs.CL

    "Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation

    Authors: Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu

    Abstract: Model extraction attacks (MEAs) on large language models (LLMs) have received increasing attention in recent research. However, existing attack methods typically adapt the extraction strategies originally developed for deep neural networks (DNNs). They neglect the underlying inconsistency between the training tasks of MEA and LLM alignment, leading to suboptimal attack performance. To tackle this… ▽ More

    Submitted 19 May, 2025; v1 submitted 4 September, 2024; originally announced September 2024.

    Comments: To appear at ACL 25 main conference

  37. arXiv:2408.11424  [pdf, other

    cs.CV

    EMO-LLaMA: Enhancing Facial Emotion Understanding with Instruction Tuning

    Authors: Bohao Xing, Zitong Yu, Xin Liu, Kaishen Yuan, Qilang Ye, Weicheng Xie, Huanjing Yue, Jingyu Yang, Heikki Kälviäinen

    Abstract: Facial expression recognition (FER) is an important research topic in emotional artificial intelligence. In recent decades, researchers have made remarkable progress. However, current FER paradigms face challenges in generalization, lack semantic information aligned with natural language, and struggle to process both images and videos within a unified framework, making their application in multimo… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  38. arXiv:2408.10599  [pdf, other

    hep-ex cs.CV

    Vision Calorimeter: Migrating Visual Object Detector to High-energy Particle Images

    Authors: Hongtian Yu, Yangu Li, Yunfan Liu, Yunxuan Song, Xiaorui Lyu, Qixiang Ye

    Abstract: In high-energy physics, accurately estimating the kinematic parameters (position and momentum) of anti-neutrons ($\bar{n}$) is essential for exploring the fundamental governing principles. However, this process is particularly challenging when using an electromagnetic calorimeter (EMC) as the energy detector, due to their limited accuracy and efficiency in interacting with $\bar{n}$. To address th… ▽ More

    Submitted 16 February, 2025; v1 submitted 20 August, 2024; originally announced August 2024.

  39. arXiv:2408.09097  [pdf, other

    cs.CV cs.AI

    Depth-guided Texture Diffusion for Image Semantic Segmentation

    Authors: Wei Sun, Yuan Li, Qixiang Ye, Jianbin Jiao, Yanzhao Zhou

    Abstract: Depth information provides valuable insights into the 3D structure especially the outline of objects, which can be utilized to improve the semantic segmentation tasks. However, a naive fusion of depth information can disrupt feature and compromise accuracy due to the modality gap between the depth and the vision. In this work, we introduce a Depth-guided Texture Diffusion approach that effectively… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

  40. arXiv:2408.08723  [pdf, other

    cs.CV cs.AI

    Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS

    Authors: Wei Sun, Xiaosong Zhang, Fang Wan, Yanzhao Zhou, Yuan Li, Qixiang Ye, Jianbin Jiao

    Abstract: Novel View Synthesis (NVS) without Structure-from-Motion (SfM) pre-processed camera poses--referred to as SfM-free methods--is crucial for promoting rapid response capabilities and enhancing robustness against variable operating conditions. Recent SfM-free methods have integrated pose optimization, designing end-to-end frameworks for joint camera pose estimation and NVS. However, most existing wor… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: arXiv admin note: text overlap with arXiv:2312.07504 by other authors

  41. arXiv:2408.06967  [pdf, ps, other

    quant-ph cs.CC cs.DS cs.LG

    Stabilizer bootstrapping: A recipe for efficient agnostic tomography and magic estimation

    Authors: Sitan Chen, Weiyuan Gong, Qi Ye, Zhihan Zhang

    Abstract: We study the task of agnostic tomography: given copies of an unknown $n$-qubit state $ρ$ which has fidelity $τ$ with some state in a given class $C$, find a state which has fidelity $\ge τ- ε$ with $ρ$. We give a new framework, stabilizer bootstrapping, for designing computationally efficient protocols for this task, and use this to get new agnostic tomography protocols for the following classes:… ▽ More

    Submitted 4 December, 2024; v1 submitted 13 August, 2024; originally announced August 2024.

    Comments: 68 pages

  42. arXiv:2408.02416  [pdf, other

    cs.CL cs.CR

    Why Are My Prompts Leaked? Unraveling Prompt Extraction Threats in Customized Large Language Models

    Authors: Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, Haoyang Li

    Abstract: The drastic increase of large language models' (LLMs) parameters has led to a new research direction of fine-tuning-free downstream customization by prompts, i.e., task descriptions. While these prompt-based services (e.g. OpenAI's GPTs) play an important role in many businesses, there has emerged growing concerns about the prompt leakage, which undermines the intellectual properties of these serv… ▽ More

    Submitted 12 February, 2025; v1 submitted 5 August, 2024; originally announced August 2024.

    Comments: Source Code: https://github.com/liangzid/PromptExtractionEval

  43. arXiv:2407.16695  [pdf, other

    cs.CL cs.AI cs.LG

    Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack

    Authors: Xiaoyue Xu, Qinyuan Ye, Xiang Ren

    Abstract: We introduce Lifelong ICL, a problem setting that challenges long-context language models (LMs) to learn a sequence of language tasks through in-context learning (ICL). We further introduce Task Haystack, an evaluation suite dedicated to assessing and diagnosing how long-context LMs utilizes contexts in Lifelong ICL. When given a task instruction and test inputs, long-context LMs are expected to l… ▽ More

    Submitted 2 December, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

    Comments: NeurIPS 2024 (Datasets and Benchmarks Track). Code: https://github.com/INK-USC/Lifelong-ICL Website: https://inklab.usc.edu/lifelong-icl/

  44. arXiv:2407.15315  [pdf, other

    math.NA

    A Fast and Accurate Solver for the Fractional Fokker-Planck Equation with Dirac-Delta Initial Conditions

    Authors: Qihao Ye, Xiaochuan Tian, Dong Wang

    Abstract: The classical Fokker-Planck equation (FPE) is a key tool in physics for describing systems influenced by drag forces and Gaussian noise, with applications spanning multiple fields. We consider the fractional Fokker-Planck equation (FFPE), which models the time evolution of probability densities for systems driven by Lévy processes, relevant in scenarios where Gaussian assumptions fail. The paper p… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: 34 pages, 8 figures

    MSC Class: 34K37; 44A35; 35Q84; 65D40; 33C10

  45. arXiv:2407.13532  [pdf, other

    cs.CR cs.DB

    PriPL-Tree: Accurate Range Query for Arbitrary Distribution under Local Differential Privacy

    Authors: Leixia Wang, Qingqing Ye, Haibo Hu, Xiaofeng Meng

    Abstract: Answering range queries in the context of Local Differential Privacy (LDP) is a widely studied problem in Online Analytical Processing (OLAP). Existing LDP solutions all assume a uniform data distribution within each domain partition, which may not align with real-world scenarios where data distribution is varied, resulting in inaccurate estimates. To address this problem, we introduce PriPL-Tree,… ▽ More

    Submitted 24 August, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: To appear in VLDB 2024

  46. arXiv:2407.11086  [pdf, other

    cs.LG cs.AI physics.chem-ph

    Pre-training with Fractional Denoising to Enhance Molecular Property Prediction

    Authors: Yuyan Ni, Shikun Feng, Xin Hong, Yuancheng Sun, Wei-Ying Ma, Zhi-Ming Ma, Qiwei Ye, Yanyan Lan

    Abstract: Deep learning methods have been considered promising for accelerating molecular screening in drug discovery and material design. Due to the limited availability of labelled data, various self-supervised molecular pre-training methods have been presented. While many existing methods utilize common pre-training tasks in computer vision (CV) and natural language processing (NLP), they often overlook… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  47. arXiv:2407.06939  [pdf, other

    cs.RO cs.CV

    Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge

    Authors: Sriram Yenamandra, Arun Ramachandran, Mukul Khanna, Karmesh Yadav, Jay Vakil, Andrew Melnik, Michael Büttner, Leon Harz, Lyon Brown, Gora Chand Nandi, Arjun PS, Gaurav Kumar Yadav, Rahul Kala, Robert Haschke, Yang Luo, Jinxin Zhu, Yansen Han, Bingyi Lu, Xuan Gu, Qinyuan Liu, Yaping Zhao, Qiting Ye, Chenxiao Dou, Yansong Chua, Volodymyr Kuzma , et al. (20 additional authors not shown)

    Abstract: In order to develop robots that can effectively serve as versatile and capable home assistants, it is crucial for them to reliably perceive and interact with a wide variety of objects across diverse environments. To this end, we proposed Open Vocabulary Mobile Manipulation as a key benchmark task for robotics: finding any object in a novel environment and placing it on any receptacle surface withi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  48. arXiv:2407.04842  [pdf, other

    cs.CV cs.CL cs.LG

    MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

    Authors: Zhaorun Chen, Yichao Du, Zichen Wen, Yiyang Zhou, Chenhang Cui, Zhenzhen Weng, Haoqin Tu, Chaoqi Wang, Zhengwei Tong, Qinglan Huang, Canyu Chen, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou, Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, Huaxiu Yao

    Abstract: While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial to align these models with desired behaviors based on feedback from a multimodal judge. Despite their significance, current multimodal judges frequent… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 42 pages, 13 figures, 33 tables

  49. arXiv:2407.01094  [pdf, other

    cs.CV

    Evaluation of Text-to-Video Generation Models: A Dynamics Perspective

    Authors: Mingxiang Liao, Hannan Lu, Xinyu Zhang, Fang Wan, Tianyu Wang, Yuzhong Zhao, Wangmeng Zuo, Qixiang Ye, Jingdong Wang

    Abstract: Comprehensive and constructive evaluation protocols play an important role in the development of sophisticated text-to-video (T2V) generation models. Existing evaluation protocols primarily focus on temporal consistency and content continuity, yet largely ignore the dynamics of video content. Dynamics are an essential dimension for measuring the visual vividness and the honesty of video content to… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  50. arXiv:2406.14862  [pdf, other

    cs.LG cs.CL cs.CV

    LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models

    Authors: Mengdan Zhu, Raasikh Kanjiani, Jiahui Lu, Andrew Choi, Qirui Ye, Liang Zhao

    Abstract: Deep generative models like VAEs and diffusion models have advanced various generation tasks by leveraging latent variables to learn data distributions and generate high-quality samples. Despite the field of explainable AI making strides in interpreting machine learning models, understanding latent variables in generative models remains challenging. This paper introduces \textit{LatentExplainer},… ▽ More

    Submitted 27 May, 2025; v1 submitted 21 June, 2024; originally announced June 2024.