Skip to main content

Showing 1–50 of 222 results for author: Zeng, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.07246  [pdf, ps, other

    cs.CR

    Disa: Accurate Learning-based Static Disassembly with Attentions

    Authors: Peicheng Wang, Monika Santra, Mingyu Liu, Cong Sun, Dongrui Zeng, Gang Tan

    Abstract: For reverse engineering related security domains, such as vulnerability detection, malware analysis, and binary hardening, disassembly is crucial yet challenging. The fundamental challenge of disassembly is to identify instruction and function boundaries. Classic approaches rely on file-format assumptions and architecture-specific heuristics to guess the boundaries, resulting in incomplete and inc… ▽ More

    Submitted 9 July, 2025; originally announced July 2025.

    Comments: To appear at ACM CCS 2025

  2. arXiv:2506.23460  [pdf, ps, other

    cs.CV

    Contrastive Learning with Diffusion Features for Weakly Supervised Medical Image Segmentation

    Authors: Dewen Zeng, Xinrong Hu, Yu-Jen Chen, Yawen Wu, Xiaowei Xu, Yiyu Shi

    Abstract: Weakly supervised semantic segmentation (WSSS) methods using class labels often rely on class activation maps (CAMs) to localize objects. However, traditional CAM-based methods struggle with partial activations and imprecise object boundaries due to optimization discrepancies between classification and segmentation. Recently, the conditional diffusion model (CDM) has been used as an alternative fo… ▽ More

    Submitted 29 June, 2025; originally announced June 2025.

  3. arXiv:2506.22012  [pdf, ps, other

    eess.IV cs.CV

    Noise-Inspired Diffusion Model for Generalizable Low-Dose CT Reconstruction

    Authors: Qi Gao, Zhihao Chen, Dong Zeng, Junping Zhang, Jianhua Ma, Hongming Shan

    Abstract: The generalization of deep learning-based low-dose computed tomography (CT) reconstruction models to doses unseen in the training data is important and remains challenging. Previous efforts heavily rely on paired data to improve the generalization performance and robustness through collecting either diverse CT data for re-training or a few test data for fine-tuning. Recently, diffusion models have… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

    Comments: Accepted for publication in Medical Image Analysis, 2025

  4. arXiv:2506.19300  [pdf, ps, other

    cs.CV

    Open-Vocabulary Camouflaged Object Segmentation with Cascaded Vision Language Models

    Authors: Kai Zhao, Wubang Yuan, Zheng Wang, Guanyi Li, Xiaoqiang Zhu, Deng-ping Fan, Dan Zeng

    Abstract: Open-Vocabulary Camouflaged Object Segmentation (OVCOS) seeks to segment and classify camouflaged objects from arbitrary categories, presenting unique challenges due to visual ambiguity and unseen categories.Recent approaches typically adopt a two-stage paradigm: first segmenting objects, then classifying the segmented regions using Vision Language Models (VLMs).However, these methods (1) suffer f… ▽ More

    Submitted 24 June, 2025; originally announced June 2025.

  5. arXiv:2505.21003  [pdf, ps, other

    cs.CL

    Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models?

    Authors: Yifei Wang, Yu Sheng, Linjing Li, Daniel Zeng

    Abstract: Recent advances in handling long sequences have facilitated the exploration of long-context in-context learning (ICL). While much of the existing research emphasizes performance improvements driven by additional in-context examples, the influence on the trustworthiness of generated responses remains underexplored. This paper addresses this gap by investigating how increased examples influence pred… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: Camera-ready versions for ACL 2025 Findings

  6. arXiv:2505.12380  [pdf, ps, other

    cs.LG cs.DB cs.PL

    Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise Reward

    Authors: Han Weng, Puzhen Wu, Cui Longjie, Yi Zhan, Boyi Liu, Yuanfeng Song, Dun Zeng, Yingxiang Yang, Qianru Zhang, Dong Huang, Xiaoming Yin, Yang Sun, Xing Chen

    Abstract: Reinforcement learning (RL) has been widely adopted to enhance the performance of large language models (LLMs) on Text-to-SQL tasks. However, existing methods often rely on execution-based or LLM-based Bradley-Terry reward models. The former suffers from high execution latency caused by repeated database calls, whereas the latter imposes substantial GPU memory overhead, both of which significantly… ▽ More

    Submitted 27 June, 2025; v1 submitted 18 May, 2025; originally announced May 2025.

  7. arXiv:2505.07796  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Learning Dynamics in Continual Pre-Training for Large Language Models

    Authors: Xingjin Wang, Howe Tissue, Lu Wang, Linjing Li, Daniel Dajun Zeng

    Abstract: Continual Pre-Training (CPT) has become a popular and effective method to apply strong foundation models to specific downstream tasks. In this work, we explore the learning dynamics throughout the CPT process for large language models. We specifically focus on how general and downstream domain performance evolves at each training step, with domain performance measured via validation losses. We hav… ▽ More

    Submitted 19 June, 2025; v1 submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted to ICML2025 (Oral)

  8. arXiv:2505.05034  [pdf, ps, other

    cs.LG stat.ML

    Dequantified Diffusion-Schr{ö}dinger Bridge for Density Ratio Estimation

    Authors: Wei Chen, Shigui Li, Jiacheng Li, Junmei Yang, John Paisley, Delu Zeng

    Abstract: Density ratio estimation is fundamental to tasks involving f-divergences, yet existing methods often fail under significantly different distributions or inadequately overlapping supports -- the density-chasm and the support-chasm problems. Additionally, prior approaches yield divergent time scores near boundaries, leading to instability. We design $\textbf{D}^3\textbf{RE}$, a unified framework for… ▽ More

    Submitted 29 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

    Journal ref: ICML 2025: Proceedings of the 42nd International Conference on Machine Learning, 2025

  9. arXiv:2505.02639  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Enhancing Chemical Reaction and Retrosynthesis Prediction with Large Language Model and Dual-task Learning

    Authors: Xuan Lin, Qingrui Liu, Hongxin Xiang, Daojian Zeng, Xiangxiang Zeng

    Abstract: Chemical reaction and retrosynthesis prediction are fundamental tasks in drug discovery. Recently, large language models (LLMs) have shown potential in many domains. However, directly applying LLMs to these tasks faces two major challenges: (i) lacking a large-scale chemical synthesis-related instruction dataset; (ii) ignoring the close correlation between reaction and retrosynthesis prediction fo… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: Accepted for publication at IJCAI 2025

  10. arXiv:2505.01768  [pdf, ps, other

    eess.IV cs.CV

    Continuous Filtered Backprojection by Learnable Interpolation Network

    Authors: Hui Lin, Dong Zeng, Qi Xie, Zerui Mao, Jianhua Ma, Deyu Meng

    Abstract: Accurate reconstruction of computed tomography (CT) images is crucial in medical imaging field. However, there are unavoidable interpolation errors in the backprojection step of the conventional reconstruction methods, i.e., filtered-back-projection based methods, which are detrimental to the accurate reconstruction. In this study, to address this issue, we propose a novel deep learning model, nam… ▽ More

    Submitted 3 May, 2025; originally announced May 2025.

    Comments: 14 pages, 10 figures

  11. arXiv:2504.13904  [pdf, other

    cs.HC cs.AI cs.CL

    Generative Framework for Personalized Persuasion: Inferring Causal, Counterfactual, and Latent Knowledge

    Authors: Donghuo Zeng, Roberto Legaspi, Yuewen Sun, Xinshuai Dong, Kazushi Ikeda, Peter Spirtes, Kun Zhang

    Abstract: We hypothesize that optimal system responses emerge from adaptive strategies grounded in causal and counterfactual knowledge. Counterfactual inference allows us to create hypothetical scenarios to examine the effects of alternative system responses. We enhance this process through causal discovery, which identifies the strategies informed by the underlying causal structure that govern system behav… ▽ More

    Submitted 28 May, 2025; v1 submitted 8 April, 2025; originally announced April 2025.

    Comments: 12 pages, 10 figures, 1 table. Accepted by ACM UMAP 2025

  12. arXiv:2504.09228  [pdf, other

    cs.CV

    Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking

    Authors: You Wu, Xucheng Wang, Xiangyang Yang, Mengyuan Liu, Dan Zeng, Hengzhou Ye, Shuiwang Li

    Abstract: Single-stream architectures using Vision Transformer (ViT) backbones show great potential for real-time UAV tracking recently. However, frequent occlusions from obstacles like buildings and trees expose a major drawback: these models often lack strategies to handle occlusions effectively. New methods are needed to enhance the occlusion resilience of single-stream ViT models in aerial tracking. In… ▽ More

    Submitted 12 April, 2025; originally announced April 2025.

  13. arXiv:2504.02248  [pdf, other

    cs.LG

    CRC-SGAD: Conformal Risk Control for Supervised Graph Anomaly Detection

    Authors: Songran Bai, Xiaolong Zheng, Daniel Dajun Zeng

    Abstract: Graph Anomaly Detection (GAD) is critical in security-sensitive domains, yet faces reliability challenges: miscalibrated confidence estimation (underconfidence in normal nodes, overconfidence in anomalies), adversarial vulnerability of derived confidence score under structural perturbations, and limited efficacy of conventional calibration methods for sparse anomaly patterns. Thus we propose CRC-S… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  14. arXiv:2504.00721  [pdf, other

    cs.LG

    Alleviating Performance Disparity in Adversarial Spatiotemporal Graph Learning Under Zero-Inflated Distribution

    Authors: Songran Bai, Yuheng Ji, Yue Liu, Xingwei Zhang, Xiaolong Zheng, Daniel Dajun Zeng

    Abstract: Spatiotemporal Graph Learning (SGL) under Zero-Inflated Distribution (ZID) is crucial for urban risk management tasks, including crime prediction and traffic accident profiling. However, SGL models are vulnerable to adversarial attacks, compromising their practical utility. While adversarial training (AT) has been widely used to bolster model robustness, our study finds that traditional AT exacerb… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  15. arXiv:2503.16544  [pdf, other

    cs.CL cs.AI cs.HC

    Causal Discovery and Counterfactual Reasoning to Optimize Persuasive Dialogue Policies

    Authors: Donghuo Zeng, Roberto Legaspi, Yuewen Sun, Xinshuai Dong, Kazushi Ikeda, Peter Spirtes, Kun Zhang

    Abstract: Tailoring persuasive conversations to users leads to more effective persuasion. However, existing dialogue systems often struggle to adapt to dynamically evolving user states. This paper presents a novel method that leverages causal discovery and counterfactual reasoning for optimizing system persuasion capability and outcomes. We employ the Greedy Relaxation of the Sparsest Permutation (GRaSP) al… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

    Comments: 21 pages, 8 figures

  16. Data Insights as Data: Quick Overview and Exploration of Automated Data Insights

    Authors: Shangxuan Wu, Wendi Luan, Yong Wang, Dan Zeng, Qiaomu Shen, Bo Tang

    Abstract: Automated data insight mining and visualization have been widely used in various business intelligence applications (e.g., market analysis and product promotion). However, automated insight mining techniques often output the same mining results to different analysts without considering their personal preferences, while interactive insight discovery requires significant manual effort. This paper fi… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  17. arXiv:2502.06194  [pdf, other

    cs.CV

    Multimodal Task Representation Memory Bank vs. Catastrophic Forgetting in Anomaly Detection

    Authors: You Zhou, Jiangshan Zhao, Deyu Zeng, Zuo Zuo, Weixiang Liu, Zongze Wu

    Abstract: Unsupervised Continuous Anomaly Detection (UCAD) faces significant challenges in multi-task representation learning, with existing methods suffering from incomplete representation and catastrophic forgetting. Unlike supervised models, unsupervised scenarios lack prior information, making it difficult to effectively distinguish redundant and complementary multimodal features. To address this, we pr… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  18. arXiv:2502.05761  [pdf, other

    cs.CV

    3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly

    Authors: Enquan Yang, Peng Xing, Hanyang Sun, Wenbo Guo, Yuanwei Ma, Zechao Li, Dan Zeng

    Abstract: Industrial anomaly detection achieves progress thanks to datasets such as MVTec-AD and VisA. However, they suffer from limitations in terms of the number of defect samples, types of defects, and availability of real-world scenes. These constraints inhibit researchers from further exploring the performance of industrial detection with higher accuracy. To this end, we propose a new large-scale anoma… ▽ More

    Submitted 8 February, 2025; originally announced February 2025.

    Comments: Accept by AAAI2025, github: https://github.com/EnquanYang2022/3CAD

  19. arXiv:2501.18154  [pdf, other

    cs.CL

    Mixed-Precision Graph Neural Quantization for Low Bit Large Language Models

    Authors: Wanlong Liu, Yichen Xiao, Dingyi Zeng, Hongyang Zhao, Wenyu Chen, Malu Zhang

    Abstract: Post-Training Quantization (PTQ) is pivotal for deploying large language models (LLMs) within resource-limited settings by significantly reducing resource demands. However, existing PTQ strategies underperform at low bit levels < 3 bits due to the significant difference between the quantized and original weights. To enhance the quantization performance at low bit widths, we introduce a Mixed-preci… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: ICASSP 2025

  20. arXiv:2501.14005  [pdf, other

    cs.CV cs.AI

    Device-aware Optical Adversarial Attack for a Portable Projector-camera System

    Authors: Ning Jiang, Yanhong Liu, Dingheng Zeng, Yue Feng, Weihong Deng, Ying Li

    Abstract: Deep-learning-based face recognition (FR) systems are susceptible to adversarial examples in both digital and physical domains. Physical attacks present a greater threat to deployed systems as adversaries can easily access the input channel, allowing them to provide malicious inputs to impersonate a victim. This paper addresses the limitations of existing projector-camera-based adversarial light a… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

  21. arXiv:2501.09608  [pdf, other

    cs.SD cs.AI cs.CV cs.IR cs.MM eess.AS

    Metric Learning with Progressive Self-Distillation for Audio-Visual Embedding Learning

    Authors: Donghuo Zeng, Kazushi Ikeda

    Abstract: Metric learning projects samples into an embedded space, where similarities and dissimilarities are quantified based on their learned representations. However, existing methods often rely on label-guided representation learning, where representations of different modalities, such as audio and visual data, are aligned based on annotated labels. This approach tends to underutilize latent complex fea… ▽ More

    Submitted 16 January, 2025; originally announced January 2025.

    Comments: 5 pages, 3 figures, 2 tables. Accepted by ICASSP 2025

  22. arXiv:2501.06366  [pdf, other

    stat.ML cs.CY cs.LG stat.ME

    Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing

    Authors: Jitao Wang, Chengchun Shi, John D. Piette, Joshua R. Loftus, Donglin Zeng, Zhenke Wu

    Abstract: When applied in healthcare, reinforcement learning (RL) seeks to dynamically match the right interventions to subjects to maximize population benefit. However, the learned policy may disproportionately allocate efficacious actions to one subpopulation, creating or exacerbating disparities in other socioeconomically-disadvantaged subgroups. These biases tend to occur in multi-stage decision making… ▽ More

    Submitted 13 January, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

  23. arXiv:2501.01456  [pdf, other

    eess.IV cs.CV cs.LG

    SS-CTML: Self-Supervised Cross-Task Mutual Learning for CT Image Reconstruction

    Authors: Gaofeng Chen, Yaoduo Zhang, Li Huang, Pengfei Wang, Wenyu Zhang, Dong Zeng, Jianhua Ma, Ji He

    Abstract: Supervised deep-learning (SDL) techniques with paired training datasets have been widely studied for X-ray computed tomography (CT) image reconstruction. However, due to the difficulties of obtaining paired training datasets in clinical routine, the SDL methods are still away from common uses in clinical practices. In recent years, self-supervised deep-learning (SSDL) techniques have shown great p… ▽ More

    Submitted 30 December, 2024; originally announced January 2025.

  24. arXiv:2501.01164  [pdf, other

    cs.CV

    Towards Interactive Deepfake Analysis

    Authors: Lixiong Qin, Ning Jiang, Yang Zhang, Yuhan Qiu, Dingheng Zeng, Jiani Hu, Weihong Deng

    Abstract: Existing deepfake analysis methods are primarily based on discriminative models, which significantly limit their application scenarios. This paper aims to explore interactive deepfake analysis by performing instruction tuning on multi-modal large language models (MLLMs). This will face challenges such as the lack of datasets and benchmarks, and low training efficiency. To address these issues, we… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

  25. arXiv:2412.20002  [pdf, other

    cs.CV

    Learning an Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking

    Authors: You Wu, Yongxin Li, Mengyuan Liu, Xucheng Wang, Xiangyang Yang, Hengzhou Ye, Dan Zeng, Qijun Zhao, Shuiwang Li

    Abstract: Visual tracking has made significant strides due to the adoption of transformer-based models. Most state-of-the-art trackers struggle to meet real-time processing demands on mobile platforms with constrained computing resources, particularly for real-time unmanned aerial vehicle (UAV) tracking. To achieve a better balance between performance and efficiency, we introduce AVTrack, an adaptive comput… ▽ More

    Submitted 8 May, 2025; v1 submitted 27 December, 2024; originally announced December 2024.

  26. arXiv:2412.00626  [pdf, other

    cs.CV

    MambaNUT: Nighttime UAV Tracking via Mamba-based Adaptive Curriculum Learning

    Authors: You Wu, Xiangyang Yang, Xucheng Wang, Hengzhou Ye, Dan Zeng, Shuiwang Li

    Abstract: Harnessing low-light enhancement and domain adaptation, nighttime UAV tracking has made substantial strides. However, over-reliance on image enhancement, limited high-quality nighttime data, and a lack of integration between daytime and nighttime trackers hinder the development of an end-to-end trainable framework. Additionally, current ViT-based trackers demand heavy computational resources due t… ▽ More

    Submitted 9 May, 2025; v1 submitted 30 November, 2024; originally announced December 2024.

  27. arXiv:2411.16303  [pdf, other

    cs.LG stat.ML

    Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization

    Authors: Dun Zeng, Zheshun Wu, Shiyu Liu, Yu Pan, Xiaoying Tang, Zenglin Xu

    Abstract: Federated Learning (FL) is a distributed learning approach that trains machine learning models across multiple devices while keeping their local data private. However, FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients. These inconsistencies can cause unfavorable convergence behavior and generalization performance degradation. Existing studies m… ▽ More

    Submitted 5 February, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

  28. arXiv:2411.07050  [pdf, other

    eess.SP cs.AI cs.LG

    FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data

    Authors: Yukun Zhang, Guanzhong Chen, Zenglin Xu, Jianyong Wang, Dun Zeng, Junfan Li, Jinghua Wang, Yuan Qi, Irwin King

    Abstract: Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. However, the sensitive nature of healthcare data often restricts individual clinical institutions from sharing da… ▽ More

    Submitted 27 October, 2024; originally announced November 2024.

    Comments: 10 pages, 4 figures

  29. arXiv:2410.21982  [pdf, other

    cs.CV

    A Survey on RGB, 3D, and Multimodal Approaches for Unsupervised Industrial Image Anomaly Detection

    Authors: Yuxuan Lin, Yang Chang, Xuan Tong, Jiawen Yu, Antonio Liotta, Guofan Huang, Wei Song, Deyu Zeng, Zongze Wu, Yan Wang, Wenqiang Zhang

    Abstract: In the advancement of industrial informatization, unsupervised anomaly detection technology effectively overcomes the scarcity of abnormal samples and significantly enhances the automation and reliability of smart manufacturing. As an important branch, industrial image anomaly detection focuses on automatically identifying visual anomalies in industrial scenarios (such as product surface defects,… ▽ More

    Submitted 21 March, 2025; v1 submitted 29 October, 2024; originally announced October 2024.

    Comments: Accepted by Information Fusion

  30. arXiv:2410.17802  [pdf, other

    cs.CV cs.GR

    GenUDC: High Quality 3D Mesh Generation with Unsigned Dual Contouring Representation

    Authors: Ruowei Wang, Jiaqi Li, Dan Zeng, Xueqi Ma, Zixiang Xu, Jianwei Zhang, Qijun Zhao

    Abstract: Generating high-quality meshes with complex structures and realistic surfaces is the primary goal of 3D generative models. Existing methods typically employ sequence data or deformable tetrahedral grids for mesh generation. However, sequence-based methods have difficulty producing complex structures with many faces due to memory limits. The deformable tetrahedral grid-based method MeshDiffusion fa… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Comments: ACMMM 2024, code:https://github.com/TrepangCat/GenUDC

  31. arXiv:2410.16711  [pdf

    cs.CV cs.AI

    Development of CNN Architectures using Transfer Learning Methods for Medical Image Classification

    Authors: Ganga Prasad Basyal, David Zeng, Bhaskar Pm Rimal

    Abstract: The application of deep learning-based architecture has seen a tremendous rise in recent years. For example, medical image classification using deep learning achieved breakthrough results. Convolutional Neural Networks (CNNs) are implemented predominantly in medical image classification and segmentation. On the other hand, transfer learning has emerged as a prominent supporting tool for enhancing… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

  32. arXiv:2410.15607  [pdf, other

    cs.RO cs.AI

    Reinforced Imitative Trajectory Planning for Urban Automated Driving

    Authors: Di Zeng, Ling Zheng, Xiantong Yang, Yinong Li

    Abstract: Reinforcement learning (RL) faces challenges in trajectory planning for urban automated driving due to the poor convergence of RL and the difficulty in designing reward functions. The convergence problem is alleviated by combining RL with supervised learning. However, most existing approaches only reason one step ahead and lack the capability to plan for multiple future steps. Besides, although in… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

    Comments: 19 pages, 9 figures

  33. arXiv:2410.11550  [pdf, other

    cs.AI cs.CL

    Y-Mol: A Multiscale Biomedical Knowledge-Guided Large Language Model for Drug Development

    Authors: Tengfei Ma, Xuan Lin, Tianle Li, Chaoyi Li, Long Chen, Peng Zhou, Xibao Cai, Xinyu Yang, Daojian Zeng, Dongsheng Cao, Xiangxiang Zeng

    Abstract: Large Language Models (LLMs) have recently demonstrated remarkable performance in general tasks across various fields. However, their effectiveness within specific domains such as drug development remains challenges. To solve these challenges, we introduce \textbf{Y-Mol}, forming a well-established LLM paradigm for the flow of drug development. Y-Mol is a multiscale biomedical knowledge-guided LLM… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: 12 pages, Under Review

  34. arXiv:2409.15636  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Personalized Federated Learning via Backbone Self-Distillation

    Authors: Pengju Wang, Bochao Liu, Dan Zeng, Chenggang Yan, Shiming Ge

    Abstract: In practical scenarios, federated learning frequently necessitates training personalized models for each client using heterogeneous data. This paper proposes a backbone self-distillation approach to facilitate personalized federated learning. In this approach, each client trains its local model and only sends the backbone weights to the server. These weights are then aggregated to create a global… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

    Comments: Pubished in ACM MMAsia 2023

  35. arXiv:2409.12384  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Privacy-Preserving Student Learning with Differentially Private Data-Free Distillation

    Authors: Bochao Liu, Jianghu Lu, Pengju Wang, Junjie Zhang, Dan Zeng, Zhenxing Qian, Shiming Ge

    Abstract: Deep learning models can achieve high inference accuracy by extracting rich knowledge from massive well-annotated data, but may pose the risk of data privacy leakage in practical deployment. In this paper, we present an effective teacher-student learning approach to train privacy-preserving deep learning models via differentially private data-free distillation. The main idea is generating syntheti… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: Published by IEEE MMSP 2022

  36. arXiv:2409.09322  [pdf, other

    cs.CL

    A Compressive Memory-based Retrieval Approach for Event Argument Extraction

    Authors: Wanlong Liu, Enqi Zhang, Li Zhou, Dingyi Zeng, Shaohuan Cheng, Chen Zhang, Malu Zhang, Wenyu Chen

    Abstract: Recent works have demonstrated the effectiveness of retrieval augmentation in the Event Argument Extraction (EAE) task. However, existing retrieval-based EAE methods have two main limitations: (1) input length constraints and (2) the gap between the retriever and the inference model. These issues limit the diversity and quality of the retrieved information. In this paper, we propose a Compressive… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: 15 pages

  37. arXiv:2409.07748  [pdf, other

    cs.CV cs.AI cs.CL

    Top-down Activity Representation Learning for Video Question Answering

    Authors: Yanan Wang, Shuichiro Haruta, Donghuo Zeng, Julio Vizcarra, Mori Kurokawa

    Abstract: Capturing complex hierarchical human activities, from atomic actions (e.g., picking up one present, moving to the sofa, unwrapping the present) to contextual events (e.g., celebrating Christmas) is crucial for achieving high-performance video question answering (VideoQA). Recent works have expanded multimodal models (e.g., CLIP, LLaVA) to process continuous video sequences, enhancing the model's t… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

    Comments: presented at MIRU2024

  38. arXiv:2409.07747  [pdf, other

    cs.CV cs.AI cs.CL

    Multi-object event graph representation learning for Video Question Answering

    Authors: Yanan Wang, Shuichiro Haruta, Donghuo Zeng, Julio Vizcarra, Mori Kurokawa

    Abstract: Video question answering (VideoQA) is a task to predict the correct answer to questions posed about a given video. The system must comprehend spatial and temporal relationships among objects extracted from videos to perform causal and temporal reasoning. While prior works have focused on modeling individual object movements using transformer-based methods, they falter when capturing complex scenar… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

    Comments: presented at MIRU2024

  39. arXiv:2409.02555  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation

    Authors: Kangkai Zhang, Shiming Ge, Ruixin Shi, Dan Zeng

    Abstract: Recognizing objects in low-resolution images is a challenging task due to the lack of informative details. Recent studies have shown that knowledge distillation approaches can effectively transfer knowledge from a high-resolution teacher model to a low-resolution student model by aligning cross-resolution representations. However, these approaches still face limitations in adapting to the situatio… ▽ More

    Submitted 4 September, 2024; originally announced September 2024.

    Comments: This paper is accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

  40. arXiv:2409.02404  [pdf, other

    cs.LG cs.AI cs.CR

    Learning Privacy-Preserving Student Networks via Discriminative-Generative Distillation

    Authors: Shiming Ge, Bochao Liu, Pengju Wang, Yong Li, Dan Zeng

    Abstract: While deep models have proved successful in learning rich knowledge from massive well-annotated data, they may pose a privacy leakage risk in practical deployment. It is necessary to find an effective trade-off between high utility and strong privacy. In this work, we propose a discriminative-generative distillation approach to learn privacy-preserving deep models. Our key idea is taking models as… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: This paper is accepted by IEEE Transactions on Image Processing (TIP)

  41. arXiv:2408.16965  [pdf, other

    cs.CV

    Contrastive Learning with Synthetic Positives

    Authors: Dewen Zeng, Yawen Wu, Xinrong Hu, Xiaowei Xu, Yiyu Shi

    Abstract: Contrastive learning with the nearest neighbor has proved to be one of the most efficient self-supervised learning (SSL) techniques by utilizing the similarity of multiple instances within the same class. However, its efficacy is constrained as the nearest neighbor algorithm primarily identifies "easy" positive pairs, where the representations are already closely located in the embedding space. In… ▽ More

    Submitted 24 April, 2025; v1 submitted 29 August, 2024; originally announced August 2024.

    Comments: 8 pages, conference

  42. arXiv:2408.06710  [pdf, other

    cs.LG cs.AI stat.ML

    Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling

    Authors: Jian Xu, Shian Du, Junmei Yang, Qianli Ma, Delu Zeng

    Abstract: Gaussian Process Latent Variable Models (GPLVMs) have become increasingly popular for unsupervised tasks such as dimensionality reduction and missing data recovery due to their flexibility and non-linear nature. An importance-weighted version of the Bayesian GPLVMs has been proposed to obtain a tighter variational bound. However, this version of the approach is primarily limited to analyzing simpl… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  43. arXiv:2408.06699  [pdf, other

    cs.LG cs.AI

    Information Geometry and Beta Link for Optimizing Sparse Variational Student-t Processes

    Authors: Jian Xu, Delu Zeng, John Paisley

    Abstract: Recently, a sparse version of Student-t Processes, termed sparse variational Student-t Processes, has been proposed to enhance computational efficiency and flexibility for real-world datasets using stochastic gradient descent. However, traditional gradient descent methods like Adam may not fully exploit the parameter space geometry, potentially leading to slower convergence and suboptimal performa… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  44. arXiv:2408.06069  [pdf, other

    cs.LG cs.AI

    Fully Bayesian Differential Gaussian Processes through Stochastic Differential Equations

    Authors: Jian Xu, Zhiqi Lin, Min Chen, Junmei Yang, Delu Zeng, John Paisley

    Abstract: Traditional deep Gaussian processes model the data evolution using a discrete hierarchy, whereas differential Gaussian processes (DIFFGPs) represent the evolution as an infinitely deep Gaussian process. However, prior DIFFGP methods often overlook the uncertainty of kernel hyperparameters and assume them to be fixed and time-invariant, failing to leverage the unique synergy between continuous-time… ▽ More

    Submitted 12 August, 2024; originally announced August 2024.

  45. arXiv:2408.05889  [pdf, other

    cs.CV

    Enhancing 3D Transformer Segmentation Model for Medical Image with Token-level Representation Learning

    Authors: Xinrong Hu, Dewen Zeng, Yawen Wu, Xueyang Li, Yiyu Shi

    Abstract: In the field of medical images, although various works find Swin Transformer has promising effectiveness on pixelwise dense prediction, whether pre-training these models without using extra dataset can further boost the performance for the downstream semantic segmentation remains unexplored.Applications of previous representation learning methods are hindered by the limited number of 3D volumes an… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  46. arXiv:2408.05705  [pdf, other

    eess.IV cs.AI cs.CV

    TC-KANRecon: High-Quality and Accelerated MRI Reconstruction via Adaptive KAN Mechanisms and Intelligent Feature Scaling

    Authors: Ruiquan Ge, Xiao Yu, Yifei Chen, Guanyu Zhou, Fan Jia, Shenghao Zhu, Junhao Jia, Chenyan Zhang, Yifei Sun, Dong Zeng, Changmiao Wang, Qiegen Liu, Shanzhou Niu

    Abstract: Magnetic Resonance Imaging (MRI) has become essential in clinical diagnosis due to its high resolution and multiple contrast mechanisms. However, the relatively long acquisition time limits its broader application. To address this issue, this study presents an innovative conditional guided diffusion model, named as TC-KANRecon, which incorporates the Multi-Free U-KAN (MF-UKAN) module and a dynamic… ▽ More

    Submitted 6 January, 2025; v1 submitted 11 August, 2024; originally announced August 2024.

    Comments: 11 pages, 3 figures

  47. arXiv:2408.03746  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling

    Authors: Jian Xu, Zhiqi Lin, Shigui Li, Min Chen, Junmei Yang, Delu Zeng, John Paisley

    Abstract: Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in Bayesian Last Layer (BLL) models limits their expressive capacity when faced with non-Gaussian, outlier-rich, or high-dimensional datasets. To address this shortfall,… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  48. arXiv:2408.03247  [pdf, other

    cs.CL cs.AI

    Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons

    Authors: Yifei Wang, Yuheng Chen, Wanting Wen, Yu Sheng, Linjing Li, Daniel Dajun Zeng

    Abstract: In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs' internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt… ▽ More

    Submitted 30 September, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

  49. arXiv:2407.17033  [pdf, other

    cs.LG cs.AI stat.ML

    Sparse Inducing Points in Deep Gaussian Processes: Enhancing Modeling with Denoising Diffusion Variational Inference

    Authors: Jian Xu, Delu Zeng, John Paisley

    Abstract: Deep Gaussian processes (DGPs) provide a robust paradigm for Bayesian deep learning. In DGPs, a set of sparse integration locations called inducing points are selected to approximate the posterior distribution of the model. This is done to reduce computational complexity and improve model efficiency. However, inferring the posterior distribution of inducing points is not straightforward. Tradition… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  50. arXiv:2407.11518  [pdf, other

    stat.ML cs.LG stat.OT

    Ensemble Transport Filter via Optimized Maximum Mean Discrepancy

    Authors: Dengfei Zeng, Lijian Jiang

    Abstract: In this paper, we present a new ensemble-based filter method by reconstructing the analysis step of the particle filter through a transport map, which directly transports prior particles to posterior particles. The transport map is constructed through an optimization problem described by the Maximum Mean Discrepancy loss function, which matches the expectation information of the approximated poste… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 27 pages, 14 figures