Skip to main content

Showing 1–33 of 33 results for author: Zhan, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.13629  [pdf, ps, other

    cs.CV

    FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding

    Authors: Chenlu Zhan, Gaoang Wang, Hongwei Wang

    Abstract: Semantic querying in complex 3D scenes through free-form language presents a significant challenge. Existing 3D scene understanding methods use large-scale training data and CLIP to align text queries with 3D semantic features. However, their reliance on predefined vocabulary priors from training data hinders free-form semantic querying. Besides, recent advanced methods rely on LLMs for scene unde… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  2. arXiv:2506.06822  [pdf, ps, other

    cs.CV cs.AI

    Hi-LSplat: Hierarchical 3D Language Gaussian Splatting

    Authors: Chenlu Zhan, Yufei Zhang, Gaoang Wang, Hongwei Wang

    Abstract: Modeling 3D language fields with Gaussian Splatting for open-ended language queries has recently garnered increasing attention. However, recent 3DGS-based models leverage view-dependent 2D foundation models to refine 3D semantics but lack a unified 3D representation, leading to view inconsistencies. Additionally, inherent open-vocabulary challenges cause inconsistencies in object and relational de… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

  3. arXiv:2504.16834  [pdf

    cs.LG cs.AI physics.ao-ph

    Improving Significant Wave Height Prediction Using Chronos Models

    Authors: Yilin Zhai, Hongyuan Shi, Chao Zhan, Qing Wang, Zaijin You, Nan Wang

    Abstract: Accurate wave height prediction is critical for maritime safety and coastal resilience, yet conventional physics-based models and traditional machine learning methods face challenges in computational efficiency and nonlinear dynamics modeling. This study introduces Chronos, the first implementation of a large language model (LLM)-powered temporal architecture (Chronos) optimized for wave forecasti… ▽ More

    Submitted 25 April, 2025; v1 submitted 23 April, 2025; originally announced April 2025.

    Comments: arXiv admin note: text overlap with arXiv:2403.07815 by other authors

  4. arXiv:2503.07956  [pdf, other

    cs.CL cs.AI

    EFPC: Towards Efficient and Flexible Prompt Compression

    Authors: Yun-Hao Cao, Yangsong Wang, Shuzheng Hao, Zhenxing Li, Chengjun Zhan, Sichao Liu, Yi-Qi Hu

    Abstract: The emergence of large language models (LLMs) like GPT-4 has revolutionized natural language processing (NLP), enabling diverse, complex tasks. However, extensive token counts lead to high computational and financial burdens. To address this, we propose Efficient and Flexible Prompt Compression (EFPC), a novel method unifying task-aware and task-agnostic compression for a favorable accuracy-effici… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 10 pages, 6 figures

  5. arXiv:2502.01964  [pdf, other

    cs.NI quant-ph

    Design and Simulation of the Adaptive Continuous Entanglement Generation Protocol

    Authors: Caitao Zhan, Joaquin Chung, Allen Zang, Alexander Kolar, Rajkumar Kettimuthu

    Abstract: Generating and distributing remote entangled pairs (EPs) is a primary function of quantum networks, as entanglement is the fundamental resource for key quantum network applications. A critical performance metric for quantum networks is the time-to-serve (TTS) for users' EP requests, which is the time to distribute EPs between the requested nodes. Minimizing the TTS is essential given the limited q… ▽ More

    Submitted 16 February, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

    Comments: 8 pages, 10 figures, accepted at QCNC 2025

  6. arXiv:2501.11102  [pdf, other

    cs.CV

    RDG-GS: Relative Depth Guidance with Gaussian Splatting for Real-time Sparse-View 3D Rendering

    Authors: Chenlu Zhan, Yufei Zhang, Yu Lin, Gaoang Wang, Hongwei Wang

    Abstract: Efficiently synthesizing novel views from sparse inputs while maintaining accuracy remains a critical challenge in 3D reconstruction. While advanced techniques like radiance fields and 3D Gaussian Splatting achieve rendering quality and impressive efficiency with dense view inputs, they suffer from significant geometric reconstruction errors when applied to sparse input views. Moreover, although r… ▽ More

    Submitted 19 January, 2025; originally announced January 2025.

    Comments: 24 pages, 12 figures

  7. arXiv:2411.17372  [pdf, other

    cs.LG cs.SI

    Epidemiology-informed Graph Neural Network for Heterogeneity-aware Epidemic Forecasting

    Authors: Yufan Zheng, Wei Jiang, Alexander Zhou, Nguyen Quoc Viet Hung, Choujun Zhan, Tong Chen

    Abstract: Among various spatio-temporal prediction tasks, epidemic forecasting plays a critical role in public health management. Recent studies have demonstrated the strong potential of spatio-temporal graph neural networks (STGNNs) in extracting heterogeneous spatio-temporal patterns for epidemic forecasting. However, most of these methods bear an over-simplified assumption that two locations (e.g., citie… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: 14 pages, 6 figures, 3 tables

  8. arXiv:2411.11031  [pdf, other

    quant-ph cs.NI

    Simulation of Entanglement-Enabled Connectivity in QLANs using SeQUeNCe

    Authors: Francesco Mazza, Caitao Zhan, Joaquin Chung, Rajkumar Kettimuthu, Marcello Caleffi, Angela Sara Cacciapuoti

    Abstract: Quantum Local Area Networks (QLANs) represent a promising building block for larger scale quantum networks with the ambitious goal -- in a long time horizon -- of realizing a Quantum Internet. Surprisingly, the physical topology of a QLAN can be enriched by a set of artificial links, enabled by shared multipartite entangled states among the nodes of the network. This novel concept of artificial to… ▽ More

    Submitted 17 November, 2024; originally announced November 2024.

  9. arXiv:2410.10122  [pdf, other

    cs.CV

    MuseTalk: Real-Time High-Fidelity Video Dubbing via Spatio-Temporal Sampling

    Authors: Yue Zhang, Zhizhou Zhong, Minhao Liu, Zhaokang Chen, Bin Wu, Yubin Zeng, Chao Zhan, Yingjie He, Junxin Huang, Wenjiang Zhou

    Abstract: Real-time video dubbing that preserves identity consistency while achieving accurate lip synchronization remains a critical challenge. Existing approaches face a trilemma: diffusion-based methods achieve high visual fidelity but suffer from prohibitive computational costs, while GAN-based solutions sacrifice lip-sync accuracy or dental details for real-time performance. We present MuseTalk, a nove… ▽ More

    Submitted 26 March, 2025; v1 submitted 13 October, 2024; originally announced October 2024.

    Comments: 15 pages, 4 figures

    Report number: RV-10-16

  10. arXiv:2410.02644  [pdf, ps, other

    cs.CR cs.AI

    Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents

    Authors: Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang

    Abstract: Although LLM-based agents, powered by Large Language Models (LLMs), can use external tools and memory mechanisms to solve complex real-world tasks, they may also introduce critical security vulnerabilities. However, the existing literature does not comprehensively evaluate attacks and defenses against LLM-based agents. To address this, we introduce Agent Security Bench (ASB), a comprehensive frame… ▽ More

    Submitted 29 May, 2025; v1 submitted 3 October, 2024; originally announced October 2024.

    Comments: Accepted by ICLR 2025

  11. arXiv:2409.17480  [pdf, other

    cs.AI

    What Would Happen Next? Predicting Consequences from An Event Causality Graph

    Authors: Chuanhong Zhan, Wei Xiang, Chao Liang, Bang Wang

    Abstract: Existing script event prediction task forcasts the subsequent event based on an event script chain. However, the evolution of historical events are more complicated in real world scenarios and the limited information provided by the event script chain also make it difficult to accurately predict subsequent events. This paper introduces a Causality Graph Event Prediction(CGEP) task that forecasting… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

  12. arXiv:2405.14672  [pdf, other

    cs.CV

    Invisible Backdoor Attack against Self-supervised Learning

    Authors: Hanrong Zhang, Zhenting Wang, Boheng Li, Fulin Lin, Tingxu Han, Mingyu Jin, Chenlu Zhan, Mengnan Du, Hongwei Wang, Shiqing Ma

    Abstract: Self-supervised learning (SSL) models are vulnerable to backdoor attacks. Existing backdoor attacks that are effective in SSL often involve noticeable triggers, like colored patches or visible noise, which are vulnerable to human inspection. This paper proposes an imperceptible and effective backdoor attack against self-supervised models. We first find that existing imperceptible triggers designed… ▽ More

    Submitted 3 April, 2025; v1 submitted 23 May, 2024; originally announced May 2024.

  13. arXiv:2405.14040  [pdf, other

    cs.MM

    Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline

    Authors: Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin

    Abstract: Video storytelling is engaging multimedia content that utilizes video and its accompanying narration to attract the audience, where a key challenge is creating narrations for recorded visual scenes. Previous studies on dense video captioning and video story generation have made some progress. However, in practical applications, we typically require synchronized narrations for ongoing visual scenes… ▽ More

    Submitted 30 December, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: 15 pages, 13 figures

    Journal ref: https://aclanthology.org/2024.acl-long.513/

  14. arXiv:2405.00222  [pdf, other

    quant-ph cs.NI

    Optimized Distribution of Entanglement Graph States in Quantum Networks

    Authors: Xiaojie Fan, Caitao Zhan, Himanshu Gupta, C. R. Ramakrishnan

    Abstract: Building large-scale quantum computers, essential to demonstrating quantum advantage, is a key challenge. Quantum Networks (QNs) can help address this challenge by enabling the construction of large, robust, and more capable quantum computing platforms by connecting smaller quantum computers. Moreover, unlike classical systems, QNs can enable fully secured long-distance communication. Thus, quantu… ▽ More

    Submitted 18 March, 2025; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: 16 pages, 20 figures

  15. arXiv:2403.04290  [pdf, other

    eess.IV cs.CV cs.LG

    MedM2G: Unifying Medical Multi-Modal Generation via Cross-Guided Diffusion with Visual Invariant

    Authors: Chenlu Zhan, Yu Lin, Gaoang Wang, Hongwei Wang, Jian Wu

    Abstract: Medical generative models, acknowledged for their high-quality sample generation ability, have accelerated the fast growth of medical applications. However, recent works concentrate on separate medical generation models for distinct medical tasks and are restricted to inadequate medical multi-modal knowledge, constraining medical comprehensive diagnosis. In this paper, we propose MedM2G, a Medical… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  16. arXiv:2312.11171  [pdf, other

    cs.CV cs.AI

    UniDCP: Unifying Multiple Medical Vision-language Tasks via Dynamic Cross-modal Learnable Prompts

    Authors: Chenlu Zhan, Yufei Zhang, Yu Lin, Gaoang Wang, Hongwei Wang

    Abstract: Medical vision-language pre-training (Med-VLP) models have recently accelerated the fast-growing medical diagnostics application. However, most Med-VLP models learn task-specific representations independently from scratch, thereby leading to great inflexibility when they work across multiple fine-tuning tasks. In this work, we propose UniDCP, a Unified medical vision-language model with Dynamic Cr… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  17. arXiv:2307.09813  [pdf, other

    cs.CL

    DAPrompt: Deterministic Assumption Prompt Learning for Event Causality Identification

    Authors: Wei Xiang, Chuanhong Zhan, Bang Wang

    Abstract: Event Causality Identification (ECI) aims at determining whether there is a causal relation between two event mentions. Conventional prompt learning designs a prompt template to first predict an answer word and then maps it to the final decision. Unlike conventional prompts, we argue that predicting an answer word may not be a necessary prerequisite for the ECI task. Instead, we can first make a d… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  18. arXiv:2306.14701  [pdf, other

    cs.LG cs.AI

    Hard Sample Mining Enabled Supervised Contrastive Feature Learning for Wind Turbine Pitch System Fault Diagnosis

    Authors: Zixuan Wang, Bo Qin, Mengxuan Li, Chenlu Zhan, Mark D. Butala, Peng Peng, Hongwei Wang

    Abstract: The efficient utilization of wind power by wind turbines relies on the ability of their pitch systems to adjust blade pitch angles in response to varying wind speeds. However, the presence of multiple health conditions in the pitch system due to the long-term wear and tear poses challenges in accurately classifying them, thus increasing the maintenance cost of wind turbines or even damaging them.… ▽ More

    Submitted 10 August, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

  19. arXiv:2306.07484  [pdf, other

    cs.LG q-bio.BM

    Multi-objective Molecular Optimization for Opioid Use Disorder Treatment Using Generative Network Complex

    Authors: Hongsong Feng, Rui Wang, Chang-Guo Zhan, Guo-Wei Wei

    Abstract: Opioid Use Disorder (OUD) has emerged as a significant global public health issue, with complex multifaceted conditions. Due to the lack of effective treatment options for various conditions, there is a pressing need for the discovery of new medications. In this study, we propose a deep generative model that combines a stochastic differential equation (SDE)-based diffusion modeling with the latent… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  20. arXiv:2212.10729  [pdf, other

    cs.CV cs.AI cs.LG

    UnICLAM:Contrastive Representation Learning with Adversarial Masking for Unified and Interpretable Medical Vision Question Answering

    Authors: Chenlu Zhan, Peng Peng, Hongsen Wang, Tao Chen, Hongwei Wang

    Abstract: Medical Visual Question Answering (Medical-VQA) aims to to answer clinical questions regarding radiology images, assisting doctors with decision-making options. Nevertheless, current Medical-VQA models learn cross-modal representations through residing vision and texture encoders in dual separate spaces, which lead to indirect semantic alignment. In this paper, we propose UnICLAM, a Unified and In… ▽ More

    Submitted 27 September, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  21. arXiv:2212.02871  [pdf, other

    cs.CV

    Video Object of Interest Segmentation

    Authors: Siyuan Zhou, Chunru Zhan, Biao Wang, Tiezheng Ge, Yuning Jiang, Li Niu

    Abstract: In this work, we present a new computer vision task named video object of interest segmentation (VOIS). Given a video and a target image of interest, our objective is to simultaneously segment and track all objects in the video that are relevant to the target image. This problem combines the traditional video object segmentation task with an additional image indicating the content that users are c… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: 13 pages, 8 figures

  22. Joint Task Offloading and Resource Optimization in NOMA-based Vehicular Edge Computing: A Game-Theoretic DRL Approach

    Authors: Xincao Xu, Kai Liu, Penglin Dai, Feiyu Jin, Hualing Ren, Choujun Zhan, Songtao Guo

    Abstract: Vehicular edge computing (VEC) becomes a promising paradigm for the development of emerging intelligent transportation systems. Nevertheless, the limited resources and massive transmission demands bring great challenges on implementing vehicular applications with stringent deadline requirements. This work presents a non-orthogonal multiple access (NOMA) based architecture in VEC, where heterogeneo… ▽ More

    Submitted 24 October, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

    Journal ref: Journal of Systems Architecture 134 (2023) 102780

  23. DeepAlloc: CNN-Based Approach to Efficient Spectrum Allocation in Shared Spectrum Systems

    Authors: Mohammad Ghaderibaneh, Caitao Zhan, Himanshu Gupta

    Abstract: Shared spectrum systems facilitate spectrum allocation to unlicensed users without harming the licensed users; they offer great promise in optimizing spectrum utility, but their management (in particular, efficient spectrum allocation to unlicensed users) is challenging. A significant shortcoming of current allocation methods is that they are either done very conservatively to ensure correctness,… ▽ More

    Submitted 4 April, 2024; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: 15 pages, 16 figures

  24. arXiv:2201.01772  [pdf, other

    cs.LG

    Neural Architecture Search for Inversion

    Authors: Cheng Zhan, Licheng Zhang, Xin Zhao, Chang-Chun Lee, Shujiao Huang

    Abstract: Over the year, people have been using deep learning to tackle inversion problems, and we see the framework has been applied to build relationship between recording wavefield and velocity (Yang et al., 2016). Here we will extend the work from 2 perspectives, one is deriving a more appropriate loss function, as we now, pixel-2-pixel comparison might not be the best choice to characterize image struc… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

  25. DeepMTL Pro: Deep Learning Based MultipleTransmitter Localization and Power Estimation

    Authors: Caitao Zhan, Mohammad Ghaderibaneh, Pranjal Sahu, Himanshu Gupta

    Abstract: In this paper, we address the problem of Multiple Transmitter Localization (MTL). MTL is to determine the locations of potential multiple transmitters in a field, based on readings from a distributed set of sensors. In contrast to the widely studied single transmitter localization problem, the MTL problem has only been studied recently in a few works. MTL is of great significance in many applicati… ▽ More

    Submitted 22 March, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

    Comments: 38 pages, 27 figures. This is the final revision verison of a journal paper submitted to Pervasive and Mobile Computing (PMC). This is an extension of an accepted paper at IEEE International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM 2021)

  26. Efficient Quantum Network Communication using Optimized Entanglement-Swapping Trees

    Authors: Mohammad Ghaderibaneh, Caitao Zhan, Himanshu Gupta, C. R. Ramakrishnan

    Abstract: Quantum network communication is challenging, as the No-cloning theorem in quantum regime makes many classical techniques inapplicable. For long-distance communication, the only viable communication approach is teleportation of quantum states, which requires a prior distribution of entangled pairs (EPs) of qubits. Establishment of EPs across remote nodes can incur significant latency due to the lo… ▽ More

    Submitted 4 April, 2024; v1 submitted 21 December, 2021; originally announced December 2021.

  27. arXiv:2101.04264  [pdf

    cs.LG

    HighAir: A Hierarchical Graph Neural Network-Based Air Quality Forecasting Method

    Authors: Jiahui Xu, Ling Chen, Mingqi Lv, Chaoqun Zhan, Sanjian Chen, Jian Chang

    Abstract: Accurately forecasting air quality is critical to protecting general public from lung and heart diseases. This is a challenging task due to the complicated interactions among distinct pollution sources and various other influencing factors. Existing air quality forecasting methods cannot effectively model the diffusion processes of air pollutants between cities and monitoring stations, which may s… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

  28. arXiv:2001.09021  [pdf

    cs.CV

    Dense Residual Network: Enhancing Global Dense Feature Flow for Character Recognition

    Authors: Zhao Zhang, Zemin Tang, Yang Wang, Zheng Zhang, Choujun Zhan, Zhengjun Zha, Meng Wang

    Abstract: Deep Convolutional Neural Networks (CNNs), such as Dense Convolutional Networks (DenseNet), have achieved great success for image representation by discovering deep hierarchical information. However, most existing networks simply stacks the convolutional layers and hence failing to fully discover local and global feature information among layers. In this paper, we mainly explore how to enhance the… ▽ More

    Submitted 8 February, 2021; v1 submitted 23 January, 2020; originally announced January 2020.

    Comments: Please cite this work as: Zhao Zhang, Zemin Tang, Yang Wang, Zheng Zhang, Choujun Zhan, Zhengjun Zha and Meng Wang, "Dense Residual Network: Enhancing Global Dense Feature Flow for Character Recognition," Neural Networks (NN), Feb 2021. arXiv admin note: text overlap with arXiv:1912.07016

  29. arXiv:1812.07367  [pdf

    cs.LG stat.ML

    Deep Learning Approach in Automatic Iceberg - Ship Detection with SAR Remote Sensing Data

    Authors: Cheng Zhan, Licheng Zhang, Zhenzhen Zhong, Sher Didi-Ooi, Youzuo Lin, Yunxi Zhang, Shujiao Huang, Changchun Wang

    Abstract: Deep Learning is gaining traction with geophysics community to understand subsurface structures, such as fault detection or salt body in seismic data. This study describes using deep learning method for iceberg or ship recognition with synthetic aperture radar (SAR) data. Drifting icebergs pose a potential threat to activities offshore around the Arctic, including for both ship navigation and oil… ▽ More

    Submitted 9 December, 2018; originally announced December 2018.

  30. arXiv:1810.07075  [pdf

    cs.CV

    A Multi-stage Framework with Context Information Fusion Structure for Skin Lesion Segmentation

    Authors: Yujiao Tang, Feng Yang, Shaofeng Yuan, Chang'an Zhan

    Abstract: The computer-aided diagnosis (CAD) systems can highly improve the reliability and efficiency of melanoma recognition. As a crucial step of CAD, skin lesion segmentation has the unsatisfactory accuracy in existing methods due to large variability in lesion appearance and artifacts. In this work, we propose a framework employing multi-stage UNets (MS-UNet) in the auto-context scheme to segment skin… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

    Comments: 4 pages, 3 figures, 1 table

  31. arXiv:1805.04364  [pdf, ps, other

    cs.IT eess.SP

    Trajectory Design for Distributed Estimation in UAV Enabled Wireless Sensor Network

    Authors: Cheng Zhan, Yong Zeng, Rui Zhang

    Abstract: In this paper, we study an unmanned aerial vehicle(UAV)-enabled wireless sensor network, where a UAV is dispatched to collect the sensed data from distributed sensor nodes (SNs) for estimating an unknown parameter. It is revealed that in order to minimize the mean square error (MSE) for the estimation, the UAV should collect the data from as many SNs as possible, based on which an optimization pro… ▽ More

    Submitted 11 May, 2018; originally announced May 2018.

    Comments: 5 pages, 4 figures, submitted for possible journal publication

  32. arXiv:1708.00221  [pdf, ps, other

    cs.IT

    Energy-Efficient Data Collection in UAV Enabled Wireless Sensor Network

    Authors: Cheng Zhan, Yong Zeng, Rui Zhang

    Abstract: In wireless sensor networks (WSNs), utilizing the unmanned aerial vehicle (UAV) as a mobile data collector for the ground sensor nodes (SNs) is an energy-efficient technique to prolong the network lifetime. Specifically, since the UAV can sequentially move close to each of the SNs when collecting data from them and thus reduce the link distance for saving the SNs' transmission energy. In this lett… ▽ More

    Submitted 1 August, 2017; originally announced August 2017.

    Comments: Submitted for possible journal publication

  33. arXiv:1706.08217  [pdf, other

    stat.ML cs.LG

    An Effective Way to Improve YouTube-8M Classification Accuracy in Google Cloud Platform

    Authors: Zhenzhen Zhong, Shujiao Huang, Cheng Zhan, Licheng Zhang, Zhiwei Xiao, Chang-Chun Wang, Pei Yang

    Abstract: Large-scale datasets have played a significant role in progress of neural network and deep learning areas. YouTube-8M is such a benchmark dataset for general multi-label video classification. It was created from over 7 million YouTube videos (450,000 hours of video) and includes video labels from a vocabulary of 4716 classes (3.4 labels/video on average). It also comes with pre-extracted audio & v… ▽ More

    Submitted 25 June, 2017; originally announced June 2017.

    Comments: 5 pages, 2 figures