Skip to main content

Showing 1–50 of 81 results for author: Yan, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.02445  [pdf, ps, other

    cs.CV eess.IV

    IGDNet: Zero-Shot Robust Underexposed Image Enhancement via Illumination-Guided and Denoising

    Authors: Hailong Yan, Junjian Huang, Tingwen Huang

    Abstract: Current methods for restoring underexposed images typically rely on supervised learning with paired underexposed and well-illuminated images. However, collecting such datasets is often impractical in real-world scenarios. Moreover, these methods can lead to over-enhancement, distorting well-illuminated regions. To address these issues, we propose IGDNet, a Zero-Shot enhancement method that operate… ▽ More

    Submitted 3 July, 2025; originally announced July 2025.

    Comments: Submitted to IEEE Transactions on Artificial Intelligence (TAI) on Oct.31, 2024

  2. arXiv:2506.12537  [pdf, ps, other

    cs.CL cs.AI eess.AS

    Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction

    Authors: Xiaoran Fan, Zhichao Sun, Yangfan Gao, Jingfei Xiong, Hang Yan, Yifei Cao, Jiajun Sun, Shuo Li, Zhihao Zhang, Zhiheng Xi, Yuhao Zhou, Senjie Jin, Changhao Jiang, Junjie Ye, Ming Zhang, Rui Zheng, Zhenhua Han, Yunke Zhang, Demei Yan, Shaokang Dong, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang

    Abstract: Speech-language models (SLMs) offer a promising path toward unifying speech and text understanding and generation. However, challenges remain in achieving effective cross-modal alignment and high-quality speech generation. In this work, we systematically investigate the impact of key components (i.e., speech tokenizers, speech heads, and speaker modeling) on the performance of LLM-centric SLMs. We… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  3. arXiv:2506.11616  [pdf, ps, other

    cs.CV eess.SP

    Wi-CBR: WiFi-based Cross-domain Behavior Recognition via Multimodal Collaborative Awareness

    Authors: Ruobei Zhang, Shengeng Tang, Huan Yan, Xiang Zhang, Richang Hong

    Abstract: WiFi-based human behavior recognition aims to recognize gestures and activities by analyzing wireless signal variations. However, existing methods typically focus on a single type of data, neglecting the interaction and fusion of multiple features. To this end, we propose a novel multimodal collaborative awareness method. By leveraging phase data reflecting changes in dynamic path length and Doppl… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

  4. arXiv:2506.09377  [pdf, ps, other

    eess.IV

    An Interpretable Two-Stage Feature Decomposition Method for Deep Learning-based SAR ATR

    Authors: Chenwei Wang, Renjie Xu, Congwen Wu, Cunyi Yin, Ziyun Liao, Deqing Mao, Sitong Zhang, Hong Yan

    Abstract: Synthetic aperture radar automatic target recognition (SAR ATR) has seen significant performance improvements with deep learning. However, the black-box nature of deep SAR ATR introduces low confidence and high risks in decision-critical SAR applications, hindering practical deployment. To address this issue, deep SAR ATR should provide an interpretable reasoning basis $r_b$ and logic $λ_w$, formi… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

  5. arXiv:2505.24576  [pdf, ps, other

    eess.AS

    A Composite Predictive-Generative Approach to Monaural Universal Speech Enhancement

    Authors: Jie Zhang, Haoyin Yan, Xiaofei Li

    Abstract: It is promising to design a single model that can suppress various distortions and improve speech quality, i.e., universal speech enhancement (USE). Compared to supervised learning-based predictive methods, diffusion-based generative models have shown greater potential due to the generative capacities from degraded speech with severely damaged information. However, artifacts may be introduced in h… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: Accepted by IEEE Transactions on Audio, Speech and Language Processing

  6. arXiv:2505.24151  [pdf

    eess.SP

    Channel Knowledge Maps for 6G Wireless Networks: Construction, Applications, and Future Challenges

    Authors: Xingchen Liu, Shu Sun, Meixia Tao, Aryan Kaushik, Hangsong Yan

    Abstract: The advent of 6G wireless networks promises unprecedented connectivity, supporting ultra-high data rates, low latency, and massive device connectivity. However, these ambitious goals introduce significant challenges, particularly in channel estimation due to complex and dynamic propagation environments. This paper explores the concept of channel knowledge maps (CKMs) as a solution to these challen… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  7. arXiv:2505.21216  [pdf, ps, other

    eess.SP

    CiUAV: A Multi-Task 3D Indoor Localization System for UAVs based on Channel State Information

    Authors: Cunyi Yin, Chenwei Wang, Jing Chen, Hao Jiang, Xiren Miao, Shaocong Zheng Zhenghua Chen Senior, Hong Yan

    Abstract: Accurate indoor positioning for unmanned aerial vehicles (UAVs) is critical for logistics, surveillance, and emergency response applications, particularly in GPS-denied environments. Existing indoor localization methods, including optical tracking, ultra-wideband, and Bluetooth-based systems, face cost, accuracy, and robustness trade-offs, limiting their practicality for UAV navigation. This paper… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  8. arXiv:2505.09521  [pdf, ps, other

    eess.IV cs.CV

    Spec2VolCAMU-Net: A Spectrogram-to-Volume Model for EEG-to-fMRI Reconstruction based on Multi-directional Time-Frequency Convolutional Attention Encoder and Vision-Mamba U-Net

    Authors: Dongyi He, Shiyang Li, Bin Jiang, He Yan

    Abstract: High-resolution functional magnetic resonance imaging (fMRI) is essential for mapping human brain activity; however, it remains costly and logistically challenging. If comparable volumes could be generated directly from widely available scalp electroencephalography (EEG), advanced neuroimaging would become significantly more accessible. Existing EEG-to-fMRI generators rely on plain CNNs that fail… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  9. arXiv:2505.08038  [pdf, other

    eess.SP

    Statistical CSI-Based Distributed Precoding Design for OFDM-Cooperative Multi-Satellite Systems

    Authors: Yafei Wang, Vu Nguyen Ha, Konstantinos Ntontin, Hong Yan, Wenjin Wang, Symeon Chatzinotas, Björn Ottersten

    Abstract: This paper investigates the design of distributed precoding for multi-satellite massive MIMO transmissions. We first conduct a detailed analysis of the transceiver model, in which delay and Doppler precompensation is introduced to ensure coherent transmission. In this analysis, we examine the impact of precompensation errors on the transmission model, emphasize the near-independence of inter-satel… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  10. arXiv:2503.22867  [pdf, other

    eess.SY

    Markov Potential Game Construction and Multi-Agent Reinforcement Learning with Applications to Autonomous Driving

    Authors: Huiwen Yan, Mushuang Liu

    Abstract: Markov games (MGs) serve as the mathematical foundation for multi-agent reinforcement learning (MARL), enabling self-interested agents to learn their optimal policies while interacting with others in a shared environment. However, due to the complexities of an MG problem, seeking (Markov perfect) Nash equilibrium (NE) is often very challenging for a general-sum MG. Markov potential games (MPGs), w… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  11. arXiv:2503.18353  [pdf, other

    eess.SY

    Contact Plan Design for Cross-Linked GNSSs: An ILP Approach for Extended Applications

    Authors: Huan Yan, Juan A. Fraire, Ziqi Yang, Kanglian Zhao, Wenfeng Li, Xiyun Hou, Haohan Li, Yuxuan Miao, Jinjun Zheng, Chengbin Kang, Huichao Zhou, Xinuo Chang, Lu Wang

    Abstract: Global Navigation Satellite Systems (GNSS) employ inter-satellite links (ISLs) to reduce dependency on ground stations, enabling precise ranging and communication across satellites. Beyond their traditional role, ISLs can support extended applications, including providing navigation and communication services to external entities. However, designing effective contact plan design (CPD) schemes for… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

    Comments: 18 pages, 13 figures

  12. arXiv:2503.18340  [pdf, other

    eess.SY

    Optimized Contact Plan Design for Reflector and Phased Array Terminals in Cislunar Space Networks

    Authors: Huan Yan, Juan A. Fraire, Ziqi Yang, Kanglian Zhao, Wenfeng Li, Yuan Fang, Jinjun Zheng, Chengbin Kang, Huichao Zhou, Xinuo Chang, Lu Wang, Linshan Xue

    Abstract: Cislunar space is emerging as a critical domain for human exploration, requiring robust infrastructure to support spatial users - spacecraft with navigation and communication demands. Deploying satellites at Earth-Moon libration points offers an effective solution. This paper introduces a novel Contact Plan Design (CPD) scheme that considers two classes of cislunar transponders: Reflector Links (R… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

    Comments: 16 pages, 14 figures

  13. arXiv:2503.02261  [pdf, other

    eess.IV cs.CV

    Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

    Authors: Zelin Li, Chenwei Wang, Zhaoke Huang, Yiming MA, Cunmin Zhao, Zhongying Zhao, Hong Yan

    Abstract: 3D fluorescence microscopy is essential for understanding fundamental life processes through long-term live-cell imaging. However, due to inherent issues in imaging principles, it faces significant challenges including spatially varying noise and anisotropic resolution, where the axial resolution lags behind the lateral resolution up to 4.5 times. Meanwhile, laser power is kept low to maintain cel… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: Accepted on CVPR 2025

  14. arXiv:2501.13006  [pdf, other

    eess.SP

    Terahertz Integrated Sensing Communications and Powering for 6G Wireless Networks

    Authors: Hua Yan, Yunfei Chen

    Abstract: The terahertz (THz) band has attracted significant interest for future wireless networks. In this paper, a THz integrated sensing communications and powering (THz-ISCAP) system, where sensing is leveraged to enhance communications and powering, is studied. For a given total amount of time, we aim to determine an optimal time allocation for sensing to improve the efficiency of communications and po… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

    Comments: 12 pages

  15. arXiv:2501.06098  [pdf, other

    eess.IV

    ELFATT: Efficient Linear Fast Attention for Vision Transformers

    Authors: Chong Wu, Maolin Che, Renjie Xu, Zhuoheng Ran, Hong Yan

    Abstract: The attention mechanism is the key to the success of transformers in different machine learning tasks. However, the quadratic complexity with respect to the sequence length of the vanilla softmax-based attention mechanism becomes the major bottleneck for the application of long sequence tasks, such as vision tasks. Although various efficient linear attention mechanisms have been proposed, they nee… ▽ More

    Submitted 10 February, 2025; v1 submitted 10 January, 2025; originally announced January 2025.

    Comments: 22 pages, 5 figures, 13 tables

  16. arXiv:2501.00378  [pdf, other

    eess.IV cs.CV cs.LG

    STARFormer: A Novel Spatio-Temporal Aggregation Reorganization Transformer of FMRI for Brain Disorder Diagnosis

    Authors: Wenhao Dong, Yueyang Li, Weiming Zeng, Lei Chen, Hongjie Yan, Wai Ting Siok, Nizhuan Wang

    Abstract: Many existing methods that use functional magnetic resonance imaging (fMRI) classify brain disorders, such as autism spectrum disorder (ASD) and attention deficit hyperactivity disorder (ADHD), often overlook the integration of spatial and temporal dependencies of the blood oxygen level-dependent (BOLD) signals, which may lead to inaccurate or imprecise classification results. To solve this proble… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

  17. arXiv:2412.12651  [pdf, other

    cs.LG cs.AI eess.SP q-bio.NC

    Shared Attention-based Autoencoder with Hierarchical Fusion-based Graph Convolution Network for sEEG SOZ Identification

    Authors: Huachao Yan, Kailing Guo, Shiwei Song, Yihai Dai, Xiaoqiang Wei, Xiaofen Xing, Xiangmin Xu

    Abstract: Diagnosing seizure onset zone (SOZ) is a challenge in neurosurgery, where stereoelectroencephalography (sEEG) serves as a critical technique. In sEEG SOZ identification, the existing studies focus solely on the intra-patient representation of epileptic information, overlooking the general features of epilepsy across patients and feature interdependencies between feature elements in each contact si… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  18. arXiv:2412.00999  [pdf

    eess.SY

    A Compact Hybrid Battery Thermal Management System for Enhanced Cooling

    Authors: Zhipeng Lyu, Jinrong Su, Zhe Li, Xiang Li, Hanghang Yan, Lei Chen

    Abstract: Hybrid battery thermal management systems (HBTMS) combining active liquid cooling and passive phase change materials (PCM) cooling have shown a potential for the thermal management of lithium-ion batteries. However, the fill volume of coolant and PCM in hybrid cooling systems is limited by the size and weight of the HBTMS at high charge/discharge rates. These limitations result in reduced convecti… ▽ More

    Submitted 1 December, 2024; originally announced December 2024.

  19. arXiv:2411.17734  [pdf

    eess.SP

    A Low-Cost Monopulse Receiver with Enhanced Estimation Accuracy Via Deep Neural Network

    Authors: Hanxiang Zhang, Saeed Zolfaghary Pour, Hao Yan, Powei Liu, Bayaner Arigong

    Abstract: In this paper, a low-cost monopulse receiver with an enhanced direction of arrival (DoA) estimation accuracy via deep neural network (DNN) is proposed. The entire system is composed of a 4-element patch array, a fully planar symmetrical monopulse comparator network, and a down conversion link. Unlike the conventional design topology, the proposed monopulse comparator network is configured by four… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

  20. arXiv:2411.12448  [pdf, other

    cs.CV eess.IV

    Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need

    Authors: Kecheng Chen, Pingping Zhang, Hui Liu, Jie Liu, Yibing Liu, Jiaxin Huang, Shiqi Wang, Hong Yan, Haoliang Li

    Abstract: We have recently witnessed that ``Intelligence" and `` Compression" are the two sides of the same coin, where the language large model (LLM) with unprecedented intelligence is a general-purpose lossless compressor for various data modalities. This attribute particularly appeals to the lossless image compression community, given the increasing need to compress high-resolution images in the current… ▽ More

    Submitted 21 November, 2024; v1 submitted 19 November, 2024; originally announced November 2024.

  21. arXiv:2409.13285  [pdf, other

    eess.AS cs.SD eess.SP

    LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement

    Authors: Haoyin Yan, Jie Zhang, Cunhang Fan, Yeping Zhou, Peiqi Liu

    Abstract: Speech enhancement (SE) aims to extract the clean waveform from noise-contaminated measurements to improve the speech quality and intelligibility. Although learning-based methods can perform much better than traditional counterparts, the large computational complexity and model size heavily limit the deployment on latency-sensitive and low-resource edge devices. In this work, we propose a lightwei… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: 5 pages, submitted to 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2025)

  22. arXiv:2409.05113  [pdf, other

    eess.SY

    Nonlinear Cooperative Output Regulation with Input Delay Compensation

    Authors: Shiqi Zheng, Choon Ki Ahn, Xiaowei Jiang, Huaicheng Yan, Peng Shi

    Abstract: This paper investigates the cooperative output regulation (COR) of nonlinear multi-agent systems (MASs) with long input delay based on periodic event-triggered mechanism. Compared with other mechanisms, periodic event-triggered control can automatically guarantee a Zeno-free behavior and avoid the continuous monitoring of triggered conditions. First, a new periodic event-triggered distributed obse… ▽ More

    Submitted 8 September, 2024; originally announced September 2024.

    Comments: Acceptted by IEEE Trans. Automatic Control

  23. STANet: A Novel Spatio-Temporal Aggregation Network for Depression Classification with Small and Unbalanced FMRI Data

    Authors: Wei Zhang, Weiming Zeng, Hongyu Chen, Jie Liu, Hongjie Yan, Kaile Zhang, Ran Tao, Wai Ting Siok, Nizhuan Wang

    Abstract: Accurate diagnosis of depression is crucial for timely implementation of optimal treatments, preventing complications and reducing the risk of suicide. Traditional methods rely on self-report questionnaires and clinical assessment, lacking objective biomarkers. Combining fMRI with artificial intelligence can enhance depression diagnosis by integrating neuroimaging indicators. However, the specific… ▽ More

    Submitted 28 November, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

    Comments: This paper is published on Tomography

    Journal ref: Tomography,2024

  24. arXiv:2407.13211  [pdf

    cs.CV eess.IV

    Research on Image Super-Resolution Reconstruction Mechanism based on Convolutional Neural Network

    Authors: Hao Yan, Zixiang Wang, Zhengjia Xu, Zhuoyue Wang, Zhizhong Wu, Ranran Lyu

    Abstract: Super-resolution reconstruction techniques entail the utilization of software algorithms to transform one or more sets of low-resolution images captured from the same scene into high-resolution images. In recent years, considerable advancement has been observed in the domain of single-image super-resolution algorithms, particularly those based on deep learning techniques. Nevertheless, the extract… ▽ More

    Submitted 31 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

  25. Two-Path GMM-ResNet and GMM-SENet for ASV Spoofing Detection

    Authors: Zhenchun Lei, Hui Yan, Changhong Liu, Minglei Ma, Yingen Yang

    Abstract: The automatic speaker verification system is sometimes vulnerable to various spoofing attacks. The 2-class Gaussian Mixture Model classifier for genuine and spoofed speech is usually used as the baseline for spoofing detection. However, the GMM classifier does not separately consider the scores of feature frames on each Gaussian component. In addition, the GMM accumulates the scores on all frames… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  26. arXiv:2407.03135  [pdf, other

    cs.SD cs.AI cs.HC eess.AS

    GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification

    Authors: Hui Yan, Zhenchun Lei, Changhong Liu, Yong Zhou

    Abstract: With the development of deep learning, many different network architectures have been explored in speaker verification. However, most network architectures rely on a single deep learning architecture, and hybrid networks combining different architectures have been little studied in ASV tasks. In this paper, we propose the GMM-ResNext model for speaker verification. Conventional GMM does not consid… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  27. GMM-ResNet2: Ensemble of Group ResNet Networks for Synthetic Speech Detection

    Authors: Zhenchun Lei, Hui Yan, Changhong Liu, Yong Zhou, Minglei Ma

    Abstract: Deep learning models are widely used for speaker recognition and spoofing speech detection. We propose the GMM-ResNet2 for synthesis speech detection. Compared with the previous GMM-ResNet model, GMM-ResNet2 has four improvements. Firstly, the different order GMMs have different capabilities to form smooth approximations to the feature distribution, and multiple GMMs are used to extract multi-scal… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  28. arXiv:2407.02052  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

    Authors: Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, Jintao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

    Abstract: This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlapping and Mandarin accent dynamics in the ICMC case. We implement the front-end speaker diarization using the self-supervised learning representation based multi-speaker embedding and beamforming using the speaker position,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ICASSP 2024

  29. arXiv:2406.12707  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction

    Authors: Haoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu

    Abstract: Large Language Model (LLM)-enhanced agents become increasingly prevalent in Human-AI communication, offering vast potential from entertainment to professional domains. However, current multi-modal dialogue systems overlook the acoustic information present in speech, which is crucial for understanding human communication nuances. This oversight can lead to misinterpretations of speakers' intentions… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 9 pages, 3 figures, ACL24 accepted

  30. arXiv:2406.11799  [pdf, other

    eess.IV cs.CV cs.LG

    Mix-Domain Contrastive Learning for Unpaired H&E-to-IHC Stain Translation

    Authors: Song Wang, Zhong Zhang, Huan Yan, Ming Xu, Guanghui Wang

    Abstract: H&E-to-IHC stain translation techniques offer a promising solution for precise cancer diagnosis, especially in low-resource regions where there is a shortage of health professionals and limited access to expensive equipment. Considering the pixel-level misalignment of H&E-IHC image pairs, current research explores the pathological consistency between patches from the same positions of the image pa… ▽ More

    Submitted 30 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  31. arXiv:2405.11352  [pdf, other

    cs.NI eess.SP

    Hierarchical Reinforcement Learning Empowered Task Offloading in V2I Networks

    Authors: Xinyu You, Haojie Yan, Yuedong Xu, Lifeng Wang, Liangui Dai

    Abstract: Edge computing plays an essential role in the vehicle-to-infrastructure (V2I) networks, where vehicles offload their intensive computation tasks to the road-side units for saving energy and reduce the latency. This paper designs the optimal task offloading policy to address the concerns involving processing delay, energy consumption and edge computing cost. Each computation task consisting of some… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  32. arXiv:2403.06460  [pdf, other

    eess.SP

    RIS-Enabled Joint Near-Field 3D Localization and Synchronization in SISO Multipath Environments

    Authors: Han Yan, Hua Chen, Wei Liu, Songjie Yang, Gang Wang, Chau Yuen

    Abstract: Reconfigurable Intelligent Surfaces (RIS) show great promise in the realm of 6th generation (6G) wireless systems, particularly in the areas of localization and communication. Their cost-effectiveness and energy efficiency enable the integration of numerous passive and reflective elements, enabling near-field propagation. In this paper, we tackle the challenges of RIS-aided 3D localization and syn… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  33. arXiv:2403.04145   

    eess.SY

    A Crosstalk-Aware Timing Prediction Method in Routing

    Authors: Leilei Jin, Jiajie Xu, Wenjie Fu, Hao Yan, Longxing Shi

    Abstract: With shrinking interconnect spacing in advanced technology nodes, existing timing predictions become less precise due to the challenging quantification of crosstalk-induced delay. During the routing, the crosstalk effect is typically modeled by predicting coupling capacitance with congestion information. However, the timing estimation tends to be overly pessimistic, as the crosstalk-induced delay… ▽ More

    Submitted 9 April, 2025; v1 submitted 6 March, 2024; originally announced March 2024.

    Comments: I would like to withdraw my submission because the work is incomplete. In particular, the calculations related to crosstalk need to be reconsidered. I plan to revise and improve the manuscript further

    ACM Class: I.6.4; B.7.3

  34. arXiv:2401.11675  [pdf, ps, other

    eess.IV

    ATFusion: An Alternate Cross-Attention Transformer Network for Infrared and Visible Image Fusion

    Authors: Han Yan, Songlei Xiong, Long Wang, Lihua Jian, Gemine Vivone

    Abstract: The fusion of infrared and visible images is essential in remote sensing applications, as it combines the thermal information of infrared images with the detailed texture of visible images for more accurate analysis in tasks like environmental monitoring, target detection, and disaster management. The current fusion methods based on Transformer techniques for infrared and visible (IV) images have… ▽ More

    Submitted 17 June, 2025; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: v2 Update: Enhanced version with improved model stability (new augmentation/regularization), added benchmarks (SwinFuse/AEFusion/LRRNet, FMI/SSIM), and RGB-NIR dataset. Updated authorship (H.Yan first, L.Jian corresponding). Under review at Infrared Physics & Technology. Original: arXiv:2401.11675v1

  35. arXiv:2310.09937  [pdf, other

    eess.IV eess.SP

    Joint Sparse Representations and Coupled Dictionary Learning in Multi-Source Heterogeneous Image Pseudo-color Fusion

    Authors: Long Bai, Shilong Yao, Kun Gao, Yanjun Huang, Ruijie Tang, Hong Yan, Max Q. -H. Meng, Hongliang Ren

    Abstract: Considering that Coupled Dictionary Learning (CDL) method can obtain a reasonable linear mathematical relationship between resource images, we propose a novel CDL-based Synthetic Aperture Radar (SAR) and multispectral pseudo-color fusion method. Firstly, the traditional Brovey transform is employed as a pre-processing method on the paired SAR and multispectral images. Then, CDL is used to capture… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: To appear in IEEE Sensors Journal

  36. arXiv:2306.10275  [pdf, other

    eess.SY cs.AI cs.LG

    Multi-Scale Simulation of Complex Systems: A Perspective of Integrating Knowledge and Data

    Authors: Huandong Wang, Huan Yan, Can Rong, Yuan Yuan, Fenyu Jiang, Zhenyu Han, Hongjie Sui, Depeng Jin, Yong Li

    Abstract: Complex system simulation has been playing an irreplaceable role in understanding, predicting, and controlling diverse complex systems. In the past few decades, the multi-scale simulation technique has drawn increasing attention for its remarkable ability to overcome the challenges of complex system simulation with unknown mechanisms and expensive computational costs. In this survey, we will syste… ▽ More

    Submitted 23 July, 2024; v1 submitted 17 June, 2023; originally announced June 2023.

    ACM Class: I.2; I.6; J.2; J.4

  37. arXiv:2303.06543  [pdf, other

    cs.CV eess.IV

    MetaUE: Model-based Meta-learning for Underwater Image Enhancement

    Authors: Zhenwei Zhang, Haorui Yan, Ke Tang, Yuping Duan

    Abstract: The challenges in recovering underwater images are the presence of diverse degradation factors and the lack of ground truth images. Although synthetic underwater image pairs can be used to overcome the problem of inadequately observing data, it may result in over-fitting and enhancement degradation. This paper proposes a model-based deep learning method for restoring clean images under various und… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

  38. arXiv:2302.13251  [pdf, other

    eess.IV cs.CV cs.LG

    Unsupervised Domain Adaptation for Low-dose CT Reconstruction via Bayesian Uncertainty Alignment

    Authors: Kecheng Chen, Jie Liu, Renjie Wan, Victor Ho-Fun Lee, Varut Vardhanabhuti, Hong Yan, Haoliang Li

    Abstract: Low-dose computed tomography (LDCT) image reconstruction techniques can reduce patient radiation exposure while maintaining acceptable imaging quality. Deep learning is widely used in this problem, but the performance of testing data (a.k.a. target domain) is often degraded in clinical scenarios due to the variations that were not encountered in training data (a.k.a. source domain). Unsupervised d… ▽ More

    Submitted 2 June, 2024; v1 submitted 26 February, 2023; originally announced February 2023.

    Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems

  39. arXiv:2211.02419  [pdf, other

    eess.IV cs.CV cs.LG

    High-Resolution Boundary Detection for Medical Image Segmentation with Piece-Wise Two-Sample T-Test Augmented Loss

    Authors: Yucong Lin, Jinhua Su, Yuhang Li, Yuhao Wei, Hanchao Yan, Saining Zhang, Jiaan Luo, Danni Ai, Hong Song, Jingfan Fan, Tianyu Fu, Deqiang Xiao, Feifei Wang, Jue Hou, Jian Yang

    Abstract: Deep learning methods have contributed substantially to the rapid advancement of medical image segmentation, the quality of which relies on the suitable design of loss functions. Popular loss functions, including the cross-entropy and dice losses, often fall short of boundary detection, thereby limiting high-resolution downstream applications such as automated diagnoses and procedures. We develope… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  40. arXiv:2210.03883  [pdf, other

    cs.CV eess.IV

    Rethinking the Detection Head Configuration for Traffic Object Detection

    Authors: Yi Shi, Jiang Wu, Shixuan Zhao, Gangyao Gao, Tao Deng, Hongmei Yan

    Abstract: Multi-scale detection plays an important role in object detection models. However, researchers usually feel blank on how to reasonably configure detection heads combining multi-scale features at different input resolutions. We find that there are different matching relationships between the object distribution and the detection head at different input resolutions. Based on the instructive findings… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: 26 pages, 4 figures, 7 tables

  41. arXiv:2209.12764  [pdf, other

    eess.IV

    Graph Neural Network and Superpixel Based Brain Tissue Segmentation (Corrected Version)

    Authors: Chong Wu, Zhenan Feng, Houwang Zhang, Hong Yan

    Abstract: Convolutional neural networks (CNNs) are usually used as a backbone to design methods in biomedical image segmentation. However, the limitation of receptive field and large number of parameters limit the performance of these methods. In this paper, we propose a graph neural network (GNN) based method named GNN-SEG for the segmentation of brain tissues. Different to conventional CNN based methods,… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: The original version of this paper was accepted and presented at 2022 International Joint Conference on Neural Networks (IJCNN). This version corrects the mistakes in Figs. 7 and 8

  42. arXiv:2208.08855  [pdf, other

    eess.SP stat.AP stat.ME

    Adaptive Partially-Observed Sequential Change Detection and Isolation

    Authors: Xinyu Zhao, Jiuyun Hu, Yajun Mei, Hao Yan

    Abstract: High-dimensional data has become popular due to the easy accessibility of sensors in modern industrial applications. However, one specific challenge is that it is often not easy to obtain complete measurements due to limited sensing powers and resource constraints. Furthermore, distinct failure patterns may exist in the systems, and it is necessary to identify the true failure pattern. This work f… ▽ More

    Submitted 25 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

    Comments: Accepted in Technometrics

  43. arXiv:2205.03599  [pdf, other

    eess.IV cs.CV

    GAN-Based Multi-View Video Coding with Spatio-Temporal EPI Reconstruction

    Authors: Chengdong Lan, Hao Yan, Cheng Luo, Tiesong Zhao

    Abstract: The introduction of multiple viewpoints in video scenes inevitably increases the bitrates required for storage and transmission. To reduce bitrates, researchers have developed methods to skip intermediate viewpoints during compression and delivery, and ultimately reconstruct them using Side Information (SI). Typically, depth maps are used to construct SI. However, their methods suffer from inaccur… ▽ More

    Submitted 5 May, 2023; v1 submitted 7 May, 2022; originally announced May 2022.

  44. arXiv:2201.04397  [pdf, other

    eess.IV cs.CV cs.LG

    Towards Adversarially Robust Deep Image Denoising

    Authors: Hanshu Yan, Jingfeng Zhang, Jiashi Feng, Masashi Sugiyama, Vincent Y. F. Tan

    Abstract: This work systematically investigates the adversarial robustness of deep image denoisers (DIDs), i.e, how well DIDs can recover the ground truth from noisy observations degraded by adversarial perturbations. Firstly, to evaluate DIDs' robustness, we propose a novel adversarial attack, namely Observation-based Zero-mean Attack ({\sc ObsAtk}), to craft adversarial zero-mean perturbations on given no… ▽ More

    Submitted 13 January, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

  45. arXiv:2201.02727  [pdf, other

    eess.SP eess.SY

    Multi-Mode Spatial Signal Processor with Rainbow-like Fast Beam Training and Wideband Communications using True-Time-Delay Arrays

    Authors: Chung-Ching Lin, Chase Puglisi, Veljko Boljanovic, Han Yan, Erfan Ghaderi, Jayce Gaddis, Qiuyan Xu, Sreeni Poolakkal, Danijela Cabric, Subhanshu Gupta

    Abstract: Initial access in millimeter-wave (mmW) wireless is critical toward successful realization of the fifth-generation (5G) wireless networks and beyond. Limited bandwidth in existing standards and use of phase-shifters in analog/hybrid phased-antenna arrays (PAA) are not suited for these emerging standards demanding low-latency direction finding. This work proposes a reconfigurable true-time-delay (T… ▽ More

    Submitted 7 January, 2022; originally announced January 2022.

  46. arXiv:2111.15191  [pdf, other

    eess.SP eess.SY

    Wideband Beamforming with Rainbow Beam Training using Reconfigurable True-Time-Delay Arrays for Millimeter-Wave Wireless

    Authors: Chung-Ching Lin, Veljko Boljanovic, Han Yan, Erfan Ghaderi, Mohammad Ali Mokri, Jayce Jeron Gaddis, Aditya Wadaskar, Chase Puglisi, Soumen Mohapatra, Qiuyan Xu, Sreeni Poolakkal, Deukhyoun Heo, Subhanshu Gupta, Danijela Cabric

    Abstract: The decadal research in integrated true-time-delay arrays have seen organic growth enabling realization of wideband beamformers for large arrays with wide aperture widths. This article introduces highly reconfigurable delay elements implementable at analog or digital baseband that enables multiple SSP functions including wideband beamforming, wideband interference cancellation, and fast beam train… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.

  47. arXiv:2109.09317  [pdf, other

    cs.LG cs.AI eess.SP

    Deep Spatio-temporal Sparse Decomposition for Trend Prediction and Anomaly Detection in Cardiac Electrical Conduction

    Authors: Xinyu Zhao, Hao Yan, Zhiyong Hu, Dongping Du

    Abstract: Electrical conduction among cardiac tissue is commonly modeled with partial differential equations, i.e., reaction-diffusion equation, where the reaction term describes cellular stimulation and diffusion term describes electrical propagation. Detecting and identifying of cardiac cells that produce abnormal electrical impulses in such nonlinear dynamic systems are important for efficient treatment… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  48. Rainbow-link: Beam-Alignment-Free and Grant-Free mmW Multiple Access using True-Time-Delay Array

    Authors: Ruifu Li, Han Yan, Danijela Cabric

    Abstract: The millimeter-wave (mmW) communications is a key enabling technology in 5G to provide ultra-high throughput. Current mmW technologies rely on analog phased arrays to realize beamforming gain and overcome high path loss. However, due to a limited number of simultaneous beams that can be created with analog/hybrid phased antenna arrays, the overheads of beam training and beam scheduling become a bo… ▽ More

    Submitted 15 January, 2022; v1 submitted 21 August, 2021; originally announced September 2021.

    Comments: Accepted for JSAC issue on Next Generation Multiple Access (NGMA). Available on IEEE Xplore JSAC Early Access

  49. arXiv:2106.01255  [pdf, other

    eess.SP eess.SY

    A 4-Element 800MHz-BW 29mW True-Time-Delay Spatial Signal Processor Enabling Fast Beam-Training with Data Communications

    Authors: Chung-Ching Lin, Chase Puglisi, Veljko Boljanovic, Soumen Mohapatra, Han Yan, Erfan Ghaderi, Deukhyoun Heo, Danijela Cabric, Subhanshu Gupta

    Abstract: Spatial signal processors (SSP) for emerging millimeter-wave wireless networks are critically dependent on link discovery. To avoid loss in communication, mobile devices need to locate narrow directional beams with millisecond latency. In this work, we demonstrate a true-time-delay (TTD) array with digitally reconfigurable delay elements enabling both fast beam-training at the receiver with wideba… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: to be presented at the IEEE European Solid-State Circuits Conference in September 2021

  50. arXiv:2105.15096  [pdf, ps, other

    cs.NI cs.ET eess.SP

    Small-Scale Spatial-Temporal Correlation and Degrees of Freedom for Reconfigurable Intelligent Surfaces

    Authors: Shu Sun, Hangsong Yan

    Abstract: The reconfigurable intelligent surface (RIS) is an emerging promising candidate technology for future wireless networks, where the element spacing is usually of sub-wavelength. Only limited knowledge, however, has been gained about the spatial-temporal correlation behavior among the elements in an RIS. In this paper, we investigate the spatial-temporal correlation for an RIS-enabled wireless commu… ▽ More

    Submitted 13 September, 2021; v1 submitted 1 April, 2021; originally announced May 2021.

    Comments: 5 pages, 7 figures, IEEE Wireless Communications Letters, DOI:10.1109/LWC.2021.3112781