Search | arXiv e-print repository

doi 10.1002/aisy.202500425

Real-Time Guidewire Tip Tracking Using a Siamese Network for Image-Guided Endovascular Procedures

Authors: Tianliang Yao, Zhiqiang Pei, Yong Li, Yixuan Yuan, Peng Qi

Abstract: An ever-growing incorporation of AI solutions into clinical practices enhances the efficiency and effectiveness of healthcare services. This paper focuses on guidewire tip tracking tasks during image-guided therapy for cardiovascular diseases, aiding physicians in improving diagnostic and therapeutic quality. A novel tracking framework based on a Siamese network with dual attention mechanisms comb… ▽ More An ever-growing incorporation of AI solutions into clinical practices enhances the efficiency and effectiveness of healthcare services. This paper focuses on guidewire tip tracking tasks during image-guided therapy for cardiovascular diseases, aiding physicians in improving diagnostic and therapeutic quality. A novel tracking framework based on a Siamese network with dual attention mechanisms combines self- and cross-attention strategies for robust guidewire tip tracking. This design handles visual ambiguities, tissue deformations, and imaging artifacts through enhanced spatial-temporal feature learning. Validation occurred on 3 randomly selected clinical digital subtraction angiography (DSA) sequences from a dataset of 15 sequences, covering multiple interventional scenarios. The results indicate a mean localization error of 0.421 $\pm$ 0.138 mm, with a maximum error of 1.736 mm, and a mean Intersection over Union (IoU) of 0.782. The framework maintains an average processing speed of 57.2 frames per second, meeting the temporal demands of endovascular imaging. Further validations with robotic platforms for automating diagnostics and therapies in clinical routines yielded tracking errors of 0.708 $\pm$ 0.695 mm and 0.148 $\pm$ 0.057 mm in two distinct experimental scenarios. △ Less

Submitted 24 June, 2025; originally announced July 2025.

Comments: This paper has been accepted by Advanced Intelligent Systems

arXiv:2505.24351 [pdf, ps, other]

A Novel Coronary Artery Registration Method Based on Super-pixel Particle Swarm Optimization

Authors: Peng Qi, Wenxi Qu, Tianliang Yao, Haonan Ma, Dylan Wintle, Yinyi Lai, Giorgos Papanastasiou, Chengjia Wang

Abstract: Percutaneous Coronary Intervention (PCI) is a minimally invasive procedure that improves coronary blood flow and treats coronary artery disease. Although PCI typically requires 2D X-ray angiography (XRA) to guide catheter placement at real-time, computed tomography angiography (CTA) may substantially improve PCI by providing precise information of 3D vascular anatomy and status. To leverage real-t… ▽ More Percutaneous Coronary Intervention (PCI) is a minimally invasive procedure that improves coronary blood flow and treats coronary artery disease. Although PCI typically requires 2D X-ray angiography (XRA) to guide catheter placement at real-time, computed tomography angiography (CTA) may substantially improve PCI by providing precise information of 3D vascular anatomy and status. To leverage real-time XRA and detailed 3D CTA anatomy for PCI, accurate multimodal image registration of XRA and CTA is required, to guide the procedure and avoid complications. This is a challenging process as it requires registration of images from different geometrical modalities (2D -> 3D and vice versa), with variations in contrast and noise levels. In this paper, we propose a novel multimodal coronary artery image registration method based on a swarm optimization algorithm, which effectively addresses challenges such as large deformations, low contrast, and noise across these imaging modalities. Our algorithm consists of two main modules: 1) preprocessing of XRA and CTA images separately, and 2) a registration module based on feature extraction using the Steger and Superpixel Particle Swarm Optimization algorithms. Our technique was evaluated on a pilot dataset of 28 pairs of XRA and CTA images from 10 patients who underwent PCI. The algorithm was compared with four state-of-the-art (SOTA) methods in terms of registration accuracy, robustness, and efficiency. Our method outperformed the selected SOTA baselines in all aspects. Experimental results demonstrate the significant effectiveness of our algorithm, surpassing the previous benchmarks and proposes a novel clinical approach that can potentially have merit for improving patient outcomes in coronary artery disease. △ Less

Submitted 30 May, 2025; originally announced May 2025.

arXiv:2504.07399 [pdf, other]

WK-Pnet: FM-Based Positioning via Wavelet Packet Decomposition and Knowledge Distillation

Authors: Shilian Zheng, Quan Lin, Peihan Qi, Luxin Zhang, Xinjiang Qiu, Zhijin Zhao, Xiaoniu Yang

Abstract: Accurate and efficient positioning in complex environments is critical for applications where traditional satellite-based systems face limitations, such as indoors or urban canyons. This paper introduces WK-Pnet, an FM-based indoor positioning framework that combines wavelet packet decomposition (WPD) and knowledge distillation. WK-Pnet leverages WPD to extract rich time-frequency features from FM… ▽ More Accurate and efficient positioning in complex environments is critical for applications where traditional satellite-based systems face limitations, such as indoors or urban canyons. This paper introduces WK-Pnet, an FM-based indoor positioning framework that combines wavelet packet decomposition (WPD) and knowledge distillation. WK-Pnet leverages WPD to extract rich time-frequency features from FM signals, which are then processed by a deep learning model for precise position estimation. To address computational demands, we employ knowledge distillation, transferring insights from a high-capacity model to a streamlined student model, achieving substantial reductions in complexity without sacrificing accuracy. Experimental results across diverse environments validate WK-Pnet's superior positioning accuracy and lower computational requirements, making it a viable solution for positioning in real-time resource-constraint applications. △ Less

Submitted 9 April, 2025; originally announced April 2025.

arXiv:2503.13480 [pdf, other]

WVEmbs with its Masking: A Method For Radar Signal Sorting

Authors: Xianan Hu, Fu Li, Kairui Niu, Peihan Qi, Zhiyong Liang

Abstract: Our study proposes a novel embedding method, Wide-Value-Embeddings (WVEmbs), for processing Pulse Descriptor Words (PDWs) as normalized inputs to neural networks. This method adapts to the distribution of interleaved radar signals, ranking original signal features from trivial to useful and stabilizing the learning process. To address the imbalance in radar signal interleaving, we introduce a valu… ▽ More Our study proposes a novel embedding method, Wide-Value-Embeddings (WVEmbs), for processing Pulse Descriptor Words (PDWs) as normalized inputs to neural networks. This method adapts to the distribution of interleaved radar signals, ranking original signal features from trivial to useful and stabilizing the learning process. To address the imbalance in radar signal interleaving, we introduce a value dimension masking method on WVEmbs, which automatically and efficiently generates challenging samples, and constructs interleaving scenarios, thereby compelling the model to learn robust features. Experimental results demonstrate that our method is an efficient end-to-end approach, achieving high-granularity, sample-level pulse sorting for high-density interleaved radar pulse sequences in complex and non-ideal environments. △ Less

Submitted 5 March, 2025; originally announced March 2025.

arXiv:2503.06816 [pdf, other]

Semi-Supervised Medical Image Segmentation via Knowledge Mining from Large Models

Authors: Yuchen Mao, Hongwei Li, Yinyi Lai, Giorgos Papanastasiou, Peng Qi, Yunjie Yang, Chengjia Wang

Abstract: Large-scale vision models like SAM have extensive visual knowledge, yet their general nature and computational demands limit their use in specialized tasks like medical image segmentation. In contrast, task-specific models such as U-Net++ often underperform due to sparse labeled data. This study introduces a strategic knowledge mining method that leverages SAM's broad understanding to boost the pe… ▽ More Large-scale vision models like SAM have extensive visual knowledge, yet their general nature and computational demands limit their use in specialized tasks like medical image segmentation. In contrast, task-specific models such as U-Net++ often underperform due to sparse labeled data. This study introduces a strategic knowledge mining method that leverages SAM's broad understanding to boost the performance of small, locally hosted deep learning models. In our approach, we trained a U-Net++ model on a limited labeled dataset and extend its capabilities by converting SAM's output infered on unlabeled images into prompts. This process not only harnesses SAM's generalized visual knowledge but also iteratively improves SAM's prediction to cater specialized medical segmentation tasks via U-Net++. The mined knowledge, serving as "pseudo labels", enriches the training dataset, enabling the fine-tuning of the local network. Applied to the Kvasir SEG and COVID-QU-Ex datasets which consist of gastrointestinal polyp and lung X-ray images respectively, our proposed method consistently enhanced the segmentation performance on Dice by 3% and 1% respectively over the baseline U-Net++ model, when the same amount of labelled data were used during training (75% and 50% of labelled data). Remarkably, our proposed method surpassed the baseline U-Net++ model even when the latter was trained exclusively on labeled data (100% of labelled data). These results underscore the potential of knowledge mining to overcome data limitations in specialized models by leveraging the broad, albeit general, knowledge of large-scale models like SAM, all while maintaining operational efficiency essential for clinical applications. △ Less

Submitted 9 March, 2025; originally announced March 2025.

Comments: 18 pages, 2 figures

arXiv:2303.03093 [pdf, other]

A Miniaturised Camera-based Multi-Modal Tactile Sensor

Authors: Kaspar Althoefer, Yonggen Ling, Wanlin Li, Xinyuan Qian, Wang Wei Lee, Peng Qi

Abstract: In conjunction with huge recent progress in camera and computer vision technology, camera-based sensors have increasingly shown considerable promise in relation to tactile sensing. In comparison to competing technologies (be they resistive, capacitive or magnetic based), they offer super-high-resolution, while suffering from fewer wiring problems. The human tactile system is composed of various ty… ▽ More In conjunction with huge recent progress in camera and computer vision technology, camera-based sensors have increasingly shown considerable promise in relation to tactile sensing. In comparison to competing technologies (be they resistive, capacitive or magnetic based), they offer super-high-resolution, while suffering from fewer wiring problems. The human tactile system is composed of various types of mechanoreceptors, each able to perceive and process distinct information such as force, pressure, texture, etc. Camera-based tactile sensors such as GelSight mainly focus on high-resolution geometric sensing on a flat surface, and their force measurement capabilities are limited by the hysteresis and non-linearity of the silicone material. In this paper, we present a miniaturised dome-shaped camera-based tactile sensor that allows accurate force and tactile sensing in a single coherent system. The key novelty of the sensor design is as follows. First, we demonstrate how to build a smooth silicone hemispheric sensing medium with uniform markers on its curved surface. Second, we enhance the illumination of the rounded silicone with diffused LEDs. Third, we construct a force-sensitive mechanical structure in a compact form factor with usage of springs to accurately perceive forces. Our multi-modal sensor is able to acquire tactile information from multi-axis forces, local force distribution, and contact geometry, all in real-time. We apply an end-to-end deep learning method to process all the information. △ Less

Submitted 6 March, 2023; originally announced March 2023.

arXiv:2108.02317 [pdf]

Efficient Fourier single-pixel imaging with Gaussian random sampling

Authors: Ziheng Qiu, Xinyi Guo, Tianao Lu, Pan Qi, Zibang Zhang, Jingang Zhong

Abstract: Fourier single-pixel imaging (FSI) is a branch of single-pixel imaging techniques. It uses Fourier basis patterns as structured patterns for spatial information acquisition in the Fourier domain. However, the spatial resolution of the image reconstructed by FSI mainly depends on the number of Fourier coefficients sampled. The reconstruction of a high-resolution image typically requires a number of… ▽ More Fourier single-pixel imaging (FSI) is a branch of single-pixel imaging techniques. It uses Fourier basis patterns as structured patterns for spatial information acquisition in the Fourier domain. However, the spatial resolution of the image reconstructed by FSI mainly depends on the number of Fourier coefficients sampled. The reconstruction of a high-resolution image typically requires a number of Fourier coefficients to be sampled, and therefore takes a long data acquisition time. Here we propose a new sampling strategy for FSI. It allows FSI to reconstruct a clear and sharp image with a reduced number of measurements. The core of the proposed sampling strategy is to perform a variable density sampling in the Fourier space and, more importantly, the density with respect to the importance of Fourier coefficients is subject to a one-dimensional Gaussian function. Combined with compressive sensing, the proposed sampling strategy enables better reconstruction quality than conventional sampling strategies, especially when the sampling ratio is low. We experimentally demonstrate compressive FSI combined with the proposed sampling strategy is able to reconstruct a sharp and clear image of 256-by-256 pixels with a sampling ratio of 10%. The proposed method enables fast single-pixel imaging and provides a new approach for efficient spatial information acquisition. △ Less

Submitted 28 June, 2021; originally announced August 2021.

arXiv:2106.10401 [pdf]

Parallel frequency function-deep neural network for efficient complex broadband signal approximation

Authors: Zhi Zeng, Pengpeng Shi, Fulei Ma, Peihan Qi

Abstract: A neural network is essentially a high-dimensional complex mapping model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency… ▽ More A neural network is essentially a high-dimensional complex mapping model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency band extraction and frequency shift techniques [Cai et al. SIAM J. SCI. COMPUT. 42, A3285 (2020)]. Our paper is devoted to an alternative candidate for fitting complex signals with high-frequency components. Here, a parallel frequency function-deep neural network (PFF-DNN) is proposed to suppress computational overhead while ensuring fitting accuracy by utilizing fast Fourier analysis of broadband signals and the spectral bias nature of neural networks. The effectiveness and efficiency of the proposed PFF-DNN method are verified based on detailed numerical experiments for six typical broadband signals. △ Less

Submitted 18 June, 2021; originally announced June 2021.

arXiv:2105.12951 [pdf, other]

VeniBot: Towards Autonomous Venipuncture with Automatic Puncture Area and Angle Regression from NIR Images

Authors: Xu Cao, Zijie Chen, Bolin Lai, Yuxuan Wang, Yu Chen, Zhengqing Cao, Zhilin Yang, Nanyang Ye, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

Abstract: Venipucture is a common step in clinical scenarios, and is with highly practical value to be automated with robotics. Nowadays, only a few on-shelf robotic systems are developed, however, they can not fulfill practical usage due to varied reasons. In this paper, we develop a compact venipucture robot -- VeniBot, with four parts, six motors and two imaging devices. For the automation, we focus on t… ▽ More Venipucture is a common step in clinical scenarios, and is with highly practical value to be automated with robotics. Nowadays, only a few on-shelf robotic systems are developed, however, they can not fulfill practical usage due to varied reasons. In this paper, we develop a compact venipucture robot -- VeniBot, with four parts, six motors and two imaging devices. For the automation, we focus on the positioning part and propose a Dual-In-Dual-Out network based on two-step learning and two-task learning, which can achieve fully automatic regression of the suitable puncture area and angle from near-infrared(NIR) images. The regressed suitable puncture area and angle can further navigate the positioning part of VeniBot, which is an important step towards a fully autonomous venipucture robot. Validation on 30 VeniBot-collected volunteers shows a high mean dice coefficient(DSC) of 0.7634 and a low angle error of 15.58° on suitable puncture area and angle regression respectively, indicating its potentially wide and practical application in the future. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2105.12945 [pdf, other]

VeniBot: Towards Autonomous Venipuncture with Semi-supervised Vein Segmentation from Ultrasound Images

Authors: Yu Chen, Yuxuan Wang, Bolin Lai, Zijie Chen, Xu Cao, Nanyang Ye, Zhongyuan Ren, Junbo Zhao, Xiao-Yun Zhou, Peng Qi

Abstract: In the modern medical care, venipuncture is an indispensable procedure for both diagnosis and treatment. In this paper, unlike existing solutions that fully or partially rely on professional assistance, we propose VeniBot -- a compact robotic system solution integrating both novel hardware and software developments. For the hardware, we design a set of units to facilitate the supporting, positioni… ▽ More In the modern medical care, venipuncture is an indispensable procedure for both diagnosis and treatment. In this paper, unlike existing solutions that fully or partially rely on professional assistance, we propose VeniBot -- a compact robotic system solution integrating both novel hardware and software developments. For the hardware, we design a set of units to facilitate the supporting, positioning, puncturing and imaging functionalities. For the software, to move towards a full automation, we propose a novel deep learning framework -- semi-ResNeXt-Unet for semi-supervised vein segmentation from ultrasound images. From which, the depth information of vein is calculated and used to enable automated navigation for the puncturing unit. VeniBot is validated on 40 volunteers, where ultrasound images can be collected successfully. For the vein segmentation validation, the proposed semi-ResNeXt-Unet improves the dice similarity coefficient (DSC) by 5.36%, decreases the centroid error by 1.38 pixels and decreases the failure rate by 5.60%, compared to fully-supervised ResNeXt-Unet. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2011.11337 [pdf, other]

DemodNet: Learning Soft Demodulation from Hard Information Using Convolutional Neural Network

Authors: Shilian Zheng, Xiaoyu Zhou, Shichuan Chen, Peihan Qi, Xiaoniu Yang

Abstract: Soft demodulation is a basic module of traditional communication receivers. It converts received symbols into soft bits, that is, log likelihood ratios (LLRs). However, in the nonideal additive white Gaussian noise (AWGN) channel, it is difficult to accurately calculate the LLR. In this letter, we propose a demodulator, DemodNet, based on a fully convolutional neural network with variable input an… ▽ More Soft demodulation is a basic module of traditional communication receivers. It converts received symbols into soft bits, that is, log likelihood ratios (LLRs). However, in the nonideal additive white Gaussian noise (AWGN) channel, it is difficult to accurately calculate the LLR. In this letter, we propose a demodulator, DemodNet, based on a fully convolutional neural network with variable input and output length. We use hard bit information to train the DemodNet, and we propose log probability ratio (LPR) based on the output layer of the trained DemodNet to realize soft demodulation. The simulation results show that under the AWGN channel, the performance of both hard demodulation and soft demodulation of DemodNet is very close to the traditional methods. In three non-ideal channel scenarios, i.e., the presence of frequency deviation, additive generalized Gaussian noise (AGGN) channel, and Rayleigh fading channel, the performance of channel decoding using the soft information LPR obtained by DemodNet is better than the performance of decoding using the exact LLR calculated under the ideal AWGN assumption. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: 5 pages, 6 figures

arXiv:1909.06020 [pdf]

Spectrum Sensing Based on Deep Learning Classification for Cognitive Radios

Authors: Shilian Zheng, Shichuan Chen, Peihan Qi, Huaji Zhou, Xiaoniu Yang

Abstract: Spectrum sensing is a key technology for cognitive radios. We present spectrum sensing as a classification problem and propose a sensing method based on deep learning classification. We normalize the received signal power to overcome the effects of noise power uncertainty. We train the model with as many types of signals as possible as well as noise data to enable the trained network model to adap… ▽ More Spectrum sensing is a key technology for cognitive radios. We present spectrum sensing as a classification problem and propose a sensing method based on deep learning classification. We normalize the received signal power to overcome the effects of noise power uncertainty. We train the model with as many types of signals as possible as well as noise data to enable the trained network model to adapt to untrained new signals. We also use transfer learning strategies to improve the performance for real-world signals. Extensive experiments are conducted to evaluate the performance of this method. The simulation results show that the proposed method performs better than two traditional spectrum sensing methods, i.e., maximum-minimum eigenvalue ratio-based method and frequency domain entropy-based method. In addition, the experimental results of the new untrained signal types show that our method can adapt to the detection of these new signals. Furthermore, the real-world signal detection experiment results show that the detection performance can be further improved by transfer learning. Finally, experiments under colored noise show that our proposed method has superior detection performance under colored noise, while the traditional methods have a significant performance degradation, which further validate the superiority of our method. △ Less

Submitted 12 September, 2019; originally announced September 2019.

Comments: Submitted to China Communications

Showing 1–12 of 12 results for author: Qi, P