Skip to main content

Showing 1–16 of 16 results for author: Xiong, S

Searching in archive eess. Search in all archives.
.
  1. RepSNet: A Nucleus Instance Segmentation model based on Boundary Regression and Structural Re-parameterization

    Authors: Shengchun Xiong, Xiangru Li, Yunpeng Zhong, Wanfen Peng

    Abstract: Pathological diagnosis is the gold standard for tumor diagnosis, and nucleus instance segmentation is a key step in digital pathology analysis and pathological diagnosis. However, the computational efficiency of the model and the treatment of overlapping targets are the major challenges in the studies of this problem. To this end, a neural network model RepSNet was designed based on a nucleus boun… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

    Comments: 25 pages, 7 figures, 5 tables

    Journal ref: Int J Comput Vis (2025)

  2. arXiv:2504.08365  [pdf, other

    cs.SD eess.AS

    Location-Oriented Sound Event Localization and Detection with Spatial Mapping and Regression Localization

    Authors: Xueping Zhang, Yaxiong Chen, Ruilin Yao, Yunfei Zi, Shengwu Xiong

    Abstract: Sound Event Localization and Detection (SELD) combines the Sound Event Detection (SED) with the corresponding Direction Of Arrival (DOA). Recently, adopted event oriented multi-track methods affect the generality in polyphonic environments due to the limitation of the number of tracks. To enhance the generality in polyphonic environments, we propose Spatial Mapping and Regression Localization for… ▽ More

    Submitted 22 April, 2025; v1 submitted 11 April, 2025; originally announced April 2025.

  3. arXiv:2503.19649  [pdf, other

    eess.SP cs.AI

    Recover from Horcrux: A Spectrogram Augmentation Method for Cardiac Feature Monitoring from Radar Signal Components

    Authors: Yuanyuan Zhang, Sijie Xiong, Rui Yang, EngGee Lim, Yutao Yue

    Abstract: Radar-based wellness monitoring is becoming an effective measurement to provide accurate vital signs in a contactless manner, but data scarcity retards the related research on deep-learning-based methods. Data augmentation is commonly used to enrich the dataset by modifying the existing data, but most augmentation techniques can only couple with classification tasks. To enable the augmentation for… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  4. arXiv:2503.13987  [pdf, other

    eess.IV cs.CV

    Striving for Simplicity: Simple Yet Effective Prior-Aware Pseudo-Labeling for Semi-Supervised Ultrasound Image Segmentation

    Authors: Yaxiong Chen, Yujie Wang, Zixuan Zheng, Jingliang Hu, Yilei Shi, Shengwu Xiong, Xiao Xiang Zhu, Lichao Mou

    Abstract: Medical ultrasound imaging is ubiquitous, but manual analysis struggles to keep pace. Automated segmentation can help but requires large labeled datasets, which are scarce. Semi-supervised learning leveraging both unlabeled and limited labeled data is a promising approach. State-of-the-art methods use consistency regularization or pseudo-labeling but grow increasingly complex. Without sufficient l… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: MICCAI 2024

  5. arXiv:2501.15206  [pdf, other

    physics.app-ph cond-mat.dis-nn eess.SY

    Engineering-Oriented Design of Drift-Resilient MTJ Random Number Generator via Hybrid Control Strategies

    Authors: Ran Zhang, Caihua Wan, Yingqian Xu, Xiaohan Li, Raik Hoffmann, Meike Hindenberg, Shiqiang Liu, Dehao Kong, Shilong Xiong, Shikun He, Alptekin Vardar, Qiang Dai, Junlu Gong, Yihui Sun, Zejie Zheng, Thomas Kämpfe, Guoqiang Yu, Xiufeng Han

    Abstract: Magnetic Tunnel Junctions (MTJs) have shown great promise as hardware sources for true random number generation (TRNG) due to their intrinsic stochastic switching behavior. However, practical deployment remains challenged by drift in switching probability caused by thermal fluctuations, device aging, and environmental instability. This work presents an engineering-oriented, drift-resilient MTJ-bas… ▽ More

    Submitted 19 April, 2025; v1 submitted 25 January, 2025; originally announced January 2025.

    Comments: 16 pages, 9 figures, data shared at https://doi.org/10.6084/m9.figshare.28680899.v1

  6. arXiv:2409.17603  [pdf, other

    cs.CL cs.SD eess.AS

    Deep CLAS: Deep Contextual Listen, Attend and Spell

    Authors: Mengzhi Wang, Shifu Xiong, Genshun Wan, Hang Chen, Jianqing Gao, Lirong Dai

    Abstract: Contextual-LAS (CLAS) has been shown effective in improving Automatic Speech Recognition (ASR) of rare words. It relies on phrase-level contextual modeling and attention-based relevance scoring without explicit contextual constraint which lead to insufficient use of contextual information. In this work, we propose deep CLAS to use contextual information better. We introduce bias loss forcing model… ▽ More

    Submitted 19 December, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

    Comments: Submitted to JUSTC

  7. arXiv:2409.15168  [pdf, ps, other

    cs.SD eess.AS

    Adaptive Learning via a Negative Selection Strategy for Few-Shot Bioacoustic Event Detection

    Authors: Yaxiong Chen, Xueping Zhang, Yunfei Zi, Shengwu Xiong

    Abstract: Although the Prototypical Network (ProtoNet) has demonstrated effectiveness in few-shot biological event detection, two persistent issues remain. Firstly, there is difficulty in constructing a representative negative prototype due to the absence of explicitly annotated negative samples. Secondly, the durations of the target biological vocalisations vary across tasks, making it challenging for the… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

  8. arXiv:2409.13286  [pdf, ps, other

    cs.IT eess.SP

    Generative Learning Powered Probing Beam Optimization for Cell-Free Hybrid Beamforming

    Authors: Cheng Zhang, Shuangbo Xiong, Mengqing He, Lan Wei, Yongming Huang, Wei Zhang

    Abstract: Probing beam measurement (PBM)-based hybrid beamforming provides a feasible solution for cell-free MIMO. In this letter, we propose a novel probing beam optimization framework where three collaborative modules respectively realize PBM augmentation, sum-rate prediction and probing beam optimization. Specifically, the PBM augmentation model integrates the conditional variational auto-encoder (CVAE)… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  9. arXiv:2401.11675  [pdf, other

    eess.IV

    Rethinking Cross-Attention for Infrared and Visible Image Fusion

    Authors: Lihua Jian, Songlei Xiong, Han Yan, Xiaoguang Niu, Shaowu Wu, Di Zhang

    Abstract: The salient information of an infrared image and the abundant texture of a visible image can be fused to obtain a comprehensive image. As can be known, the current fusion methods based on Transformer techniques for infrared and visible (IV) images have exhibited promising performance. However, the attention mechanism of the previous Transformer-based methods was prone to extract common information… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  10. VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

    Authors: Jianing Qiu, Jian Wu, Hao Wei, Peilun Shi, Minqing Zhang, Yunyun Sun, Lin Li, Hanruo Liu, Hongyi Liu, Simeng Hou, Yuyang Zhao, Xuehui Shi, Junfang Xian, Xiaoxia Qu, Sirui Zhu, Lijie Pan, Xiaoniao Chen, Xiaojia Zhang, Shuai Jiang, Kebing Wang, Chenlong Yang, Mingqiang Chen, Sujie Fan, Jianhua Hu, Aiguo Lv , et al. (17 additional authors not shown)

    Abstract: We present VisionFM, a foundation model pre-trained with 3.4 million ophthalmic images from 560,457 individuals, covering a broad range of ophthalmic diseases, modalities, imaging devices, and demography. After pre-training, VisionFM provides a foundation to foster multiple ophthalmic artificial intelligence (AI) applications, such as disease screening and diagnosis, disease prognosis, subclassifi… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Journal ref: The latest VisionFM work has been published in NEJM AI, 2024

  11. arXiv:2202.08509  [pdf, other

    cs.SD cs.AI cs.CV cs.LG eess.AS

    A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning

    Authors: Hengshun Zhou, Jun Du, Chao-Han Huck Yang, Shifu Xiong, Chin-Hui Lee

    Abstract: Audio-only-based wake word spotting (WWS) is challenging under noisy conditions due to environmental interference in signal transmission. In this paper, we investigate on designing a compact audio-visual WWS system by utilizing visual information to alleviate the degradation. Specifically, in order to use visual information, we first encode the detected lips to fixed-size vectors with MobileNet an… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: Accepted to ICASSP 2022. H. Zhou et al

  12. arXiv:2111.14992  [pdf, other

    eess.SP cs.CR

    Network Traffic Shaping for Enhancing Privacy in IoT Systems

    Authors: Sijie Xiong, Anand D. Sarwate, Narayan B. Mandayam

    Abstract: Motivated by privacy issues caused by inference attacks on user activities in the packet sizes and timing information of Internet of Things (IoT) network traffic, we establish a rigorous event-level differential privacy (DP) model on infinite packet streams. We propose a memoryless traffic shaping mechanism satisfying a first-come-first-served queuing discipline that outputs traffic dependent on t… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Comments: 18 pages, 10 figures, submitted to IEEE Transactions on Networking

  13. arXiv:2107.02412  [pdf, ps, other

    cs.IT eess.SP

    GBLinks: GNN-Based Beam Selection and Link Activation for Ultra-dense D2D mmWave Networks

    Authors: S. He, S. Xiong, W. Zhang, Y. Yang, J. Ren, Y. Huang

    Abstract: In this paper, we consider the problem of joint beam selection and link activation across a set of communication pairs to effectively control the interference between communication pairs via inactivating part communication pairs in ultra-dense device-to-device (D2D) mmWave communication networks. The resulting optimization problem is formulated as an integer programming problem that is nonconvex a… ▽ More

    Submitted 29 December, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: 31 pages, 9 figures, submitted to IEEE Trans. on Commun., July 2021, major revised in Dec. 2021

  14. arXiv:2008.08523  [pdf

    cs.CV cs.LG eess.IV

    Scene Text Detection with Selected Anchor

    Authors: Anna Zhu, Hang Du, Shengwu Xiong

    Abstract: Object proposal technique with dense anchoring scheme for scene text detection were applied frequently to achieve high recall. It results in the significant improvement in accuracy but waste of computational searching, regression and classification. In this paper, we propose an anchor selection-based region proposal network (AS-RPN) using effective selected anchors instead of dense anchors to extr… ▽ More

    Submitted 19 August, 2020; originally announced August 2020.

    Comments: 8 pages

  15. arXiv:1910.04919  [pdf

    cs.CV cs.LG eess.IV

    From Species to Cultivar: Soybean Cultivar Recognition using Multiscale Sliding Chord Matching of Leaf Images

    Authors: Bin Wang, Yongsheng Gao, Xiaohan Yu, Xiaohui Yuan, Shengwu Xiong, Xianzhong Feng

    Abstract: Leaf image recognition techniques have been actively researched for plant species identification. However it remains unclear whether leaf patterns can provide sufficient information for cultivar recognition. This paper reports the first attempt on soybean cultivar recognition from plant leaves which is not only a challenging research problem but also important for soybean cultivar evaluation, sele… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

    Comments: 33 pages, 8 figures

  16. arXiv:1812.02455  [pdf, ps, other

    cs.CL cs.SD eess.AS

    The USTC-NEL Speech Translation system at IWSLT 2018

    Authors: Dan Liu, Junhua Liu, Wu Guo, Shifu Xiong, Zhiqiang Ma, Rui Song, Chongliang Wu, Quan Liu

    Abstract: This paper describes the USTC-NEL system to the speech translation task of the IWSLT Evaluation 2018. The system is a conventional pipeline system which contains 3 modules: speech recognition, post-processing and machine translation. We train a group of hybrid-HMM models for our speech recognition, and for machine translation we train transformer based neural machine translation models with speech… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

    Comments: 5 pages, 8 tabels