Skip to main content

Showing 1–12 of 12 results for author: Tong, R

Searching in archive eess. Search in all archives.
.
  1. arXiv:2210.14645  [pdf, other

    eess.IV cs.CV

    Super-Resolution Based Patch-Free 3D Image Segmentation with High-Frequency Guidance

    Authors: Hongyi Wang, Lanfen Lin, Hongjie Hu, Qingqing Chen, Yinhao Li, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

    Abstract: High resolution (HR) 3D images are widely used nowadays, such as medical images like Magnetic Resonance Imaging (MRI) and Computed Tomography (CT). However, segmentation of these 3D images remains a challenge due to their high spatial resolution and dimensionality in contrast to currently limited GPU memory. Therefore, most existing 3D image segmentation methods use patch-based models, which have… ▽ More

    Submitted 10 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Version #2 uploaded in Jul 10, 2023

  2. Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer

    Authors: Lei Wang, Rong Tong

    Abstract: One of the challenges in automatic speech recognition is foreign words recognition. It is observed that a speaker's pronunciation of a foreign word is influenced by his native language knowledge, and such phenomenon is known as the effect of language transfer. This paper focuses on examining the phonetic effect of language transfer in automatic speech recognition. A set of lexical rules is propose… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Published by INTERSPEECH 2014

    ACM Class: I.2.7

  3. Cloud-based Automatic Speech Recognition Systems for Southeast Asian Languages

    Authors: Lei Wang, Rong Tong, Cheung Chi Leung, Sunil Sivadas, Chongjia Ni, Bin Ma

    Abstract: This paper provides an overall introduction of our Automatic Speech Recognition (ASR) systems for Southeast Asian languages. As not much existing work has been carried out on such regional languages, a few difficulties should be addressed before building the systems: limitation on speech and text resources, lack of linguistic knowledge, etc. This work takes Bahasa Indonesia and Thai as examples to… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Published by the 2017 IEEE International Conference on Orange Technologies (ICOT 2017)

    ACM Class: I.2.7

  4. arXiv:2203.03951  [pdf

    eess.IV cs.CV

    Efficient and Accurate Hyperspectral Pansharpening Using 3D VolumeNet and 2.5D Texture Transfer

    Authors: Yinao Li, Yutaro Iwamoto, Ryousuke Nakamura, Lanfen Lin, Ruofeng Tong, Yen-Wei Chen

    Abstract: Recently, convolutional neural networks (CNN) have obtained promising results in single-image SR for hyperspectral pansharpening. However, enhancing CNNs' representation ability with fewer parameters and a shorter prediction time is a challenging and critical task. In this paper, we propose a novel multi-spectral image fusion method using a combination of the previously proposed 3D CNN model Volum… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

  5. arXiv:2111.04734  [pdf, other

    eess.IV cs.AI cs.CV

    Mixed Transformer U-Net For Medical Image Segmentation

    Authors: Hongyi Wang, Shiao Xie, Lanfen Lin, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

    Abstract: Though U-Net has achieved tremendous success in medical image segmentation tasks, it lacks the ability to explicitly model long-range dependencies. Therefore, Vision Transformers have emerged as alternative segmentation structures recently, for their innate ability of capturing long-range correlations through Self-Attention (SA). However, Transformers usually rely on large-scale pre-training and h… ▽ More

    Submitted 11 November, 2021; v1 submitted 8 November, 2021; originally announced November 2021.

  6. arXiv:2109.13930  [pdf, other

    eess.IV cs.CV

    All-Around Real Label Supervision: Cyclic Prototype Consistency Learning for Semi-supervised Medical Image Segmentation

    Authors: Zhe Xu, Yixin Wang, Donghuan Lu, Lequan Yu, Jiangpeng Yan, Jie Luo, Kai Ma, Yefeng Zheng, Raymond Kai-yu Tong

    Abstract: Semi-supervised learning has substantially advanced medical image segmentation since it alleviates the heavy burden of acquiring the costly expert-examined annotations. Especially, the consistency-based approaches have attracted more attention for their superior performance, wherein the real labels are only utilized to supervise their paired images via supervised loss while the unlabeled images ar… ▽ More

    Submitted 15 March, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: 11 pages

  7. arXiv:2108.00911  [pdf, ps, other

    eess.IV cs.CV

    Multi-phase Liver Tumor Segmentation with Spatial Aggregation and Uncertain Region Inpainting

    Authors: Yue Zhang, Chengtao Peng, Liying Peng, Huimin Huang, Ruofeng Tong, Lanfen Lin, Jingsong Li, Yen-Wei Chen, Qingqing Chen, Hongjie Hu, Zhiyi Peng

    Abstract: Multi-phase computed tomography (CT) images provide crucial complementary information for accurate liver tumor segmentation (LiTS). State-of-the-art multi-phase LiTS methods usually fused cross-phase features through phase-weighted summation or channel-attention based concatenation. However, these methods ignored the spatial (pixel-wise) relationships between different phases, hence leading to ins… ▽ More

    Submitted 5 August, 2021; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: To appear in MICCAI 2021

  8. arXiv:2107.02433  [pdf, other

    cs.CV eess.IV

    Double-Uncertainty Guided Spatial and Temporal Consistency Regularization Weighting for Learning-based Abdominal Registration

    Authors: Zhe Xu, Jie Luo, Donghuan Lu, Jiangpeng Yan, Sarah Frisken, Jayender Jagadeesan, William Wells III, Xiu Li, Yefeng Zheng, Raymond Tong

    Abstract: In order to tackle the difficulty associated with the ill-posed nature of the image registration problem, regularization is often used to constrain the solution space. For most learning-based registration approaches, the regularization usually has a fixed weight and only constrains the spatial transformation. Such convention has two limitations: (i) Besides the laborious grid search for the optima… ▽ More

    Submitted 2 March, 2022; v1 submitted 6 July, 2021; originally announced July 2021.

    Comments: 11 pages

  9. arXiv:2103.04235  [pdf

    eess.IV cs.CV

    Graph-based Pyramid Global Context Reasoning with a Saliency-aware Projection for COVID-19 Lung Infections Segmentation

    Authors: Huimin Huang, Ming Cai, Lanfen Lin, Jing Zheng, Xiongwei Mao, Xiaohan Qian, Zhiyi Peng, Jianying Zhou, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

    Abstract: Coronavirus Disease 2019 (COVID-19) has rapidly spread in 2020, emerging a mass of studies for lung infection segmentation from CT images. Though many methods have been proposed for this issue, it is a challenging task because of infections of various size appearing in different lobe zones. To tackle these issues, we propose a Graph-based Pyramid Global Context Reasoning (Graph-PGCR) module, which… ▽ More

    Submitted 6 March, 2021; originally announced March 2021.

  10. arXiv:2103.00274  [pdf

    eess.IV cs.CV

    PA-ResSeg: A Phase Attention Residual Network for Liver Tumor Segmentation from Multi-phase CT Images

    Authors: Yingying Xu, Ming Cai, Lanfen Lin, Yue Zhang, Hongjie Hu, Zhiyi Peng, Qiaowei Zhang, Qingqing Chen, Xiongwei Mao, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

    Abstract: In this paper, we propose a phase attention residual network (PA-ResSeg) to model multi-phase features for accurate liver tumor segmentation, in which a phase attention (PA) is newly proposed to additionally exploit the images of arterial (ART) phase to facilitate the segmentation of portal venous (PV) phase. The PA block consists of an intra-phase attention (Intra-PA) module and an inter-phase at… ▽ More

    Submitted 27 February, 2021; originally announced March 2021.

    Comments: A self-archive version to be published in Medical Physics, awaiting minor revision

  11. arXiv:2010.11657  [pdf, other

    cs.SD cs.CL eess.AS

    The HUAWEI Speaker Diarisation System for the VoxCeleb Speaker Diarisation Challenge

    Authors: Renyu Wang, Ruilin Tong, Yu Ting Yeung, Xiao Chen

    Abstract: This paper describes system setup of our submission to speaker diarisation track (Track 4) of VoxCeleb Speaker Recognition Challenge 2020. Our diarisation system consists of a well-trained neural network based speech enhancement model as pre-processing front-end of input speech signals. We replace conventional energy-based voice activity detection (VAD) with a neural network based VAD. The neural… ▽ More

    Submitted 23 October, 2020; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: 5 pages, 2 figures, A report about our diarisation system for VoxCeleb Challenge, Interspeech conference workshop

  12. arXiv:2004.08790  [pdf

    eess.IV cs.CV cs.LG

    UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation

    Authors: Huimin Huang, Lanfen Lin, Ruofeng Tong, Hongjie Hu, Qiaowei Zhang, Yutaro Iwamoto, Xianhua Han, Yen-Wei Chen, Jian Wu

    Abstract: Recently, a growing interest has been seen in deep learning-based semantic segmentation. UNet, which is one of deep learning networks with an encoder-decoder architecture, is widely used in medical image segmentation. Combining multi-scale features is one of important factors for accurate segmentation. UNet++ was developed as a modified Unet by designing an architecture with nested and dense skip… ▽ More

    Submitted 19 April, 2020; originally announced April 2020.