-
T-Mamba: A unified framework with Long-Range Dependency in dual-domain for 2D & 3D Tooth Segmentation
Authors:
Jing Hao,
Yonghui Zhu,
Lei He,
Moyun Liu,
James Kit Hon Tsoi,
Kuo Feng Hung
Abstract:
Tooth segmentation is a pivotal step in modern digital dentistry, essential for applications across orthodontic diagnosis and treatment planning. Despite its importance, this process is fraught with challenges due to the high noise and low contrast inherent in 2D and 3D tooth data. Both Convolutional Neural Networks (CNNs) and Transformers has shown promise in medical image segmentation, yet each…
▽ More
Tooth segmentation is a pivotal step in modern digital dentistry, essential for applications across orthodontic diagnosis and treatment planning. Despite its importance, this process is fraught with challenges due to the high noise and low contrast inherent in 2D and 3D tooth data. Both Convolutional Neural Networks (CNNs) and Transformers has shown promise in medical image segmentation, yet each method has limitations in handling long-range dependencies and computational complexity. To address this issue, this paper introduces T-Mamba, integrating frequency-based features and shared bi-positional encoding into vision mamba to address limitations in efficient global feature modeling. Besides, we design a gate selection unit to integrate two features in spatial domain and one feature in frequency domain adaptively. T-Mamba is the first work to introduce frequency-based features into vision mamba, and its flexibility allows it to process both 2D and 3D tooth data without the need for separate modules. Also, the TED3, a large-scale public tooth 2D dental X-ray dataset, has been presented in this paper. Extensive experiments demonstrate that T-Mamba achieves new SOTA results on a public tooth CBCT dataset and outperforms previous SOTA methods on TED3 dataset. The code and models are publicly available at: https://github.com/isbrycee/T-Mamba.
△ Less
Submitted 1 August, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
GEM: Boost Simple Network for Glass Surface Segmentation via Segment Anything Model and Data Synthesis
Authors:
Jing Hao,
Moyun Liu,
Kuo Feng Hung
Abstract:
Detecting glass regions is a challenging task due to the ambiguity of their transparency and reflection properties. These transparent glasses share the visual appearance of both transmitted arbitrary background scenes and reflected objects, thus having no fixed patterns.Recent visual foundation models, which are trained on vast amounts of data, have manifested stunning performance in terms of imag…
▽ More
Detecting glass regions is a challenging task due to the ambiguity of their transparency and reflection properties. These transparent glasses share the visual appearance of both transmitted arbitrary background scenes and reflected objects, thus having no fixed patterns.Recent visual foundation models, which are trained on vast amounts of data, have manifested stunning performance in terms of image perception and image generation. To segment glass surfaces with higher accuracy, we make full use of two visual foundation models: Segment Anything (SAM) and Stable Diffusion.Specifically, we devise a simple glass surface segmentor named GEM, which only consists of a SAM backbone, a simple feature pyramid, a discerning query selection module, and a mask decoder. The discerning query selection can adaptively identify glass surface features, assigning them as initialized queries in the mask decoder. We also propose a Synthetic but photorealistic large-scale Glass Surface Detection dataset dubbed S-GSD via diffusion model with four different scales, which contain 1x, 5x, 10x, and 20x of the original real data size. This dataset is a feasible source for transfer learning. The scale of synthetic data has positive impacts on transfer learning, while the improvement will gradually saturate as the amount of data increases. Extensive experiments demonstrate that GEM achieves a new state-of-the-art on the GSD-S validation set (IoU +2.1%). Codes and datasets are available at: https://github.com/isbrycee/GEM-Glass-Segmentor.
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
GEM: Boost Simple Network for Glass Surface Segmentation via Vision Foundation Models
Authors:
Jing Hao,
Moyun Liu,
Jinrong Yang,
Kuo Feng Hung
Abstract:
Detecting glass regions is a challenging task due to the inherent ambiguity in their transparency and reflective characteristics. Current solutions in this field remain rooted in conventional deep learning paradigms, requiring the construction of annotated datasets and the design of network architectures. However, the evident drawback with these mainstream solutions lies in the time-consuming and…
▽ More
Detecting glass regions is a challenging task due to the inherent ambiguity in their transparency and reflective characteristics. Current solutions in this field remain rooted in conventional deep learning paradigms, requiring the construction of annotated datasets and the design of network architectures. However, the evident drawback with these mainstream solutions lies in the time-consuming and labor-intensive process of curating datasets, alongside the increasing complexity of model structures. In this paper, we propose to address these issues by fully harnessing the capabilities of two existing vision foundation models (VFMs): Stable Diffusion and Segment Anything Model (SAM). Firstly, we construct a Synthetic but photorealistic large-scale Glass Surface Detection dataset, dubbed S-GSD, without any labour cost via Stable Diffusion. This dataset consists of four different scales, consisting of 168k images totally with precise masks. Besides, based on the powerful segmentation ability of SAM, we devise a simple Glass surface sEgMentor named GEM, which follows the simple query-based encoder-decoder architecture. Comprehensive experiments are conducted on the large-scale glass segmentation dataset GSD-S. Our GEM establishes a new state-of-the-art performance with the help of these two VFMs, surpassing the best-reported method GlassSemNet with an IoU improvement of 2.1%. Additionally, extensive experiments demonstrate that our synthetic dataset S-GSD exhibits remarkable performance in zero-shot and transfer learning settings. Codes, datasets and models are publicly available at: https://github.com/isbrycee/GEM
△ Less
Submitted 21 May, 2024; v1 submitted 22 July, 2023;
originally announced July 2023.