E2E-Swin-Unet++: An Enhanced End-to-End Swin-Unet Architecture With Dual Decoders For PTMC Segmentation

Dialameh, Maryam; Rajabzadeh, Hossein; Sadeghi-Goughari, Moslem; Sim, Jung Suk; Kwon, Hyock Ju

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2410.18239 (eess)

[Submitted on 23 Oct 2024 (v1), last revised 8 Jun 2025 (this version, v2)]

Title:E2E-Swin-Unet++: An Enhanced End-to-End Swin-Unet Architecture With Dual Decoders For PTMC Segmentation

Authors:Maryam Dialameh, Hossein Rajabzadeh, Moslem Sadeghi-Goughari, Jung Suk Sim, Hyock Ju Kwon

View PDF HTML (experimental)

Abstract:Precise segmentation of papillary thyroid microcarcinoma (PTMC) during ultrasound-guided radiofrequency ablation (RFA) is critical for effective treatment but remains challenging due to acoustic artifacts, small lesion size, and anatomical variability. In this study, we propose \textbf{DualSwinUnet++}, a dual-decoder transformer-based architecture designed to enhance PTMC segmentation by incorporating thyroid gland context. DualSwinUnet++ employs independent linear projection heads for each decoder and a residual information flow mechanism that passes intermediate features from the first (thyroid) decoder to the second (PTMC) decoder via concatenation and transformation. These design choices allow the model to condition tumor prediction explicitly on gland morphology without shared gradient interference. Trained on a clinical ultrasound dataset with 691 annotated RFA images and evaluated against state-of-the-art models, DualSwinUnet++ achieves superior Dice and Jaccard scores while maintaining sub-200ms inference latency. The results demonstrate the model's suitability for near real-time surgical assistance and its effectiveness in improving segmentation accuracy in challenging PTMC cases.

Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2410.18239 [eess.IV]
	(or arXiv:2410.18239v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2410.18239

Submission history

From: Hossein Rajabzadeh [view email]
[v1] Wed, 23 Oct 2024 19:33:33 UTC (2,537 KB)
[v2] Sun, 8 Jun 2025 03:11:31 UTC (3,286 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:E2E-Swin-Unet++: An Enhanced End-to-End Swin-Unet Architecture With Dual Decoders For PTMC Segmentation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:E2E-Swin-Unet++: An Enhanced End-to-End Swin-Unet Architecture With Dual Decoders For PTMC Segmentation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators