-
Towards user-centered interactive medical image segmentation in VR with an assistive AI agent
Authors:
Pascal Spiegler,
Arash Harirpoush,
Yiming Xiao
Abstract:
Crucial in disease analysis and surgical planning, manual segmentation of volumetric medical scans (e.g. MRI, CT) is laborious, error-prone, and challenging to master, while fully automatic algorithms can benefit from user feedback. Therefore, with the complementary power of the latest radiological AI foundation models and virtual reality (VR)'s intuitive data interaction, we propose SAMIRA, a nov…
▽ More
Crucial in disease analysis and surgical planning, manual segmentation of volumetric medical scans (e.g. MRI, CT) is laborious, error-prone, and challenging to master, while fully automatic algorithms can benefit from user feedback. Therefore, with the complementary power of the latest radiological AI foundation models and virtual reality (VR)'s intuitive data interaction, we propose SAMIRA, a novel conversational AI agent that assists users with localizing, segmenting, and visualizing 3D medical concepts in VR. Through speech-based interaction, the agent helps users understand radiological features, locate clinical targets, and generate segmentation masks that can be refined with just a few point prompts. The system also supports true-to-scale 3D visualization of segmented pathology to enhance patient-specific anatomical understanding. Furthermore, to determine the optimal interaction paradigm under near-far attention-switching for refining segmentation masks in an immersive, human-in-the-loop workflow, we compare VR controller pointing, head pointing, and eye tracking as input modes. With a user study, evaluations demonstrated a high usability score (SUS=90.0 $\pm$ 9.0), low overall task load, as well as strong support for the proposed VR system's guidance, training potential, and integration of AI in radiological segmentation tasks.
△ Less
Submitted 15 May, 2025; v1 submitted 11 May, 2025;
originally announced May 2025.
-
CABLD: Contrast-Agnostic Brain Landmark Detection with Consistency-Based Regularization
Authors:
Soorena Salari,
Arash Harirpoush,
Hassan Rivaz,
Yiming Xiao
Abstract:
Anatomical landmark detection in medical images is essential for various clinical and research applications, including disease diagnosis and surgical planning. However, manual landmark annotation is time-consuming and requires significant expertise. Existing deep learning (DL) methods often require large amounts of well-annotated data, which are costly to acquire. In this paper, we introduce CABLD…
▽ More
Anatomical landmark detection in medical images is essential for various clinical and research applications, including disease diagnosis and surgical planning. However, manual landmark annotation is time-consuming and requires significant expertise. Existing deep learning (DL) methods often require large amounts of well-annotated data, which are costly to acquire. In this paper, we introduce CABLD, a novel self-supervised DL framework for 3D brain landmark detection in unlabeled scans with varying contrasts by using only a single reference example. To achieve this, we employed an inter-subject landmark consistency loss with an image registration loss while introducing a 3D convolution-based contrast augmentation strategy to promote model generalization to new contrasts. Additionally, we utilize an adaptive mixed loss function to schedule the contributions of different sub-tasks for optimal outcomes. We demonstrate the proposed method with the intricate task of MRI-based 3D brain landmark detection. With comprehensive experiments on four diverse clinical and public datasets, including both T1w and T2w MRI scans at different MRI field strengths, we demonstrate that CABLD outperforms the state-of-the-art methods in terms of mean radial errors (MREs) and success detection rates (SDRs). Our framework provides a robust and accurate solution for anatomical landmark detection, reducing the need for extensively annotated datasets and generalizing well across different imaging contrasts. Our code will be publicly available at: https://github.com/HealthX-Lab/CABLD.
△ Less
Submitted 21 March, 2025; v1 submitted 26 November, 2024;
originally announced November 2024.
-
Virtual Reality-Based Preoperative Planning for Optimized Trocar Placement in Thoracic Surgery: A Preliminary Study
Authors:
Arash Harirpoush,
George Rakovich,
Marta Kersten-Oertel,
Yiming Xiao
Abstract:
Video-assisted thoracic surgery (VATS) is a minimally invasive approach for treating early-stage non-small-cell lung cancer. Optimal trocar placement during VATS ensures comprehensive access to the thoracic cavity, provides a panoramic endoscopic view, and prevents instrument crowding. While established principles such as the Baseball Diamond Principle (BDP) and Triangle Target Principle (TTP) exi…
▽ More
Video-assisted thoracic surgery (VATS) is a minimally invasive approach for treating early-stage non-small-cell lung cancer. Optimal trocar placement during VATS ensures comprehensive access to the thoracic cavity, provides a panoramic endoscopic view, and prevents instrument crowding. While established principles such as the Baseball Diamond Principle (BDP) and Triangle Target Principle (TTP) exist, surgeons mainly rely on experience and patient-specific anatomy for trocar placement, potentially leading to sub-optimal surgical plans that increase operative time and fatigue. To address this, we present the first virtual reality (VR)-based pre-operative planning tool with tailored data visualization and interaction designs for efficient and optimal VATS trocar placement, following the established surgical principles and consultation with an experienced surgeon. In our preliminary study, we demonstrate the system's application in right upper lung lobectomy, a common thoracic procedure typically using three trocars. A preliminary user study of our system indicates it is efficient, robust, and user-friendly for planning optimal trocar placement, with a great promise for clinical application while offering potentially valuable insights for the development of other surgical VR systems.
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical Segmentation
Authors:
Arash Harirpoush,
Amirhossein Rasoulian,
Marta Kersten-Oertel,
Yiming Xiao
Abstract:
Recent rising interests in patient-specific thoracic surgical planning and simulation require efficient and robust creation of digital anatomical models from automatic medical image segmentation algorithms. Deep learning (DL) is now state-of-the-art in various radiological tasks, and U-shaped DL models have particularly excelled in medical image segmentation since the inception of the 2D UNet. To…
▽ More
Recent rising interests in patient-specific thoracic surgical planning and simulation require efficient and robust creation of digital anatomical models from automatic medical image segmentation algorithms. Deep learning (DL) is now state-of-the-art in various radiological tasks, and U-shaped DL models have particularly excelled in medical image segmentation since the inception of the 2D UNet. To date, many variants of U-shaped models have been proposed by the integration of different attention mechanisms and network configurations. Systematic benchmark studies which analyze the architecture of these models by leveraging the recent development of the multi-label databases, can provide valuable insights for clinical deployment and future model designs, but such studies are still rare. We conduct the first systematic benchmark study for variants of 3D U-shaped models (3DUNet, STUNet, AttentionUNet, SwinUNETR, FocalSegNet, and a novel 3D SwinUnet with four variants) with a focus on CT-based anatomical segmentation for thoracic surgery. Our study systematically examines the impact of different attention mechanisms, the number of resolution stages, and network configurations on segmentation accuracy and computational complexity. To allow cross-reference with other recent benchmarking studies, we also included a performance assessment of the BTCV abdominal structural segmentation. With the STUNet ranking at the top, our study demonstrated the value of CNN-based U-shaped models for the investigated tasks and the benefit of residual blocks in network configuration designs to boost segmentation performance.
△ Less
Submitted 14 March, 2024; v1 submitted 5 February, 2024;
originally announced February 2024.
-
Weakly supervised segmentation of intracranial aneurysms using a novel 3D focal modulation UNet
Authors:
Amirhossein Rasoulian,
Arash Harirpoush,
Soorena Salari,
Yiming Xiao
Abstract:
Accurate identification and quantification of unruptured intracranial aneurysms (UIAs) is crucial for the risk assessment and treatment of this cerebrovascular disorder. Current 2D manual assessment on 3D magnetic resonance angiography (MRA) is suboptimal and time-consuming. In addition, one major issue in medical image segmentation is the need for large well-annotated data, which can be expensive…
▽ More
Accurate identification and quantification of unruptured intracranial aneurysms (UIAs) is crucial for the risk assessment and treatment of this cerebrovascular disorder. Current 2D manual assessment on 3D magnetic resonance angiography (MRA) is suboptimal and time-consuming. In addition, one major issue in medical image segmentation is the need for large well-annotated data, which can be expensive to obtain. Techniques that mitigate this requirement, such as weakly supervised learning with coarse labels are highly desirable. In the paper, we propose FocalSegNet, a novel 3D focal modulation UNet, to detect an aneurysm and offer an initial, coarse segmentation of it from time-of-flight MRA image patches, which is further refined with a dense conditional random field (CRF) post-processing layer to produce a final segmentation map. We trained and evaluated our model on a public dataset, and in terms of UIA detection, our model showed a low false-positive rate of 0.21 and a high sensitivity of 0.80. For voxel-wise aneurysm segmentation, we achieved a Dice score of 0.68 and a 95% Hausdorff distance of ~0.95 mm, demonstrating its strong performance. We evaluated our algorithms against the state-of-the-art 3D Residual-UNet and Swin-UNETR, and illustrated the superior performance of our proposed FocalSegNet, highlighting the advantages of employing focal modulation for this task.
△ Less
Submitted 20 March, 2024; v1 submitted 5 August, 2023;
originally announced August 2023.