-
Attention on the Wires (AttWire): A Foundation Model for Detecting Devices and Catheters in X-ray Fluoroscopic Images
Authors:
YingLiang Ma,
Sandra Howell,
Aldo Rinaldi,
Tarv Dhanjal,
Kawal S. Rhode
Abstract:
Objective: Interventional devices, catheters and insertable imaging devices such as transesophageal echo (TOE) probes are routinely used in minimally invasive cardiovascular procedures. Detecting their positions and orientations in X-ray fluoroscopic images is important for many clinical applications. Method: In this paper, a novel attention mechanism was designed to guide a convolution neural net…
▽ More
Objective: Interventional devices, catheters and insertable imaging devices such as transesophageal echo (TOE) probes are routinely used in minimally invasive cardiovascular procedures. Detecting their positions and orientations in X-ray fluoroscopic images is important for many clinical applications. Method: In this paper, a novel attention mechanism was designed to guide a convolution neural network (CNN) model to the areas of wires in X-ray images, as nearly all interventional devices and catheters used in cardiovascular procedures contain wires. The attention mechanism includes multi-scale Gaussian derivative filters and a dot-product-based attention layer. By utilizing the proposed attention mechanism, a lightweight foundation model can be created to detect multiple objects simultaneously with higher precision and real-time speed. Results: The proposed model was trained and tested on a total of 12,438 X-ray images. An accuracy of 0.88 was achieved for detecting an echo probe and 0.87 for detecting an artificial valve at 58 FPS. The accuracy was measured by intersection-over-union (IoU). We also achieved a 99.8% success rate in detecting a 10-electrode catheter and a 97.8% success rate in detecting an ablation catheter. Conclusion: Our detection foundation model can simultaneously detect and identify both interventional devices and flexible catheters in real-time X-ray fluoroscopic images. Significance: The proposed model employs a novel attention mechanism to achieve high-performance object detection, making it suitable for various clinical applications and robotic-assisted surgeries. Codes are available at https://github.com/YingLiangMa/AttWire.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
Catheter Detection and Segmentation in X-ray Images via Multi-task Learning
Authors:
Lin Xi,
Yingliang Ma,
Ethan Koland,
Sandra Howell,
Aldo Rinaldi,
Kawal S. Rhode
Abstract:
Automated detection and segmentation of surgical devices, such as catheters or wires, in X-ray fluoroscopic images have the potential to enhance image guidance in minimally invasive heart surgeries. In this paper, we present a convolutional neural network model that integrates a resnet architecture with multiple prediction heads to achieve real-time, accurate localization of electrodes on catheter…
▽ More
Automated detection and segmentation of surgical devices, such as catheters or wires, in X-ray fluoroscopic images have the potential to enhance image guidance in minimally invasive heart surgeries. In this paper, we present a convolutional neural network model that integrates a resnet architecture with multiple prediction heads to achieve real-time, accurate localization of electrodes on catheters and catheter segmentation in an end-to-end deep learning framework. We also propose a multi-task learning strategy in which our model is trained to perform both accurate electrode detection and catheter segmentation simultaneously. A key challenge with this approach is achieving optimal performance for both tasks. To address this, we introduce a novel multi-level dynamic resource prioritization method. This method dynamically adjusts sample and task weights during training to effectively prioritize more challenging tasks, where task difficulty is inversely proportional to performance and evolves throughout the training process. Experiments on both public and private datasets have demonstrated that the accuracy of our method surpasses the existing state-of-the-art methods in both single segmentation task and in the detection and segmentation multi-task. Our approach achieves a good trade-off between accuracy and efficiency, making it well-suited for real-time surgical guidance applications.
△ Less
Submitted 4 March, 2025;
originally announced March 2025.
-
A generalizable 3D framework and model for self-supervised learning in medical imaging
Authors:
Tony Xu,
Sepehr Hosseini,
Chris Anderson,
Anthony Rinaldi,
Rahul G. Krishnan,
Anne L. Martel,
Maged Goubran
Abstract:
Current self-supervised learning methods for 3D medical imaging rely on simple pretext formulations and organ- or modality-specific datasets, limiting their generalizability and scalability. We present 3DINO, a cutting-edge SSL method adapted to 3D datasets, and use it to pretrain 3DINO-ViT: a general-purpose medical imaging model, on an exceptionally large, multimodal, and multi-organ dataset of…
▽ More
Current self-supervised learning methods for 3D medical imaging rely on simple pretext formulations and organ- or modality-specific datasets, limiting their generalizability and scalability. We present 3DINO, a cutting-edge SSL method adapted to 3D datasets, and use it to pretrain 3DINO-ViT: a general-purpose medical imaging model, on an exceptionally large, multimodal, and multi-organ dataset of ~100,000 3D medical imaging scans from over 10 organs. We validate 3DINO-ViT using extensive experiments on numerous medical imaging segmentation and classification tasks. Our results demonstrate that 3DINO-ViT generalizes across modalities and organs, including out-of-distribution tasks and datasets, outperforming state-of-the-art methods on the majority of evaluation metrics and labeled dataset sizes. Our 3DINO framework and 3DINO-ViT will be made available to enable research on 3D foundation models or further finetuning for a wide range of medical imaging applications.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
Exploiting Domain-Specific Parallel Data on Multilingual Language Models for Low-resource Language Translation
Authors:
Surangika Ranathungaa,
Shravan Nayak,
Shih-Ting Cindy Huang,
Yanke Mao,
Tong Su,
Yun-Hsiang Ray Chan,
Songchen Yuan,
Anthony Rinaldi,
Annie En-Shiun Lee
Abstract:
Neural Machine Translation (NMT) systems built on multilingual sequence-to-sequence Language Models (msLMs) fail to deliver expected results when the amount of parallel data for a language, as well as the language's representation in the model are limited. This restricts the capabilities of domain-specific NMT systems for low-resource languages (LRLs). As a solution, parallel data from auxiliary d…
▽ More
Neural Machine Translation (NMT) systems built on multilingual sequence-to-sequence Language Models (msLMs) fail to deliver expected results when the amount of parallel data for a language, as well as the language's representation in the model are limited. This restricts the capabilities of domain-specific NMT systems for low-resource languages (LRLs). As a solution, parallel data from auxiliary domains can be used either to fine-tune or to further pre-train the msLM. We present an evaluation of the effectiveness of these two techniques in the context of domain-specific LRL-NMT. We also explore the impact of domain divergence on NMT model performance. We recommend several strategies for utilizing auxiliary parallel data in building domain-specific NMT models for LRLs.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
NuLite -- Lightweight and Fast Model for Nuclei Instance Segmentation and Classification
Authors:
Cristian Tommasino,
Cristiano Russo,
Antonio Maria Rinaldi
Abstract:
In pathology, accurate and efficient analysis of Hematoxylin and Eosin (H\&E) slides is crucial for timely and effective cancer diagnosis. Although many deep learning solutions for nuclei instance segmentation and classification exist in the literature, they often entail high computational costs and resource requirements, thus limiting their practical usage in medical applications. To address this…
▽ More
In pathology, accurate and efficient analysis of Hematoxylin and Eosin (H\&E) slides is crucial for timely and effective cancer diagnosis. Although many deep learning solutions for nuclei instance segmentation and classification exist in the literature, they often entail high computational costs and resource requirements, thus limiting their practical usage in medical applications. To address this issue, we introduce a novel convolutional neural network, NuLite, a U-Net-like architecture designed explicitly on Fast-ViT, a state-of-the-art (SOTA) lightweight CNN. We obtained three versions of our model, NuLite-S, NuLite-M, and NuLite-H, trained on the PanNuke dataset. The experimental results prove that our models equal CellViT (SOTA) in terms of panoptic quality and detection. However, our lightest model, NuLite-S, is 40 times smaller in terms of parameters and about 8 times smaller in terms of GFlops, while our heaviest model is 17 times smaller in terms of parameters and about 7 times smaller in terms of GFlops. Moreover, our model is up to about 8 times faster than CellViT. Lastly, to prove the effectiveness of our solution, we provide a robust comparison of external datasets, namely CoNseP, MoNuSeg, and GlySAC. Our model is publicly available at https://github.com/CosmoIknosLab/NuLite
△ Less
Submitted 9 August, 2024; v1 submitted 3 August, 2024;
originally announced August 2024.
-
HoVer-UNet: Accelerating HoVerNet with UNet-based multi-class nuclei segmentation via knowledge distillation
Authors:
Cristian Tommasino,
Cristiano Russo,
Antonio Maria Rinaldi,
Francesco Ciompi
Abstract:
We present HoVer-UNet, an approach to distill the knowledge of the multi-branch HoVerNet framework for nuclei instance segmentation and classification in histopathology. We propose a compact, streamlined single UNet network with a Mix Vision Transformer backbone, and equip it with a custom loss function to optimally encode the distilled knowledge of HoVerNet, reducing computational requirements wi…
▽ More
We present HoVer-UNet, an approach to distill the knowledge of the multi-branch HoVerNet framework for nuclei instance segmentation and classification in histopathology. We propose a compact, streamlined single UNet network with a Mix Vision Transformer backbone, and equip it with a custom loss function to optimally encode the distilled knowledge of HoVerNet, reducing computational requirements without compromising performances. We show that our model achieved results comparable to HoVerNet on the public PanNuke and Consep datasets with a three-fold reduction in inference time. We make the code of our model publicly available at https://github.com/DIAGNijmegen/HoVer-UNet.
△ Less
Submitted 4 December, 2023; v1 submitted 21 November, 2023;
originally announced November 2023.
-
Uncertainty Aware Training to Improve Deep Learning Model Calibration for Classification of Cardiac MR Images
Authors:
Tareen Dawood,
Chen Chen,
Baldeep S. Sidhua,
Bram Ruijsink,
Justin Goulda,
Bradley Porter,
Mark K. Elliott,
Vishal Mehta,
Christopher A. Rinaldi,
Esther Puyol-Anton,
Reza Razavi,
Andrew P. King
Abstract:
Quantifying uncertainty of predictions has been identified as one way to develop more trustworthy artificial intelligence (AI) models beyond conventional reporting of performance metrics. When considering their role in a clinical decision support setting, AI classification models should ideally avoid confident wrong predictions and maximise the confidence of correct predictions. Models that do thi…
▽ More
Quantifying uncertainty of predictions has been identified as one way to develop more trustworthy artificial intelligence (AI) models beyond conventional reporting of performance metrics. When considering their role in a clinical decision support setting, AI classification models should ideally avoid confident wrong predictions and maximise the confidence of correct predictions. Models that do this are said to be well-calibrated with regard to confidence. However, relatively little attention has been paid to how to improve calibration when training these models, i.e., to make the training strategy uncertainty-aware. In this work we evaluate three novel uncertainty-aware training strategies comparing against two state-of-the-art approaches. We analyse performance on two different clinical applications: cardiac resynchronisation therapy (CRT) response prediction and coronary artery disease (CAD) diagnosis from cardiac magnetic resonance (CMR) images. The best-performing model in terms of both classification accuracy and the most common calibration measure, expected calibration error (ECE) was the Confidence Weight method, a novel approach that weights the loss of samples to explicitly penalise confident incorrect predictions. The method reduced the ECE by 17% for CRT response prediction and by 22% for CAD diagnosis when compared to a baseline classifier in which no uncertainty-aware strategy was included. In both applications, as well as reducing the ECE there was a slight increase in accuracy from 69% to 70% and 70% to 72% for CRT response prediction and CAD diagnosis respectively. However, our analysis showed a lack of consistency in terms of optimal models when using different calibration measures. This indicates the need for careful consideration of performance metrics when training and selecting models for complex high-risk applications in healthcare.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Leveraging Auxiliary Domain Parallel Data in Intermediate Task Fine-tuning for Low-resource Translation
Authors:
Shravan Nayak,
Surangika Ranathunga,
Sarubi Thillainathan,
Rikki Hung,
Anthony Rinaldi,
Yining Wang,
Jonah Mackey,
Andrew Ho,
En-Shiun Annie Lee
Abstract:
NMT systems trained on Pre-trained Multilingual Sequence-Sequence (PMSS) models flounder when sufficient amounts of parallel data is not available for fine-tuning. This specifically holds for languages missing/under-represented in these models. The problem gets aggravated when the data comes from different domains. In this paper, we show that intermediate-task fine-tuning (ITFT) of PMSS models is…
▽ More
NMT systems trained on Pre-trained Multilingual Sequence-Sequence (PMSS) models flounder when sufficient amounts of parallel data is not available for fine-tuning. This specifically holds for languages missing/under-represented in these models. The problem gets aggravated when the data comes from different domains. In this paper, we show that intermediate-task fine-tuning (ITFT) of PMSS models is extremely beneficial for domain-specific NMT, especially when target domain data is limited/unavailable and the considered languages are missing or under-represented in the PMSS model. We quantify the domain-specific results variations using a domain-divergence test, and show that ITFT can mitigate the impact of domain divergence to some extent.
△ Less
Submitted 23 September, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
AI-enabled Assessment of Cardiac Systolic and Diastolic Function from Echocardiography
Authors:
Esther Puyol-Antón,
Bram Ruijsink,
Baldeep S. Sidhu,
Justin Gould,
Bradley Porter,
Mark K. Elliott,
Vishal Mehta,
Haotian Gu,
Miguel Xochicale,
Alberto Gomez,
Christopher A. Rinaldi,
Martin Cowie,
Phil Chowienczyk,
Reza Razavi,
Andrew P. King
Abstract:
Left ventricular (LV) function is an important factor in terms of patient management, outcome, and long-term survival of patients with heart disease. The most recently published clinical guidelines for heart failure recognise that over reliance on only one measure of cardiac function (LV ejection fraction) as a diagnostic and treatment stratification biomarker is suboptimal. Recent advances in AI-…
▽ More
Left ventricular (LV) function is an important factor in terms of patient management, outcome, and long-term survival of patients with heart disease. The most recently published clinical guidelines for heart failure recognise that over reliance on only one measure of cardiac function (LV ejection fraction) as a diagnostic and treatment stratification biomarker is suboptimal. Recent advances in AI-based echocardiography analysis have shown excellent results on automated estimation of LV volumes and LV ejection fraction. However, from time-varying 2-D echocardiography acquisition, a richer description of cardiac function can be obtained by estimating functional biomarkers from the complete cardiac cycle. In this work we propose for the first time an AI approach for deriving advanced biomarkers of systolic and diastolic LV function from 2-D echocardiography based on segmentations of the full cardiac cycle. These biomarkers will allow clinicians to obtain a much richer picture of the heart in health and disease. The AI model is based on the 'nn-Unet' framework and was trained and tested using four different databases. Results show excellent agreement between manual and automated analysis and showcase the potential of the advanced systolic and diastolic biomarkers for patient stratification. Finally, for a subset of 50 cases, we perform a correlation analysis between clinical biomarkers derived from echocardiography and CMR and we show excellent agreement between the two modalities.
△ Less
Submitted 21 July, 2022; v1 submitted 21 March, 2022;
originally announced March 2022.
-
Entanglement Swapping in Quantum Switches: Protocol Design and Stability Analysis
Authors:
Wenhan Dai,
Anthony Rinaldi,
Don Towsley
Abstract:
Quantum switches are critical components in quantum networks, distributing maximally entangled pairs among end nodes by entanglement swapping. In this work, we design protocols that schedule entanglement swapping operations in quantum switches. Entanglement requests randomly arrive at the switch, and the goal of an entanglement swapping protocol is to stabilize the quantum switch so that the numbe…
▽ More
Quantum switches are critical components in quantum networks, distributing maximally entangled pairs among end nodes by entanglement swapping. In this work, we design protocols that schedule entanglement swapping operations in quantum switches. Entanglement requests randomly arrive at the switch, and the goal of an entanglement swapping protocol is to stabilize the quantum switch so that the number of unfinished entanglement requests is bounded with a high probability. We determine the capacity region for the rates of entanglement requests and develop entanglement swapping protocols to stabilize the switch. Among these protocols, the on-demand protocols are not only computationally efficient, but also achieve high fidelity and low latency demonstrated by results obtained using a quantum network discrete event simulator.
△ Less
Submitted 21 May, 2023; v1 submitted 8 October, 2021;
originally announced October 2021.
-
Uncertainty-Aware Training for Cardiac Resynchronisation Therapy Response Prediction
Authors:
Tareen Dawood,
Chen Chen,
Robin Andlauer,
Baldeep S. Sidhu,
Bram Ruijsink,
Justin Gould,
Bradley Porter,
Mark Elliott,
Vishal Mehta,
C. Aldo Rinaldi,
Esther Puyol-Antón,
Reza Razavi,
Andrew P. King
Abstract:
Evaluation of predictive deep learning (DL) models beyond conventional performance metrics has become increasingly important for applications in sensitive environments like healthcare. Such models might have the capability to encode and analyse large sets of data but they often lack comprehensive interpretability methods, preventing clinical trust in predictive outcomes. Quantifying uncertainty of…
▽ More
Evaluation of predictive deep learning (DL) models beyond conventional performance metrics has become increasingly important for applications in sensitive environments like healthcare. Such models might have the capability to encode and analyse large sets of data but they often lack comprehensive interpretability methods, preventing clinical trust in predictive outcomes. Quantifying uncertainty of a prediction is one way to provide such interpretability and promote trust. However, relatively little attention has been paid to how to include such requirements into the training of the model. In this paper we: (i) quantify the data (aleatoric) and model (epistemic) uncertainty of a DL model for Cardiac Resynchronisation Therapy response prediction from cardiac magnetic resonance images, and (ii) propose and perform a preliminary investigation of an uncertainty-aware loss function that can be used to retrain an existing DL image-based classification model to encourage confidence in correct predictions and reduce confidence in incorrect predictions. Our initial results are promising, showing a significant increase in the (epistemic) confidence of true positive predictions, with some evidence of a reduction in false negative confidence.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
A Serious Game Approach for the Electro-Mobility Sector
Authors:
Bartolomeo Silvestri,
Alessandro Rinaldi,
Antonella Berardi,
Michele Roccotelli,
Simone Acquaviva,
Maria Pia Fanti
Abstract:
Serious Games (SGs) represent a new approach to improve learning processes more effectively and economically than traditional methods. This paper aims to present a SG approach for the electro-mobility context, in order to encourage the use of electric light vehicles. The design of the SG is based on the typical elements of the classic "game" with a real gameplay with different purposes. In this wo…
▽ More
Serious Games (SGs) represent a new approach to improve learning processes more effectively and economically than traditional methods. This paper aims to present a SG approach for the electro-mobility context, in order to encourage the use of electric light vehicles. The design of the SG is based on the typical elements of the classic "game" with a real gameplay with different purposes. In this work, the proposed SG aims to raise awareness on environmental issues caused by mobility and actively involve users, on improving livability in the city and on real savings using alternative means to traditional vehicles. The objective of the designed tool is to propose elements of fun and entertainment for tourists or users of electric vehicles in the cities, while giving useful information about the benefits of using such vehicles, discovering touristic and interesting places in the city to discover. In this way, the user is stimulated to explore the artistic and historical aspects of the city through an effective learning process: he/she is encouraged to search the origins and the peculiarities of the monuments.
△ Less
Submitted 1 December, 2020;
originally announced December 2020.
-
Interpretable Deep Models for Cardiac Resynchronisation Therapy Response Prediction
Authors:
Esther Puyol-Antón,
Chen Chen,
James R. Clough,
Bram Ruijsink,
Baldeep S. Sidhu,
Justin Gould,
Bradley Porter,
Mark Elliott,
Vishal Mehta,
Daniel Rueckert,
Christopher A. Rinaldi,
Andrew P. King
Abstract:
Advances in deep learning (DL) have resulted in impressive accuracy in some medical image classification tasks, but often deep models lack interpretability. The ability of these models to explain their decisions is important for fostering clinical trust and facilitating clinical translation. Furthermore, for many problems in medicine there is a wealth of existing clinical knowledge to draw upon, w…
▽ More
Advances in deep learning (DL) have resulted in impressive accuracy in some medical image classification tasks, but often deep models lack interpretability. The ability of these models to explain their decisions is important for fostering clinical trust and facilitating clinical translation. Furthermore, for many problems in medicine there is a wealth of existing clinical knowledge to draw upon, which may be useful in generating explanations, but it is not obvious how this knowledge can be encoded into DL models - most models are learnt either from scratch or using transfer learning from a different domain. In this paper we address both of these issues. We propose a novel DL framework for image-based classification based on a variational autoencoder (VAE). The framework allows prediction of the output of interest from the latent space of the autoencoder, as well as visualisation (in the image domain) of the effects of crossing the decision boundary, thus enhancing the interpretability of the classifier. Our key contribution is that the VAE disentangles the latent space based on `explanations' drawn from existing clinical knowledge. The framework can predict outputs as well as explanations for these outputs, and also raises the possibility of discovering new biomarkers that are separate (or disentangled) from the existing knowledge. We demonstrate our framework on the problem of predicting response of patients with cardiomyopathy to cardiac resynchronization therapy (CRT) from cine cardiac magnetic resonance images. The sensitivity and specificity of the proposed model on the task of CRT response prediction are 88.43% and 84.39% respectively, and we showcase the potential of our model in enhancing understanding of the factors contributing to CRT response.
△ Less
Submitted 9 July, 2020; v1 submitted 24 June, 2020;
originally announced June 2020.