-
A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs
Authors:
Niccolò McConnell,
Pardeep Vasudev,
Daisuke Yamada,
Daryl Cheng,
Mehran Azimbagirad,
John McCabe,
Shahab Aslani,
Ahmed H. Shahin,
Yukun Zhou,
The SUMMIT Consortium,
Andre Altmann,
Yipeng Hu,
Paul Taylor,
Sam M. Janes,
Daniel C. Alexander,
Joseph Jacob
Abstract:
Low-dose computed tomography (LDCT) imaging employed in lung cancer screening (LCS) programs is increasing in uptake worldwide. LCS programs herald a generational opportunity to simultaneously detect cancer and non-cancer-related early-stage lung disease. Yet these efforts are hampered by a shortage of radiologists to interpret scans at scale. Here, we present TANGERINE, a computationally frugal,…
▽ More
Low-dose computed tomography (LDCT) imaging employed in lung cancer screening (LCS) programs is increasing in uptake worldwide. LCS programs herald a generational opportunity to simultaneously detect cancer and non-cancer-related early-stage lung disease. Yet these efforts are hampered by a shortage of radiologists to interpret scans at scale. Here, we present TANGERINE, a computationally frugal, open-source vision foundation model for volumetric LDCT analysis. Designed for broad accessibility and rapid adaptation, TANGERINE can be fine-tuned off the shelf for a wide range of disease-specific tasks with limited computational resources and training data. Relative to models trained from scratch, TANGERINE demonstrates fast convergence during fine-tuning, thereby requiring significantly fewer GPU hours, and displays strong label efficiency, achieving comparable or superior performance with a fraction of fine-tuning data. Pretrained using self-supervised learning on over 98,000 thoracic LDCTs, including the UK's largest LCS initiative to date and 27 public datasets, TANGERINE achieves state-of-the-art performance across 14 disease classification tasks, including lung cancer and multiple respiratory diseases, while generalising robustly across diverse clinical centres. By extending a masked autoencoder framework to 3D imaging, TANGERINE offers a scalable solution for LDCT analysis, departing from recent closed, resource-intensive models by combining architectural simplicity, public availability, and modest computational requirements. Its accessible, open-source lightweight design lays the foundation for rapid integration into next-generation medical imaging tools that could transform LCS initiatives, allowing them to pivot from a singular focus on lung cancer detection to comprehensive respiratory disease management in high-risk populations.
△ Less
Submitted 2 July, 2025;
originally announced July 2025.
-
Reasoning in machine vision: learning to think fast and slow
Authors:
Shaheer U. Saeed,
Yipei Wang,
Veeru Kasivisvanathan,
Brian R. Davidson,
Matthew J. Clarkson,
Yipeng Hu,
Daniel C. Alexander
Abstract:
Reasoning is a hallmark of human intelligence, enabling adaptive decision-making in complex and unfamiliar scenarios. In contrast, machine intelligence remains bound to training data, lacking the ability to dynamically refine solutions at inference time. While some recent advances have explored reasoning in machines, these efforts are largely limited to verbal domains such as mathematical problem-…
▽ More
Reasoning is a hallmark of human intelligence, enabling adaptive decision-making in complex and unfamiliar scenarios. In contrast, machine intelligence remains bound to training data, lacking the ability to dynamically refine solutions at inference time. While some recent advances have explored reasoning in machines, these efforts are largely limited to verbal domains such as mathematical problem-solving, where explicit rules govern step-by-step reasoning. Other critical real-world tasks - including visual perception, spatial reasoning, and radiological diagnosis - require non-verbal reasoning, which remains an open challenge. Here we present a novel learning paradigm that enables machine reasoning in vision by allowing performance improvement with increasing thinking time (inference-time compute), even under conditions where labelled data is very limited. Inspired by dual-process theories of human cognition in psychology, our approach integrates a fast-thinking System I module for familiar tasks, with a slow-thinking System II module that iteratively refines solutions using self-play reinforcement learning. This paradigm mimics human reasoning by proposing, competing over, and refining solutions in data-scarce scenarios. We demonstrate superior performance through extended thinking time, compared not only to large-scale supervised learning but also foundation models and even human experts, in real-world vision tasks. These tasks include computer-vision benchmarks and cancer localisation on medical images across five organs, showcasing transformative potential for non-verbal machine reasoning.
△ Less
Submitted 27 June, 2025;
originally announced June 2025.
-
Tackling Hallucination from Conditional Models for Medical Image Reconstruction with DynamicDPS
Authors:
Seunghoi Kim,
Henry F. J. Tregidgo,
Matteo Figini,
Chen Jin,
Sarang Joshi,
Daniel C. Alexander
Abstract:
Hallucinations are spurious structures not present in the ground truth, posing a critical challenge in medical image reconstruction, especially for data-driven conditional models. We hypothesize that combining an unconditional diffusion model with data consistency, trained on a diverse dataset, can reduce these hallucinations. Based on this, we propose DynamicDPS, a diffusion-based framework that…
▽ More
Hallucinations are spurious structures not present in the ground truth, posing a critical challenge in medical image reconstruction, especially for data-driven conditional models. We hypothesize that combining an unconditional diffusion model with data consistency, trained on a diverse dataset, can reduce these hallucinations. Based on this, we propose DynamicDPS, a diffusion-based framework that integrates conditional and unconditional diffusion models to enhance low-quality medical images while systematically reducing hallucinations. Our approach first generates an initial reconstruction using a conditional model, then refines it with an adaptive diffusion-based inverse problem solver. DynamicDPS skips early stage in the reverse process by selecting an optimal starting time point per sample and applies Wolfe's line search for adaptive step sizes, improving both efficiency and image fidelity. Using diffusion priors and data consistency, our method effectively reduces hallucinations from any conditional model output. We validate its effectiveness in Image Quality Transfer for low-field MRI enhancement. Extensive evaluations on synthetic and real MR scans, including a downstream task for tissue volume estimation, show that DynamicDPS reduces hallucinations, improving relative volume estimation by over 15% for critical tissues while using only 5% of the sampling steps required by baseline diffusion models. As a model-agnostic and fine-tuning-free approach, DynamicDPS offers a robust solution for hallucination reduction in medical imaging. The code will be made publicly available upon publication.
△ Less
Submitted 2 March, 2025;
originally announced March 2025.
-
Brain Latent Progression: Individual-based Spatiotemporal Disease Progression on 3D Brain MRIs via Latent Diffusion
Authors:
Lemuel Puglisi,
Daniel C. Alexander,
Daniele Ravì
Abstract:
The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated Artificial Intelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scans for individual patients. However, despite significant advancements in AI, current methods continue to face challenges including achieving patient-specific individualization, ensurin…
▽ More
The growing availability of longitudinal Magnetic Resonance Imaging (MRI) datasets has facilitated Artificial Intelligence (AI)-driven modeling of disease progression, making it possible to predict future medical scans for individual patients. However, despite significant advancements in AI, current methods continue to face challenges including achieving patient-specific individualization, ensuring spatiotemporal consistency, efficiently utilizing longitudinal data, and managing the substantial memory demands of 3D scans. To address these challenges, we propose Brain Latent Progression (BrLP), a novel spatiotemporal model designed to predict individual-level disease progression in 3D brain MRIs. The key contributions in BrLP are fourfold: (i) it operates in a small latent space, mitigating the computational challenges posed by high-dimensional imaging data; (ii) it explicitly integrates subject metadata to enhance the individualization of predictions; (iii) it incorporates prior knowledge of disease dynamics through an auxiliary model, facilitating the integration of longitudinal data; and (iv) it introduces the Latent Average Stabilization (LAS) algorithm, which (a) enforces spatiotemporal consistency in the predicted progression at inference time and (b) allows us to derive a measure of the uncertainty for the prediction. We train and evaluate BrLP on 11,730 T1-weighted (T1w) brain MRIs from 2,805 subjects and validate its generalizability on an external test set comprising 2,257 MRIs from 962 subjects. Our experiments compare BrLP-generated MRI scans with real follow-up MRIs, demonstrating state-of-the-art accuracy compared to existing methods. The code is publicly available at: https://github.com/LemuelPuglisi/BrLP.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
4D VQ-GAN: Synthesising Medical Scans at Any Time Point for Personalised Disease Progression Modelling of Idiopathic Pulmonary Fibrosis
Authors:
An Zhao,
Moucheng Xu,
Ahmed H. Shahin,
Wim Wuyts,
Mark G. Jones,
Joseph Jacob,
Daniel C. Alexander
Abstract:
Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accu…
▽ More
Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. In this paper, we propose 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN), a model capable of generating realistic CT volumes of IPF patients at any time point. The model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) based temporal model is trained to capture the temporal dynamics of the quantised embeddings generated by the encoder in the first stage. We evaluate different configurations of our model for generating longitudinal CT scans and compare the results against ground truth data, both quantitatively and qualitatively. For validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index comparable to that of biomarkers derived from the real CT scans. The survival analysis results demonstrate the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes.
△ Less
Submitted 8 February, 2025;
originally announced February 2025.
-
MRI Parameter Mapping via Gaussian Mixture VAE: Breaking the Assumption of Independent Pixels
Authors:
Moucheng Xu,
Yukun Zhou,
Tobias Goodwin-Allcock,
Kimia Firoozabadi,
Joseph Jacob,
Daniel C. Alexander,
Paddy J. Slator
Abstract:
We introduce and demonstrate a new paradigm for quantitative parameter mapping in MRI. Parameter mapping techniques, such as diffusion MRI and quantitative MRI, have the potential to robustly and repeatably measure biologically-relevant tissue maps that strongly relate to underlying microstructure. Quantitative maps are calculated by fitting a model to multiple images, e.g. with least-squares or m…
▽ More
We introduce and demonstrate a new paradigm for quantitative parameter mapping in MRI. Parameter mapping techniques, such as diffusion MRI and quantitative MRI, have the potential to robustly and repeatably measure biologically-relevant tissue maps that strongly relate to underlying microstructure. Quantitative maps are calculated by fitting a model to multiple images, e.g. with least-squares or machine learning. However, the overwhelming majority of model fitting techniques assume that each voxel is independent, ignoring any co-dependencies in the data. This makes model fitting sensitive to voxelwise measurement noise, hampering reliability and repeatability. We propose a self-supervised deep variational approach that breaks the assumption of independent pixels, leveraging redundancies in the data to effectively perform data-driven regularisation of quantitative maps. We demonstrate that our approach outperforms current model fitting techniques in dMRI simulations and real data. Especially with a Gaussian mixture prior, our model enables sharper quantitative maps, revealing finer anatomical details that are not presented in the baselines. Our approach can hence support the clinical adoption of parameter mapping methods such as dMRI and qMRI.
△ Less
Submitted 16 November, 2024;
originally announced November 2024.
-
Alternative Learning Paradigms for Image Quality Transfer
Authors:
Ahmed Karam Eldaly,
Matteo Figini,
Daniel C. Alexander
Abstract:
Image Quality Transfer (IQT) aims to enhance the contrast and resolution of low-quality medical images, e.g. obtained from low-power devices, with rich information learned from higher quality images. In contrast to existing IQT methods which adopt supervised learning frameworks, in this work, we propose two novel formulations of the IQT problem. The first approach uses an unsupervised learning fra…
▽ More
Image Quality Transfer (IQT) aims to enhance the contrast and resolution of low-quality medical images, e.g. obtained from low-power devices, with rich information learned from higher quality images. In contrast to existing IQT methods which adopt supervised learning frameworks, in this work, we propose two novel formulations of the IQT problem. The first approach uses an unsupervised learning framework, whereas the second is a combination of both supervised and unsupervised learning. The unsupervised learning approach considers a sparse representation (SRep) and dictionary learning model, which we call IQT-SRep, whereas the combination of supervised and unsupervised learning approach is based on deep dictionary learning (DDL), which we call IQT-DDL. The IQT-SRep approach trains two dictionaries using a SRep model using pairs of low- and high-quality volumes. Subsequently, the SRep of a low-quality block, in terms of the low-quality dictionary, can be directly used to recover the corresponding high-quality block using the high-quality dictionary. On the other hand, the IQT-DDL approach explicitly learns a high-resolution dictionary to upscale the input volume, while the entire network, including high dictionary generator, is simultaneously optimised to take full advantage of deep learning methods. The two models are evaluated using a low-field magnetic resonance imaging (MRI) application aiming to recover high-quality images akin to those obtained from high-field scanners. Experiments comparing the proposed approaches against state-of-the-art supervised deep learning IQT method (IQT-DL) identify that the two novel formulations of the IQT problem can avoid bias associated with supervised methods when tested using out-of-distribution data that differs from the distribution of the data the model was trained on. This highlights the potential benefit of these novel paradigms for IQT.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Unscrambling disease progression at scale: fast inference of event permutations with optimal transport
Authors:
Peter A. Wijeratne,
Daniel C. Alexander
Abstract:
Disease progression models infer group-level temporal trajectories of change in patients' features as a chronic degenerative condition plays out. They provide unique insight into disease biology and staging systems with individual-level clinical utility. Discrete models consider disease progression as a latent permutation of events, where each event corresponds to a feature becoming measurably abn…
▽ More
Disease progression models infer group-level temporal trajectories of change in patients' features as a chronic degenerative condition plays out. They provide unique insight into disease biology and staging systems with individual-level clinical utility. Discrete models consider disease progression as a latent permutation of events, where each event corresponds to a feature becoming measurably abnormal. However, permutation inference using traditional maximum likelihood approaches becomes prohibitive due to combinatoric explosion, severely limiting model dimensionality and utility. Here we leverage ideas from optimal transport to model disease progression as a latent permutation matrix of events belonging to the Birkhoff polytope, facilitating fast inference via optimisation of the variational lower bound. This enables a factor of 1000 times faster inference than the current state of the art and, correspondingly, supports models with several orders of magnitude more features than the current state of the art can consider. Experiments demonstrate the increase in speed, accuracy and robustness to noise in simulation. Further experiments with real-world imaging data from two separate datasets, one from Alzheimer's disease patients, the other age-related macular degeneration, showcase, for the first time, pixel-level disease progression events in the brain and eye, respectively. Our method is low compute, interpretable and applicable to any progressive condition and data modality, giving it broad potential clinical utility.
△ Less
Submitted 24 June, 2025; v1 submitted 18 October, 2024;
originally announced October 2024.
-
An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation
Authors:
Ahmed Abdulaal,
Hugo Fry,
Nina Montaña-Brown,
Ayodeji Ijishakin,
Jack Gao,
Stephanie Hyland,
Daniel C. Alexander,
Daniel C. Castro
Abstract:
Radiological services are experiencing unprecedented demand, leading to increased interest in automating radiology report generation. Existing Vision-Language Models (VLMs) suffer from hallucinations, lack interpretability, and require expensive fine-tuning. We introduce SAE-Rad, which uses sparse autoencoders (SAEs) to decompose latent representations from a pre-trained vision transformer into hu…
▽ More
Radiological services are experiencing unprecedented demand, leading to increased interest in automating radiology report generation. Existing Vision-Language Models (VLMs) suffer from hallucinations, lack interpretability, and require expensive fine-tuning. We introduce SAE-Rad, which uses sparse autoencoders (SAEs) to decompose latent representations from a pre-trained vision transformer into human-interpretable features. Our hybrid architecture combines state-of-the-art SAE advancements, achieving accurate latent reconstructions while maintaining sparsity. Using an off-the-shelf language model, we distil ground-truth reports into radiological descriptions for each SAE feature, which we then compile into a full report for each image, eliminating the need for fine-tuning large models for this task. To the best of our knowledge, SAE-Rad represents the first instance of using mechanistic interpretability techniques explicitly for a downstream multi-modal reasoning task. On the MIMIC-CXR dataset, SAE-Rad achieves competitive radiology-specific metrics compared to state-of-the-art models while using significantly fewer computational resources for training. Qualitative analysis reveals that SAE-Rad learns meaningful visual concepts and generates reports aligning closely with expert interpretations. Our results suggest that SAEs can enhance multimodal reasoning in healthcare, providing a more interpretable alternative to existing VLMs.
△ Less
Submitted 4 October, 2024;
originally announced October 2024.
-
DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model
Authors:
Mona Sheikh Zeinoddin,
Chiara Lena,
Jiongqi Qu,
Luca Carlini,
Mattia Magro,
Seunghoi Kim,
Elena De Momi,
Sophia Bano,
Matthew Grech-Sollars,
Evangelos Mazomenos,
Daniel C. Alexander,
Danail Stoyanov,
Matthew J. Clarkson,
Mobarakol Islam
Abstract:
Robotic-assisted surgery (RAS) relies on accurate depth estimation for 3D reconstruction and visualization. While foundation models like Depth Anything Models (DAM) show promise, directly applying them to surgery often yields suboptimal results. Fully fine-tuning on limited surgical data can cause overfitting and catastrophic forgetting, compromising model robustness and generalization. Although L…
▽ More
Robotic-assisted surgery (RAS) relies on accurate depth estimation for 3D reconstruction and visualization. While foundation models like Depth Anything Models (DAM) show promise, directly applying them to surgery often yields suboptimal results. Fully fine-tuning on limited surgical data can cause overfitting and catastrophic forgetting, compromising model robustness and generalization. Although Low-Rank Adaptation (LoRA) addresses some adaptation issues, its uniform parameter distribution neglects the inherent feature hierarchy, where earlier layers, learning more general features, require more parameters than later ones. To tackle this issue, we introduce Depth Anything in Robotic Endoscopic Surgery (DARES), a novel approach that employs a new adaptation technique, Vector Low-Rank Adaptation (Vector-LoRA) on the DAM V2 to perform self-supervised monocular depth estimation in RAS scenes. To enhance learning efficiency, we introduce Vector-LoRA by integrating more parameters in earlier layers and gradually decreasing parameters in later layers. We also design a reprojection loss based on the multi-scale SSIM error to enhance depth perception by better tailoring the foundation model to the specific requirements of the surgical environment. The proposed method is validated on the SCARED dataset and demonstrates superior performance over recent state-of-the-art self-supervised monocular depth estimation techniques, achieving an improvement of 13.3% in the absolute relative error metric. The code and pre-trained weights are available at https://github.com/mobarakol/DARES.
△ Less
Submitted 21 October, 2024; v1 submitted 30 August, 2024;
originally announced August 2024.
-
SCREENER: A general framework for task-specific experiment design in quantitative MRI
Authors:
Tianshu Zheng,
Zican Wang,
Timothy Bray,
Daniel C. Alexander,
Dan Wu,
Hui Zhang
Abstract:
Quantitative magnetic resonance imaging (qMRI) is increasingly investigated for use in a variety of clinical tasks from diagnosis, through staging, to treatment monitoring. However, experiment design in qMRI, the identification of the optimal acquisition protocols, has been focused on obtaining the most precise parameter estimations, with no regard for the specific requirements of downstream tasks…
▽ More
Quantitative magnetic resonance imaging (qMRI) is increasingly investigated for use in a variety of clinical tasks from diagnosis, through staging, to treatment monitoring. However, experiment design in qMRI, the identification of the optimal acquisition protocols, has been focused on obtaining the most precise parameter estimations, with no regard for the specific requirements of downstream tasks. Here we propose SCREENER: A general framework for task-specific experiment design in quantitative MRI. SCREENER incorporates a task-specific objective and seeks the optimal protocol with a deep-reinforcement-learning (DRL) based optimization strategy. To illustrate this framework, we employ a task of classifying the inflammation status of bone marrow using diffusion MRI data with intravoxel incoherent motion (IVIM) modelling. Results demonstrate SCREENER outperforms previous ad hoc and optimized protocols under clinical signal-to-noise ratio (SNR) conditions, achieving significant improvement, both in binary classification tasks, e.g. from 67% to 89%, and in a multi-class classification task, from 46% to 59%. Additionally, we show this improvement is robust to the SNR. Lastly, we demonstrate the advantage of DRL-based optimization strategy, enabling zero-shot discovery of near-optimal protocols for a range of SNRs not used in training. In conclusion, SCREENER has the potential to enable wider uptake of qMRI in the clinic.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
Enhancing Spatiotemporal Disease Progression Models via Latent Diffusion and Prior Knowledge
Authors:
Lemuel Puglisi,
Daniel C. Alexander,
Daniele Ravì
Abstract:
In this work, we introduce Brain Latent Progression (BrLP), a novel spatiotemporal disease progression model based on latent diffusion. BrLP is designed to predict the evolution of diseases at the individual level on 3D brain MRIs. Existing deep generative models developed for this task are primarily data-driven and face challenges in learning disease progressions. BrLP addresses these challenges…
▽ More
In this work, we introduce Brain Latent Progression (BrLP), a novel spatiotemporal disease progression model based on latent diffusion. BrLP is designed to predict the evolution of diseases at the individual level on 3D brain MRIs. Existing deep generative models developed for this task are primarily data-driven and face challenges in learning disease progressions. BrLP addresses these challenges by incorporating prior knowledge from disease models to enhance the accuracy of predictions. To implement this, we propose to integrate an auxiliary model that infers volumetric changes in various brain regions. Additionally, we introduce Latent Average Stabilization (LAS), a novel technique to improve spatiotemporal consistency of the predicted progression. BrLP is trained and evaluated on a large dataset comprising 11,730 T1-weighted brain MRIs from 2,805 subjects, collected from three publicly available, longitudinal Alzheimer's Disease (AD) studies. In our experiments, we compare the MRI scans generated by BrLP with the actual follow-up MRIs available from the subjects, in both cross-sectional and longitudinal settings. BrLP demonstrates significant improvements over existing methods, with an increase of 22% in volumetric accuracy across AD-related brain regions and 43% in image similarity to the ground-truth scans. The ability of BrLP to generate conditioned 3D scans at the subject level, along with the novelty of integrating prior knowledge to enhance accuracy, represents a significant advancement in disease progression modeling, opening new avenues for precision medicine. The code of BrLP is available at the following link: https://github.com/LemuelPuglisi/BrLP.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Tackling Structural Hallucination in Image Translation with Local Diffusion
Authors:
Seunghoi Kim,
Chen Jin,
Tom Diethe,
Matteo Figini,
Henry F. J. Tregidgo,
Asher Mullokandov,
Philip Teare,
Daniel C. Alexander
Abstract:
Recent developments in diffusion models have advanced conditioned image generation, yet they struggle with reconstructing out-of-distribution (OOD) images, such as unseen tumors in medical images, causing "image hallucination" and risking misdiagnosis. We hypothesize such hallucinations result from local OOD regions in the conditional images. We verify that partitioning the OOD region and conducti…
▽ More
Recent developments in diffusion models have advanced conditioned image generation, yet they struggle with reconstructing out-of-distribution (OOD) images, such as unseen tumors in medical images, causing "image hallucination" and risking misdiagnosis. We hypothesize such hallucinations result from local OOD regions in the conditional images. We verify that partitioning the OOD region and conducting separate image generations alleviates hallucinations in several applications. From this, we propose a training-free diffusion framework that reduces hallucination with multiple Local Diffusion processes. Our approach involves OOD estimation followed by two modules: a "branching" module generates locally both within and outside OOD regions, and a "fusion" module integrates these predictions into one. Our evaluation shows our method mitigates hallucination over baseline models quantitatively and qualitatively, reducing misdiagnosis by 40% and 25% in the real-world medical and natural image datasets, respectively. It also demonstrates compatibility with various pre-trained diffusion models.
△ Less
Submitted 17 July, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
Brain-ID: Learning Contrast-agnostic Anatomical Representations for Brain Imaging
Authors:
Peirong Liu,
Oula Puonti,
Xiaoling Hu,
Daniel C. Alexander,
Juan E. Iglesias
Abstract:
Recent learning-based approaches have made astonishing advances in calibrated medical imaging like computerized tomography (CT), yet they struggle to generalize in uncalibrated modalities -- notably magnetic resonance (MR) imaging, where performance is highly sensitive to the differences in MR contrast, resolution, and orientation. This prevents broad applicability to diverse real-world clinical p…
▽ More
Recent learning-based approaches have made astonishing advances in calibrated medical imaging like computerized tomography (CT), yet they struggle to generalize in uncalibrated modalities -- notably magnetic resonance (MR) imaging, where performance is highly sensitive to the differences in MR contrast, resolution, and orientation. This prevents broad applicability to diverse real-world clinical protocols. We introduce Brain-ID, an anatomical representation learning model for brain imaging. With the proposed "mild-to-severe" intra-subject generation, Brain-ID is robust to the subject-specific brain anatomy regardless of the appearance of acquired images (e.g., contrast, deformation, resolution, artifacts). Trained entirely on synthetic data, Brain-ID readily adapts to various downstream tasks through only one layer. We present new metrics to validate the intra- and inter-subject robustness of Brain-ID features, and evaluate their performance on four downstream applications, covering contrast-independent (anatomy reconstruction/contrast synthesis, brain segmentation), and contrast-dependent (super-resolution, bias field estimation) tasks. Extensive experiments on six public datasets demonstrate that Brain-ID achieves state-of-the-art performance in all tasks on different MRI modalities and CT, and more importantly, preserves its performance on low-resolution and small datasets. Code is available at https://github.com/peirong26/Brain-ID.
△ Less
Submitted 10 March, 2024; v1 submitted 28 November, 2023;
originally announced November 2023.
-
A 3D Conditional Diffusion Model for Image Quality Transfer -- An Application to Low-Field MRI
Authors:
Seunghoi Kim,
Henry F. J. Tregidgo,
Ahmed K. Eldaly,
Matteo Figini,
Daniel C. Alexander
Abstract:
Low-field (LF) MRI scanners (<1T) are still prevalent in settings with limited resources or unreliable power supply. However, they often yield images with lower spatial resolution and contrast than high-field (HF) scanners. This quality disparity can result in inaccurate clinician interpretations. Image Quality Transfer (IQT) has been developed to enhance the quality of images by learning a mappin…
▽ More
Low-field (LF) MRI scanners (<1T) are still prevalent in settings with limited resources or unreliable power supply. However, they often yield images with lower spatial resolution and contrast than high-field (HF) scanners. This quality disparity can result in inaccurate clinician interpretations. Image Quality Transfer (IQT) has been developed to enhance the quality of images by learning a mapping function between low and high-quality images. Existing IQT models often fail to restore high-frequency features, leading to blurry output. In this paper, we propose a 3D conditional diffusion model to improve 3D volumetric data, specifically LF MR images. Additionally, we incorporate a cross-batch mechanism into the self-attention and padding of our network, ensuring broader contextual awareness even under small 3D patches. Experiments on the publicly available Human Connectome Project (HCP) dataset for IQT and brain parcellation demonstrate that our model outperforms existing methods both quantitatively and qualitatively. The code is publicly available at \url{https://github.com/edshkim98/DiffusionIQT}.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
-
CenTime: Event-Conditional Modelling of Censoring in Survival Analysis
Authors:
Ahmed H. Shahin,
An Zhao,
Alexander C. Whitehead,
Daniel C. Alexander,
Joseph Jacob,
David Barber
Abstract:
Survival analysis is a valuable tool for estimating the time until specific events, such as death or cancer recurrence, based on baseline observations. This is particularly useful in healthcare to prognostically predict clinically important events based on patient data. However, existing approaches often have limitations; some focus only on ranking patients by survivability, neglecting to estimate…
▽ More
Survival analysis is a valuable tool for estimating the time until specific events, such as death or cancer recurrence, based on baseline observations. This is particularly useful in healthcare to prognostically predict clinically important events based on patient data. However, existing approaches often have limitations; some focus only on ranking patients by survivability, neglecting to estimate the actual event time, while others treat the problem as a classification task, ignoring the inherent time-ordered structure of the events. Furthermore, the effective utilization of censored samples - training data points where the exact event time is unknown - is essential for improving the predictive accuracy of the model. In this paper, we introduce CenTime, a novel approach to survival analysis that directly estimates the time to event. Our method features an innovative event-conditional censoring mechanism that performs robustly even when uncensored data is scarce. We demonstrate that our approach forms a consistent estimator for the event model parameters, even in the absence of uncensored data. Furthermore, CenTime is easily integrated with deep learning models with no restrictions on batch size or the number of uncensored samples. We compare our approach with standard survival analysis methods, including the Cox proportional-hazard model and DeepHit. Our results indicate that CenTime offers state-of-the-art performance in predicting time-to-death while maintaining comparable ranking performance. Our implementation is publicly available at https://github.com/ahmedhshahin/CenTime.
△ Less
Submitted 10 January, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
A coupled-mechanisms modelling framework for neurodegeneration
Authors:
Tiantian He,
Elinor Thompson,
Anna Schroder,
Neil P. Oxtoby,
Ahmed Abdulaal,
Frederik Barkhof,
Daniel C. Alexander
Abstract:
Computational models of neurodegeneration aim to emulate the evolving pattern of pathology in the brain during neurodegenerative disease, such as Alzheimer's disease. Previous studies have made specific choices on the mechanisms of pathology production and diffusion, or assume that all the subjects lie on the same disease progression trajectory. However, the complexity and heterogeneity of neurode…
▽ More
Computational models of neurodegeneration aim to emulate the evolving pattern of pathology in the brain during neurodegenerative disease, such as Alzheimer's disease. Previous studies have made specific choices on the mechanisms of pathology production and diffusion, or assume that all the subjects lie on the same disease progression trajectory. However, the complexity and heterogeneity of neurodegenerative pathology suggests that multiple mechanisms may contribute synergistically with complex interactions, meanwhile the degree of contribution of each mechanism may vary among individuals. We thus put forward a coupled-mechanisms modelling framework which non-linearly combines the network-topology-informed pathology appearance with the process of pathology spreading within a dynamic modelling system. We account for the heterogeneity of disease by fitting the model at the individual level, allowing the epicenters and rate of progression to vary among subjects. We construct a Bayesian model selection framework to account for feature importance and parameter uncertainty. This provides a combination of mechanisms that best explains the observations for each individual from the ADNI dataset. With the obtained distribution of mechanism importance for each subject, we are able to identify subgroups of patients sharing similar combinations of apparent mechanisms.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
A 3D deep learning classifier and its explainability when assessing coronary artery disease
Authors:
Wing Keung Cheung,
Jeremy Kalindjian,
Robert Bell,
Arjun Nair,
Leon J. Menezes,
Riyaz Patel,
Simon Wan,
Kacy Chou,
Jiahang Chen,
Ryo Torii,
Rhodri H. Davies,
James C. Moon,
Daniel C. Alexander,
Joseph Jacob
Abstract:
Early detection and diagnosis of coronary artery disease (CAD) could save lives and reduce healthcare costs. The current clinical practice is to perform CAD diagnosis through analysing medical images from computed tomography coronary angiography (CTCA). Most current approaches utilise deep learning methods but require centerline extraction and multi-planar reconstruction. These indirect methods ar…
▽ More
Early detection and diagnosis of coronary artery disease (CAD) could save lives and reduce healthcare costs. The current clinical practice is to perform CAD diagnosis through analysing medical images from computed tomography coronary angiography (CTCA). Most current approaches utilise deep learning methods but require centerline extraction and multi-planar reconstruction. These indirect methods are not designed in a clinician-friendly manner, and they complicate the interventional procedure. Furthermore, the current deep learning methods do not provide exact explainability and limit the usefulness of these methods to be deployed in clinical settings. In this study, we first propose a 3D Resnet-50 deep learning model to directly classify normal subjects and CAD patients on CTCA images, then we demonstrate a 2D modified U-Net model can be subsequently employed to segment the coronary arteries. Our proposed approach outperforms the state-of-the-art models by 21.43% in terms of classification accuracy. The classification model with focal loss provides a better and more focused heat map, and the segmentation model provides better explainability than the classification-only model. The proposed holistic approach not only provides a simpler and clinician-friendly solution but also good classification accuracy and exact explainability for CAD diagnosis.
△ Less
Submitted 26 November, 2024; v1 submitted 29 July, 2023;
originally announced August 2023.
-
Interpolation-Split: a data-centric deep learning approach with big interpolated data to boost airway segmentation performance
Authors:
Wing Keung Cheung,
Ashkan Pakzad,
Nesrin Mogulkoc,
Sarah Needleman,
Bojidar Rangelov,
Eyjolfur Gudmundsson,
An Zhao,
Mariam Abbas,
Davina McLaverty,
Dimitrios Asimakopoulos,
Robert Chapman,
Recep Savas,
Sam M Janes,
Yipeng Hu,
Daniel C. Alexander,
John R Hurst,
Joseph Jacob
Abstract:
The morphology and distribution of airway tree abnormalities enables diagnosis and disease characterisation across a variety of chronic respiratory conditions. In this regard, airway segmentation plays a critical role in the production of the outline of the entire airway tree to enable estimation of disease extent and severity. In this study, we propose a data-centric deep learning technique to se…
▽ More
The morphology and distribution of airway tree abnormalities enables diagnosis and disease characterisation across a variety of chronic respiratory conditions. In this regard, airway segmentation plays a critical role in the production of the outline of the entire airway tree to enable estimation of disease extent and severity. In this study, we propose a data-centric deep learning technique to segment the airway tree. The proposed technique utilises interpolation and image split to improve data usefulness and quality. Then, an ensemble learning strategy is implemented to aggregate the segmented airway trees at different scales. In terms of segmentation performance (dice similarity coefficient), our method outperforms the baseline model by 2.5% on average when a combined loss is used. Further, our proposed technique has a low GPU usage and high flexibility enabling it to be deployed on any 2D deep learning model.
△ Less
Submitted 23 July, 2024; v1 submitted 29 July, 2023;
originally announced August 2023.
-
Rician likelihood loss for quantitative MRI using self-supervised deep learning
Authors:
Christopher S. Parker,
Anna Schroder,
Sean C. Epstein,
James Cole,
Daniel C. Alexander,
Hui Zhang
Abstract:
Purpose: Previous quantitative MR imaging studies using self-supervised deep learning have reported biased parameter estimates at low SNR. Such systematic errors arise from the choice of Mean Squared Error (MSE) loss function for network training, which is incompatible with Rician-distributed MR magnitude signals. To address this issue, we introduce the negative log Rician likelihood (NLR) loss. M…
▽ More
Purpose: Previous quantitative MR imaging studies using self-supervised deep learning have reported biased parameter estimates at low SNR. Such systematic errors arise from the choice of Mean Squared Error (MSE) loss function for network training, which is incompatible with Rician-distributed MR magnitude signals. To address this issue, we introduce the negative log Rician likelihood (NLR) loss. Methods: A numerically stable and accurate implementation of the NLR loss was developed to estimate quantitative parameters of the apparent diffusion coefficient (ADC) model and intra-voxel incoherent motion (IVIM) model. Parameter estimation accuracy, precision and overall error were evaluated in terms of bias, variance and root mean squared error and compared against the MSE loss over a range of SNRs (5 - 30). Results: Networks trained with NLR loss show higher estimation accuracy than MSE for the ADC and IVIM diffusion coefficients as SNR decreases, with minimal loss of precision or total error. At high effective SNR (high SNR and small diffusion coefficients), both losses show comparable accuracy and precision for all parameters of both models. Conclusion: The proposed NLR loss is numerically stable and accurate across the full range of tested SNRs and improves parameter estimation accuracy of diffusion coefficients using self-supervised deep learning. We expect the development to benefit quantitative MR imaging techniques broadly, enabling more accurate parameter estimation from noisy data.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
CAMIL: Context-Aware Multiple Instance Learning for Cancer Detection and Subtyping in Whole Slide Images
Authors:
Olga Fourkioti,
Matt De Vries,
Chen Jin,
Daniel C. Alexander,
Chris Bakal
Abstract:
The visual examination of tissue biopsy sections is fundamental for cancer diagnosis, with pathologists analyzing sections at multiple magnifications to discern tumor cells and their subtypes. However, existing attention-based multiple instance learning (MIL) models used for analyzing Whole Slide Images (WSIs) in cancer diagnostics often overlook the contextual information of tumor and neighboring…
▽ More
The visual examination of tissue biopsy sections is fundamental for cancer diagnosis, with pathologists analyzing sections at multiple magnifications to discern tumor cells and their subtypes. However, existing attention-based multiple instance learning (MIL) models used for analyzing Whole Slide Images (WSIs) in cancer diagnostics often overlook the contextual information of tumor and neighboring tiles, leading to misclassifications. To address this, we propose the Context-Aware Multiple Instance Learning (CAMIL) architecture. CAMIL incorporates neighbor-constrained attention to consider dependencies among tiles within a WSI and integrates contextual constraints as prior knowledge into the MIL model. We evaluated CAMIL on subtyping non-small cell lung cancer (TCGA-NSCLC) and detecting lymph node (CAMELYON16 and CAMELYON17) metastasis, achieving test AUCs of 97.5\%, 95.9\%, and 88.1\%, respectively, outperforming other state-of-the-art methods. Additionally, CAMIL enhances model interpretability by identifying regions of high diagnostic value.
△ Less
Submitted 10 October, 2024; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Domain-agnostic segmentation of thalamic nuclei from joint structural and diffusion MRI
Authors:
Henry F. J. Tregidgo,
Sonja Soskic,
Mark D. Olchanyi,
Juri Althonayan,
Benjamin Billot,
Chiara Maffei,
Polina Golland,
Anastasia Yendiki,
Daniel C. Alexander,
Martina Bocchetta,
Jonathan D. Rohrer,
Juan Eugenio Iglesias
Abstract:
The human thalamus is a highly connected subcortical grey-matter structure within the brain. It comprises dozens of nuclei with different function and connectivity, which are affected differently by disease. For this reason, there is growing interest in studying the thalamic nuclei in vivo with MRI. Tools are available to segment the thalamus from 1 mm T1 scans, but the contrast of the lateral and…
▽ More
The human thalamus is a highly connected subcortical grey-matter structure within the brain. It comprises dozens of nuclei with different function and connectivity, which are affected differently by disease. For this reason, there is growing interest in studying the thalamic nuclei in vivo with MRI. Tools are available to segment the thalamus from 1 mm T1 scans, but the contrast of the lateral and internal boundaries is too faint to produce reliable segmentations. Some tools have attempted to incorporate information from diffusion MRI in the segmentation to refine these boundaries, but do not generalise well across diffusion MRI acquisitions. Here we present the first CNN that can segment thalamic nuclei from T1 and diffusion data of any resolution without retraining or fine tuning. Our method builds on a public histological atlas of the thalamic nuclei and silver standard segmentations on high-quality diffusion data obtained with a recent Bayesian adaptive segmentation tool. We combine these with an approximate degradation model for fast domain randomisation during training. Our CNN produces a segmentation at 0.7 mm isotropic resolution, irrespective of the resolution of the input. Moreover, it uses a parsimonious model of the diffusion signal at each voxel (fractional anisotropy and principal eigenvector) that is compatible with virtually any set of directions and b-values, including huge amounts of legacy data. We show results of our proposed method on three heterogeneous datasets acquired on dozens of different scanners. An implementation of the method is publicly available at https://freesurfer.net/fswiki/ThalamicNucleiDTI.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Expectation Maximization Pseudo Labels
Authors:
Moucheng Xu,
Yukun Zhou,
Chen Jin,
Marius de Groot,
Daniel C. Alexander,
Neil P. Oxtoby,
Yipeng Hu,
Joseph Jacob
Abstract:
In this paper, we study pseudo-labelling. Pseudo-labelling employs raw inferences on unlabelled data as pseudo-labels for self-training. We elucidate the empirical successes of pseudo-labelling by establishing a link between this technique and the Expectation Maximisation algorithm. Through this, we realise that the original pseudo-labelling serves as an empirical estimation of its more comprehens…
▽ More
In this paper, we study pseudo-labelling. Pseudo-labelling employs raw inferences on unlabelled data as pseudo-labels for self-training. We elucidate the empirical successes of pseudo-labelling by establishing a link between this technique and the Expectation Maximisation algorithm. Through this, we realise that the original pseudo-labelling serves as an empirical estimation of its more comprehensive underlying formulation. Following this insight, we present a full generalisation of pseudo-labels under Bayes' theorem, termed Bayesian Pseudo Labels. Subsequently, we introduce a variational approach to generate these Bayesian Pseudo Labels, involving the learning of a threshold to automatically select high-quality pseudo labels. In the remainder of the paper, we showcase the applications of pseudo-labelling and its generalised form, Bayesian Pseudo-Labelling, in the semi-supervised segmentation of medical images. Specifically, we focus on: 1) 3D binary segmentation of lung vessels from CT volumes; 2) 2D multi-class segmentation of brain tumours from MRI volumes; 3) 3D binary segmentation of whole brain tumours from MRI volumes; and 4) 3D binary segmentation of prostate from MRI volumes. We further demonstrate that pseudo-labels can enhance the robustness of the learned representations. The code is released in the following GitHub repository: https://github.com/moucheng2017/EMSSL
△ Less
Submitted 26 January, 2024; v1 submitted 2 May, 2023;
originally announced May 2023.
-
Low-field magnetic resonance image enhancement via stochastic image quality transfer
Authors:
Hongxiang Lin,
Matteo Figini,
Felice D'Arco,
Godwin Ogbole,
Ryutaro Tanno,
Stefano B. Blumberg,
Lisa Ronan,
Biobele J. Brown,
David W. Carmichael,
Ikeoluwa Lagunju,
Judith Helen Cross,
Delmiro Fernandez-Reyes,
Daniel C. Alexander
Abstract:
Low-field (<1T) magnetic resonance imaging (MRI) scanners remain in widespread use in low- and middle-income countries (LMICs) and are commonly used for some applications in higher income countries e.g. for small child patients with obesity, claustrophobia, implants, or tattoos. However, low-field MR images commonly have lower resolution and poorer contrast than images from high field (1.5T, 3T, a…
▽ More
Low-field (<1T) magnetic resonance imaging (MRI) scanners remain in widespread use in low- and middle-income countries (LMICs) and are commonly used for some applications in higher income countries e.g. for small child patients with obesity, claustrophobia, implants, or tattoos. However, low-field MR images commonly have lower resolution and poorer contrast than images from high field (1.5T, 3T, and above). Here, we present Image Quality Transfer (IQT) to enhance low-field structural MRI by estimating from a low-field image the image we would have obtained from the same subject at high field. Our approach uses (i) a stochastic low-field image simulator as the forward model to capture uncertainty and variation in the contrast of low-field images corresponding to a particular high-field image, and (ii) an anisotropic U-Net variant specifically designed for the IQT inverse problem. We evaluate the proposed algorithm both in simulation and using multi-contrast (T1-weighted, T2-weighted, and fluid attenuated inversion recovery (FLAIR)) clinical low-field MRI data from an LMIC hospital. We show the efficacy of IQT in improving contrast and resolution of low-field MR images. We demonstrate that IQT-enhanced images have potential for enhancing visualisation of anatomical structures and pathological lesions of clinical relevance from the perspective of radiologists. IQT is proved to have capability of boosting the diagnostic value of low-field MRI, especially in low-resource settings.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
A hybrid CNN-RNN approach for survival analysis in a Lung Cancer Screening study
Authors:
Yaozhi Lu,
Shahab Aslani,
An Zhao,
Ahmed Shahin,
David Barber,
Mark Emberton,
Daniel C. Alexander,
Joseph Jacob
Abstract:
In this study, we present a hybrid CNN-RNN approach to investigate long-term survival of subjects in a lung cancer screening study. Subjects who died of cardiovascular and respiratory causes were identified whereby the CNN model was used to capture imaging features in the CT scans and the RNN model was used to investigate time series and thus global information. The models were trained on subjects…
▽ More
In this study, we present a hybrid CNN-RNN approach to investigate long-term survival of subjects in a lung cancer screening study. Subjects who died of cardiovascular and respiratory causes were identified whereby the CNN model was used to capture imaging features in the CT scans and the RNN model was used to investigate time series and thus global information. The models were trained on subjects who underwent cardiovascular and respiratory deaths and a control cohort matched to participant age, gender, and smoking history. The combined model can achieve an AUC of 0.76 which outperforms humans at cardiovascular mortality prediction. The corresponding F1 and Matthews Correlation Coefficient are 0.63 and 0.42 respectively. The generalisability of the model is further validated on an 'external' cohort. The same models were applied to survival analysis with the Cox Proportional Hazard model. It was demonstrated that incorporating the follow-up history can lead to improvement in survival prediction. The Cox neural network can achieve an IPCW C-index of 0.75 on the internal dataset and 0.69 on an external dataset. Delineating imaging features associated with long-term survival can help focus preventative interventions appropriately, particularly for under-recognised pathologies thereby potentially reducing patient morbidity.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
DeepBrainPrint: A Novel Contrastive Framework for Brain MRI Re-Identification
Authors:
Lemuel Puglisi,
Frederik Barkhof,
Daniel C. Alexander,
Geoffrey JM Parker,
Arman Eshaghi,
Daniele Ravì
Abstract:
Recent advances in MRI have led to the creation of large datasets. With the increase in data volume, it has become difficult to locate previous scans of the same patient within these datasets (a process known as re-identification). To address this issue, we propose an AI-powered medical imaging retrieval framework called DeepBrainPrint, which is designed to retrieve brain MRI scans of the same pat…
▽ More
Recent advances in MRI have led to the creation of large datasets. With the increase in data volume, it has become difficult to locate previous scans of the same patient within these datasets (a process known as re-identification). To address this issue, we propose an AI-powered medical imaging retrieval framework called DeepBrainPrint, which is designed to retrieve brain MRI scans of the same patient. Our framework is a semi-self-supervised contrastive deep learning approach with three main innovations. First, we use a combination of self-supervised and supervised paradigms to create an effective brain fingerprint from MRI scans that can be used for real-time image retrieval. Second, we use a special weighting function to guide the training and improve model convergence. Third, we introduce new imaging transformations to improve retrieval robustness in the presence of intensity variations (i.e. different scan contrasts), and to account for age and disease progression in patients. We tested DeepBrainPrint on a large dataset of T1-weighted brain MRIs from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and on a synthetic dataset designed to evaluate retrieval performance with different image modalities. Our results show that DeepBrainPrint outperforms previous methods, including simple similarity metrics and more advanced contrastive deep learning frameworks.
△ Less
Submitted 24 September, 2023; v1 submitted 25 February, 2023;
originally announced February 2023.
-
Deformably-Scaled Transposed Convolution
Authors:
Stefano B. Blumberg,
Daniele Raví,
Mou-Cheng Xu,
Matteo Figini,
Iasonas Kokkinos,
Daniel C. Alexander
Abstract:
Transposed convolution is crucial for generating high-resolution outputs, yet has received little attention compared to convolution layers. In this work we revisit transposed convolution and introduce a novel layer that allows us to place information in the image selectively and choose the `stroke breadth' at which the image is synthesized, whilst incurring a small additional parameter cost. For t…
▽ More
Transposed convolution is crucial for generating high-resolution outputs, yet has received little attention compared to convolution layers. In this work we revisit transposed convolution and introduce a novel layer that allows us to place information in the image selectively and choose the `stroke breadth' at which the image is synthesized, whilst incurring a small additional parameter cost. For this we introduce three ideas: firstly, we regress offsets to the positions where the transpose convolution results are placed; secondly we broadcast the offset weight locations over a learnable neighborhood; and thirdly we use a compact parametrization to share weights and restrict offsets. We show that simply substituting upsampling operators with our novel layer produces substantial improvements across tasks as diverse as instance segmentation, object detection, semantic segmentation, generative image modeling, and 3D magnetic resonance image enhancement, while outperforming all existing variants of transposed convolutions. Our novel layer can be used as a drop-in replacement for 2D and 3D upsampling operators and the code will be publicly available.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Experimental Design for Multi-Channel Imaging via Task-Driven Feature Selection
Authors:
Stefano B. Blumberg,
Paddy J. Slator,
Daniel C. Alexander
Abstract:
This paper presents a data-driven, task-specific paradigm for experimental design, to shorten acquisition time, reduce costs, and accelerate the deployment of imaging devices. Current approaches in experimental design focus on model-parameter estimation and require specification of a particular model, whereas in imaging, other tasks may drive the design. Furthermore, such approaches often lead to…
▽ More
This paper presents a data-driven, task-specific paradigm for experimental design, to shorten acquisition time, reduce costs, and accelerate the deployment of imaging devices. Current approaches in experimental design focus on model-parameter estimation and require specification of a particular model, whereas in imaging, other tasks may drive the design. Furthermore, such approaches often lead to intractable optimization problems in real-world imaging applications. Here we present a new paradigm for experimental design that simultaneously optimizes the design (set of image channels) and trains a machine-learning model to execute a user-specified image-analysis task. The approach obtains data densely-sampled over the measurement space (many image channels) for a small number of acquisitions, then identifies a subset of channels of prespecified size that best supports the task. We propose a method: TADRED for TAsk-DRiven Experimental Design in imaging, to identify the most informative channel-subset whilst simultaneously training a network to execute the task given the subset. Experiments demonstrate the potential of TADRED in diverse imaging applications: several clinically-relevant tasks in magnetic resonance imaging; and remote sensing and physiological applications of hyperspectral imaging. Results show substantial improvement over classical experimental design, two recent application-specific methods within the new paradigm, and state-of-the-art approaches in supervised feature selection. We anticipate further applications of our approach. Code is available: https://github.com/sbb-gh/experimental-design-multichannel
△ Less
Submitted 17 March, 2024; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Fitting a Directional Microstructure Model to Diffusion-Relaxation MRI Data with Self-Supervised Machine Learning
Authors:
Jason P. Lim,
Stefano B. Blumberg,
Neil Narayan,
Sean C. Epstein,
Daniel C. Alexander,
Marco Palombo,
Paddy J. Slator
Abstract:
Machine learning is a powerful approach for fitting microstructural models to diffusion MRI data. Early machine learning microstructure imaging implementations trained regressors to estimate model parameters in a supervised way, using synthetic training data with known ground truth. However, a drawback of this approach is that the choice of training data impacts fitted parameter values. Self-super…
▽ More
Machine learning is a powerful approach for fitting microstructural models to diffusion MRI data. Early machine learning microstructure imaging implementations trained regressors to estimate model parameters in a supervised way, using synthetic training data with known ground truth. However, a drawback of this approach is that the choice of training data impacts fitted parameter values. Self-supervised learning is emerging as an attractive alternative to supervised learning in this context. Thus far, both supervised and self-supervised learning have typically been applied to isotropic models, such as intravoxel incoherent motion (IVIM), as opposed to models where the directionality of anisotropic structures is also estimated. In this paper, we demonstrate self-supervised machine learning model fitting for a directional microstructural model. In particular, we fit a combined T1-ball-stick model to the multidimensional diffusion (MUDI) challenge diffusion-relaxation dataset. Our self-supervised approach shows clear improvements in parameter estimation and computational time, for both simulated and in-vivo brain data, compared to standard non-linear least squares fitting. Code for the artificial neural net constructed for this study is available for public use from the following GitHub repository: https://github.com/jplte/deep-T1-ball-stick
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Optimising Chest X-Rays for Image Analysis by Identifying and Removing Confounding Factors
Authors:
Shahab Aslani,
Watjana Lilaonitkul,
Vaishnavi Gnanananthan,
Divya Raj,
Bojidar Rangelov,
Alexandra L Young,
Yipeng Hu,
Paul Taylor,
Daniel C Alexander,
Joseph Jacob
Abstract:
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions. This variation is seen in the CXR projections used, image annotations added and in the inspiratory effort and degree of rotation of clinical images. The image analysis community has attempted to ease the burden on overst…
▽ More
During the COVID-19 pandemic, the sheer volume of imaging performed in an emergency setting for COVID-19 diagnosis has resulted in a wide variability of clinical CXR acquisitions. This variation is seen in the CXR projections used, image annotations added and in the inspiratory effort and degree of rotation of clinical images. The image analysis community has attempted to ease the burden on overstretched radiology departments during the pandemic by developing automated COVID-19 diagnostic algorithms, the input for which has been CXR imaging. Large publicly available CXR datasets have been leveraged to improve deep learning algorithms for COVID-19 diagnosis. Yet the variable quality of clinically-acquired CXRs within publicly available datasets could have a profound effect on algorithm performance. COVID-19 diagnosis may be inferred by an algorithm from non-anatomical features on an image such as image labels. These imaging shortcuts may be dataset-specific and limit the generalisability of AI systems. Understanding and correcting key potential biases in CXR images is therefore an essential first step prior to CXR image analysis. In this study, we propose a simple and effective step-wise approach to pre-processing a COVID-19 chest X-ray dataset to remove undesired biases. We perform ablation studies to show the impact of each individual step. The results suggest that using our proposed pipeline could increase accuracy of the baseline COVID-19 detection algorithm by up to 13%.
△ Less
Submitted 22 August, 2022;
originally announced August 2022.
-
Bayesian Pseudo Labels: Expectation Maximization for Robust and Efficient Semi-Supervised Segmentation
Authors:
Mou-Cheng Xu,
Yukun Zhou,
Chen Jin,
Marius de Groot,
Daniel C. Alexander,
Neil P. Oxtoby,
Yipeng Hu,
Joseph Jacob
Abstract:
This paper concerns pseudo labelling in segmentation. Our contribution is fourfold. Firstly, we present a new formulation of pseudo-labelling as an Expectation-Maximization (EM) algorithm for clear statistical interpretation. Secondly, we propose a semi-supervised medical image segmentation method purely based on the original pseudo labelling, namely SegPL. We demonstrate SegPL is a competitive ap…
▽ More
This paper concerns pseudo labelling in segmentation. Our contribution is fourfold. Firstly, we present a new formulation of pseudo-labelling as an Expectation-Maximization (EM) algorithm for clear statistical interpretation. Secondly, we propose a semi-supervised medical image segmentation method purely based on the original pseudo labelling, namely SegPL. We demonstrate SegPL is a competitive approach against state-of-the-art consistency regularisation based methods on semi-supervised segmentation on a 2D multi-class MRI brain tumour segmentation task and a 3D binary CT lung vessel segmentation task. The simplicity of SegPL allows less computational cost comparing to prior methods. Thirdly, we demonstrate that the effectiveness of SegPL may originate from its robustness against out-of-distribution noises and adversarial attacks. Lastly, under the EM framework, we introduce a probabilistic generalisation of SegPL via variational inference, which learns a dynamic threshold for pseudo labelling during the training. We show that SegPL with variational inference can perform uncertainty estimation on par with the gold-standard method Deep Ensemble.
△ Less
Submitted 13 September, 2022; v1 submitted 8 August, 2022;
originally announced August 2022.
-
An efficient semi-supervised quality control system trained using physics-based MRI-artefact generators and adversarial training
Authors:
Daniele Ravi,
Frederik Barkhof,
Daniel C. Alexander,
Lemuel Puglisi,
Geoffrey JM Parker,
Arman Eshaghi
Abstract:
Large medical imaging data sets are becoming increasingly available, but ensuring sample quality without significant artefacts is challenging. Existing methods for identifying imperfections in medical imaging rely on data-intensive approaches, compounded by a scarcity of artefact-rich scans for training machine learning models in clinical research. To tackle this problem, we propose a framework wi…
▽ More
Large medical imaging data sets are becoming increasingly available, but ensuring sample quality without significant artefacts is challenging. Existing methods for identifying imperfections in medical imaging rely on data-intensive approaches, compounded by a scarcity of artefact-rich scans for training machine learning models in clinical research. To tackle this problem, we propose a framework with four main components: 1) artefact generators inspired by magnetic resonance physics to corrupt brain MRI scans and augment a training dataset, 2) abstract and engineered features to represent images compactly, 3) a feature selection process depending on the artefact class to improve classification, and 4) SVM classifiers to identify artefacts. Our contributions are threefold: first, physics-based artefact generators produce synthetic brain MRI scans with controlled artefacts for data augmentation. This will avoid the labour-intensive collection and labelling process of scans with rare artefacts. Second, we propose a pool of abstract and engineered image features to identify 9 different artefacts for structural MRI. Finally, we use an artefact-based feature selection block that, for each class of artefacts, finds the set of features providing the best classification performance. We performed validation experiments on a large data set of scans with artificially-generated artefacts, and in a multiple sclerosis clinical trial where real artefacts were identified by experts, showing that the proposed pipeline outperforms traditional methods. In particular, our data augmentation increases performance by up to 12.5 percentage points on accuracy, precision, and recall. The computational efficiency of our pipeline enables potential real-time deployment, promising high-throughput clinical applications through automated image-processing pipelines driven by quality control systems.
△ Less
Submitted 14 November, 2023; v1 submitted 7 June, 2022;
originally announced June 2022.
-
Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning
Authors:
Shahab Aslani,
Pavan Alluri,
Eyjolfur Gudmundsson,
Edward Chandy,
John McCabe,
Anand Devaraj,
Carolyn Horst,
Sam M Janes,
Rahul Chakkara,
Arjun Nair,
Daniel C Alexander,
SUMMIT consortium,
Joseph Jacob
Abstract:
Lung cancer is the leading cause of cancer-related mortality worldwide. Lung cancer screening (LCS) using annual low-dose computed tomography (CT) scanning has been proven to significantly reduce lung cancer mortality by detecting cancerous lung nodules at an earlier stage. Improving risk stratification of malignancy risk in lung nodules can be enhanced using machine/deep learning algorithms. Howe…
▽ More
Lung cancer is the leading cause of cancer-related mortality worldwide. Lung cancer screening (LCS) using annual low-dose computed tomography (CT) scanning has been proven to significantly reduce lung cancer mortality by detecting cancerous lung nodules at an earlier stage. Improving risk stratification of malignancy risk in lung nodules can be enhanced using machine/deep learning algorithms. However most existing algorithms: a) have primarily assessed single time-point CT data alone thereby failing to utilize the inherent advantages contained within longitudinal imaging datasets; b) have not integrated into computer models pertinent clinical data that might inform risk prediction; c) have not assessed algorithm performance on the spectrum of nodules that are most challenging for radiologists to interpret and where assistance from analytic tools would be most beneficial.
Here we show the performance of our time-series deep learning model (DeepCAD-NLM-L) which integrates multi-model information across three longitudinal data domains: nodule-specific, lung-specific, and clinical demographic data. We compared our time-series deep learning model to a) radiologist performance on CTs from the National Lung Screening Trial enriched with the most challenging nodules for diagnosis; b) a nodule management algorithm from a North London LCS study (SUMMIT). Our model demonstrated comparable and complementary performance to radiologists when interpreting challenging lung nodules and showed improved performance (AUC=88\%) against models utilizing single time-point data only. The results emphasise the importance of time-series, multi-modal analysis when interpreting malignancy risk in LCS.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Survival Analysis for Idiopathic Pulmonary Fibrosis using CT Images and Incomplete Clinical Data
Authors:
Ahmed H. Shahin,
Joseph Jacob,
Daniel C. Alexander,
David Barber
Abstract:
Idiopathic Pulmonary Fibrosis (IPF) is an inexorably progressive fibrotic lung disease with a variable and unpredictable rate of progression. CT scans of the lungs inform clinical assessment of IPF patients and contain pertinent information related to disease progression. In this work, we propose a multi-modal method that uses neural networks and memory banks to predict the survival of IPF patient…
▽ More
Idiopathic Pulmonary Fibrosis (IPF) is an inexorably progressive fibrotic lung disease with a variable and unpredictable rate of progression. CT scans of the lungs inform clinical assessment of IPF patients and contain pertinent information related to disease progression. In this work, we propose a multi-modal method that uses neural networks and memory banks to predict the survival of IPF patients using clinical and imaging data. The majority of clinical IPF patient records have missing data (e.g. missing lung function tests). To this end, we propose a probabilistic model that captures the dependencies between the observed clinical variables and imputes missing ones. This principled approach to missing data imputation can be naturally combined with a deep survival analysis model. We show that the proposed framework yields significantly better survival analysis results than baselines in terms of concordance index and integrated Brier score. Our work also provides insights into novel image-based biomarkers that are linked to mortality.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
Learning Morphological Feature Perturbations for Calibrated Semi-Supervised Segmentation
Authors:
Mou-Cheng Xu,
Yu-Kun Zhou,
Chen Jin,
Stefano B Blumberg,
Frederick J Wilson,
Marius deGroot,
Daniel C. Alexander,
Neil P. Oxtoby,
Joseph Jacob
Abstract:
We propose MisMatch, a novel consistency-driven semi-supervised segmentation framework which produces predictions that are invariant to learnt feature perturbations. MisMatch consists of an encoder and a two-head decoders. One decoder learns positive attention to the foreground regions of interest (RoI) on unlabelled images thereby generating dilated features. The other decoder learns negative att…
▽ More
We propose MisMatch, a novel consistency-driven semi-supervised segmentation framework which produces predictions that are invariant to learnt feature perturbations. MisMatch consists of an encoder and a two-head decoders. One decoder learns positive attention to the foreground regions of interest (RoI) on unlabelled images thereby generating dilated features. The other decoder learns negative attention to the foreground on the same unlabelled images thereby generating eroded features. We then apply a consistency regularisation on the paired predictions. MisMatch outperforms state-of-the-art semi-supervised methods on a CT-based pulmonary vessel segmentation task and a MRI-based brain tumour segmentation task. In addition, we show that the effectiveness of MisMatch comes from better model calibration than its supervised learning counterpart.
△ Less
Submitted 1 April, 2022; v1 submitted 18 March, 2022;
originally announced March 2022.
-
Progressive Subsampling for Oversampled Data - Application to Quantitative MRI
Authors:
Stefano B. Blumberg,
Hongxiang Lin,
Francesco Grussu,
Yukun Zhou,
Matteo Figini,
Daniel C. Alexander
Abstract:
We present PROSUB: PROgressive SUBsampling, a deep learning based, automated methodology that subsamples an oversampled data set (e.g. multi-channeled 3D images) with minimal loss of information. We build upon a recent dual-network approach that won the MICCAI MUlti-DIffusion (MUDI) quantitative MRI measurement sampling-reconstruction challenge, but suffers from deep learning training instability,…
▽ More
We present PROSUB: PROgressive SUBsampling, a deep learning based, automated methodology that subsamples an oversampled data set (e.g. multi-channeled 3D images) with minimal loss of information. We build upon a recent dual-network approach that won the MICCAI MUlti-DIffusion (MUDI) quantitative MRI measurement sampling-reconstruction challenge, but suffers from deep learning training instability, by subsampling with a hard decision boundary. PROSUB uses the paradigm of recursive feature elimination (RFE) and progressively subsamples measurements during deep learning training, improving optimization stability. PROSUB also integrates a neural architecture search (NAS) paradigm, allowing the network architecture hyperparameters to respond to the subsampling process. We show PROSUB outperforms the winner of the MUDI MICCAI challenge, producing large improvements >18% MSE on the MUDI challenge sub-tasks and qualitative improvements on downstream processes useful for clinical applications. We also show the benefits of incorporating NAS and analyze the effect of PROSUB's components. As our method generalizes to other problems beyond MRI measurement selection-reconstruction, our code is https://github.com/sbb-gh/PROSUB
△ Less
Submitted 11 October, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
VAFO-Loss: VAscular Feature Optimised Loss Function for Retinal Artery/Vein Segmentation
Authors:
Yukun Zhou,
Moucheng Xu,
Yipeng Hu,
Stefano B. Blumberg,
An Zhao,
Siegfried K. Wagner,
Pearse A. Keane,
Daniel C. Alexander
Abstract:
Estimating clinically-relevant vascular features following vessel segmentation is a standard pipeline for retinal vessel analysis, which provides potential ocular biomarkers for both ophthalmic disease and systemic disease. In this work, we integrate these clinical features into a novel vascular feature optimised loss function (VAFO-Loss), in order to regularise networks to produce segmentation ma…
▽ More
Estimating clinically-relevant vascular features following vessel segmentation is a standard pipeline for retinal vessel analysis, which provides potential ocular biomarkers for both ophthalmic disease and systemic disease. In this work, we integrate these clinical features into a novel vascular feature optimised loss function (VAFO-Loss), in order to regularise networks to produce segmentation maps, with which more accurate vascular features can be derived. Two common vascular features, vessel density and fractal dimension, are identified to be sensitive to intra-segment misclassification, which is a well-recognised problem in multi-class artery/vein segmentation particularly hindering the estimation of these vascular features. Thus we encode these two features into VAFO-Loss. We first show that incorporating our end-to-end VAFO-Loss in standard segmentation networks indeed improves vascular feature estimation, yielding quantitative improvement in stroke incidence prediction, a clinical downstream task. We also report a technically interesting finding that the trained segmentation network, albeit biased by the feature optimised loss VAFO-Loss, shows statistically significant improvement in segmentation metrics, compared to those trained with other state-of-the-art segmentation losses.
△ Less
Submitted 12 March, 2022;
originally announced March 2022.
-
Ten years of image analysis and machine learning competitions in dementia
Authors:
Esther E. Bron,
Stefan Klein,
Annika Reinke,
Janne M. Papma,
Lena Maier-Hein,
Daniel C. Alexander,
Neil P. Oxtoby
Abstract:
Machine learning methods exploiting multi-parametric biomarkers, especially based on neuroimaging, have huge potential to improve early diagnosis of dementia and to predict which individuals are at-risk of developing dementia. To benchmark algorithms in the field of machine learning and neuroimaging in dementia and assess their potential for use in clinical practice and clinical trials, seven gran…
▽ More
Machine learning methods exploiting multi-parametric biomarkers, especially based on neuroimaging, have huge potential to improve early diagnosis of dementia and to predict which individuals are at-risk of developing dementia. To benchmark algorithms in the field of machine learning and neuroimaging in dementia and assess their potential for use in clinical practice and clinical trials, seven grand challenges have been organized in the last decade.
The seven grand challenges addressed questions related to screening, clinical status estimation, prediction and monitoring in (pre-clinical) dementia. There was little overlap in clinical questions, tasks and performance metrics. Whereas this aids providing insight on a broad range of questions, it also limits the validation of results across challenges. The validation process itself was mostly comparable between challenges, using similar methods for ensuring objective comparison, uncertainty estimation and statistical testing. In general, winning algorithms performed rigorous data preprocessing and combined a wide range of input features.
Despite high state-of-the-art performances, most of the methods evaluated by the challenges are not clinically used. To increase impact, future challenges could pay more attention to statistical analysis of which factors relate to higher performance, to clinical questions beyond Alzheimer's disease, and to using testing data beyond the Alzheimer's Disease Neuroimaging Initiative. Grand challenges would be an ideal venue for assessing the generalizability of algorithm performance to unseen data of other cohorts. Key for increasing impact in this way are larger testing data sizes, which could be reached by sharing algorithms rather than data to exploit data that cannot be shared.
△ Less
Submitted 18 February, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
Evaluation of automated airway morphological quantification for assessing fibrosing lung disease
Authors:
Ashkan Pakzad,
Wing Keung Cheung,
Kin Quan,
Nesrin Mogulkoc,
Coline H. M. Van Moorsel,
Brian J. Bartholmai,
Hendrik W. Van Es,
Alper Ezircan,
Frouke Van Beek,
Marcel Veltkamp,
Ronald Karwoski,
Tobias Peikert,
Ryan D. Clay,
Finbar Foley,
Cassandra Braun,
Recep Savas,
Carole Sudre,
Tom Doel,
Daniel C. Alexander,
Peter Wijeratne,
David Hawkes,
Yipeng Hu,
John R Hurst,
Joseph Jacob
Abstract:
Abnormal airway dilatation, termed traction bronchiectasis, is a typical feature of idiopathic pulmonary fibrosis (IPF). Volumetric computed tomography (CT) imaging captures the loss of normal airway tapering in IPF. We postulated that automated quantification of airway abnormalities could provide estimates of IPF disease extent and severity. We propose AirQuant, an automated computational pipelin…
▽ More
Abnormal airway dilatation, termed traction bronchiectasis, is a typical feature of idiopathic pulmonary fibrosis (IPF). Volumetric computed tomography (CT) imaging captures the loss of normal airway tapering in IPF. We postulated that automated quantification of airway abnormalities could provide estimates of IPF disease extent and severity. We propose AirQuant, an automated computational pipeline that systematically parcellates the airway tree into its lobes and generational branches from a deep learning based airway segmentation, deriving airway structural measures from chest CT. Importantly, AirQuant prevents the occurrence of spurious airway branches by thick wave propagation and removes loops in the airway-tree by graph search, overcoming limitations of existing airway skeletonisation algorithms. Tapering between airway segments (intertapering) and airway tortuosity computed by AirQuant were compared between 14 healthy participants and 14 IPF patients. Airway intertapering was significantly reduced in IPF patients, and airway tortuosity was significantly increased when compared to healthy controls. Differences were most marked in the lower lobes, conforming to the typical distribution of IPF-related damage. AirQuant is an open-source pipeline that avoids limitations of existing airway quantification algorithms and has clinical interpretability. Automated airway measurements may have potential as novel imaging biomarkers of IPF severity and disease extent.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
MisMatch: Calibrated Segmentation via Consistency on Differential Morphological Feature Perturbations with Limited Labels
Authors:
Mou-Cheng Xu,
Yukun Zhou,
Chen Jin,
Marius De Groot,
Neil P. Oxtoby,
Daniel C. Alexander,
Joseph Jacob
Abstract:
Semi-supervised learning (SSL) is a promising machine learning paradigm to address the issue of label scarcity in medical imaging. SSL methods were originally developed in image classification. The state-of-the-art SSL methods in image classification utilise consistency regularisation to learn unlabelled predictions which are invariant to input level perturbations. However, image level perturbatio…
▽ More
Semi-supervised learning (SSL) is a promising machine learning paradigm to address the issue of label scarcity in medical imaging. SSL methods were originally developed in image classification. The state-of-the-art SSL methods in image classification utilise consistency regularisation to learn unlabelled predictions which are invariant to input level perturbations. However, image level perturbations violate the cluster assumption in the setting of segmentation. Moreover, existing image level perturbations are hand-crafted which could be sub-optimal. Therefore, it is a not trivial to straightforwardly adapt existing SSL image classification methods in segmentation. In this paper, we propose MisMatch, a semi-supervised segmentation framework based on the consistency between paired predictions which are derived from two differently learnt morphological feature perturbations. MisMatch consists of an encoder and two decoders. One decoder learns positive attention for foreground on unlabelled data thereby generating dilated features of foreground. The other decoder learns negative attention for foreground on the same unlabelled data thereby generating eroded features of foreground. We first develop a 2D U-net based MisMatch framework and perform extensive cross-validation on a CT-based pulmonary vessel segmentation task and show that MisMatch statistically outperforms state-of-the-art semi-supervised methods when only 6.25\% of the total labels are used. In a second experiment, we show that U-net based MisMatch outperforms state-of-the-art methods on an MRI-based brain tumour segmentation task. In a third experiment, we show that a 3D MisMatch outperforms a previous method using input level augmentations, on a left atrium segmentation task. Lastly, we find that the performance improvement of MisMatch over the baseline might originate from its better calibration.
△ Less
Submitted 2 May, 2023; v1 submitted 23 October, 2021;
originally announced October 2021.
-
Learning to Downsample for Segmentation of Ultra-High Resolution Images
Authors:
Chen Jin,
Ryutaro Tanno,
Thomy Mertzanidou,
Eleftheria Panagiotaki,
Daniel C. Alexander
Abstract:
Many computer vision systems require low-cost segmentation algorithms based on deep learning, either because of the enormous size of input images or limited computational budget. Common solutions uniformly downsample the input images to meet memory constraints, assuming all pixels are equally informative. In this work, we demonstrate that this assumption can harm the segmentation performance becau…
▽ More
Many computer vision systems require low-cost segmentation algorithms based on deep learning, either because of the enormous size of input images or limited computational budget. Common solutions uniformly downsample the input images to meet memory constraints, assuming all pixels are equally informative. In this work, we demonstrate that this assumption can harm the segmentation performance because the segmentation difficulty varies spatially. We combat this problem by introducing a learnable downsampling module, which can be optimised together with the given segmentation model in an end-to-end fashion. We formulate the problem of training such downsampling module as optimisation of sampling density distributions over the input images given their low-resolution views. To defend against degenerate solutions (e.g. over-sampling trivial regions like the backgrounds), we propose a regularisation term that encourages the sampling locations to concentrate around the object boundaries. We find the downsampling module learns to sample more densely at difficult locations, thereby improving the segmentation performance. Our experiments on benchmarks of high-resolution street view, aerial and medical images demonstrate substantial improvements in terms of efficiency-and-accuracy trade-off compared to both uniform downsampling and two recent advanced downsampling techniques.
△ Less
Submitted 15 March, 2022; v1 submitted 22 September, 2021;
originally announced September 2021.
-
Learning to Address Intra-segment Misclassification in Retinal Imaging
Authors:
Yukun Zhou,
Moucheng Xu,
Yipeng Hu,
Hongxiang Lin,
Joseph Jacob,
Pearse A. Keane,
Daniel C. Alexander
Abstract:
Accurate multi-class segmentation is a long-standing challenge in medical imaging, especially in scenarios where classes share strong similarity. Segmenting retinal blood vessels in retinal photographs is one such scenario, in which arteries and veins need to be identified and differentiated from each other and from the background. Intra-segment misclassification, i.e. veins classified as arteries…
▽ More
Accurate multi-class segmentation is a long-standing challenge in medical imaging, especially in scenarios where classes share strong similarity. Segmenting retinal blood vessels in retinal photographs is one such scenario, in which arteries and veins need to be identified and differentiated from each other and from the background. Intra-segment misclassification, i.e. veins classified as arteries or vice versa, frequently occurs when arteries and veins intersect, whereas in binary retinal vessel segmentation, error rates are much lower. We thus propose a new approach that decomposes multi-class segmentation into multiple binary, followed by a binary-to-multi-class fusion network. The network merges representations of artery, vein, and multi-class feature maps, each of which are supervised by expert vessel annotation in adversarial training. A skip-connection based merging process explicitly maintains class-specific gradients to avoid gradient vanishing in deep layers, to favor the discriminative features. The results show that, our model respectively improves F1-score by 4.4\%, 5.1\%, and 4.2\% compared with three state-of-the-art deep learning based methods on DRIVE-AV, LES-AV, and HRF-AV data sets. Code: https://github.com/rmaphoh/Learning-AVSegmentation
△ Less
Submitted 8 December, 2021; v1 submitted 25 April, 2021;
originally announced April 2021.
-
Uncertainty-Aware Annotation Protocol to Evaluate Deformable Registration Algorithms
Authors:
Loic Peter,
Daniel C. Alexander,
Caroline Magnain,
Juan Eugenio Iglesias
Abstract:
Landmark correspondences are a widely used type of gold standard in image registration. However, the manual placement of corresponding points is subject to high inter-user variability in the chosen annotated locations and in the interpretation of visual ambiguities. In this paper, we introduce a principled strategy for the construction of a gold standard in deformable registration. Our framework:…
▽ More
Landmark correspondences are a widely used type of gold standard in image registration. However, the manual placement of corresponding points is subject to high inter-user variability in the chosen annotated locations and in the interpretation of visual ambiguities. In this paper, we introduce a principled strategy for the construction of a gold standard in deformable registration. Our framework: (i) iteratively suggests the most informative location to annotate next, taking into account its redundancy with previous annotations; (ii) extends traditional pointwise annotations by accounting for the spatial uncertainty of each annotation, which can either be directly specified by the user, or aggregated from pointwise annotations from multiple experts; and (iii) naturally provides a new strategy for the evaluation of deformable registration algorithms. Our approach is validated on four different registration tasks. The experimental results show the efficacy of suggesting annotations according to their informativeness, and an improved capacity to assess the quality of the outputs of registration algorithms. In addition, our approach yields, from sparse annotations only, a dense visualization of the errors made by a registration method. The source code of our approach supporting both 2D and 3D data is publicly available at https://github.com/LoicPeter/evaluation-deformable-registration.
△ Less
Submitted 2 April, 2021;
originally announced April 2021.
-
Joint super-resolution and synthesis of 1 mm isotropic MP-RAGE volumes from clinical MRI exams with scans of different orientation, resolution and contrast
Authors:
Juan Eugenio Iglesias,
Benjamin Billot,
Yael Balbastre,
Azadeh Tabari,
John Conklin,
Daniel C. Alexander,
Polina Golland,
Brian L. Edlow,
Bruce Fischl
Abstract:
Most existing algorithms for automatic 3D morphometry of human brain MRI scans are designed for data with near-isotropic voxels at approximately 1 mm resolution, and frequently have contrast constraints as well - typically requiring T1 scans (e.g., MP-RAGE). This limitation prevents the analysis of millions of MRI scans acquired with large inter-slice spacing ("thick slice") in clinical settings e…
▽ More
Most existing algorithms for automatic 3D morphometry of human brain MRI scans are designed for data with near-isotropic voxels at approximately 1 mm resolution, and frequently have contrast constraints as well - typically requiring T1 scans (e.g., MP-RAGE). This limitation prevents the analysis of millions of MRI scans acquired with large inter-slice spacing ("thick slice") in clinical settings every year. The inability to quantitatively analyze these scans hinders the adoption of quantitative neuroimaging in healthcare, and precludes research studies that could attain huge sample sizes and hence greatly improve our understanding of the human brain. Recent advances in CNNs are producing outstanding results in super-resolution and contrast synthesis of MRI. However, these approaches are very sensitive to the contrast, resolution and orientation of the input images, and thus do not generalize to diverse clinical acquisition protocols - even within sites. Here we present SynthSR, a method to train a CNN that receives one or more thick-slice scans with different contrast, resolution and orientation, and produces an isotropic scan of canonical contrast (typically a 1 mm MP-RAGE). The presented method does not require any preprocessing, e.g., skull stripping or bias field correction. Crucially, SynthSR trains on synthetic input images generated from 3D segmentations, and can thus be used to train CNNs for any combination of contrasts, resolutions and orientations without high-resolution training data. We test the images generated with SynthSR in an array of common downstream analyses, and show that they can be reliably used for subcortical segmentation and volumetry, image registration (e.g., for tensor-based morphometry), and, if some image quality requirements are met, even cortical thickness morphometry. The source code is publicly available at github.com/BBillot/SynthSR.
△ Less
Submitted 24 December, 2020;
originally announced December 2020.
-
DeepReg: a deep learning toolkit for medical image registration
Authors:
Yunguan Fu,
Nina Montaña Brown,
Shaheer U. Saeed,
Adrià Casamitjana,
Zachary M. C. Baum,
Rémi Delaunay,
Qianye Yang,
Alexander Grimwood,
Zhe Min,
Stefano B. Blumberg,
Juan Eugenio Iglesias,
Dean C. Barratt,
Ester Bonmati,
Daniel C. Alexander,
Matthew J. Clarkson,
Tom Vercauteren,
Yipeng Hu
Abstract:
DeepReg (https://github.com/DeepRegNet/DeepReg) is a community-supported open-source toolkit for research and education in medical image registration using deep learning.
DeepReg (https://github.com/DeepRegNet/DeepReg) is a community-supported open-source toolkit for research and education in medical image registration using deep learning.
△ Less
Submitted 4 November, 2020;
originally announced November 2020.
-
Learning transition times in event sequences: the Event-Based Hidden Markov Model of disease progression
Authors:
Peter A. Wijeratne,
Daniel C. Alexander
Abstract:
Progressive diseases worsen over time and are characterised by monotonic change in features that track disease progression. Here we connect ideas from two formerly separate methodologies -- event-based and hidden Markov modelling -- to derive a new generative model of disease progression. Our model can uniquely infer the most likely group-level sequence and timing of events (natural history) from…
▽ More
Progressive diseases worsen over time and are characterised by monotonic change in features that track disease progression. Here we connect ideas from two formerly separate methodologies -- event-based and hidden Markov modelling -- to derive a new generative model of disease progression. Our model can uniquely infer the most likely group-level sequence and timing of events (natural history) from limited datasets. Moreover, it can infer and predict individual-level trajectories (prognosis) even when data are missing, giving it high clinical utility. Here we derive the model and provide an inference scheme based on the expectation maximisation algorithm. We use clinical, imaging and biofluid data from the Alzheimer's Disease Neuroimaging Initiative to demonstrate the validity and utility of our model. First, we train our model to uncover a new group-level sequence of feature changes in Alzheimer's disease over a period of ${\sim}17.3$ years. Next, we demonstrate that our model provides improved utility over a continuous time hidden Markov model by area under the receiver operator characteristic curve ${\sim}0.23$. Finally, we demonstrate that our model maintains predictive accuracy with up to $50\%$ missing data. These results support the clinical validity of our model and its broader utility in resource-limited medical applications.
△ Less
Submitted 4 June, 2021; v1 submitted 2 November, 2020;
originally announced November 2020.
-
Disentangling Human Error from the Ground Truth in Segmentation of Medical Images
Authors:
Le Zhang,
Ryutaro Tanno,
Mou-Cheng Xu,
Chen Jin,
Joseph Jacob,
Olga Ciccarelli,
Frederik Barkhof,
Daniel C. Alexander
Abstract:
Recent years have seen increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depends on the quality of labels. This problem is particularly pertinent in the medical image domain, where both the annotation cost and inter-observer variability are high. In a typical label acquisition process, different human experts provide their…
▽ More
Recent years have seen increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depends on the quality of labels. This problem is particularly pertinent in the medical image domain, where both the annotation cost and inter-observer variability are high. In a typical label acquisition process, different human experts provide their estimates of the "true" segmentation labels under the influence of their own biases and competence levels. Treating these noisy labels blindly as the ground truth limits the performance that automatic segmentation algorithms can achieve. In this work, we present a method for jointly learning, from purely noisy observations alone, the reliability of individual annotators and the true segmentation label distributions, using two coupled CNNs. The separation of the two is achieved by encouraging the estimated annotators to be maximally unreliable while achieving high fidelity with the noisy training data. We first define a toy segmentation dataset based on MNIST and study the properties of the proposed algorithm. We then demonstrate the utility of the method on three public medical imaging segmentation datasets with simulated (when necessary) and real diverse annotations: 1) MSLSC (multiple-sclerosis lesions); 2) BraTS (brain tumours); 3) LIDC-IDRI (lung abnormalities). In all cases, our method outperforms competing methods and relevant baselines particularly in cases where the number of annotations is small and the amount of disagreement is large. The experiments also show strong ability to capture the complex spatial characteristics of annotators' mistakes.
△ Less
Submitted 23 October, 2020; v1 submitted 31 July, 2020;
originally announced July 2020.
-
Learning To Pay Attention To Mistakes
Authors:
Mou-Cheng Xu,
Neil P. Oxtoby,
Daniel C. Alexander,
Joseph Jacob
Abstract:
In convolutional neural network based medical image segmentation, the periphery of foreground regions representing malignant tissues may be disproportionately assigned as belonging to the background class of healthy tissues \cite{attenUnet}\cite{AttenUnet2018}\cite{InterSeg}\cite{UnetFrontNeuro}\cite{LearnActiveContour}. This leads to high false negative detection rates. In this paper, we propose…
▽ More
In convolutional neural network based medical image segmentation, the periphery of foreground regions representing malignant tissues may be disproportionately assigned as belonging to the background class of healthy tissues \cite{attenUnet}\cite{AttenUnet2018}\cite{InterSeg}\cite{UnetFrontNeuro}\cite{LearnActiveContour}. This leads to high false negative detection rates. In this paper, we propose a novel attention mechanism to directly address such high false negative rates, called Paying Attention to Mistakes. Our attention mechanism steers the models towards false positive identification, which counters the existing bias towards false negatives. The proposed mechanism has two complementary implementations: (a) "explicit" steering of the model to attend to a larger Effective Receptive Field on the foreground areas; (b) "implicit" steering towards false positives, by attending to a smaller Effective Receptive Field on the background areas. We validated our methods on three tasks: 1) binary dense prediction between vehicles and the background using CityScapes; 2) Enhanced Tumour Core segmentation with multi-modal MRI scans in BRATS2018; 3) segmenting stroke lesions using ultrasound images in ISLES2018. We compared our methods with state-of-the-art attention mechanisms in medical imaging, including self-attention, spatial-attention and spatial-channel mixed attention. Across all of the three different tasks, our models consistently outperform the baseline models in Intersection over Union (IoU) and/or Hausdorff Distance (HD). For instance, in the second task, the "explicit" implementation of our mechanism reduces the HD of the best baseline by more than $26\%$, whilst improving the IoU by more than $3\%$. We believe our proposed attention mechanism can benefit a wide range of medical and computer vision tasks, which suffer from over-detection of background.
△ Less
Submitted 7 August, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Foveation for Segmentation of Ultra-High Resolution Images
Authors:
Chen Jin,
Ryutaro Tanno,
Moucheng Xu,
Thomy Mertzanidou,
Daniel C. Alexander
Abstract:
Segmentation of ultra-high resolution images is challenging because of their enormous size, consisting of millions or even billions of pixels. Typical solutions include dividing input images into patches of fixed size and/or down-sampling to meet memory constraints. Such operations incur information loss in the field-of-view (FoV) i.e., spatial coverage and the image resolution. The impact on segm…
▽ More
Segmentation of ultra-high resolution images is challenging because of their enormous size, consisting of millions or even billions of pixels. Typical solutions include dividing input images into patches of fixed size and/or down-sampling to meet memory constraints. Such operations incur information loss in the field-of-view (FoV) i.e., spatial coverage and the image resolution. The impact on segmentation performance is, however, as yet understudied. In this work, we start with a motivational experiment which demonstrates that the trade-off between FoV and resolution affects the segmentation performance on ultra-high resolution images---and furthermore, its influence also varies spatially according to the local patterns in different areas. We then introduce foveation module, a learnable "dataloader" which, for a given ultra-high resolution image, adaptively chooses the appropriate configuration (FoV/resolution trade-off) of the input patch to feed to the downstream segmentation model at each spatial location of the image. The foveation module is jointly trained with the segmentation network to maximise the task performance. We demonstrate on three publicly available high-resolution image datasets that the foveation module consistently improves segmentation performance over the cases trained with patches of fixed FoV/resolution trade-off. Our approach achieves the SoTA performance on the DeepGlobe aerial image dataset. On the Gleason2019 histopathology dataset, our model achieves better segmentation accuracy for the two most clinically important and ambiguous classes (Gleason Grade 3 and 4) than the top performers in the challenge by 13.1% and 7.5%, and improves on the average performance of 6 human experts by 6.5% and 7.5%. Our code and trained models are available at $\text{https://github.com/lxasqjc/Foveation-Segmentation}$.
△ Less
Submitted 31 July, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Image Quality Transfer Enhances Contrast and Resolution of Low-Field Brain MRI in African Paediatric Epilepsy Patients
Authors:
Matteo Figini,
Hongxiang Lin,
Godwin Ogbole,
Felice D Arco,
Stefano B. Blumberg,
David W. Carmichael,
Ryutaro Tanno,
Enrico Kaden,
Biobele J. Brown,
Ikeoluwa Lagunju,
Helen J. Cross,
Delmiro Fernandez-Reyes,
Daniel C. Alexander
Abstract:
1.5T or 3T scanners are the current standard for clinical MRI, but low-field (<1T) scanners are still common in many lower- and middle-income countries for reasons of cost and robustness to power failures. Compared to modern high-field scanners, low-field scanners provide images with lower signal-to-noise ratio at equivalent resolution, leaving practitioners to compensate by using large slice thic…
▽ More
1.5T or 3T scanners are the current standard for clinical MRI, but low-field (<1T) scanners are still common in many lower- and middle-income countries for reasons of cost and robustness to power failures. Compared to modern high-field scanners, low-field scanners provide images with lower signal-to-noise ratio at equivalent resolution, leaving practitioners to compensate by using large slice thickness and incomplete spatial coverage. Furthermore, the contrast between different types of brain tissue may be substantially reduced even at equal signal-to-noise ratio, which limits diagnostic value. Recently the paradigm of Image Quality Transfer has been applied to enhance 0.36T structural images aiming to approximate the resolution, spatial coverage, and contrast of typical 1.5T or 3T images. A variant of the neural network U-Net was trained using low-field images simulated from the publicly available 3T Human Connectome Project dataset. Here we present qualitative results from real and simulated clinical low-field brain images showing the potential value of IQT to enhance the clinical utility of readily accessible low-field MRIs in the management of epilepsy.
△ Less
Submitted 18 March, 2020; v1 submitted 16 March, 2020;
originally announced March 2020.