Search | arXiv e-print repository

arXiv:2506.20683 [pdf, ps, other]

Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG

Authors: Alexander Selivanov, Philip Müller, Özgün Turgut, Nil Stolt-Ansó, Daniel Rückert

Abstract: An electrocardiogram (ECG) is a widely used, cost-effective tool for detecting electrical abnormalities in the heart. However, it cannot directly measure functional parameters, such as ventricular volumes and ejection fraction, which are crucial for assessing cardiac function. Cardiac magnetic resonance (CMR) is the gold standard for these measurements, providing detailed structural and functional… ▽ More An electrocardiogram (ECG) is a widely used, cost-effective tool for detecting electrical abnormalities in the heart. However, it cannot directly measure functional parameters, such as ventricular volumes and ejection fraction, which are crucial for assessing cardiac function. Cardiac magnetic resonance (CMR) is the gold standard for these measurements, providing detailed structural and functional insights, but is expensive and less accessible. To bridge this gap, we propose PTACL (Patient and Temporal Alignment Contrastive Learning), a multimodal contrastive learning framework that enhances ECG representations by integrating spatio-temporal information from CMR. PTACL uses global patient-level contrastive loss and local temporal-level contrastive loss. The global loss aligns patient-level representations by pulling ECG and CMR embeddings from the same patient closer together, while pushing apart embeddings from different patients. Local loss enforces fine-grained temporal alignment within each patient by contrasting encoded ECG segments with corresponding encoded CMR frames. This approach enriches ECG representations with diagnostic information beyond electrical activity and transfers more insights between modalities than global alignment alone, all without introducing new learnable weights. We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank. Compared to baseline approaches, PTACL achieves better performance in two clinically relevant tasks: (1) retrieving patients with similar cardiac phenotypes and (2) predicting CMR-derived cardiac function parameters, such as ventricular volumes and ejection fraction. Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG. The code is available at: https://github.com/alsalivan/ecgcmr △ Less

Submitted 24 June, 2025; originally announced June 2025.

Comments: accepted to MICCAI 2025 (Springer LNCS)

arXiv:2505.23408 [pdf, ps, other]

doi 10.1109/TMI.2025.3570226

Self-supervised feature learning for cardiac Cine MR image reconstruction

Authors: Siying Xu, Marcel Früh, Kerstin Hammernik, Andreas Lingg, Jens Kübler, Patrick Krumm, Daniel Rueckert, Sergios Gatidis, Thomas Küstner

Abstract: We propose a self-supervised feature learning assisted reconstruction (SSFL-Recon) framework for MRI reconstruction to address the limitation of existing supervised learning methods. Although recent deep learning-based methods have shown promising performance in MRI reconstruction, most require fully-sampled images for supervised learning, which is challenging in practice considering long acquisit… ▽ More We propose a self-supervised feature learning assisted reconstruction (SSFL-Recon) framework for MRI reconstruction to address the limitation of existing supervised learning methods. Although recent deep learning-based methods have shown promising performance in MRI reconstruction, most require fully-sampled images for supervised learning, which is challenging in practice considering long acquisition times under respiratory or organ motion. Moreover, nearly all fully-sampled datasets are obtained from conventional reconstruction of mildly accelerated datasets, thus potentially biasing the achievable performance. The numerous undersampled datasets with different accelerations in clinical practice, hence, remain underutilized. To address these issues, we first train a self-supervised feature extractor on undersampled images to learn sampling-insensitive features. The pre-learned features are subsequently embedded in the self-supervised reconstruction network to assist in removing artifacts. Experiments were conducted retrospectively on an in-house 2D cardiac Cine dataset, including 91 cardiovascular patients and 38 healthy subjects. The results demonstrate that the proposed SSFL-Recon framework outperforms existing self-supervised MRI reconstruction methods and even exhibits comparable or better performance to supervised learning up to $16\times$ retrospective undersampling. The feature learning strategy can effectively extract global representations, which have proven beneficial in removing artifacts and increasing generalization ability during reconstruction. △ Less

Submitted 29 May, 2025; originally announced May 2025.

Comments: Accepted to IEEE Transactions on Medical Imaging (TMI), 2025

arXiv:2505.09565 [pdf, other]

Meta-learning Slice-to-Volume Reconstruction in Fetal Brain MRI using Implicit Neural Representations

Authors: Maik Dannecker, Thomas Sanchez, Meritxell Bach Cuadra, Özgün Turgut, Anthony N. Price, Lucilio Cordero-Grande, Vanessa Kyriakopoulou, Joseph V. Hajnal, Daniel Rueckert

Abstract: High-resolution slice-to-volume reconstruction (SVR) from multiple motion-corrupted low-resolution 2D slices constitutes a critical step in image-based diagnostics of moving subjects, such as fetal brain Magnetic Resonance Imaging (MRI). Existing solutions struggle with image artifacts and severe subject motion or require slice pre-alignment to achieve satisfying reconstruction performance. We pro… ▽ More High-resolution slice-to-volume reconstruction (SVR) from multiple motion-corrupted low-resolution 2D slices constitutes a critical step in image-based diagnostics of moving subjects, such as fetal brain Magnetic Resonance Imaging (MRI). Existing solutions struggle with image artifacts and severe subject motion or require slice pre-alignment to achieve satisfying reconstruction performance. We propose a novel SVR method to enable fast and accurate MRI reconstruction even in cases of severe image and motion corruption. Our approach performs motion correction, outlier handling, and super-resolution reconstruction with all operations being entirely based on implicit neural representations. The model can be initialized with task-specific priors through fully self-supervised meta-learning on either simulated or real-world data. In extensive experiments including over 480 reconstructions of simulated and clinical MRI brain data from different centers, we prove the utility of our method in cases of severe subject motion and image artifacts. Our results demonstrate improvements in reconstruction quality, especially in the presence of severe motion, compared to state-of-the-art methods, and up to 50% reduction in reconstruction time. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: 10 pages, 6 figures

arXiv:2504.13037 [pdf, ps, other]

Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond

Authors: Yundi Zhang, Paul Hager, Che Liu, Suprosanna Shit, Chen Chen, Daniel Rueckert, Jiazhen Pan

Abstract: Cardiac magnetic resonance imaging is the gold standard for non-invasive cardiac assessment, offering rich spatio-temporal views of the cardiac anatomy and physiology. Patient-level health factors, such as demographics, metabolic, and lifestyle, are known to substantially influence cardiovascular health and disease risk, yet remain uncaptured by CMR alone. To holistically understand cardiac health… ▽ More Cardiac magnetic resonance imaging is the gold standard for non-invasive cardiac assessment, offering rich spatio-temporal views of the cardiac anatomy and physiology. Patient-level health factors, such as demographics, metabolic, and lifestyle, are known to substantially influence cardiovascular health and disease risk, yet remain uncaptured by CMR alone. To holistically understand cardiac health and to enable the best possible interpretation of an individual's disease risk, CMR and patient-level factors must be jointly exploited within an integrated framework. Recent multi-modal approaches have begun to bridge this gap, yet they often rely on limited spatio-temporal data and focus on isolated clinical tasks, thereby hindering the development of a comprehensive representation for cardiac health evaluation. To overcome these limitations, we introduce ViTa, a step toward foundation models that delivers a comprehensive representation of the heart and a precise interpretation of individual disease risk. Leveraging data from 42,000 UK Biobank participants, ViTa integrates 3D+T cine stacks from short-axis and long-axis views, enabling a complete capture of the cardiac cycle. These imaging data are then fused with detailed tabular patient-level factors, enabling context-aware insights. This multi-modal paradigm supports a wide spectrum of downstream tasks, including cardiac phenotype and physiological feature prediction, segmentation, and classification of cardiac and metabolic diseases within a single unified framework. By learning a shared latent representation that bridges rich imaging features and patient context, ViTa moves beyond traditional, task-specific models toward a universal, patient-specific understanding of cardiac health, highlighting its potential to advance clinical utility and scalability in cardiac analysis. △ Less

Submitted 2 July, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

arXiv:2503.11846 [pdf, other]

From Pixels to Histopathology: A Graph-Based Framework for Interpretable Whole Slide Image Analysis

Authors: Alexander Weers, Alexander H. Berger, Laurin Lux, Peter Schüffler, Daniel Rueckert, Johannes C. Paetzold

Abstract: The histopathological classification of whole-slide images (WSIs) is a fundamental task in digital pathology; yet it requires extensive time and expertise from specialists. While deep learning methods show promising results, they typically process WSIs by dividing them into artificial patches, which inherently prevents a network from learning from the entire image context, disregards natural tissu… ▽ More The histopathological classification of whole-slide images (WSIs) is a fundamental task in digital pathology; yet it requires extensive time and expertise from specialists. While deep learning methods show promising results, they typically process WSIs by dividing them into artificial patches, which inherently prevents a network from learning from the entire image context, disregards natural tissue structures and compromises interpretability. Our method overcomes this limitation through a novel graph-based framework that constructs WSI graph representations. The WSI-graph efficiently captures essential histopathological information in a compact form. We build tissue representations (nodes) that follow biological boundaries rather than arbitrary patches all while providing interpretable features for explainability. Through adaptive graph coarsening guided by learned embeddings, we progressively merge regions while maintaining discriminative local features and enabling efficient global information exchange. In our method's final step, we solve the diagnostic task through a graph attention network. We empirically demonstrate strong performance on multiple challenging tasks such as cancer stage classification and survival prediction, while also identifying predictive factors using Integrated Gradients. Our implementation is publicly available at https://github.com/HistoGraph31/pix2pathology △ Less

Submitted 14 March, 2025; originally announced March 2025.

Comments: 11 pages, 2 figures

arXiv:2503.00230 [pdf, other]

Physics-Informed Implicit Neural Representations for Joint B0 Estimation and Echo Planar Imaging

Authors: Wenqi Huang, Nan Wang, Congyu Liao, Yimeng Lin, Mengze Gao, Daniel Rueckert, Kawin Setsompop

Abstract: Echo Planar Imaging (EPI) is widely used for its rapid acquisition but suffers from severe geometric distortions due to B0 inhomogeneities, particularly along the phase encoding direction. Existing methods follow a two-step process: reconstructing blip-up/down EPI images, then estimating B0, which can introduce error accumulation and reduce correction accuracy. This is especially problematic in hi… ▽ More Echo Planar Imaging (EPI) is widely used for its rapid acquisition but suffers from severe geometric distortions due to B0 inhomogeneities, particularly along the phase encoding direction. Existing methods follow a two-step process: reconstructing blip-up/down EPI images, then estimating B0, which can introduce error accumulation and reduce correction accuracy. This is especially problematic in high B0 regions, where distortions align along the same axis, making them harder to disentangle. In this work, we propose a novel approach that integrates Implicit Neural Representations (INRs) with a physics-informed correction model to jointly estimate B0 inhomogeneities and reconstruct distortion-free images from rotated-view EPI acquisitions. INRs offer a flexible, continuous representation that inherently captures complex spatial variations without requiring predefined grid-based field maps. By leveraging this property, our method dynamically adapts to subject-specific B0 variations and improves robustness across different imaging conditions. Experimental results on 180 slices of brain images from three subjects demonstrate that our approach outperforms traditional methods in terms of reconstruction quality and field estimation accuracy. △ Less

Submitted 27 March, 2025; v1 submitted 28 February, 2025; originally announced March 2025.

arXiv:2501.16471 [pdf, other]

SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments

Authors: Simon Dahan, Gabriel Bénédict, Logan Z. J. Williams, Yourong Guo, Daniel Rueckert, Robert Leech, Emma C. Robinson

Abstract: Current AI frameworks for brain decoding and encoding, typically train and test models within the same datasets. This limits their utility for brain computer interfaces (BCI) or neurofeedback, for which it would be useful to pool experiences across individuals to better simulate stimuli not sampled during training. A key obstacle to model generalisation is the degree of variability of inter-subjec… ▽ More Current AI frameworks for brain decoding and encoding, typically train and test models within the same datasets. This limits their utility for brain computer interfaces (BCI) or neurofeedback, for which it would be useful to pool experiences across individuals to better simulate stimuli not sampled during training. A key obstacle to model generalisation is the degree of variability of inter-subject cortical organisation, which makes it difficult to align or compare cortical signals across participants. In this paper we address this through the use of surface vision transformers, which build a generalisable model of cortical functional dynamics, through encoding the topography of cortical networks and their interactions as a moving image across a surface. This is then combined with tri-modal self-supervised contrastive (CLIP) alignment of audio, video, and fMRI modalities to enable the retrieval of visual and auditory stimuli from patterns of cortical activity (and vice-versa). We validate our approach on 7T task-fMRI data from 174 healthy participants engaged in the movie-watching experiment from the Human Connectome Project (HCP). Results show that it is possible to detect which movie clips an individual is watching purely from their brain activity, even for individuals and movies not seen during training. Further analysis of attention maps reveals that our model captures individual patterns of brain activity that reflect semantic and visual systems. This opens the door to future personalised simulations of brain function. Code & pre-trained models will be made available at https://github.com/metrics-lab/sim, processed data for training will be available upon request at https://gin.g-node.org/Sdahan30/sim. △ Less

Submitted 27 January, 2025; originally announced January 2025.

Comments: 27 pages, accepted to ICLR 2025

arXiv:2501.10757 [pdf, other]

Deformable Image Registration of Dark-Field Chest Radiographs for Local Lung Signal Change Assessment

Authors: Fabian Drexel, Vasiliki Sideri-Lampretsa, Henriette Bast, Alexander W. Marka, Thomas Koehler, Florian T. Gassert, Daniela Pfeiffer, Daniel Rueckert, Franz Pfeiffer

Abstract: Dark-field radiography of the human chest has been demonstrated to have promising potential for the analysis of the lung microstructure and the diagnosis of respiratory diseases. However, previous studies of dark-field chest radiographs evaluated the lung signal only in the inspiratory breathing state. Our work aims to add a new perspective to these previous assessments by locally comparing dark-f… ▽ More Dark-field radiography of the human chest has been demonstrated to have promising potential for the analysis of the lung microstructure and the diagnosis of respiratory diseases. However, previous studies of dark-field chest radiographs evaluated the lung signal only in the inspiratory breathing state. Our work aims to add a new perspective to these previous assessments by locally comparing dark-field lung information between different respiratory states. To this end, we discuss suitable image registration methods for dark-field chest radiographs to enable consistent spatial alignment of the lung in distinct breathing states. Utilizing full inspiration and expiration scans from a clinical chronic obstructive pulmonary disease study, we assess the performance of the proposed registration framework and outline applicable evaluation approaches. Our regional characterization of lung dark-field signal changes between the breathing states provides a proof-of-principle that dynamic radiography-based lung function assessment approaches may benefit from considering registered dark-field images in addition to standard plain chest radiographs. △ Less

Submitted 18 January, 2025; originally announced January 2025.

Comments: 10 pages, 6 figures

arXiv:2501.09403 [pdf, other]

PISCO: Self-Supervised k-Space Regularization for Improved Neural Implicit k-Space Representations of Dynamic MRI

Authors: Veronika Spieker, Hannah Eichhorn, Wenqi Huang, Jonathan K. Stelter, Tabita Catalan, Rickmer F. Braren, Daniel Rueckert, Francisco Sahli Costabal, Kerstin Hammernik, Dimitrios C. Karampinos, Claudia Prieto, Julia A. Schnabel

Abstract: Neural implicit k-space representations (NIK) have shown promising results for dynamic magnetic resonance imaging (MRI) at high temporal resolutions. Yet, reducing acquisition time, and thereby available training data, results in severe performance drops due to overfitting. To address this, we introduce a novel self-supervised k-space loss function $\mathcal{L}_\mathrm{PISCO}$, applicable for regu… ▽ More Neural implicit k-space representations (NIK) have shown promising results for dynamic magnetic resonance imaging (MRI) at high temporal resolutions. Yet, reducing acquisition time, and thereby available training data, results in severe performance drops due to overfitting. To address this, we introduce a novel self-supervised k-space loss function $\mathcal{L}_\mathrm{PISCO}$, applicable for regularization of NIK-based reconstructions. The proposed loss function is based on the concept of parallel imaging-inspired self-consistency (PISCO), enforcing a consistent global k-space neighborhood relationship without requiring additional data. Quantitative and qualitative evaluations on static and dynamic MR reconstructions show that integrating PISCO significantly improves NIK representations. Particularly for high acceleration factors (R$\geq$54), NIK with PISCO achieves superior spatio-temporal reconstruction quality compared to state-of-the-art methods. Furthermore, an extensive analysis of the loss assumptions and stability shows PISCO's potential as versatile self-supervised k-space loss function for further applications and architectures. Code is available at: https://github.com/compai-lab/2025-pisco-spieker △ Less

Submitted 16 January, 2025; originally announced January 2025.

arXiv:2501.08667 [pdf, ps, other]

TimeFlow: Longitudinal Brain Image Registration and Aging Progression Analysis

Authors: Bailiang Jian, Jiazhen Pan, Yitong Li, Fabian Bongratz, Ruochen Li, Daniel Rueckert, Benedikt Wiestler, Christian Wachinger

Abstract: Predicting future brain states is crucial for understanding healthy aging and neurodegenerative diseases. Longitudinal brain MRI registration, a cornerstone for such analyses, has long been limited by its inability to forecast future developments, reliance on extensive dense longitudinal data, and the need to balance registration accuracy with temporal smoothness. In this work, we present \emph{Ti… ▽ More Predicting future brain states is crucial for understanding healthy aging and neurodegenerative diseases. Longitudinal brain MRI registration, a cornerstone for such analyses, has long been limited by its inability to forecast future developments, reliance on extensive dense longitudinal data, and the need to balance registration accuracy with temporal smoothness. In this work, we present \emph{TimeFlow}, a novel framework for longitudinal brain MRI registration that overcomes all these challenges. TimeFlow leverages a U-Net architecture with temporal conditioning inspired by diffusion models, enabling accurate registration using only two images as input and facilitating prospective analyses through future image prediction. Unlike traditional methods, TimeFlow eliminates the demand for explicit smoothness regularizers and dense sequential data while maintaining temporal consistency and continuity. Experimental results highlight its superior performance in both future timepoint prediction and registration accuracy compared to state-of-the-art methods. Additionally, TimeFlow supports novel biological brain aging analyses, effectively differentiating neurodegenerative conditions from healthy aging, all without requiring segmentation, thus avoiding non-trivial annotation and inconsistent segmentation flaws. This framework paves the way for accurate, data-efficient, and annotation-free prospective analyses of brain aging and chronic diseases. △ Less

Submitted 7 July, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

arXiv:2412.15740 [pdf, other]

From Model Based to Learned Regularization in Medical Image Registration: A Comprehensive Review

Authors: Anna Reithmeir, Veronika Spieker, Vasiliki Sideri-Lampretsa, Daniel Rueckert, Julia A. Schnabel, Veronika A. Zimmer

Abstract: Image registration is fundamental in medical imaging applications, such as disease progression analysis or radiation therapy planning. The primary objective of image registration is to precisely capture the deformation between two or more images, typically achieved by minimizing an optimization problem. Due to its inherent ill-posedness, regularization is a key component in driving the solution to… ▽ More Image registration is fundamental in medical imaging applications, such as disease progression analysis or radiation therapy planning. The primary objective of image registration is to precisely capture the deformation between two or more images, typically achieved by minimizing an optimization problem. Due to its inherent ill-posedness, regularization is a key component in driving the solution toward anatomically meaningful deformations. A wide range of regularization methods has been proposed for both conventional and deep learning-based registration. However, the appropriate application of regularization techniques often depends on the specific registration problem, and no one-fits-all method exists. Despite its importance, regularization is often overlooked or addressed with default approaches, assuming existing methods are sufficient. A comprehensive and structured review remains missing. This review addresses this gap by introducing a novel taxonomy that systematically categorizes the diverse range of proposed regularization methods. It highlights the emerging field of learned regularization, which leverages data-driven techniques to automatically derive deformation properties from the data. Moreover, this review examines the transfer of regularization methods from conventional to learning-based registration, identifies open challenges, and outlines future research directions. By emphasizing the critical role of regularization in image registration, we hope to inspire the research community to reconsider regularization strategies in modern registration algorithms and to explore this rapidly evolving field further. △ Less

Submitted 20 December, 2024; originally announced December 2024.

Comments: Submitted to Medical Image Analysis

arXiv:2412.15342 [pdf, other]

DCRA-Net: Attention-Enabled Reconstruction Model for Dynamic Fetal Cardiac MRI

Authors: Denis Prokopenko, David F. A. Lloyd, Amedeo Chiribiri, Daniel Rueckert, Joseph V. Hajnal

Abstract: Dynamic fetal heart magnetic resonance imaging (MRI) presents unique challenges due to the fast heart rate of the fetus compared to adult subjects and uncontrolled fetal motion. This requires high temporal and spatial resolutions over a large field of view, in order to encompass surrounding maternal anatomy. In this work, we introduce Dynamic Cardiac Reconstruction Attention Network (DCRA-Net) - a… ▽ More Dynamic fetal heart magnetic resonance imaging (MRI) presents unique challenges due to the fast heart rate of the fetus compared to adult subjects and uncontrolled fetal motion. This requires high temporal and spatial resolutions over a large field of view, in order to encompass surrounding maternal anatomy. In this work, we introduce Dynamic Cardiac Reconstruction Attention Network (DCRA-Net) - a novel deep learning model that employs attention mechanisms in spatial and temporal domains and temporal frequency representation of data to reconstruct the dynamics of the fetal heart from highly accelerated free-running (non-gated) MRI acquisitions. DCRA-Net was trained on retrospectively undersampled complex-valued cardiac MRIs from 42 fetal subjects and separately from 153 adult subjects, and evaluated on data from 14 fetal and 39 adult subjects respectively. Its performance was compared to L+S and k-GIN methods in both fetal and adult cases for an undersampling factor of 8x. The proposed network performed better than the comparators for both fetal and adult data, for both regular lattice and centrally weighted random undersampling. Aliased signals due to the undersampling were comprehensively resolved, and both the spatial details of the heart and its temporal dynamics were recovered with high fidelity. The highest performance was achieved when using lattice undersampling, data consistency and temporal frequency representation, yielding PSNR of 38 for fetal and 35 for adult cases. Our method is publicly available at https://github.com/denproc/DCRA-Net. △ Less

Submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.12742 [pdf, other]

Subspace Implicit Neural Representations for Real-Time Cardiac Cine MR Imaging

Authors: Wenqi Huang, Veronika Spieker, Siying Xu, Gastao Cruz, Claudia Prieto, Julia Schnabel, Kerstin Hammernik, Thomas Kuestner, Daniel Rueckert

Abstract: Conventional cardiac cine MRI methods rely on retrospective gating, which limits temporal resolution and the ability to capture continuous cardiac dynamics, particularly in patients with arrhythmias and beat-to-beat variations. To address these challenges, we propose a reconstruction framework based on subspace implicit neural representations for real-time cardiac cine MRI of continuously sampled… ▽ More Conventional cardiac cine MRI methods rely on retrospective gating, which limits temporal resolution and the ability to capture continuous cardiac dynamics, particularly in patients with arrhythmias and beat-to-beat variations. To address these challenges, we propose a reconstruction framework based on subspace implicit neural representations for real-time cardiac cine MRI of continuously sampled radial data. This approach employs two multilayer perceptrons to learn spatial and temporal subspace bases, leveraging the low-rank properties of cardiac cine MRI. Initialized with low-resolution reconstructions, the networks are fine-tuned using spoke-specific loss functions to recover spatial details and temporal fidelity. Our method directly utilizes the continuously sampled radial k-space spokes during training, thereby eliminating the need for binning and non-uniform FFT. This approach achieves superior spatial and temporal image quality compared to conventional binned methods at the acceleration rate of 10 and 20, demonstrating potential for high-resolution imaging of dynamic cardiac events and enhancing diagnostic capability. △ Less

Submitted 17 December, 2024; originally announced December 2024.

arXiv:2410.18834 [pdf, ps, other]

Highly efficient non-rigid registration in k-space with application to cardiac Magnetic Resonance Imaging

Authors: Aya Ghoul, Kerstin Hammernik, Andreas Lingg, Patrick Krumm, Daniel Rueckert, Sergios Gatidis, Thomas Küstner

Abstract: In Magnetic Resonance Imaging (MRI), high temporal-resolved motion can be useful for image acquisition and reconstruction, MR-guided radiotherapy, dynamic contrast-enhancement, flow and perfusion imaging, and functional assessment of motion patterns in cardiovascular, abdominal, peristaltic, fetal, or musculoskeletal imaging. Conventionally, these motion estimates are derived through image-based r… ▽ More In Magnetic Resonance Imaging (MRI), high temporal-resolved motion can be useful for image acquisition and reconstruction, MR-guided radiotherapy, dynamic contrast-enhancement, flow and perfusion imaging, and functional assessment of motion patterns in cardiovascular, abdominal, peristaltic, fetal, or musculoskeletal imaging. Conventionally, these motion estimates are derived through image-based registration, a particularly challenging task for complex motion patterns and high dynamic resolution. The accelerated scans in such applications result in imaging artifacts that compromise the motion estimation. In this work, we propose a novel self-supervised deep learning-based framework, dubbed the Local-All Pass Attention Network (LAPANet), for non-rigid motion estimation directly from the acquired accelerated Fourier space, i.e. k-space. The proposed approach models non-rigid motion as the cumulative sum of local translational displacements, following the Local All-Pass (LAP) registration technique. LAPANet was evaluated on cardiac motion estimation across various sampling trajectories and acceleration rates. Our results demonstrate superior accuracy compared to prior conventional and deep learning-based registration methods, accommodating as few as 2 lines/frame in a Cartesian trajectory and 3 spokes/frame in a non-Cartesian trajectory. The achieved high temporal resolution (less than 5 ms) for non-rigid motion opens new avenues for motion detection, tracking and correction in dynamic and real-time MRI applications. △ Less

Submitted 24 October, 2024; originally announced October 2024.

arXiv:2410.01665 [pdf]

Towards a vision foundation model for comprehensive assessment of Cardiac MRI

Authors: Athira J Jacob, Indraneel Borgohain, Teodora Chitiboi, Puneet Sharma, Dorin Comaniciu, Daniel Rueckert

Abstract: Cardiac magnetic resonance imaging (CMR), considered the gold standard for noninvasive cardiac assessment, is a diverse and complex modality requiring a wide variety of image processing tasks for comprehensive assessment of cardiac morphology and function. Advances in deep learning have enabled the development of state-of-the-art (SoTA) models for these tasks. However, model training is challengin… ▽ More Cardiac magnetic resonance imaging (CMR), considered the gold standard for noninvasive cardiac assessment, is a diverse and complex modality requiring a wide variety of image processing tasks for comprehensive assessment of cardiac morphology and function. Advances in deep learning have enabled the development of state-of-the-art (SoTA) models for these tasks. However, model training is challenging due to data and label scarcity, especially in the less common imaging sequences. Moreover, each model is often trained for a specific task, with no connection between related tasks. In this work, we introduce a vision foundation model trained for CMR assessment, that is trained in a self-supervised fashion on 36 million CMR images. We then finetune the model in supervised way for 9 clinical tasks typical to a CMR workflow, across classification, segmentation, landmark localization, and pathology detection. We demonstrate improved accuracy and robustness across all tasks, over a range of available labeled dataset sizes. We also demonstrate improved few-shot learning with fewer labeled samples, a common challenge in medical image analyses. We achieve an out-of-box performance comparable to SoTA for most clinical tasks. The proposed method thus presents a resource-efficient, unified framework for CMR assessment, with the potential to accelerate the development of deep learning-based solutions for image analysis tasks, even with few annotated data available. △ Less

Submitted 6 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

Comments: 11 pages, 3 figures, 4 tables

arXiv:2409.09796 [pdf, other]

Universal Topology Refinement for Medical Image Segmentation with Polynomial Feature Synthesis

Authors: Liu Li, Hanchun Wang, Matthew Baugh, Qiang Ma, Weitong Zhang, Cheng Ouyang, Daniel Rueckert, Bernhard Kainz

Abstract: Although existing medical image segmentation methods provide impressive pixel-wise accuracy, they often neglect topological correctness, making their segmentations unusable for many downstream tasks. One option is to retrain such models whilst including a topology-driven loss component. However, this is computationally expensive and often impractical. A better solution would be to have a versatile… ▽ More Although existing medical image segmentation methods provide impressive pixel-wise accuracy, they often neglect topological correctness, making their segmentations unusable for many downstream tasks. One option is to retrain such models whilst including a topology-driven loss component. However, this is computationally expensive and often impractical. A better solution would be to have a versatile plug-and-play topology refinement method that is compatible with any domain-specific segmentation pipeline. Directly training a post-processing model to mitigate topological errors often fails as such models tend to be biased towards the topological errors of a target segmentation network. The diversity of these errors is confined to the information provided by a labelled training set, which is especially problematic for small datasets. Our method solves this problem by training a model-agnostic topology refinement network with synthetic segmentations that cover a wide variety of topological errors. Inspired by the Stone-Weierstrass theorem, we synthesize topology-perturbation masks with randomly sampled coefficients of orthogonal polynomial bases, which ensures a complete and unbiased representation. Practically, we verified the efficiency and effectiveness of our methods as being compatible with multiple families of polynomial bases, and show evidence that our universal plug-and-play topology refinement network outperforms both existing topology-driven learning-based and post-processing methods. We also show that combining our method with learning-based models provides an effortless add-on, which can further improve the performance of existing approaches. △ Less

Submitted 15 September, 2024; originally announced September 2024.

Comments: Accepted by the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024)

arXiv:2409.09387 [pdf, other]

Estimating Neural Orientation Distribution Fields on High Resolution Diffusion MRI Scans

Authors: Mohammed Munzer Dwedari, William Consagra, Philip Müller, Özgün Turgut, Daniel Rueckert, Yogesh Rathi

Abstract: The Orientation Distribution Function (ODF) characterizes key brain microstructural properties and plays an important role in understanding brain structural connectivity. Recent works introduced Implicit Neural Representation (INR) based approaches to form a spatially aware continuous estimate of the ODF field and demonstrated promising results in key tasks of interest when compared to conventiona… ▽ More The Orientation Distribution Function (ODF) characterizes key brain microstructural properties and plays an important role in understanding brain structural connectivity. Recent works introduced Implicit Neural Representation (INR) based approaches to form a spatially aware continuous estimate of the ODF field and demonstrated promising results in key tasks of interest when compared to conventional discrete approaches. However, traditional INR methods face difficulties when scaling to large-scale images, such as modern ultra-high-resolution MRI scans, posing challenges in learning fine structures as well as inefficiencies in training and inference speed. In this work, we propose HashEnc, a grid-hash-encoding-based estimation of the ODF field and demonstrate its effectiveness in retaining structural and textural features. We show that HashEnc achieves a 10% enhancement in image quality while requiring 3x less computational resources than current methods. Our code can be found at https://github.com/MunzerDw/NODF-HashEnc. △ Less

Submitted 14 September, 2024; originally announced September 2024.

Comments: 16 pages, 8 figures, conference: Medical Image Computing and Computer-Assisted Intervention (MICCAI)

arXiv:2407.20108 [pdf, other]

Classification, Regression and Segmentation directly from k-Space in Cardiac MRI

Authors: Ruochen Li, Jiazhen Pan, Youxiang Zhu, Juncheng Ni, Daniel Rueckert

Abstract: Cardiac Magnetic Resonance Imaging (CMR) is the gold standard for diagnosing cardiovascular diseases. Clinical diagnoses predominantly rely on magnitude-only Digital Imaging and Communications in Medicine (DICOM) images, omitting crucial phase information that might provide additional diagnostic benefits. In contrast, k-space is complex-valued and encompasses both magnitude and phase information,… ▽ More Cardiac Magnetic Resonance Imaging (CMR) is the gold standard for diagnosing cardiovascular diseases. Clinical diagnoses predominantly rely on magnitude-only Digital Imaging and Communications in Medicine (DICOM) images, omitting crucial phase information that might provide additional diagnostic benefits. In contrast, k-space is complex-valued and encompasses both magnitude and phase information, while humans cannot directly perceive. In this work, we propose KMAE, a Transformer-based model specifically designed to process k-space data directly, eliminating conventional intermediary conversion steps to the image domain. KMAE can handle critical cardiac disease classification, relevant phenotype regression, and cardiac morphology segmentation tasks. We utilize this model to investigate the potential of k-space-based diagnosis in cardiac MRI. Notably, this model achieves competitive classification and regression performance compared to image-domain methods e.g. Masked Autoencoders (MAEs) and delivers satisfactory segmentation performance with a myocardium dice score of 0.884. Last but not least, our model exhibits robust performance with consistent results even when the k-space is 8* undersampled. We encourage the MR community to explore the untapped potential of k-space and pursue end-to-end, automated diagnosis with reduced human intervention. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.19274 [pdf, other]

doi 10.1007/978-3-031-73480-9_7

Mamba? Catch The Hype Or Rethink What Really Helps for Image Registration

Authors: Bailiang Jian, Jiazhen Pan, Morteza Ghahremani, Daniel Rueckert, Christian Wachinger, Benedikt Wiestler

Abstract: Our findings indicate that adopting "advanced" computational elements fails to significantly improve registration accuracy. Instead, well-established registration-specific designs offer fair improvements, enhancing results by a marginal 1.5\% over the baseline. Our findings emphasize the importance of rigorous, unbiased evaluation and contribution disentanglement of all low- and high-level registr… ▽ More Our findings indicate that adopting "advanced" computational elements fails to significantly improve registration accuracy. Instead, well-established registration-specific designs offer fair improvements, enhancing results by a marginal 1.5\% over the baseline. Our findings emphasize the importance of rigorous, unbiased evaluation and contribution disentanglement of all low- and high-level registration components, rather than simply following the computer vision trends with "more advanced" computational blocks. We advocate for simpler yet effective solutions and novel evaluation metrics that go beyond conventional registration accuracy, warranting further research across diverse organs and modalities. The code is available at \url{https://github.com/BailiangJ/rethink-reg}. △ Less

Submitted 27 July, 2024; originally announced July 2024.

Comments: WBIR 2024 Workshop on Biomedical Imaging Registration

arXiv:2407.03863 [pdf, other]

Unsupervised Analysis of Alzheimer's Disease Signatures using 3D Deformable Autoencoders

Authors: Mehmet Yigit Avci, Emily Chan, Veronika Zimmer, Daniel Rueckert, Benedikt Wiestler, Julia A. Schnabel, Cosmin I. Bercea

Abstract: With the increasing incidence of neurodegenerative diseases such as Alzheimer's Disease (AD), there is a need for further research that enhances detection and monitoring of the diseases. We present MORPHADE (Morphological Autoencoders for Alzheimer's Disease Detection), a novel unsupervised learning approach which uses deformations to allow the analysis of 3D T1-weighted brain images. To the best… ▽ More With the increasing incidence of neurodegenerative diseases such as Alzheimer's Disease (AD), there is a need for further research that enhances detection and monitoring of the diseases. We present MORPHADE (Morphological Autoencoders for Alzheimer's Disease Detection), a novel unsupervised learning approach which uses deformations to allow the analysis of 3D T1-weighted brain images. To the best of our knowledge, this is the first use of deformations with deep unsupervised learning to not only detect, but also localize and assess the severity of structural changes in the brain due to AD. We obtain markedly higher anomaly scores in clinically important areas of the brain in subjects with AD compared to healthy controls, showcasing that our method is able to effectively locate AD-related atrophy. We additionally observe a visual correlation between the severity of atrophy highlighted in our anomaly maps and medial temporal lobe atrophy scores evaluated by a clinical expert. Finally, our method achieves an AUROC of 0.80 in detecting AD, out-performing several supervised and unsupervised baselines. We believe our framework shows promise as a tool towards improved understanding, monitoring and detection of AD. To support further research and application, we have made our code publicly available at github.com/ci-ber/MORPHADE. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 11 pages, 5 figures

arXiv:2407.03034 [pdf, ps, other]

Attention Incorporated Network for Sharing Low-rank, Image and K-space Information during MR Image Reconstruction to Achieve Single Breath-hold Cardiac Cine Imaging

Authors: Siying Xu, Kerstin Hammernik, Andreas Lingg, Jens Kuebler, Patrick Krumm, Daniel Rueckert, Sergios Gatidis, Thomas Kuestner

Abstract: Cardiac Cine Magnetic Resonance Imaging (MRI) provides an accurate assessment of heart morphology and function in clinical practice. However, MRI requires long acquisition times, with recent deep learning-based methods showing great promise to accelerate imaging and enhance reconstruction quality. Existing networks exhibit some common limitations that constrain further acceleration possibilities,… ▽ More Cardiac Cine Magnetic Resonance Imaging (MRI) provides an accurate assessment of heart morphology and function in clinical practice. However, MRI requires long acquisition times, with recent deep learning-based methods showing great promise to accelerate imaging and enhance reconstruction quality. Existing networks exhibit some common limitations that constrain further acceleration possibilities, including single-domain learning, reliance on a single regularization term, and equal feature contribution. To address these limitations, we propose to embed information from multiple domains, including low-rank, image, and k-space, in a novel deep learning network for MRI reconstruction, which we denote as A-LIKNet. A-LIKNet adopts a parallel-branch structure, enabling independent learning in the k-space and image domain. Coupled information sharing layers realize the information exchange between domains. Furthermore, we introduce attention mechanisms into the network to assign greater weights to more critical coils or important temporal frames. Training and testing were conducted on an in-house dataset, including 91 cardiovascular patients and 38 healthy subjects scanned with 2D cardiac Cine using retrospective undersampling. Additionally, we evaluated A-LIKNet on the real-time 8x prospectively undersampled data from the OCMR dataset. The results demonstrate that our proposed A-LIKNet outperforms existing methods and provides high-quality reconstructions. The network can effectively reconstruct highly retrospectively undersampled dynamic MR images up to 24x accelerations, indicating its potential for single breath-hold imaging. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.00186 [pdf]

DCSM 2.0: Deep Conditional Shape Models for Data Efficient Segmentation

Authors: Athira J Jacob, Puneet Sharma, Daniel Rueckert

Abstract: Segmentation is often the first step in many medical image analyses workflows. Deep learning approaches, while giving state-of-the-art accuracies, are data intensive and do not scale well to low data regimes. We introduce Deep Conditional Shape Models 2.0, which uses an edge detector, along with an implicit shape function conditioned on edge maps, to leverage cross-modality shape information. The… ▽ More Segmentation is often the first step in many medical image analyses workflows. Deep learning approaches, while giving state-of-the-art accuracies, are data intensive and do not scale well to low data regimes. We introduce Deep Conditional Shape Models 2.0, which uses an edge detector, along with an implicit shape function conditioned on edge maps, to leverage cross-modality shape information. The shape function is trained exclusively on a source domain (contrasted CT) and applied to the target domain of interest (3D echocardiography). We demonstrate data efficiency in the target domain by varying the amounts of training data used in the edge detection stage. We observe that DCSM 2.0 outperforms the baseline at all data levels in terms of Hausdorff distances, and while using 50% or less of the training data in terms of average mesh distance, and at 10% or less of the data with the dice coefficient. The method scales well to low data regimes, with gains of up to 5% in dice coefficient, 2.58 mm in average surface distance and 21.02 mm in Hausdorff distance when using just 2% (22 volumes) of the training data. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Best oral paper award at ISBI 2024

arXiv:2406.12650 [pdf, other]

Weakly Supervised Learning of Cortical Surface Reconstruction from Segmentations

Authors: Qiang Ma, Liu Li, Emma C. Robinson, Bernhard Kainz, Daniel Rueckert

Abstract: Existing learning-based cortical surface reconstruction approaches heavily rely on the supervision of pseudo ground truth (pGT) cortical surfaces for training. Such pGT surfaces are generated by traditional neuroimage processing pipelines, which are time consuming and difficult to generalize well to low-resolution brain MRI, e.g., from fetuses and neonates. In this work, we present CoSeg, a learni… ▽ More Existing learning-based cortical surface reconstruction approaches heavily rely on the supervision of pseudo ground truth (pGT) cortical surfaces for training. Such pGT surfaces are generated by traditional neuroimage processing pipelines, which are time consuming and difficult to generalize well to low-resolution brain MRI, e.g., from fetuses and neonates. In this work, we present CoSeg, a learning-based cortical surface reconstruction framework weakly supervised by brain segmentations without the need for pGT surfaces. CoSeg introduces temporal attention networks to learn time-varying velocity fields from brain MRI for diffeomorphic surface deformations, which fit an initial surface to target cortical surfaces within only 0.11 seconds for each brain hemisphere. A weakly supervised loss is designed to reconstruct pial surfaces by inflating the white surface along the normal direction towards the boundary of the cortical gray matter segmentation. This alleviates partial volume effects and encourages the pial surface to deform into deep and challenging cortical sulci. We evaluate CoSeg on 1,113 adult brain MRI at 1mm and 2mm resolution. CoSeg achieves superior geometric and morphological accuracy compared to existing learning-based approaches. We also verify that CoSeg can extract high-quality cortical surfaces from fetal brain MRI on which traditional pipelines fail to produce acceptable results. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Accepted by the 27th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2024)

arXiv:2406.06537 [pdf, other]

Interactive Generation of Laparoscopic Videos with Diffusion Models

Authors: Ivan Iliash, Simeon Allmendinger, Felix Meissen, Niklas Kühl, Daniel Rückert

Abstract: Generative AI, in general, and synthetic visual data generation, in specific, hold much promise for benefiting surgical training by providing photorealism to simulation environments. Current training methods primarily rely on reading materials and observing live surgeries, which can be time-consuming and impractical. In this work, we take a significant step towards improving the training process.… ▽ More Generative AI, in general, and synthetic visual data generation, in specific, hold much promise for benefiting surgical training by providing photorealism to simulation environments. Current training methods primarily rely on reading materials and observing live surgeries, which can be time-consuming and impractical. In this work, we take a significant step towards improving the training process. Specifically, we use diffusion models in combination with a zero-shot video diffusion method to interactively generate realistic laparoscopic images and videos by specifying a surgical action through text and guiding the generation with tool positions through segmentation masks. We demonstrate the performance of our approach using the publicly available Cholec dataset family and evaluate the fidelity and factual correctness of our generated images using a surgical action recognition model as well as the pixel-wise F1-score for the spatial control of tool generation. We achieve an FID of 38.097 and an F1-score of 0.71. △ Less

Submitted 23 April, 2024; originally announced June 2024.

Comments: 7 pages, 4 figures

arXiv:2406.04769 [pdf, other]

Diffusion-based Generative Image Outpainting for Recovery of FOV-Truncated CT Images

Authors: Michelle Espranita Liman, Daniel Rueckert, Florian J. Fintelmann, Philip Müller

Abstract: Field-of-view (FOV) recovery of truncated chest CT scans is crucial for accurate body composition analysis, which involves quantifying skeletal muscle and subcutaneous adipose tissue (SAT) on CT slices. This, in turn, enables disease prognostication. Here, we present a method for recovering truncated CT slices using generative image outpainting. We train a diffusion model and apply it to truncated… ▽ More Field-of-view (FOV) recovery of truncated chest CT scans is crucial for accurate body composition analysis, which involves quantifying skeletal muscle and subcutaneous adipose tissue (SAT) on CT slices. This, in turn, enables disease prognostication. Here, we present a method for recovering truncated CT slices using generative image outpainting. We train a diffusion model and apply it to truncated CT slices generated by simulating a small FOV. Our model reliably recovers the truncated anatomy and outperforms the previous state-of-the-art despite being trained on 87% less data. △ Less

Submitted 26 September, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

Comments: Shared last authorship: Florian J. Fintelmann and Philip Müller

arXiv:2406.00329 [pdf, other]

doi 10.1007/978-3-031-72378-0_34

Whole Heart 3D+T Representation Learning Through Sparse 2D Cardiac MR Images

Authors: Yundi Zhang, Chen Chen, Suprosanna Shit, Sophie Starck, Daniel Rueckert, Jiazhen Pan

Abstract: Cardiac Magnetic Resonance (CMR) imaging serves as the gold-standard for evaluating cardiac morphology and function. Typically, a multi-view CMR stack, covering short-axis (SA) and 2/3/4-chamber long-axis (LA) views, is acquired for a thorough cardiac assessment. However, efficiently streamlining the complex, high-dimensional 3D+T CMR data and distilling compact, coherent representation remains a… ▽ More Cardiac Magnetic Resonance (CMR) imaging serves as the gold-standard for evaluating cardiac morphology and function. Typically, a multi-view CMR stack, covering short-axis (SA) and 2/3/4-chamber long-axis (LA) views, is acquired for a thorough cardiac assessment. However, efficiently streamlining the complex, high-dimensional 3D+T CMR data and distilling compact, coherent representation remains a challenge. In this work, we introduce a whole-heart self-supervised learning framework that utilizes masked imaging modeling to automatically uncover the correlations between spatial and temporal patches throughout the cardiac stacks. This process facilitates the generation of meaningful and well-clustered heart representations without relying on the traditionally required, and often costly, labeled data. The learned heart representation can be directly used for various downstream tasks. Furthermore, our method demonstrates remarkable robustness, ensuring consistent representations even when certain CMR planes are missing/flawed. We train our model on 14,000 unlabeled CMR data from UK BioBank and evaluate it on 1,000 annotated data. The proposed method demonstrates superior performance to baselines in tasks that demand comprehensive 3D+T cardiac information, e.g. cardiac phenotype (ejection fraction and ventricle volume) prediction and multi-plane/multi-frame CMR segmentation, highlighting its effectiveness in extracting comprehensive cardiac features that are both anatomically and pathologically relevant. △ Less

Submitted 6 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

arXiv:2406.00192 [pdf, other]

doi 10.1109/ISBI56570.2024.10635282

Direct Cardiac Segmentation from Undersampled K-space Using Transformers

Authors: Yundi Zhang, Nil Stolt-Ansó, Jiazhen Pan, Wenqi Huang, Kerstin Hammernik, Daniel Rueckert

Abstract: The prevailing deep learning-based methods of predicting cardiac segmentation involve reconstructed magnetic resonance (MR) images. The heavy dependency of segmentation approaches on image quality significantly limits the acceleration rate in fast MR reconstruction. Moreover, the practice of treating reconstruction and segmentation as separate sequential processes leads to artifact generation and… ▽ More The prevailing deep learning-based methods of predicting cardiac segmentation involve reconstructed magnetic resonance (MR) images. The heavy dependency of segmentation approaches on image quality significantly limits the acceleration rate in fast MR reconstruction. Moreover, the practice of treating reconstruction and segmentation as separate sequential processes leads to artifact generation and information loss in the intermediate stage. These issues pose a great risk to achieving high-quality outcomes. To leverage the redundant k-space information overlooked in this dual-step pipeline, we introduce a novel approach to directly deriving segmentations from sparse k-space samples using a transformer (DiSK). DiSK operates by globally extracting latent features from 2D+time k-space data with attention blocks and subsequently predicting the segmentation label of query points. We evaluate our model under various acceleration factors (ranging from 4 to 64) and compare against two image-based segmentation baselines. Our model consistently outperforms the baselines in Dice and Hausdorff distances across foreground classes for all presented sampling rates. △ Less

Submitted 31 May, 2024; originally announced June 2024.

arXiv:2406.00125 [pdf]

TotalVibeSegmentator: Full Body MRI Segmentation for the NAKO and UK Biobank

Authors: Robert Graf, Paul-Sören Platzek, Evamaria Olga Riedel, Constanze Ramschütz, Sophie Starck, Hendrik Kristian Möller, Matan Atad, Henry Völzke, Robin Bülow, Carsten Oliver Schmidt, Julia Rüdebusch, Matthias Jung, Marco Reisert, Jakob Weiss, Maximilian Löffler, Fabian Bamberg, Bene Wiestler, Johannes C. Paetzold, Daniel Rueckert, Jan Stefan Kirschke

Abstract: Objectives: To present a publicly available torso segmentation network for large epidemiology datasets on volumetric interpolated breath-hold examination (VIBE) images. Materials & Methods: We extracted preliminary segmentations from TotalSegmentator, spine, and body composition networks for VIBE images, then improved them iteratively and retrained a nnUNet network. Using subsets of NAKO (85 subje… ▽ More Objectives: To present a publicly available torso segmentation network for large epidemiology datasets on volumetric interpolated breath-hold examination (VIBE) images. Materials & Methods: We extracted preliminary segmentations from TotalSegmentator, spine, and body composition networks for VIBE images, then improved them iteratively and retrained a nnUNet network. Using subsets of NAKO (85 subjects) and UK Biobank (16 subjects), we evaluated with Dice-score on a holdout set (12 subjects) and existing organ segmentation approach (1000 subjects), generating 71 semantic segmentation types for VIBE images. We provide an additional network for the vertebra segments 22 individual vertebra types. Results: We achieved an average Dice score of 0.89 +- 0.07 overall 71 segmentation labels. We scored > 0.90 Dice-score on the abdominal organs except for the pancreas with a Dice of 0.70. Conclusion: Our work offers a detailed and refined publicly available full torso segmentation on VIBE images. △ Less

Submitted 18 October, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

Comments: https://github.com/robert-graf/TotalVibeSegmentator

arXiv:2405.09549 [pdf, other]

Deep-learning-based clustering of OCT images for biomarker discovery in age-related macular degeneration (Pinnacle study report 4)

Authors: Robbie Holland, Rebecca Kaye, Ahmed M. Hagag, Oliver Leingang, Thomas R. P. Taylor, Hrvoje Bogunović, Ursula Schmidt-Erfurth, Hendrik P. N. Scholl, Daniel Rueckert, Andrew J. Lotery, Sobha Sivaprasad, Martin J. Menten

Abstract: Diseases are currently managed by grading systems, where patients are stratified by grading systems into stages that indicate patient risk and guide clinical management. However, these broad categories typically lack prognostic value, and proposals for new biomarkers are currently limited to anecdotal observations. In this work, we introduce a deep-learning-based biomarker proposal system for the… ▽ More Diseases are currently managed by grading systems, where patients are stratified by grading systems into stages that indicate patient risk and guide clinical management. However, these broad categories typically lack prognostic value, and proposals for new biomarkers are currently limited to anecdotal observations. In this work, we introduce a deep-learning-based biomarker proposal system for the purpose of accelerating biomarker discovery in age-related macular degeneration (AMD). It works by first training a neural network using self-supervised contrastive learning to discover, without any clinical annotations, features relating to both known and unknown AMD biomarkers present in 46,496 retinal optical coherence tomography (OCT) images. To interpret the discovered biomarkers, we partition the images into 30 subsets, termed clusters, that contain similar features. We then conduct two parallel 1.5-hour semi-structured interviews with two independent teams of retinal specialists that describe each cluster in clinical language. Overall, both teams independently identified clearly distinct characteristics in 27 of 30 clusters, of which 23 were related to AMD. Seven were recognised as known biomarkers already used in established grading systems and 16 depicted biomarker combinations or subtypes that are either not yet used in grading systems, were only recently proposed, or were unknown. Clusters separated incomplete from complete retinal atrophy, intraretinal from subretinal fluid and thick from thin choroids, and in simulation outperformed clinically-used grading systems in prognostic value. Overall, contrastive learning enabled the automatic proposal of AMD biomarkers that go beyond the set used by clinically established grading systems. Ultimately, we envision that equipping clinicians with discovery-oriented deep-learning tools can accelerate discovery of novel prognostic biomarkers. △ Less

Submitted 12 March, 2024; originally announced May 2024.

arXiv:2405.08783 [pdf, other]

The Developing Human Connectome Project: A Fast Deep Learning-based Pipeline for Neonatal Cortical Surface Reconstruction

Authors: Qiang Ma, Kaili Liang, Liu Li, Saga Masui, Yourong Guo, Chiara Nosarti, Emma C. Robinson, Bernhard Kainz, Daniel Rueckert

Abstract: The Developing Human Connectome Project (dHCP) aims to explore developmental patterns of the human brain during the perinatal period. An automated processing pipeline has been developed to extract high-quality cortical surfaces from structural brain magnetic resonance (MR) images for the dHCP neonatal dataset. However, the current implementation of the pipeline requires more than 6.5 hours to proc… ▽ More The Developing Human Connectome Project (dHCP) aims to explore developmental patterns of the human brain during the perinatal period. An automated processing pipeline has been developed to extract high-quality cortical surfaces from structural brain magnetic resonance (MR) images for the dHCP neonatal dataset. However, the current implementation of the pipeline requires more than 6.5 hours to process a single MRI scan, making it expensive for large-scale neuroimaging studies. In this paper, we propose a fast deep learning (DL) based pipeline for dHCP neonatal cortical surface reconstruction, incorporating DL-based brain extraction, cortical surface reconstruction and spherical projection, as well as GPU-accelerated cortical surface inflation and cortical feature estimation. We introduce a multiscale deformation network to learn diffeomorphic cortical surface reconstruction end-to-end from T2-weighted brain MRI. A fast unsupervised spherical mapping approach is integrated to minimize metric distortions between cortical surfaces and projected spheres. The entire workflow of our DL-based dHCP pipeline completes within only 24 seconds on a modern GPU, which is nearly 1000 times faster than the original dHCP pipeline. The qualitative assessment demonstrates that for 82.5% of the test samples, the cortical surfaces reconstructed by our DL-based pipeline achieve superior (54.2%) or equal (28.3%) surface quality compared to the original dHCP pipeline. △ Less

Submitted 15 November, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted by Medical Image Analysis

arXiv:2405.06463 [pdf, other]

MRSegmentator: Multi-Modality Segmentation of 40 Classes in MRI and CT

Authors: Hartmut Häntze, Lina Xu, Christian J. Mertens, Felix J. Dorfner, Leonhard Donle, Felix Busch, Avan Kader, Sebastian Ziegelmayer, Nadine Bayerl, Nassir Navab, Daniel Rueckert, Julia Schnabel, Hugo JWL Aerts, Daniel Truhn, Fabian Bamberg, Jakob Weiß, Christopher L. Schlett, Steffen Ringhof, Thoralf Niendorf, Tobias Pischon, Hans-Ulrich Kauczor, Tobias Nonnenmacher, Thomas Kröncke, Henry Völzke, Jeanette Schulz-Menger , et al. (7 additional authors not shown)

Abstract: Purpose: To develop and evaluate a deep learning model for multi-organ segmentation of MRI scans. Materials and Methods: The model was trained on 1,200 manually annotated 3D axial MRI scans from the UK Biobank, 221 in-house MRI scans, and 1228 CT scans from the TotalSegmentator dataset. A human-in-the-loop annotation workflow was employed, leveraging cross-modality transfer learning from an exis… ▽ More Purpose: To develop and evaluate a deep learning model for multi-organ segmentation of MRI scans. Materials and Methods: The model was trained on 1,200 manually annotated 3D axial MRI scans from the UK Biobank, 221 in-house MRI scans, and 1228 CT scans from the TotalSegmentator dataset. A human-in-the-loop annotation workflow was employed, leveraging cross-modality transfer learning from an existing CT segmentation model to segment 40 anatomical structures. The annotation process began with a model based on transfer learning between CT and MR, which was iteratively refined based on manual corrections to predicted segmentations. The model's performance was evaluated on MRI examinations obtained from the German National Cohort (NAKO) study (n=900) from the AMOS22 dataset (n=60) and from the TotalSegmentator-MRI test data (n=29). The Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD) were used to assess segmentation quality, stratified by organ and scan type. The model and its weights will be open-sourced. Results: MRSegmentator demonstrated high accuracy for well-defined organs (lungs: DSC 0.96, heart: DSC 0.94) and organs with anatomic variability (liver: DSC 0.96, kidneys: DSC 0.95). Smaller structures showed lower accuracy (portal/splenic veins: DSC 0.64, adrenal glands: DSC 0.69). On external validation using NAKO data, mean DSC ranged from 0.85 $\pm$ 0.08 for T2-HASTE to 0.91 $\pm$ 0.05 for in-phase sequences. The model generalized well to CT, achieving mean DSC of 0.84 $\pm$ 0.11 on AMOS CT data. Conclusion: MRSegmentator accurately segments 40 anatomical structures in MRI across diverse datasets and imaging protocols, with additional generalizability to CT images. This open-source model will provide a valuable tool for automated multi-organ segmentation in medical imaging research. It can be downloaded from https://github.com/hhaentze/MRSegmentator. △ Less

Submitted 14 November, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: 17 pages, 6 figures; updated data; completed co-author info

ACM Class: J.3

arXiv:2404.17621 [pdf, other]

doi 10.1109/TMI.2024.3385024

Attention-aware non-rigid image registration for accelerated MR imaging

Authors: Aya Ghoul, Jiazhen Pan, Andreas Lingg, Jens Kübler, Patrick Krumm, Kerstin Hammernik, Daniel Rueckert, Sergios Gatidis, Thomas Küstner

Abstract: Accurate motion estimation at high acceleration factors enables rapid motion-compensated reconstruction in Magnetic Resonance Imaging (MRI) without compromising the diagnostic image quality. In this work, we introduce an attention-aware deep learning-based framework that can perform non-rigid pairwise registration for fully sampled and accelerated MRI. We extract local visual representations to bu… ▽ More Accurate motion estimation at high acceleration factors enables rapid motion-compensated reconstruction in Magnetic Resonance Imaging (MRI) without compromising the diagnostic image quality. In this work, we introduce an attention-aware deep learning-based framework that can perform non-rigid pairwise registration for fully sampled and accelerated MRI. We extract local visual representations to build similarity maps between the registered image pairs at multiple resolution levels and additionally leverage long-range contextual information using a transformer-based module to alleviate ambiguities in the presence of artifacts caused by undersampling. We combine local and global dependencies to perform simultaneous coarse and fine motion estimation. The proposed method was evaluated on in-house acquired fully sampled and accelerated data of 101 patients and 62 healthy subjects undergoing cardiac and thoracic MRI. The impact of motion estimation accuracy on the downstream task of motion-compensated reconstruction was analyzed. We demonstrate that our model derives reliable and consistent motion fields across different sampling trajectories (Cartesian and radial) and acceleration factors of up to 16x for cardiac motion and 30x for respiratory motion and achieves superior image quality in motion-compensated reconstruction qualitatively and quantitatively compared to conventional and recent deep learning-based approaches. The code is publicly available at https://github.com/lab-midas/GMARAFT. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 14 pages, 7 figures

arXiv:2404.08350 [pdf, other]

Self-Supervised k-Space Regularization for Motion-Resolved Abdominal MRI Using Neural Implicit k-Space Representation

Authors: Veronika Spieker, Hannah Eichhorn, Jonathan K. Stelter, Wenqi Huang, Rickmer F. Braren, Daniel Rückert, Francisco Sahli Costabal, Kerstin Hammernik, Claudia Prieto, Dimitrios C. Karampinos, Julia A. Schnabel

Abstract: Neural implicit k-space representations have shown promising results for dynamic MRI at high temporal resolutions. Yet, their exclusive training in k-space limits the application of common image regularization methods to improve the final reconstruction. In this work, we introduce the concept of parallel imaging-inspired self-consistency (PISCO), which we incorporate as novel self-supervised k-spa… ▽ More Neural implicit k-space representations have shown promising results for dynamic MRI at high temporal resolutions. Yet, their exclusive training in k-space limits the application of common image regularization methods to improve the final reconstruction. In this work, we introduce the concept of parallel imaging-inspired self-consistency (PISCO), which we incorporate as novel self-supervised k-space regularization enforcing a consistent neighborhood relationship. At no additional data cost, the proposed regularization significantly improves neural implicit k-space reconstructions on simulated data. Abdominal in-vivo reconstructions using PISCO result in enhanced spatio-temporal image quality compared to state-of-the-art methods. Code is available at https://github.com/vjspi/PISCO-NIK. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: Under Review

arXiv:2404.00767 [pdf, other]

Intensity-based 3D motion correction for cardiac MR images

Authors: Nil Stolt-Ansó, Vasiliki Sideri-Lampretsa, Maik Dannecker, Daniel Rueckert

Abstract: Cardiac magnetic resonance (CMR) image acquisition requires subjects to hold their breath while 2D cine images are acquired. This process assumes that the heart remains in the same position across all slices. However, differences in breathhold positions or patient motion introduce 3D slice misalignments. In this work, we propose an algorithm that simultaneously aligns all SA and LA slices by maxim… ▽ More Cardiac magnetic resonance (CMR) image acquisition requires subjects to hold their breath while 2D cine images are acquired. This process assumes that the heart remains in the same position across all slices. However, differences in breathhold positions or patient motion introduce 3D slice misalignments. In this work, we propose an algorithm that simultaneously aligns all SA and LA slices by maximizing the pair-wise intensity agreement between their intersections. Unlike previous works, our approach is formulated as a subject-specific optimization problem and requires no prior knowledge of the underlying anatomy. We quantitatively demonstrate that the proposed method is robust against a large range of rotations and translations by synthetically misaligning 10 motion-free datasets and aligning them back using the proposed method. △ Less

Submitted 31 March, 2024; originally announced April 2024.

arXiv:2403.16776 [pdf, ps, other]

Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases

Authors: Sophie Starck, Vasiliki Sideri-Lampretsa, Bernhard Kainz, Martin J. Menten, Tamara T. Mueller, Daniel Rueckert

Abstract: Anatomical atlases are widely used for population studies and analysis. Conditional atlases target a specific sub-population defined via certain conditions, such as demographics or pathologies, and allow for the investigation of fine-grained anatomical differences like morphological changes associated with ageing or disease. Existing approaches use either registration-based methods that are often… ▽ More Anatomical atlases are widely used for population studies and analysis. Conditional atlases target a specific sub-population defined via certain conditions, such as demographics or pathologies, and allow for the investigation of fine-grained anatomical differences like morphological changes associated with ageing or disease. Existing approaches use either registration-based methods that are often unable to handle large anatomical variations or generative adversarial models, which are challenging to train since they can suffer from training instabilities. Instead of generating atlases directly in as intensities, we propose using latent diffusion models to generate deformation fields, which transform a general population atlas into one representing a specific sub-population. Our approach ensures structural integrity, enhances interpretability and avoids hallucinations that may arise during direct image synthesis by generating this deformation field and regularising it using a neighbourhood of images. We compare our method to several state-of-the-art atlas generation methods using brain MR images from the UK Biobank. Our method generates highly realistic atlases with smooth transformations and high anatomical fidelity, outperforming existing baselines. We demonstrate the quality of these atlases through comprehensive evaluations, including quantitative metrics for anatomical accuracy, perceptual similarity, and qualitative analyses displaying the consistency and realism of the generated atlases. △ Less

Submitted 24 June, 2025; v1 submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.11001 [pdf, other]

doi 10.1007/978-3-031-72111-3_68

Topologically Faithful Multi-class Segmentation in Medical Images

Authors: Alexander H. Berger, Nico Stucki, Laurin Lux, Vincent Buergin, Suprosanna Shit, Anna Banaszak, Daniel Rueckert, Ulrich Bauer, Johannes C. Paetzold

Abstract: Topological accuracy in medical image segmentation is a highly important property for downstream applications such as network analysis and flow modeling in vessels or cell counting. Recently, significant methodological advancements have brought well-founded concepts from algebraic topology to binary segmentation. However, these approaches have been underexplored in multi-class segmentation scenari… ▽ More Topological accuracy in medical image segmentation is a highly important property for downstream applications such as network analysis and flow modeling in vessels or cell counting. Recently, significant methodological advancements have brought well-founded concepts from algebraic topology to binary segmentation. However, these approaches have been underexplored in multi-class segmentation scenarios, where topological errors are common. We propose a general loss function for topologically faithful multi-class segmentation extending the recent Betti matching concept, which is based on induced matchings of persistence barcodes. We project the N-class segmentation problem to N single-class segmentation tasks, which allows us to use 1-parameter persistent homology, making training of neural networks computationally feasible. We validate our method on a comprehensive set of four medical datasets with highly variant topological characteristics. Our loss formulation significantly enhances topological correctness in cardiac, cell, artery-vein, and Circle of Willis segmentation. △ Less

Submitted 9 October, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

Journal ref: MICCAI 2024, Lecture Notes in Computer Science, vol. 15008, pp. 721-731, 2024

arXiv:2403.08464 [pdf, other]

Diffusion Models with Implicit Guidance for Medical Anomaly Detection

Authors: Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert, Julia A. Schnabel

Abstract: Diffusion models have advanced unsupervised anomaly detection by improving the transformation of pathological images into pseudo-healthy equivalents. Nonetheless, standard approaches may compromise critical information during pathology removal, leading to restorations that do not align with unaffected regions in the original scans. Such discrepancies can inadvertently increase false positive rates… ▽ More Diffusion models have advanced unsupervised anomaly detection by improving the transformation of pathological images into pseudo-healthy equivalents. Nonetheless, standard approaches may compromise critical information during pathology removal, leading to restorations that do not align with unaffected regions in the original scans. Such discrepancies can inadvertently increase false positive rates and reduce specificity, complicating radiological evaluations. This paper introduces Temporal Harmonization for Optimal Restoration (THOR), which refines the de-noising process by integrating implicit guidance through temporal anomaly maps. THOR aims to preserve the integrity of healthy tissue in areas unaffected by pathology. Comparative evaluations show that THOR surpasses existing diffusion-based methods in detecting and segmenting anomalies in brain MRIs and wrist X-rays. Code: https://github.com/ci-ber/THOR_DDPM. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2402.16368 [pdf, other]

doi 10.1007/s00330-024-11155-y

SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance Segmentation

Authors: Hendrik Möller, Robert Graf, Joachim Schmitt, Benjamin Keinert, Matan Atad, Anjany Sekuboyina, Felix Streckenbach, Hanna Schön, Florian Kofler, Thomas Kroencke, Stefanie Bette, Stefan Willich, Thomas Keil, Thoralf Niendorf, Tobias Pischon, Beate Endemann, Bjoern Menze, Daniel Rueckert, Jan S. Kirschke

Abstract: Purpose. To present SPINEPS, an open-source deep learning approach for semantic and instance segmentation of 14 spinal structures (ten vertebra substructures, intervertebral discs, spinal cord, spinal canal, and sacrum) in whole body T2w MRI. Methods. During this HIPPA-compliant, retrospective study, we utilized the public SPIDER dataset (218 subjects, 63% female) and a subset of the German Nati… ▽ More Purpose. To present SPINEPS, an open-source deep learning approach for semantic and instance segmentation of 14 spinal structures (ten vertebra substructures, intervertebral discs, spinal cord, spinal canal, and sacrum) in whole body T2w MRI. Methods. During this HIPPA-compliant, retrospective study, we utilized the public SPIDER dataset (218 subjects, 63% female) and a subset of the German National Cohort (1423 subjects, mean age 53, 49% female) for training and evaluation. We combined CT and T2w segmentations to train models that segment 14 spinal structures in T2w sagittal scans both semantically and instance-wise. Performance evaluation metrics included Dice similarity coefficient, average symmetrical surface distance, panoptic quality, segmentation quality, and recognition quality. Statistical significance was assessed using the Wilcoxon signed-rank test. An in-house dataset was used to qualitatively evaluate out-of-distribution samples. Results. On the public dataset, our approach outperformed the baseline (instance-wise vertebra dice score 0.929 vs. 0.907, p-value<0.001). Training on auto-generated annotations and evaluating on manually corrected test data from the GNC yielded global dice scores of 0.900 for vertebrae, 0.960 for intervertebral discs, and 0.947 for the spinal canal. Incorporating the SPIDER dataset during training increased these scores to 0.920, 0.967, 0.958, respectively. Conclusions. The proposed segmentation approach offers robust segmentation of 14 spinal structures in T2w sagittal images, including the spinal cord, spinal canal, intervertebral discs, endplate, sacrum, and vertebrae. The approach yields both a semantic and instance mask as output, thus being easy to utilize. This marks the first publicly available algorithm for whole spine segmentation in sagittal T2w MR imaging. △ Less

Submitted 22 April, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: https://github.com/Hendrik-code/spineps

arXiv:2401.16564 [pdf]

doi 10.1109/RBME.2024.3485022

Data and Physics driven Deep Learning Models for Fast MRI Reconstruction: Fundamentals and Methodologies

Authors: Jiahao Huang, Yinzhe Wu, Fanwen Wang, Yingying Fang, Yang Nan, Cagan Alkan, Daniel Abraham, Congyu Liao, Lei Xu, Zhifan Gao, Weiwen Wu, Lei Zhu, Zhaolin Chen, Peter Lally, Neal Bangerter, Kawin Setsompop, Yike Guo, Daniel Rueckert, Ge Wang, Guang Yang

Abstract: Magnetic Resonance Imaging (MRI) is a pivotal clinical diagnostic tool, yet its extended scanning times often compromise patient comfort and image quality, especially in volumetric, temporal and quantitative scans. This review elucidates recent advances in MRI acceleration via data and physics-driven models, leveraging techniques from algorithm unrolling models, enhancement-based methods, and plug… ▽ More Magnetic Resonance Imaging (MRI) is a pivotal clinical diagnostic tool, yet its extended scanning times often compromise patient comfort and image quality, especially in volumetric, temporal and quantitative scans. This review elucidates recent advances in MRI acceleration via data and physics-driven models, leveraging techniques from algorithm unrolling models, enhancement-based methods, and plug-and-play models to the emerging full spectrum of generative model-based methods. We also explore the synergistic integration of data models with physics-based insights, encompassing the advancements in multi-coil hardware accelerations like parallel imaging and simultaneous multi-slice imaging, and the optimization of sampling patterns. We then focus on domain-specific challenges and opportunities, including image redundancy exploitation, image integrity, evaluation metrics, data heterogeneity, and model generalization. This work also discusses potential solutions and future research directions, with an emphasis on the role of data harmonization and federated learning for further improving the general applicability and performance of these methods in MRI reconstruction. △ Less

Submitted 21 October, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted by IEEE Reviews in Biomedical Engineering (RBME)

arXiv:2401.10637 [pdf, other]

Towards Universal Unsupervised Anomaly Detection in Medical Imaging

Authors: Cosmin I. Bercea, Benedikt Wiestler, Daniel Rueckert, Julia A. Schnabel

Abstract: The increasing complexity of medical imaging data underscores the need for advanced anomaly detection methods to automatically identify diverse pathologies. Current methods face challenges in capturing the broad spectrum of anomalies, often limiting their use to specific lesion types in brain scans. To address this challenge, we introduce a novel unsupervised approach, termed \textit{Reversed Auto… ▽ More The increasing complexity of medical imaging data underscores the need for advanced anomaly detection methods to automatically identify diverse pathologies. Current methods face challenges in capturing the broad spectrum of anomalies, often limiting their use to specific lesion types in brain scans. To address this challenge, we introduce a novel unsupervised approach, termed \textit{Reversed Auto-Encoders (RA)}, designed to create realistic pseudo-healthy reconstructions that enable the detection of a wider range of pathologies. We evaluate the proposed method across various imaging modalities, including magnetic resonance imaging (MRI) of the brain, pediatric wrist X-ray, and chest X-ray, and demonstrate superior performance in detecting anomalies compared to existing state-of-the-art methods. Our unsupervised anomaly detection approach may enhance diagnostic accuracy in medical imaging by identifying a broader range of unknown pathologies. Our code is publicly available at: \url{https://github.com/ci-ber/RA}. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2312.02608 [pdf, other]

Panoptica -- instance-wise evaluation of 3D semantic and instance segmentation maps

Authors: Florian Kofler, Hendrik Möller, Josef A. Buchner, Ezequiel de la Rosa, Ivan Ezhov, Marcel Rosier, Isra Mekki, Suprosanna Shit, Moritz Negwer, Rami Al-Maskari, Ali Ertürk, Shankeeth Vinayahalingam, Fabian Isensee, Sarthak Pati, Daniel Rueckert, Jan S. Kirschke, Stefan K. Ehrlich, Annika Reinke, Bjoern Menze, Benedikt Wiestler, Marie Piraud

Abstract: This paper introduces panoptica, a versatile and performance-optimized package designed for computing instance-wise segmentation quality metrics from 2D and 3D segmentation maps. panoptica addresses the limitations of existing metrics and provides a modular framework that complements the original intersection over union-based panoptic quality with other metrics, such as the distance metric Average… ▽ More This paper introduces panoptica, a versatile and performance-optimized package designed for computing instance-wise segmentation quality metrics from 2D and 3D segmentation maps. panoptica addresses the limitations of existing metrics and provides a modular framework that complements the original intersection over union-based panoptic quality with other metrics, such as the distance metric Average Symmetric Surface Distance. The package is open-source, implemented in Python, and accompanied by comprehensive documentation and tutorials. panoptica employs a three-step metrics computation process to cover diverse use cases. The efficacy of panoptica is demonstrated on various real-world biomedical datasets, where an instance-wise evaluation is instrumental for an accurate representation of the underlying clinical task. Overall, we envision panoptica as a valuable tool facilitating in-depth evaluation of segmentation methods. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 15 pages, 6 figures, 3 tables

arXiv:2310.10756 [pdf]

Deep Conditional Shape Models for 3D cardiac image segmentation

Authors: Athira J Jacob, Puneet Sharma, Daniel Ruckert

Abstract: Delineation of anatomical structures is often the first step of many medical image analysis workflows. While convolutional neural networks achieve high performance, these do not incorporate anatomical shape information. We introduce a novel segmentation algorithm that uses Deep Conditional Shape models (DCSMs) as a core component. Using deep implicit shape representations, the algorithm learns a m… ▽ More Delineation of anatomical structures is often the first step of many medical image analysis workflows. While convolutional neural networks achieve high performance, these do not incorporate anatomical shape information. We introduce a novel segmentation algorithm that uses Deep Conditional Shape models (DCSMs) as a core component. Using deep implicit shape representations, the algorithm learns a modality-agnostic shape model that can generate the signed distance functions for any anatomy of interest. To fit the generated shape to the image, the shape model is conditioned on anatomic landmarks that can be automatically detected or provided by the user. Finally, we add a modality-dependent, lightweight refinement network to capture any fine details not represented by the implicit function. The proposed DCSM framework is evaluated on the problem of cardiac left ventricle (LV) segmentation from multiple 3D modalities (contrast-enhanced CT, non-contrasted CT, 3D echocardiography-3DE). We demonstrate that the automatic DCSM outperforms the baseline for non-contrasted CT without the local refinement, and with the refinement for contrasted CT and 3DE, especially with significant improvement in the Hausdorff distance. The semi-automatic DCSM with user-input landmarks, while only trained on contrasted CT, achieves greater than 92% Dice for all modalities. Both automatic DCSM with refinement and semi-automatic DCSM achieve equivalent or better performance compared to inter-user variability for these modalities. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted and presented as oral presentation at Statistical Atlases and Computational Modeling of the Heart (STACOM) workshop at MICCAI 2023

arXiv:2310.02829 [pdf, other]

All Sizes Matter: Improving Volumetric Brain Segmentation on Small Lesions

Authors: Ayhan Can Erdur, Daniel Scholz, Josef A. Buchner, Stephanie E. Combs, Daniel Rueckert, Jan C. Peeken

Abstract: Brain metastases (BMs) are the most frequently occurring brain tumors. The treatment of patients having multiple BMs with stereo tactic radiosurgery necessitates accurate localization of the metastases. Neural networks can assist in this time-consuming and costly task that is typically performed by human experts. Particularly challenging is the detection of small lesions since they are often under… ▽ More Brain metastases (BMs) are the most frequently occurring brain tumors. The treatment of patients having multiple BMs with stereo tactic radiosurgery necessitates accurate localization of the metastases. Neural networks can assist in this time-consuming and costly task that is typically performed by human experts. Particularly challenging is the detection of small lesions since they are often underrepresented in exist ing approaches. Yet, lesion detection is equally important for all sizes. In this work, we develop an ensemble of neural networks explicitly fo cused on detecting and segmenting small BMs. To accomplish this task, we trained several neural networks focusing on individual aspects of the BM segmentation problem: We use blob loss that specifically addresses the imbalance of lesion instances in terms of size and texture and is, therefore, not biased towards larger lesions. In addition, a model using a subtraction sequence between the T1 and T1 contrast-enhanced sequence focuses on low-contrast lesions. Furthermore, we train additional models only on small lesions. Our experiments demonstrate the utility of the ad ditional blob loss and the subtraction sequence. However, including the specialized small lesion models in the ensemble deteriorates segmentation results. We also find domain-knowledge-inspired postprocessing steps to drastically increase our performance in most experiments. Our approach enables us to submit a competitive challenge entry to the ASNR-MICCAI BraTS Brain Metastasis Challenge 2023. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.16853 [pdf, other]

T1/T2 relaxation temporal modelling from accelerated acquisitions using a Latent Transformer

Authors: Fanwen Wang, Michael Tanzer, Mengyun Qiao, Wenjia Bai, Daniel Rueckert, Guang Yang, Sonia Nielles-Vallespin

Abstract: Quantitative cardiac magnetic resonance T1 and T2 mapping enable myocardial tissue characterisation but the lengthy scan times restrict their widespread clinical application. We propose a deep learning method that incorporates a time dependency Latent Transformer module to model relationships between parameterised time frames for improved reconstruction from undersampled data. The module, implemen… ▽ More Quantitative cardiac magnetic resonance T1 and T2 mapping enable myocardial tissue characterisation but the lengthy scan times restrict their widespread clinical application. We propose a deep learning method that incorporates a time dependency Latent Transformer module to model relationships between parameterised time frames for improved reconstruction from undersampled data. The module, implemented as a multi-resolution sequence-to-sequence transformer, is integrated into an encoder-decoder architecture to leverage the inherent temporal correlations in relaxation processes. The presented results for accelerated T1 and T2 mapping show the model recovers maps with higher fidelity by explicit incorporation of time dynamics. This work demonstrates the importance of temporal modelling for artifact-free reconstruction in quantitative MRI. △ Less

Submitted 28 September, 2023; originally announced September 2023.

arXiv:2309.16735 [pdf, other]

Learnable real-time inference of molecular composition from diffuse spectroscopy of brain tissue

Authors: Ivan Ezhov, Kevin Scibilia, Luca Giannoni, Florian Kofler, Ivan Iliash, Felix Hsieh, Suprosanna Shit, Charly Caredda, Fred Lange, Ilias Tachtsidis, Daniel Rueckert

Abstract: Diffuse optical modalities such as broadband near-infrared spectroscopy (bNIRS) and hyperspectral imaging (HSI) represent a promising alternative for low-cost, non-invasive, and fast monitoring of functional and structural properties of living tissue. Particularly, the possibility of extracting the molecular composition of the tissue from the optical spectra in real-time deems the spectroscopy tec… ▽ More Diffuse optical modalities such as broadband near-infrared spectroscopy (bNIRS) and hyperspectral imaging (HSI) represent a promising alternative for low-cost, non-invasive, and fast monitoring of functional and structural properties of living tissue. Particularly, the possibility of extracting the molecular composition of the tissue from the optical spectra in real-time deems the spectroscopy techniques as a unique diagnostic tool. However, no established method exists to streamline the inference of the biochemical composition from the optical spectrum for real-time applications such as surgical monitoring. In this paper, we analyse a machine learning technique for fast and accurate inference of changes in the molecular composition of brain tissue. We reconsider and propose modifications to the existing learnable methodology based on the Beer-Lambert law, which analytically connects the spectra with concentrations. We evaluate the method's applicability to linear and non-linear formulations of the Beer-Lambert law. The approach is tested on real data obtained from the bNIRS- and HSI-based optical monitoring of brain tissue. The results demonstrate that the proposed method enables real-time molecular composition inference while maintaining the accuracy of traditional linear and non-linear optimization solvers. Preliminary findings show that Beer-Lambert law-based spectral unmixing allows to contrast brain anatomy semantics such as the vessel tree and tumor area. △ Less

Submitted 15 August, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.14306 [pdf, other]

DeepMesh: Mesh-based Cardiac Motion Tracking using Deep Learning

Authors: Qingjie Meng, Wenjia Bai, Declan P O'Regan, and Daniel Rueckert

Abstract: 3D motion estimation from cine cardiac magnetic resonance (CMR) images is important for the assessment of cardiac function and the diagnosis of cardiovascular diseases. Current state-of-the art methods focus on estimating dense pixel-/voxel-wise motion fields in image space, which ignores the fact that motion estimation is only relevant and useful within the anatomical objects of interest, e.g., t… ▽ More 3D motion estimation from cine cardiac magnetic resonance (CMR) images is important for the assessment of cardiac function and the diagnosis of cardiovascular diseases. Current state-of-the art methods focus on estimating dense pixel-/voxel-wise motion fields in image space, which ignores the fact that motion estimation is only relevant and useful within the anatomical objects of interest, e.g., the heart. In this work, we model the heart as a 3D mesh consisting of epi- and endocardial surfaces. We propose a novel learning framework, DeepMesh, which propagates a template heart mesh to a subject space and estimates the 3D motion of the heart mesh from CMR images for individual subjects. In DeepMesh, the heart mesh of the end-diastolic frame of an individual subject is first reconstructed from the template mesh. Mesh-based 3D motion fields with respect to the end-diastolic frame are then estimated from 2D short- and long-axis CMR images. By developing a differentiable mesh-to-image rasterizer, DeepMesh is able to leverage 2D shape information from multiple anatomical views for 3D mesh reconstruction and mesh motion estimation. The proposed method estimates vertex-wise displacement and thus maintains vertex correspondences between time frames, which is important for the quantitative assessment of cardiac function across different subjects and populations. We evaluate DeepMesh on CMR images acquired from the UK Biobank. We focus on 3D motion estimation of the left ventricle in this work. Experimental results show that the proposed method quantitatively and qualitatively outperforms other image-based and mesh-based cardiac motion tracking methods. △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.14198 [pdf, other]

(Predictable) Performance Bias in Unsupervised Anomaly Detection

Authors: Felix Meissen, Svenja Breuer, Moritz Knolle, Alena Buyx, Ruth Müller, Georgios Kaissis, Benedikt Wiestler, Daniel Rückert

Abstract: Background: With the ever-increasing amount of medical imaging data, the demand for algorithms to assist clinicians has amplified. Unsupervised anomaly detection (UAD) models promise to aid in the crucial first step of disease detection. While previous studies have thoroughly explored fairness in supervised models in healthcare, for UAD, this has so far been unexplored. Methods: In this study, w… ▽ More Background: With the ever-increasing amount of medical imaging data, the demand for algorithms to assist clinicians has amplified. Unsupervised anomaly detection (UAD) models promise to aid in the crucial first step of disease detection. While previous studies have thoroughly explored fairness in supervised models in healthcare, for UAD, this has so far been unexplored. Methods: In this study, we evaluated how dataset composition regarding subgroups manifests in disparate performance of UAD models along multiple protected variables on three large-scale publicly available chest X-ray datasets. Our experiments were validated using two state-of-the-art UAD models for medical images. Finally, we introduced a novel subgroup-AUROC (sAUROC) metric, which aids in quantifying fairness in machine learning. Findings: Our experiments revealed empirical "fairness laws" (similar to "scaling laws" for Transformers) for training-dataset composition: Linear relationships between anomaly detection performance within a subpopulation and its representation in the training data. Our study further revealed performance disparities, even in the case of balanced training data, and compound effects that exacerbate the drop in performance for subjects associated with multiple adversely affected groups. Interpretation: Our study quantified the disparate performance of UAD models against certain demographic subgroups. Importantly, we showed that this unfairness cannot be mitigated by balanced representation alone. Instead, the representation of some subgroups seems harder to learn by UAD models than that of others. The empirical fairness laws discovered in our study make disparate performance in UAD models easier to estimate and aid in determining the most desirable dataset composition. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 11 pages, 5 Figures, 1 panel

arXiv:2309.08643 [pdf, other]

NISF: Neural Implicit Segmentation Functions

Authors: Nil Stolt-Ansó, Julian McGinnis, Jiazhen Pan, Kerstin Hammernik, Daniel Rueckert

Abstract: Segmentation of anatomical shapes from medical images has taken an important role in the automation of clinical measurements. While typical deep-learning segmentation approaches are performed on discrete voxels, the underlying objects being analysed exist in a real-valued continuous space. Approaches that rely on convolutional neural networks (CNNs) are limited to grid-like inputs and not easily a… ▽ More Segmentation of anatomical shapes from medical images has taken an important role in the automation of clinical measurements. While typical deep-learning segmentation approaches are performed on discrete voxels, the underlying objects being analysed exist in a real-valued continuous space. Approaches that rely on convolutional neural networks (CNNs) are limited to grid-like inputs and not easily applicable to sparse or partial measurements. We propose a novel family of image segmentation models that tackle many of CNNs' shortcomings: Neural Implicit Segmentation Functions (NISF). Our framework takes inspiration from the field of neural implicit functions where a network learns a mapping from a real-valued coordinate-space to a shape representation. NISFs have the ability to segment anatomical shapes in high-dimensional continuous spaces. Training is not limited to voxelized grids, and covers applications with sparse and partial data. Interpolation between observations is learnt naturally in the training procedure and requires no post-processing. Furthermore, NISFs allow the leveraging of learnt shape priors to make predictions for regions outside of the original image plane. We go on to show the framework achieves dice scores of 0.87 $\pm$ 0.045 on a (3D+t) short-axis cardiac segmentation task using the UK Biobank dataset. We also provide a qualitative analysis on our frameworks ability to perform segmentation and image interpolation on unseen regions of an image volume at arbitrary resolutions. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2308.16863 [pdf, other]

Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images

Authors: Chinmay Prabhakar, Hongwei Bran Li, Johannes C. Paetzold, Timo Loehr, Chen Niu, Mark Mühlau, Daniel Rueckert, Benedikt Wiestler, Bjoern Menze

Abstract: Multiple Sclerosis (MS) is a severe neurological disease characterized by inflammatory lesions in the central nervous system. Hence, predicting inflammatory disease activity is crucial for disease assessment and treatment. However, MS lesions can occur throughout the brain and vary in shape, size and total count among patients. The high variance in lesion load and locations makes it challenging fo… ▽ More Multiple Sclerosis (MS) is a severe neurological disease characterized by inflammatory lesions in the central nervous system. Hence, predicting inflammatory disease activity is crucial for disease assessment and treatment. However, MS lesions can occur throughout the brain and vary in shape, size and total count among patients. The high variance in lesion load and locations makes it challenging for machine learning methods to learn a globally effective representation of whole-brain MRI scans to assess and predict disease. Technically it is non-trivial to incorporate essential biomarkers such as lesion load or spatial proximity. Our work represents the first attempt to utilize graph neural networks (GNN) to aggregate these biomarkers for a novel global representation. We propose a two-stage MS inflammatory disease activity prediction approach. First, a 3D segmentation network detects lesions, and a self-supervised algorithm extracts their image features. Second, the detected lesions are used to build a patient graph. The lesions act as nodes in the graph and are initialized with image features extracted in the first stage. Finally, the lesions are connected based on their spatial proximity and the inflammatory disease activity prediction is formulated as a graph classification task. Furthermore, we propose a self-pruning strategy to auto-select the most critical lesions for prediction. Our proposed method outperforms the existing baseline by a large margin (AUCs of 0.67 vs. 0.61 and 0.66 vs. 0.60 for one-year and two-year inflammatory disease activity, respectively). Finally, our proposed method enjoys inherent explainability by assigning an importance score to each lesion for the overall prediction. Code is available at https://github.com/chinmay5/ms_ida.git △ Less

Submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.14365 [pdf]

doi 10.1038/s43856-024-00670-0

Using UK Biobank data to establish population-specific atlases from whole body MRI

Authors: Sophie Starck, Vasiliki Sideri-Lampretsa, Jessica J. M. Ritter, Veronika A. Zimmer, Rickmer Braren, Tamara T. Mueller, Daniel Rueckert

Abstract: Reliable reference data in medical imaging is largely unavailable. Developing tools that allow for the comparison of individual patient data to reference data has a high potential to improve diagnostic imaging. Population atlases are a commonly used tool in medical imaging to facilitate this. Constructing such atlases becomes particularly challenging when working with highly heterogeneous datasets… ▽ More Reliable reference data in medical imaging is largely unavailable. Developing tools that allow for the comparison of individual patient data to reference data has a high potential to improve diagnostic imaging. Population atlases are a commonly used tool in medical imaging to facilitate this. Constructing such atlases becomes particularly challenging when working with highly heterogeneous datasets, such as whole-body images, which contain significant anatomical variations. In this work, we propose a pipeline for generating a standardised whole-body atlas for a highly heterogeneous population by partitioning the population into anatomically meaningful subgroups. Using magnetic resonance images from the UK Biobank dataset, we create six whole-body atlases representing a healthy population average. We furthermore unbias them, and this way obtain a realistic representation of the population. In addition to the anatomical atlases, we generate probabilistic atlases that capture the distributions of abdominal fat (visceral and subcutaneous) and five abdominal organs across the population (liver, spleen, pancreas, left and right kidneys). Our pipeline effectively generates high-quality, realistic whole-body atlases with clinical applicability. The probabilistic atlases show differences in fat distribution between subjects with medical conditions such as diabetes and cardiovascular diseases and healthy subjects in the atlas space. With this work, we make the constructed anatomical and label atlases publically available, with the expectation that they will support medical research involving whole-body MR images. △ Less

Submitted 28 November, 2024; v1 submitted 28 August, 2023; originally announced August 2023.

Comments: Published in Communications Medicine. Nature Portfolio

Journal ref: Commun Med 4(1), 237 (2024)

Showing 1–50 of 155 results for author: Rueckert, D