-
TotalRegistrator: Towards a Lightweight Foundation Model for CT Image Registration
Authors:
Xuan Loc Pham,
Gwendolyn Vuurberg,
Marjan Doppen,
Joey Roosen,
Tip Stille,
Thi Quynh Ha,
Thuy Duong Quach,
Quoc Vu Dang,
Manh Ha Luu,
Ewoud J. Smit,
Hong Son Mai,
Mattias Heinrich,
Bram van Ginneken,
Mathias Prokop,
Alessa Hering
Abstract:
Image registration is a fundamental technique in the analysis of longitudinal and multi-phase CT images within clinical practice. However, most existing methods are tailored for single-organ applications, limiting their generalizability to other anatomical regions. This work presents TotalRegistrator, an image registration framework capable of aligning multiple anatomical regions simultaneously us…
▽ More
Image registration is a fundamental technique in the analysis of longitudinal and multi-phase CT images within clinical practice. However, most existing methods are tailored for single-organ applications, limiting their generalizability to other anatomical regions. This work presents TotalRegistrator, an image registration framework capable of aligning multiple anatomical regions simultaneously using a standard UNet architecture and a novel field decomposition strategy. The model is lightweight, requiring only 11GB of GPU memory for training. To train and evaluate our method, we constructed a large-scale longitudinal dataset comprising 695 whole-body (thorax-abdomen-pelvic) paired CT scans from individual patients acquired at different time points. We benchmarked TotalRegistrator against a generic classical iterative algorithm and a recent foundation model for image registration. To further assess robustness and generalizability, we evaluated our model on three external datasets: the public thoracic and abdominal datasets from the Learn2Reg challenge, and a private multiphase abdominal dataset from a collaborating hospital. Experimental results on the in-house dataset show that the proposed approach generally surpasses baseline methods in multi-organ abdominal registration, with a slight drop in lung alignment performance. On out-of-distribution datasets, it achieved competitive results compared to leading single-organ models, despite not being fine-tuned for those tasks, demonstrating strong generalizability. The source code will be publicly available at: https://github.com/DIAGNijmegen/oncology_image_registration.git.
△ Less
Submitted 6 August, 2025;
originally announced August 2025.
-
Leveraging Open-Source Large Language Models for Clinical Information Extraction in Resource-Constrained Settings
Authors:
Luc Builtjes,
Joeran Bosma,
Mathias Prokop,
Bram van Ginneken,
Alessa Hering
Abstract:
Medical reports contain rich clinical information but are often unstructured and written in domain-specific language, posing challenges for information extraction. While proprietary large language models (LLMs) have shown promise in clinical natural language processing, their lack of transparency and data privacy concerns limit their utility in healthcare. This study therefore evaluates nine open-…
▽ More
Medical reports contain rich clinical information but are often unstructured and written in domain-specific language, posing challenges for information extraction. While proprietary large language models (LLMs) have shown promise in clinical natural language processing, their lack of transparency and data privacy concerns limit their utility in healthcare. This study therefore evaluates nine open-source generative LLMs on the DRAGON benchmark, which includes 28 clinical information extraction tasks in Dutch. We developed \texttt{llm\_extractinator}, a publicly available framework for information extraction using open-source generative LLMs, and used it to assess model performance in a zero-shot setting. Several 14 billion parameter models, Phi-4-14B, Qwen-2.5-14B, and DeepSeek-R1-14B, achieved competitive results, while the bigger Llama-3.3-70B model achieved slightly higher performance at greater computational cost. Translation to English prior to inference consistently degraded performance, highlighting the need of native-language processing. These findings demonstrate that open-source LLMs, when used with our framework, offer effective, scalable, and privacy-conscious solutions for clinical information extraction in low-resource settings.
△ Less
Submitted 28 July, 2025;
originally announced July 2025.
-
Unstable Prompts, Unreliable Segmentations: A Challenge for Longitudinal Lesion Analysis
Authors:
Niels Rocholl,
Ewoud Smit,
Mathias Prokop,
Alessa Hering
Abstract:
Longitudinal lesion analysis is crucial for oncological care, yet automated tools often struggle with temporal consistency. While universal lesion segmentation models have advanced, they are typically designed for single time points. This paper investigates the performance of the ULS23 segmentation model in a longitudinal context. Using a public clinical dataset of baseline and follow-up CT scans,…
▽ More
Longitudinal lesion analysis is crucial for oncological care, yet automated tools often struggle with temporal consistency. While universal lesion segmentation models have advanced, they are typically designed for single time points. This paper investigates the performance of the ULS23 segmentation model in a longitudinal context. Using a public clinical dataset of baseline and follow-up CT scans, we evaluated the model's ability to segment and track lesions over time. We identified two critical, interconnected failure modes: a sharp degradation in segmentation quality in follow-up cases due to inter-scan registration errors, and a subsequent breakdown of the lesion correspondence process. To systematically probe this vulnerability, we conducted a controlled experiment where we artificially displaced the input volume relative to the true lesion center. Our results demonstrate that the model's performance is highly dependent on its assumption of a centered lesion; segmentation accuracy collapses when the lesion is sufficiently displaced. These findings reveal a fundamental limitation of applying single-timepoint models to longitudinal data. We conclude that robust oncological tracking requires a paradigm shift away from cascading single-purpose tools towards integrated, end-to-end models inherently designed for temporal analysis.
△ Less
Submitted 25 July, 2025;
originally announced July 2025.
-
Robust Kidney Abnormality Segmentation: A Validation Study of an AI-Based Framework
Authors:
Sarah de Boer,
Hartmut Häntze,
Kiran Vaidhya Venkadesh,
Myrthe A. D. Buser,
Gabriel E. Humpire Mamani,
Lina Xu,
Lisa C. Adams,
Jawed Nawabi,
Keno K. Bressem,
Bram van Ginneken,
Mathias Prokop,
Alessa Hering
Abstract:
Kidney abnormality segmentation has important potential to enhance the clinical workflow, especially in settings requiring quantitative assessments. Kidney volume could serve as an important biomarker for renal diseases, with changes in volume correlating directly with kidney function. Currently, clinical practice often relies on subjective visual assessment for evaluating kidney size and abnormal…
▽ More
Kidney abnormality segmentation has important potential to enhance the clinical workflow, especially in settings requiring quantitative assessments. Kidney volume could serve as an important biomarker for renal diseases, with changes in volume correlating directly with kidney function. Currently, clinical practice often relies on subjective visual assessment for evaluating kidney size and abnormalities, including tumors and cysts, which are typically staged based on diameter, volume, and anatomical location. To support a more objective and reproducible approach, this research aims to develop a robust, thoroughly validated kidney abnormality segmentation algorithm, made publicly available for clinical and research use. We employ publicly available training datasets and leverage the state-of-the-art medical image segmentation framework nnU-Net. Validation is conducted using both proprietary and public test datasets, with segmentation performance quantified by Dice coefficient and the 95th percentile Hausdorff distance. Furthermore, we analyze robustness across subgroups based on patient sex, age, CT contrast phases, and tumor histologic subtypes. Our findings demonstrate that our segmentation algorithm, trained exclusively on publicly available data, generalizes effectively to external test sets and outperforms existing state-of-the-art models across all tested datasets. Subgroup analyses reveal consistent high performance, indicating strong robustness and reliability. The developed algorithm and associated code are publicly accessible at https://github.com/DIAGNijmegen/oncology-kidney-abnormality-segmentation.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Divide to Conquer: A Field Decomposition Approach for Multi-Organ Whole-Body CT Image Registration
Authors:
Xuan Loc Pham,
Mathias Prokop,
Bram van Ginneken,
Alessa Hering
Abstract:
Image registration is an essential technique for the analysis of Computed Tomography (CT) images in clinical practice. However, existing methodologies are predominantly tailored to a specific organ of interest and often exhibit lower performance on other organs, thus limiting their generalizability and applicability. Multi-organ registration addresses these limitations, but the simultaneous alignm…
▽ More
Image registration is an essential technique for the analysis of Computed Tomography (CT) images in clinical practice. However, existing methodologies are predominantly tailored to a specific organ of interest and often exhibit lower performance on other organs, thus limiting their generalizability and applicability. Multi-organ registration addresses these limitations, but the simultaneous alignment of multiple organs with diverse shapes, sizes and locations requires a highly complex deformation field with a multi-layer composition of individual deformations. This study introduces a novel field decomposition approach to address the high complexity of deformations in multi-organ whole-body CT image registration. The proposed method is trained and evaluated on a longitudinal dataset of 691 patients, each with two CT images obtained at distinct time points. These scans fully encompass the thoracic, abdominal, and pelvic regions. Two baseline registration methods are selected for this study: one based on optimization techniques and another based on deep learning. Experimental results demonstrate that the proposed approach outperforms baseline methods in handling complex deformations in multi-organ whole-body CT image registration.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
SemML: Enhancing Automata-Theoretic LTL Synthesis with Machine Learning
Authors:
Jan Kretinsky,
Tobias Meggendorfer,
Maximilian Prokop,
Ashkan Zarkhah
Abstract:
Synthesizing a reactive system from specifications given in linear temporal logic (LTL) is a classical problem, finding its applications in safety-critical systems design. We present our tool SemML, which won this year's LTL realizability tracks of SYNTCOMP, after years of domination by Strix. While both tools are based on the automata-theoretic approach, ours relies heavily on (i) Semantic labell…
▽ More
Synthesizing a reactive system from specifications given in linear temporal logic (LTL) is a classical problem, finding its applications in safety-critical systems design. We present our tool SemML, which won this year's LTL realizability tracks of SYNTCOMP, after years of domination by Strix. While both tools are based on the automata-theoretic approach, ours relies heavily on (i) Semantic labelling, additional information of logical nature, coming from recent LTL-to-automata translations and decorating the resulting parity game, and (ii) Machine Learning approaches turning this information into a guidance oracle for on-the-fly exploration of the parity game (whence the name SemML). Our tool fills the missing gaps of previous suggestions to use such an oracle and provides an efficeint implementation with additional algorithmic improvements. We evaluate SemML both on the entire set of SYNTCOMP as well as a synthetic data set, compare it to Strix, and analyze the advantages and limitations. As SemML solves more instances on SYNTCOMP and does so significantly faster on larger instances, this demonstrates for the first time that machine-learning-aided approaches can out-perform state-of-the-art tools in real LTL synthesis.
△ Less
Submitted 17 April, 2025; v1 submitted 29 January, 2025;
originally announced January 2025.
-
The ULS23 Challenge: a Baseline Model and Benchmark Dataset for 3D Universal Lesion Segmentation in Computed Tomography
Authors:
M. J. J. de Grauw,
E. Th. Scholten,
E. J. Smit,
M. J. C. M. Rutten,
M. Prokop,
B. van Ginneken,
A. Hering
Abstract:
Size measurements of tumor manifestations on follow-up CT examinations are crucial for evaluating treatment outcomes in cancer patients. Efficient lesion segmentation can speed up these radiological workflows. While numerous benchmarks and challenges address lesion segmentation in specific organs like the liver, kidneys, and lungs, the larger variety of lesion types encountered in clinical practic…
▽ More
Size measurements of tumor manifestations on follow-up CT examinations are crucial for evaluating treatment outcomes in cancer patients. Efficient lesion segmentation can speed up these radiological workflows. While numerous benchmarks and challenges address lesion segmentation in specific organs like the liver, kidneys, and lungs, the larger variety of lesion types encountered in clinical practice demands a more universal approach. To address this gap, we introduced the ULS23 benchmark for 3D universal lesion segmentation in chest-abdomen-pelvis CT examinations. The ULS23 training dataset contains 38,693 lesions across this region, including challenging pancreatic, colon and bone lesions. For evaluation purposes, we curated a dataset comprising 775 lesions from 284 patients. Each of these lesions was identified as a target lesion in a clinical context, ensuring diversity and clinical relevance within this dataset. The ULS23 benchmark is publicly accessible via uls23.grand-challenge.org, enabling researchers worldwide to assess the performance of their segmentation methods. Furthermore, we have developed and publicly released our baseline semi-supervised 3D lesion segmentation model. This model achieved an average Dice coefficient of 0.703 $\pm$ 0.240 on the challenge test set. We invite ongoing submissions to advance the development of future ULS models.
△ Less
Submitted 21 June, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
MRSegmentator: Multi-Modality Segmentation of 40 Classes in MRI and CT
Authors:
Hartmut Häntze,
Lina Xu,
Christian J. Mertens,
Felix J. Dorfner,
Leonhard Donle,
Felix Busch,
Avan Kader,
Sebastian Ziegelmayer,
Nadine Bayerl,
Nassir Navab,
Daniel Rueckert,
Julia Schnabel,
Hugo JWL Aerts,
Daniel Truhn,
Fabian Bamberg,
Jakob Weiß,
Christopher L. Schlett,
Steffen Ringhof,
Thoralf Niendorf,
Tobias Pischon,
Hans-Ulrich Kauczor,
Tobias Nonnenmacher,
Thomas Kröncke,
Henry Völzke,
Jeanette Schulz-Menger
, et al. (7 additional authors not shown)
Abstract:
Purpose: To develop and evaluate a deep learning model for multi-organ segmentation of MRI scans.
Materials and Methods: The model was trained on 1,200 manually annotated 3D axial MRI scans from the UK Biobank, 221 in-house MRI scans, and 1228 CT scans from the TotalSegmentator dataset. A human-in-the-loop annotation workflow was employed, leveraging cross-modality transfer learning from an exis…
▽ More
Purpose: To develop and evaluate a deep learning model for multi-organ segmentation of MRI scans.
Materials and Methods: The model was trained on 1,200 manually annotated 3D axial MRI scans from the UK Biobank, 221 in-house MRI scans, and 1228 CT scans from the TotalSegmentator dataset. A human-in-the-loop annotation workflow was employed, leveraging cross-modality transfer learning from an existing CT segmentation model to segment 40 anatomical structures. The annotation process began with a model based on transfer learning between CT and MR, which was iteratively refined based on manual corrections to predicted segmentations. The model's performance was evaluated on MRI examinations obtained from the German National Cohort (NAKO) study (n=900) from the AMOS22 dataset (n=60) and from the TotalSegmentator-MRI test data (n=29). The Dice Similarity Coefficient (DSC) and Hausdorff Distance (HD) were used to assess segmentation quality, stratified by organ and scan type. The model and its weights will be open-sourced.
Results: MRSegmentator demonstrated high accuracy for well-defined organs (lungs: DSC 0.96, heart: DSC 0.94) and organs with anatomic variability (liver: DSC 0.96, kidneys: DSC 0.95). Smaller structures showed lower accuracy (portal/splenic veins: DSC 0.64, adrenal glands: DSC 0.69). On external validation using NAKO data, mean DSC ranged from 0.85 $\pm$ 0.08 for T2-HASTE to 0.91 $\pm$ 0.05 for in-phase sequences. The model generalized well to CT, achieving mean DSC of 0.84 $\pm$ 0.11 on AMOS CT data.
Conclusion: MRSegmentator accurately segments 40 anatomical structures in MRI across diverse datasets and imaging protocols, with additional generalizability to CT images. This open-source model will provide a valuable tool for automated multi-organ segmentation in medical imaging research. It can be downloaded from https://github.com/hhaentze/MRSegmentator.
△ Less
Submitted 14 November, 2024; v1 submitted 10 May, 2024;
originally announced May 2024.
-
Grover's oracle for the Shortest Vector Problem and its application in hybrid classical-quantum solvers
Authors:
Milos Prokop,
Petros Wallden,
David Joseph
Abstract:
Finding the shortest vector in a lattice is a problem that is believed to be hard both for classical and quantum computers. Many major post-quantum secure cryptosystems base their security on the hardness of the Shortest Vector Problem (SVP). Finding the best classical, quantum or hybrid classical-quantum algorithms for SVP is necessary to select cryptosystem parameters that offer sufficient level…
▽ More
Finding the shortest vector in a lattice is a problem that is believed to be hard both for classical and quantum computers. Many major post-quantum secure cryptosystems base their security on the hardness of the Shortest Vector Problem (SVP). Finding the best classical, quantum or hybrid classical-quantum algorithms for SVP is necessary to select cryptosystem parameters that offer sufficient level of security. Grover's search quantum algorithm provides a generic quadratic speed-up, given access to an oracle implementing some function which describes when a solution is found. In this paper we provide concrete implementation of such an oracle for the SVP. We define the circuit, and evaluate costs in terms of number of qubits, number of gates, depth and T-quantum cost. We then analyze how to combine Grover's quantum search for small SVP instances with state-of-the-art classical solvers that use well known algorithms, such as the BKZ, where the former is used as a subroutine. This could enable solving larger instances of SVP with higher probability than classical state-of-the-art records, but still very far from posing any threat to cryptosystems being considered for standardization. Depending on the technology available, there is a spectrum of trade-offs in creating this combination.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Transfer learning from a sparsely annotated dataset of 3D medical images
Authors:
Gabriel Efrain Humpire-Mamani,
Colin Jacobs,
Mathias Prokop,
Bram van Ginneken,
Nikolas Lessmann
Abstract:
Transfer learning leverages pre-trained model features from a large dataset to save time and resources when training new models for various tasks, potentially enhancing performance. Due to the lack of large datasets in the medical imaging domain, transfer learning from one medical imaging model to other medical imaging models has not been widely explored. This study explores the use of transfer le…
▽ More
Transfer learning leverages pre-trained model features from a large dataset to save time and resources when training new models for various tasks, potentially enhancing performance. Due to the lack of large datasets in the medical imaging domain, transfer learning from one medical imaging model to other medical imaging models has not been widely explored. This study explores the use of transfer learning to improve the performance of deep convolutional neural networks for organ segmentation in medical imaging. A base segmentation model (3D U-Net) was trained on a large and sparsely annotated dataset; its weights were used for transfer learning on four new down-stream segmentation tasks for which a fully annotated dataset was available. We analyzed the training set size's influence to simulate scarce data. The results showed that transfer learning from the base model was beneficial when small datasets were available, providing significant performance improvements; where fine-tuning the base model is more beneficial than updating all the network weights with vanilla transfer learning. Transfer learning with fine-tuning increased the performance by up to 0.129 (+28\%) Dice score than experiments trained from scratch, and on average 23 experiments increased the performance by 0.029 Dice score in the new segmentation tasks. The study also showed that cross-modality transfer learning using CT scans was beneficial. The findings of this study demonstrate the potential of transfer learning to improve the efficiency of annotation and increase the accessibility of accurate organ segmentation in medical imaging, ultimately leading to improved patient care. We made the network definition and weights publicly available to benefit other users and researchers.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Kidney abnormality segmentation in thorax-abdomen CT scans
Authors:
Gabriel Efrain Humpire Mamani,
Nikolas Lessmann,
Ernst Th. Scholten,
Mathias Prokop,
Colin Jacobs,
Bram van Ginneken
Abstract:
In this study, we introduce a deep learning approach for segmenting kidney parenchyma and kidney abnormalities to support clinicians in identifying and quantifying renal abnormalities such as cysts, lesions, masses, metastases, and primary tumors. Our end-to-end segmentation method was trained on 215 contrast-enhanced thoracic-abdominal CT scans, with half of these scans containing one or more abn…
▽ More
In this study, we introduce a deep learning approach for segmenting kidney parenchyma and kidney abnormalities to support clinicians in identifying and quantifying renal abnormalities such as cysts, lesions, masses, metastases, and primary tumors. Our end-to-end segmentation method was trained on 215 contrast-enhanced thoracic-abdominal CT scans, with half of these scans containing one or more abnormalities.
We began by implementing our own version of the original 3D U-Net network and incorporated four additional components: an end-to-end multi-resolution approach, a set of task-specific data augmentations, a modified loss function using top-$k$, and spatial dropout. Furthermore, we devised a tailored post-processing strategy. Ablation studies demonstrated that each of the four modifications enhanced kidney abnormality segmentation performance, while three out of four improved kidney parenchyma segmentation. Subsequently, we trained the nnUNet framework on our dataset. By ensembling the optimized 3D U-Net and the nnUNet with our specialized post-processing, we achieved marginally superior results.
Our best-performing model attained Dice scores of 0.965 and 0.947 for segmenting kidney parenchyma in two test sets (20 scans without abnormalities and 30 with abnormalities), outperforming an independent human observer who scored 0.944 and 0.925, respectively. In segmenting kidney abnormalities within the 30 test scans containing them, the top-performing method achieved a Dice score of 0.585, while an independent second human observer reached a score of 0.664, suggesting potential for further improvement in computerized methods.
All training data is available to the research community under a CC-BY 4.0 license on https://doi.org/10.5281/zenodo.8014289
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Guessing Winning Policies in LTL Synthesis by Semantic Learning
Authors:
Jan Kretinsky,
Tobias Meggendorfer,
Maximilian Prokop,
Sabine Rieder
Abstract:
We provide a learning-based technique for guessing a winning strategy in a parity game originating from an LTL synthesis problem. A cheaply obtained guess can be useful in several applications. Not only can the guessed strategy be applied as best-effort in cases where the game's huge size prohibits rigorous approaches, but it can also increase the scalability of rigorous LTL synthesis in several w…
▽ More
We provide a learning-based technique for guessing a winning strategy in a parity game originating from an LTL synthesis problem. A cheaply obtained guess can be useful in several applications. Not only can the guessed strategy be applied as best-effort in cases where the game's huge size prohibits rigorous approaches, but it can also increase the scalability of rigorous LTL synthesis in several ways. Firstly, checking whether a guessed strategy is winning is easier than constructing one. Secondly, even if the guess is wrong in some places, it can be fixed by strategy iteration faster than constructing one from scratch. Thirdly, the guess can be used in on-the-fly approaches to prioritize exploration in the most fruitful directions.
In contrast to previous works, we (i)~reflect the highly structured logical information in game's states, the so-called semantic labelling, coming from the recent LTL-to-automata translations, and (ii)~learn to reflect it properly by learning from previously solved games, bringing the solving process closer to human-like reasoning.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
Variational quantum solutions to the Shortest Vector Problem
Authors:
Martin R. Albrecht,
Miloš Prokop,
Yixin Shen,
Petros Wallden
Abstract:
A fundamental computational problem is to find a shortest non-zero vector in Euclidean lattices, a problem known as the Shortest Vector Problem (SVP). This problem is believed to be hard even on quantum computers and thus plays a pivotal role in post-quantum cryptography. In this work we explore how (efficiently) Noisy Intermediate Scale Quantum (NISQ) devices may be used to solve SVP. Specificall…
▽ More
A fundamental computational problem is to find a shortest non-zero vector in Euclidean lattices, a problem known as the Shortest Vector Problem (SVP). This problem is believed to be hard even on quantum computers and thus plays a pivotal role in post-quantum cryptography. In this work we explore how (efficiently) Noisy Intermediate Scale Quantum (NISQ) devices may be used to solve SVP. Specifically, we map the problem to that of finding the ground state of a suitable Hamiltonian. In particular, (i) we establish new bounds for lattice enumeration, this allows us to obtain new bounds (resp.~estimates) for the number of qubits required per dimension for any lattices (resp.~random q-ary lattices) to solve SVP; (ii) we exclude the zero vector from the optimization space by proposing (a) a different classical optimisation loop or alternatively (b) a new mapping to the Hamiltonian. These improvements allow us to solve SVP in dimension up to 28 in a quantum emulation, significantly more than what was previously achieved, even for special cases. Finally, we extrapolate the size of NISQ devices that is required to be able to solve instances of lattices that are hard even for the best classical algorithms and find that with approximately $10^3$ noisy qubits such instances can be tackled.
△ Less
Submitted 23 February, 2023; v1 submitted 14 February, 2022;
originally announced February 2022.
-
Automated Estimation of Total Lung Volume using Chest Radiographs and Deep Learning
Authors:
Ecem Sogancioglu,
Keelin Murphy,
Ernst Th. Scholten,
Luuk H. Boulogne,
Mathias Prokop,
Bram van Ginneken
Abstract:
Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 92…
▽ More
Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 928 CXR studies were chosen from patients with pulmonary function test (PFT) results. The reference total lung volume was calculated from lung segmentation on CT or PFT data, respectively. This dataset was used to train deep-learning architectures to predict total lung volume from chest radiographs. The experiments were constructed in a step-wise fashion with increasing complexity to demonstrate the effect of training with CT-derived labels only and the sources of error. The optimal models were tested on 291 CXR studies with reference lung volume obtained from PFT. The optimal deep-learning regression model showed an MAE of 408 ml and a MAPE of 8.1\% and Pearson's r = 0.92 using both frontal and lateral chest radiographs as input. CT-derived labels were useful for pre-training but the optimal performance was obtained by fine-tuning the network with PFT-derived labels. We demonstrate, for the first time, that state-of-the-art deep learning solutions can accurately measure total lung volume from plain chest radiographs. The proposed model can be used to obtain total lung volume from routinely acquired chest radiographs at no additional cost and could be a useful tool to identify trends over time in patients referred regularly for chest x-rays.
△ Less
Submitted 3 May, 2021;
originally announced May 2021.
-
Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the LUNA16 challenge
Authors:
Arnaud Arindra Adiyoso Setio,
Alberto Traverso,
Thomas de Bel,
Moira S. N. Berens,
Cas van den Bogaard,
Piergiorgio Cerello,
Hao Chen,
Qi Dou,
Maria Evelina Fantacci,
Bram Geurts,
Robbert van der Gugten,
Pheng Ann Heng,
Bart Jansen,
Michael M. J. de Kaste,
Valentin Kotov,
Jack Yu-Hung Lin,
Jeroen T. M. C. Manders,
Alexander Sónora-Mengana,
Juan Carlos García-Naranjo,
Evgenia Papavasileiou,
Mathias Prokop,
Marco Saletta,
Cornelia M Schaefer-Prokop,
Ernst T. Scholten,
Luuk Scholten
, et al. (7 additional authors not shown)
Abstract:
Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorit…
▽ More
Automatic detection of pulmonary nodules in thoracic computed tomography (CT) scans has been an active area of research for the last two decades. However, there have only been few studies that provide a comparative performance evaluation of different systems on a common database. We have therefore set up the LUNA16 challenge, an objective evaluation framework for automatic nodule detection algorithms using the largest publicly available reference database of chest CT scans, the LIDC-IDRI data set. In LUNA16, participants develop their algorithm and upload their predictions on 888 CT scans in one of the two tracks: 1) the complete nodule detection track where a complete CAD system should be developed, or 2) the false positive reduction track where a provided set of nodule candidates should be classified. This paper describes the setup of LUNA16 and presents the results of the challenge so far. Moreover, the impact of combining individual systems on the detection performance was also investigated. It was observed that the leading solutions employed convolutional networks and used the provided set of nodule candidates. The combination of these solutions achieved an excellent sensitivity of over 95% at fewer than 1.0 false positives per scan. This highlights the potential of combining algorithms to improve the detection performance. Our observer study with four expert readers has shown that the best system detects nodules that were missed by expert readers who originally annotated the LIDC-IDRI data. We released this set of additional nodules for further development of CAD systems.
△ Less
Submitted 15 July, 2017; v1 submitted 23 December, 2016;
originally announced December 2016.
-
Towards automatic pulmonary nodule management in lung cancer screening with deep learning
Authors:
Francesco Ciompi,
Kaman Chung,
Sarah J. van Riel,
Arnaud Arindra Adiyoso Setio,
Paul K. Gerke,
Colin Jacobs,
Ernst Th. Scholten,
Cornelia Schaefer-Prokop,
Mathilde M. W. Wille,
Alfonso Marchiano,
Ugo Pastorino,
Mathias Prokop,
Bram van Ginneken
Abstract:
The introduction of lung cancer screening programs will produce an unprecedented amount of chest CT scans in the near future, which radiologists will have to read in order to decide on a patient follow-up strategy. According to the current guidelines, the workup of screen-detected nodules strongly relies on nodule size and nodule type. In this paper, we present a deep learning system based on mult…
▽ More
The introduction of lung cancer screening programs will produce an unprecedented amount of chest CT scans in the near future, which radiologists will have to read in order to decide on a patient follow-up strategy. According to the current guidelines, the workup of screen-detected nodules strongly relies on nodule size and nodule type. In this paper, we present a deep learning system based on multi-stream multi-scale convolutional networks, which automatically classifies all nodule types relevant for nodule workup. The system processes raw CT data containing a nodule without the need for any additional information such as nodule segmentation or nodule size and learns a representation of 3D data by analyzing an arbitrary number of 2D views of a given nodule. The deep learning system was trained with data from the Italian MILD screening trial and validated on an independent set of data from the Danish DLCST screening trial. We analyze the advantage of processing nodules at multiple scales with a multi-stream convolutional network architecture, and we show that the proposed deep learning system achieves performance at classifying nodule type that surpasses the one of classical machine learning approaches and is within the inter-observer variability among four experienced human observers.
△ Less
Submitted 23 May, 2017; v1 submitted 28 October, 2016;
originally announced October 2016.