-
When are Diffusion Priors Helpful in Sparse Reconstruction? A Study with Sparse-view CT
Authors:
Matt Y. Cheung,
Sophia Zorek,
Tucker J. Netherton,
Laurence E. Court,
Sadeer Al-Kindi,
Ashok Veeraraghavan,
Guha Balakrishnan
Abstract:
Diffusion models demonstrate state-of-the-art performance on image generation, and are gaining traction for sparse medical image reconstruction tasks. However, compared to classical reconstruction algorithms relying on simple analytical priors, diffusion models have the dangerous property of producing realistic looking results \emph{even when incorrect}, particularly with few observations. We inve…
▽ More
Diffusion models demonstrate state-of-the-art performance on image generation, and are gaining traction for sparse medical image reconstruction tasks. However, compared to classical reconstruction algorithms relying on simple analytical priors, diffusion models have the dangerous property of producing realistic looking results \emph{even when incorrect}, particularly with few observations. We investigate the utility of diffusion models as priors for image reconstruction by varying the number of observations and comparing their performance to classical priors (sparse and Tikhonov regularization) using pixel-based, structural, and downstream metrics. We make comparisons on low-dose chest wall computed tomography (CT) for fat mass quantification. First, we find that classical priors are superior to diffusion priors when the number of projections is ``sufficient''. Second, we find that diffusion priors can capture a large amount of detail with very few observations, significantly outperforming classical priors. However, they fall short of capturing all details, even with many observations. Finally, we find that the performance of diffusion priors plateau after extremely few ($\approx$10-15) projections. Ultimately, our work highlights potential issues with diffusion-based sparse reconstruction and underscores the importance of further investigation, particularly in high-stakes clinical settings.
△ Less
Submitted 4 February, 2025;
originally announced February 2025.
-
Regression Conformal Prediction under Bias
Authors:
Matt Y. Cheung,
Tucker J. Netherton,
Laurence E. Court,
Ashok Veeraraghavan,
Guha Balakrishnan
Abstract:
Uncertainty quantification is crucial to account for the imperfect predictions of machine learning algorithms for high-impact applications. Conformal prediction (CP) is a powerful framework for uncertainty quantification that generates calibrated prediction intervals with valid coverage. In this work, we study how CP intervals are affected by bias - the systematic deviation of a prediction from gr…
▽ More
Uncertainty quantification is crucial to account for the imperfect predictions of machine learning algorithms for high-impact applications. Conformal prediction (CP) is a powerful framework for uncertainty quantification that generates calibrated prediction intervals with valid coverage. In this work, we study how CP intervals are affected by bias - the systematic deviation of a prediction from ground truth values - a phenomenon prevalent in many real-world applications. We investigate the influence of bias on interval lengths of two different types of adjustments -- symmetric adjustments, the conventional method where both sides of the interval are adjusted equally, and asymmetric adjustments, a more flexible method where the interval can be adjusted unequally in positive or negative directions. We present theoretical and empirical analyses characterizing how symmetric and asymmetric adjustments impact the "tightness" of CP intervals for regression tasks. Specifically for absolute residual and quantile-based non-conformity scores, we prove: 1) the upper bound of symmetrically adjusted interval lengths increases by $2|b|$ where $b$ is a globally applied scalar value representing bias, 2) asymmetrically adjusted interval lengths are not affected by bias, and 3) conditions when asymmetrically adjusted interval lengths are guaranteed to be smaller than symmetric ones. Our analyses suggest that even if predictions exhibit significant drift from ground truth values, asymmetrically adjusted intervals are still able to maintain the same tightness and validity of intervals as if the drift had never happened, while symmetric ones significantly inflate the lengths. We demonstrate our theoretical results with two real-world prediction tasks: sparse-view computed tomography (CT) reconstruction and time-series weather forecasting. Our work paves the way for more bias-robust machine learning systems.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
Dimensionality Reduction and Nearest Neighbors for Improving Out-of-Distribution Detection in Medical Image Segmentation
Authors:
McKell Woodland,
Nihil Patel,
Austin Castelo,
Mais Al Taie,
Mohamed Eltaher,
Joshua P. Yung,
Tucker J. Netherton,
Tiffany L. Calderone,
Jessica I. Sanchez,
Darrel W. Cleere,
Ahmed Elsaiey,
Nakul Gupta,
David Victor,
Laura Beretta,
Ankit B. Patel,
Kristy K. Brock
Abstract:
Clinically deployed deep learning-based segmentation models are known to fail on data outside of their training distributions. While clinicians review the segmentations, these models tend to perform well in most instances, which could exacerbate automation bias. Therefore, detecting out-of-distribution images at inference is critical to warn the clinicians that the model likely failed. This work a…
▽ More
Clinically deployed deep learning-based segmentation models are known to fail on data outside of their training distributions. While clinicians review the segmentations, these models tend to perform well in most instances, which could exacerbate automation bias. Therefore, detecting out-of-distribution images at inference is critical to warn the clinicians that the model likely failed. This work applied the Mahalanobis distance (MD) post hoc to the bottleneck features of four Swin UNETR and nnU-net models that segmented the liver on T1-weighted magnetic resonance imaging and computed tomography. By reducing the dimensions of the bottleneck features with either principal component analysis or uniform manifold approximation and projection, images the models failed on were detected with high performance and minimal computational load. In addition, this work explored a non-parametric alternative to the MD, a k-th nearest neighbors distance (KNN). KNN drastically improved scalability and performance over MD when both were applied to raw and average-pooled bottleneck features.
△ Less
Submitted 2 October, 2024; v1 submitted 5 August, 2024;
originally announced August 2024.
-
Metric-Guided Conformal Bounds for Probabilistic Image Reconstruction
Authors:
Matt Y Cheung,
Tucker J Netherton,
Laurence E Court,
Ashok Veeraraghavan,
Guha Balakrishnan
Abstract:
Modern deep learning reconstruction algorithms generate impressively realistic scans from sparse inputs, but can often produce significant inaccuracies. This makes it difficult to provide statistically guaranteed claims about the true state of a subject from scans reconstructed by these algorithms. In this study, we propose a framework for computing provably valid prediction bounds on claims deriv…
▽ More
Modern deep learning reconstruction algorithms generate impressively realistic scans from sparse inputs, but can often produce significant inaccuracies. This makes it difficult to provide statistically guaranteed claims about the true state of a subject from scans reconstructed by these algorithms. In this study, we propose a framework for computing provably valid prediction bounds on claims derived from probabilistic black-box image reconstruction algorithms. The key insights behind our framework are to represent reconstructed scans with a derived clinical metric of interest, and to calibrate bounds on the ground truth metric with conformal prediction (CP) using a prior calibration dataset. These bounds convey interpretable feedback about the subject's state, and can also be used to retrieve nearest-neighbor reconstructed scans for visual inspection. We demonstrate the utility of this framework on sparse-view computed tomography (CT) for fat mass quantification and radiotherapy planning tasks. Results show that our framework produces bounds with better semantical interpretation than conventional pixel-based bounding approaches. Furthermore, we can flag dangerous outlier reconstructions that look plausible but have statistically unlikely metric values.
△ Less
Submitted 3 March, 2025; v1 submitted 23 April, 2024;
originally announced April 2024.
-
Evolving Horizons in Radiotherapy Auto-Contouring: Distilling Insights, Embracing Data-Centric Frameworks, and Moving Beyond Geometric Quantification
Authors:
Kareem A. Wahid,
Carlos E. Cardenas,
Barbara Marquez,
Tucker J. Netherton,
Benjamin H. Kann,
Laurence E. Court,
Renjie He,
Mohamed A. Naser,
Amy C. Moreno,
Clifton D. Fuller,
David Fuentes
Abstract:
Deep learning has significantly advanced the potential for automated contouring in radiotherapy planning. In this manuscript, guided by contemporary literature, we underscore three key insights: (1) High-quality training data is essential for auto-contouring algorithms; (2) Auto-contouring models demonstrate commendable performance even with limited medical image data; (3) The quantitative perform…
▽ More
Deep learning has significantly advanced the potential for automated contouring in radiotherapy planning. In this manuscript, guided by contemporary literature, we underscore three key insights: (1) High-quality training data is essential for auto-contouring algorithms; (2) Auto-contouring models demonstrate commendable performance even with limited medical image data; (3) The quantitative performance of auto-contouring is reaching a plateau. Given these insights, we emphasize the need for the radiotherapy research community to embrace data-centric approaches to further foster clinical adoption of auto-contouring technologies.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Dimensionality Reduction for Improving Out-of-Distribution Detection in Medical Image Segmentation
Authors:
McKell Woodland,
Nihil Patel,
Mais Al Taie,
Joshua P. Yung,
Tucker J. Netherton,
Ankit B. Patel,
Kristy K. Brock
Abstract:
Clinically deployed segmentation models are known to fail on data outside of their training distribution. As these models perform well on most cases, it is imperative to detect out-of-distribution (OOD) images at inference to protect against automation bias. This work applies the Mahalanobis distance post hoc to the bottleneck features of a Swin UNETR model that segments the liver on T1-weighted m…
▽ More
Clinically deployed segmentation models are known to fail on data outside of their training distribution. As these models perform well on most cases, it is imperative to detect out-of-distribution (OOD) images at inference to protect against automation bias. This work applies the Mahalanobis distance post hoc to the bottleneck features of a Swin UNETR model that segments the liver on T1-weighted magnetic resonance imaging. By reducing the dimensions of the bottleneck features with principal component analysis, OOD images were detected with high performance and minimal computational load.
△ Less
Submitted 19 October, 2023; v1 submitted 7 August, 2023;
originally announced August 2023.
-
VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images
Authors:
Anjany Sekuboyina,
Malek E. Husseini,
Amirhossein Bayat,
Maximilian Löffler,
Hans Liebl,
Hongwei Li,
Giles Tetteh,
Jan Kukačka,
Christian Payer,
Darko Štern,
Martin Urschler,
Maodong Chen,
Dalong Cheng,
Nikolas Lessmann,
Yujin Hu,
Tianfu Wang,
Dong Yang,
Daguang Xu,
Felix Ambellan,
Tamaz Amiranashvili,
Moritz Ehlke,
Hans Lamecker,
Sebastian Lehnert,
Marilia Lirio,
Nicolás Pérez de Olaguer
, et al. (44 additional authors not shown)
Abstract:
Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to co…
▽ More
Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to considerable variations in anatomy and acquisition protocols and due to a severe shortage of publicly available data. Addressing these limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with a call for algorithms towards labelling and segmentation of vertebrae. Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel-level by a human-machine hybrid algorithm (https://osf.io/nqjyw/, https://osf.io/t98fz/). A total of 25 algorithms were benchmarked on these datasets. In this work, we present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view. We also evaluate the generalisability of the approaches to an implicit domain shift in data by evaluating the top performing algorithms of one challenge iteration on data from the other iteration. The principal takeaway from VerSe: the performance of an algorithm in labelling and segmenting a spine scan hinges on its ability to correctly identify vertebrae in cases of rare anatomical variations. The content and code concerning VerSe can be accessed at: https://github.com/anjany/verse.
△ Less
Submitted 5 April, 2022; v1 submitted 24 January, 2020;
originally announced January 2020.