-
Multipath cycleGAN for harmonization of paired and unpaired low-dose lung computed tomography reconstruction kernels
Authors:
Aravind R. Krishnan,
Thomas Z. Li,
Lucas W. Remedios,
Michael E. Kim,
Chenyu Gao,
Gaurav Rudravaram,
Elyssa M. McMaster,
Adam M. Saunders,
Shunxing Bao,
Kaiwen Xu,
Lianrui Zuo,
Kim L. Sandler,
Fabien Maldonado,
Yuankai Huo,
Bennett A. Landman
Abstract:
Reconstruction kernels in computed tomography (CT) affect spatial resolution and noise characteristics, introducing systematic variability in quantitative imaging measurements such as emphysema quantification. Choosing an appropriate kernel is therefore essential for consistent quantitative analysis. We propose a multipath cycleGAN model for CT kernel harmonization, trained on a mixture of paired…
▽ More
Reconstruction kernels in computed tomography (CT) affect spatial resolution and noise characteristics, introducing systematic variability in quantitative imaging measurements such as emphysema quantification. Choosing an appropriate kernel is therefore essential for consistent quantitative analysis. We propose a multipath cycleGAN model for CT kernel harmonization, trained on a mixture of paired and unpaired data from a low-dose lung cancer screening cohort. The model features domain-specific encoders and decoders with a shared latent space and uses discriminators tailored for each domain.We train the model on 42 kernel combinations using 100 scans each from seven representative kernels in the National Lung Screening Trial (NLST) dataset. To evaluate performance, 240 scans from each kernel are harmonized to a reference soft kernel, and emphysema is quantified before and after harmonization. A general linear model assesses the impact of age, sex, smoking status, and kernel on emphysema. We also evaluate harmonization from soft kernels to a reference hard kernel. To assess anatomical consistency, we compare segmentations of lung vessels, muscle, and subcutaneous adipose tissue generated by TotalSegmentator between harmonized and original images. Our model is benchmarked against traditional and switchable cycleGANs. For paired kernels, our approach reduces bias in emphysema scores, as seen in Bland-Altman plots (p<0.05). For unpaired kernels, harmonization eliminates confounding differences in emphysema (p>0.05). High Dice scores confirm preservation of muscle and fat anatomy, while lung vessel overlap remains reasonable. Overall, our shared latent space multipath cycleGAN enables robust harmonization across paired and unpaired CT kernels, improving emphysema quantification and preserving anatomical fidelity.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
Investigating the impact of kernel harmonization and deformable registration on inspiratory and expiratory chest CT images for people with COPD
Authors:
Aravind R. Krishnan,
Yihao Liu,
Kaiwen Xu,
Michael E. Kim,
Lucas W. Remedios,
Gaurav Rudravaram,
Adam M. Saunders,
Bradley W. Richmond,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman,
Lianrui Zuo
Abstract:
Paired inspiratory-expiratory CT scans enable the quantification of gas trapping due to small airway disease and emphysema by analyzing lung tissue motion in COPD patients. Deformable image registration of these scans assesses regional lung volumetric changes. However, variations in reconstruction kernels between paired scans introduce errors in quantitative analysis. This work proposes a two-stag…
▽ More
Paired inspiratory-expiratory CT scans enable the quantification of gas trapping due to small airway disease and emphysema by analyzing lung tissue motion in COPD patients. Deformable image registration of these scans assesses regional lung volumetric changes. However, variations in reconstruction kernels between paired scans introduce errors in quantitative analysis. This work proposes a two-stage pipeline to harmonize reconstruction kernels and perform deformable image registration using data acquired from the COPDGene study. We use a cycle generative adversarial network (GAN) to harmonize inspiratory scans reconstructed with a hard kernel (BONE) to match expiratory scans reconstructed with a soft kernel (STANDARD). We then deformably register the expiratory scans to inspiratory scans. We validate harmonization by measuring emphysema using a publicly available segmentation algorithm before and after harmonization. Results show harmonization significantly reduces emphysema measurement inconsistencies, decreasing median emphysema scores from 10.479% to 3.039%, with a reference median score of 1.305% from the STANDARD kernel as the target. Registration accuracy is evaluated via Dice overlap between emphysema regions on inspiratory, expiratory, and deformed images. The Dice coefficient between inspiratory emphysema masks and deformably registered emphysema masks increases significantly across registration stages (p<0.001). Additionally, we demonstrate that deformable registration is robust to kernel variations.
△ Less
Submitted 7 February, 2025;
originally announced February 2025.
-
Robust Body Composition Analysis by Generating 3D CT Volumes from Limited 2D Slices
Authors:
Lianrui Zuo,
Xin Yu,
Dingjie Su,
Kaiwen Xu,
Aravind R. Krishnan,
Yihao Liu,
Shunxing Bao,
Fabien Maldonado,
Luigi Ferrucci,
Bennett A. Landman
Abstract:
Body composition analysis provides valuable insights into aging, disease progression, and overall health conditions. Due to concerns of radiation exposure, two-dimensional (2D) single-slice computed tomography (CT) imaging has been used repeatedly for body composition analysis. However, this approach introduces significant spatial variability that can impact the accuracy and robustness of the anal…
▽ More
Body composition analysis provides valuable insights into aging, disease progression, and overall health conditions. Due to concerns of radiation exposure, two-dimensional (2D) single-slice computed tomography (CT) imaging has been used repeatedly for body composition analysis. However, this approach introduces significant spatial variability that can impact the accuracy and robustness of the analysis. To mitigate this issue and facilitate body composition analysis, this paper presents a novel method to generate 3D CT volumes from limited number of 2D slices using a latent diffusion model (LDM). Our approach first maps 2D slices into a latent representation space using a variational autoencoder. An LDM is then trained to capture the 3D context of a stack of these latent representations. To accurately interpolate intermediateslices and construct a full 3D volume, we utilize body part regression to determine the spatial location and distance between the acquired slices. Experiments on both in-house and public 3D abdominal CT datasets demonstrate that the proposed method significantly enhances body composition analysis compared to traditional 2D-based analysis, with a reduced error rate from 23.3% to 15.2%.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Beyond the Lungs: Extending the Field of View in Chest CT with Latent Diffusion Models
Authors:
Lianrui Zuo,
Kaiwen Xu,
Dingjie Su,
Xin Yu,
Aravind R. Krishnan,
Yihao Liu,
Shunxing Bao,
Thomas Li,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman
Abstract:
The interconnection between the human lungs and other organs, such as the liver and kidneys, is crucial for understanding the underlying risks and effects of lung diseases and improving patient care. However, most research chest CT imaging is focused solely on the lungs due to considerations of cost and radiation dose. This restricted field of view (FOV) in the acquired images poses challenges to…
▽ More
The interconnection between the human lungs and other organs, such as the liver and kidneys, is crucial for understanding the underlying risks and effects of lung diseases and improving patient care. However, most research chest CT imaging is focused solely on the lungs due to considerations of cost and radiation dose. This restricted field of view (FOV) in the acquired images poses challenges to comprehensive analysis and hinders the ability to gain insights into the impact of lung diseases on other organs. To address this, we propose SCOPE (Spatial Coverage Optimization with Prior Encoding), a novel approach to capture the inter-organ relationships from CT images and extend the FOV of chest CT images. Our approach first trains a variational autoencoder (VAE) to encode 2D axial CT slices individually, then stacks the latent representations of the VAE to form a 3D context for training a latent diffusion model. Once trained, our approach extends the FOV of CT images in the z-direction by generating new axial slices in a zero-shot manner. We evaluated our approach on the National Lung Screening Trial (NLST) dataset, and results suggest that it effectively extends the FOV to include the liver and kidneys, which are not completely covered in the original NLST data acquisition. Quantitative results on a held-out whole-body dataset demonstrate that the generated slices exhibit high fidelity with acquired data, achieving an SSIM of 0.81.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Inter-vendor harmonization of Computed Tomography (CT) reconstruction kernels using unpaired image translation
Authors:
Aravind R. Krishnan,
Kaiwen Xu,
Thomas Li,
Chenyu Gao,
Lucas W. Remedios,
Praitayini Kanakaraj,
Ho Hin Lee,
Shunxing Bao,
Kim L. Sandler,
Fabien Maldonado,
Ivana Isgum,
Bennett A. Landman
Abstract:
The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmoni…
▽ More
The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmonization of CT scans in single or multiple manufacturers. However, these methods require paired scans of hard and soft reconstruction kernels that are spatially and anatomically aligned. Additionally, a large number of models need to be trained across different kernel pairs within manufacturers. In this study, we adopt an unpaired image translation approach to investigate harmonization between and across reconstruction kernels from different manufacturers by constructing a multipath cycle generative adversarial network (GAN). We use hard and soft reconstruction kernels from the Siemens and GE vendors from the National Lung Screening Trial dataset. We use 50 scans from each reconstruction kernel and train a multipath cycle GAN. To evaluate the effect of harmonization on the reconstruction kernels, we harmonize 50 scans each from Siemens hard kernel, GE soft kernel and GE hard kernel to a reference Siemens soft kernel (B30f) and evaluate percent emphysema. We fit a linear model by considering the age, smoking status, sex and vendor and perform an analysis of variance (ANOVA) on the emphysema scores. Our approach minimizes differences in emphysema measurement and highlights the impact of age, sex, smoking status and vendor on emphysema quantification.
△ Less
Submitted 26 January, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
Zero-shot CT Field-of-view Completion with Unconditional Generative Diffusion Prior
Authors:
Kaiwen Xu,
Aravind R. Krishnan,
Thomas Z. Li,
Yuankai Huo,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman
Abstract:
Anatomically consistent field-of-view (FOV) completion to recover truncated body sections has important applications in quantitative analyses of computed tomography (CT) with limited FOV. Existing solution based on conditional generative models relies on the fidelity of synthetic truncation patterns at training phase, which poses limitations for the generalizability of the method to potential unkn…
▽ More
Anatomically consistent field-of-view (FOV) completion to recover truncated body sections has important applications in quantitative analyses of computed tomography (CT) with limited FOV. Existing solution based on conditional generative models relies on the fidelity of synthetic truncation patterns at training phase, which poses limitations for the generalizability of the method to potential unknown types of truncation. In this study, we evaluate a zero-shot method based on a pretrained unconditional generative diffusion prior, where truncation pattern with arbitrary forms can be specified at inference phase. In evaluation on simulated chest CT slices with synthetic FOV truncation, the method is capable of recovering anatomically consistent body sections and subcutaneous adipose tissue measurement error caused by FOV truncation. However, the correction accuracy is inferior to the conditionally trained counterpart.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification
Authors:
Thomas Z. Li,
John M. Still,
Kaiwen Xu,
Ho Hin Lee,
Leon Y. Cai,
Aravind R. Krishnan,
Riqiang Gao,
Mirza S. Khan,
Sanja Antic,
Michael Kammer,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman,
Thomas A. Lasko
Abstract:
The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learni…
▽ More
The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers. Code available at https://github.com/MASILab/lmsignatures.
△ Less
Submitted 29 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Time-distance vision transformers in lung cancer diagnosis from longitudinal computed tomography
Authors:
Thomas Z. Li,
Kaiwen Xu,
Riqiang Gao,
Yucheng Tang,
Thomas A. Lasko,
Fabien Maldonado,
Kim Sandler,
Bennett A. Landman
Abstract:
Features learned from single radiologic images are unable to provide information about whether and how much a lesion may be changing over time. Time-dependent features computed from repeated images can capture those changes and help identify malignant lesions by their temporal behavior. However, longitudinal medical imaging presents the unique challenge of sparse, irregular time intervals in data…
▽ More
Features learned from single radiologic images are unable to provide information about whether and how much a lesion may be changing over time. Time-dependent features computed from repeated images can capture those changes and help identify malignant lesions by their temporal behavior. However, longitudinal medical imaging presents the unique challenge of sparse, irregular time intervals in data acquisition. While self-attention has been shown to be a versatile and efficient learning mechanism for time series and natural images, its potential for interpreting temporal distance between sparse, irregularly sampled spatial features has not been explored. In this work, we propose two interpretations of a time-distance vision transformer (ViT) by using (1) vector embeddings of continuous time and (2) a temporal emphasis model to scale self-attention weights. The two algorithms are evaluated based on benign versus malignant lung cancer discrimination of synthetic pulmonary nodules and lung screening computed tomography studies from the National Lung Screening Trial (NLST). Experiments evaluating the time-distance ViTs on synthetic nodules show a fundamental improvement in classifying irregularly sampled longitudinal images when compared to standard ViTs. In cross-validation on screening chest CTs from the NLST, our methods (0.785 and 0.786 AUC respectively) significantly outperform a cross-sectional approach (0.734 AUC) and match the discriminative performance of the leading longitudinal medical imaging algorithm (0.779 AUC) on benign versus malignant classification. This work represents the first self-attention-based framework for classifying longitudinal medical images. Our code is available at https://github.com/tom1193/time-distance-transformer.
△ Less
Submitted 4 September, 2022;
originally announced September 2022.
-
Body Composition Assessment with Limited Field-of-view Computed Tomography: A Semantic Image Extension Perspective
Authors:
Kaiwen Xu,
Thomas Li,
Mirza S. Khan,
Riqiang Gao,
Sanja L. Antic,
Yuankai Huo,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman
Abstract:
Field-of-view (FOV) tissue truncation beyond the lungs is common in routine lung screening computed tomography (CT). This poses limitations for opportunistic CT- based body composition (BC) assessment as key anatomical structures are missing. Traditionally, extending the FOV of CT is considered as a CT reconstruction problem using limited data. However, this approach relies on the projection domai…
▽ More
Field-of-view (FOV) tissue truncation beyond the lungs is common in routine lung screening computed tomography (CT). This poses limitations for opportunistic CT- based body composition (BC) assessment as key anatomical structures are missing. Traditionally, extending the FOV of CT is considered as a CT reconstruction problem using limited data. However, this approach relies on the projection domain data which might not be available in application. In this work, we formulate the problem from the semantic image extension perspective which only requires image data as inputs. The proposed two-stage method identifies a new FOV border based on the estimated extent of the complete body and imputes missing tissues in the truncated region. The training samples are simulated using CT slices with complete body in FOV, making the model development self-supervised. We evaluate the validity of the proposed method in automatic BC assessment using lung screening CT with limited FOV. The proposed method effectively restores the missing tissues and reduces BC assessment error introduced by FOV tissue truncation. In the BC assessment for a large-scale lung screening CT dataset, this correction improves both the intra-subject consistency and the correlation with anthropometric approximations. The developed method is available at https://github.com/MASILab/S-EFOV.
△ Less
Submitted 15 April, 2023; v1 submitted 13 July, 2022;
originally announced July 2022.
-
A Recurrent Neural Network Approach to Roll Estimation for Needle Steering
Authors:
Maxwell Emerson,
James M. Ferguson,
Tayfun Efe Ertop,
Margaret Rox,
Josephine Granna,
Michael Lester,
Fabien Maldonado,
Erin A. Gillaspie,
Ron Alterovitz,
Robert J. Webster III.,
Alan Kuntz
Abstract:
Steerable needles are a promising technology for delivering targeted therapies in the body in a minimally-invasive fashion, as they can curve around anatomical obstacles and hone in on anatomical targets. In order to accurately steer them, controllers must have full knowledge of the needle tip's orientation. However, current sensors either do not provide full orientation information or interfere w…
▽ More
Steerable needles are a promising technology for delivering targeted therapies in the body in a minimally-invasive fashion, as they can curve around anatomical obstacles and hone in on anatomical targets. In order to accurately steer them, controllers must have full knowledge of the needle tip's orientation. However, current sensors either do not provide full orientation information or interfere with the needle's ability to deliver therapy. Further, torsional dynamics can vary and depend on many parameters making steerable needles difficult to accurately model, limiting the effectiveness of traditional observer methods. To overcome these limitations, we propose a model-free, learned-method that leverages LSTM neural networks to estimate the needle tip's orientation online. We validate our method by integrating it into a sliding-mode controller and steering the needle to targets in gelatin and ex vivo ovine brain tissue. We compare our method's performance against an Extended Kalman Filter, a model-based observer, achieving significantly lower targeting errors.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.