-
Multi-View Transformers for Airway-To-Lung Ratio Inference on Cardiac CT Scans: The C4R Study
Authors:
Sneha N. Naik,
Elsa D. Angelini,
Eric A. Hoffman,
Elizabeth C. Oelsner,
R. Graham Barr,
Benjamin M. Smith,
Andrew F. Laine
Abstract:
The ratio of airway tree lumen to lung size (ALR), assessed at full inspiration on high resolution full-lung computed tomography (CT), is a major risk factor for chronic obstructive pulmonary disease (COPD). There is growing interest to infer ALR from cardiac CT images, which are widely available in epidemiological cohorts, to investigate the relationship of ALR to severe COVID-19 and post-acute s…
▽ More
The ratio of airway tree lumen to lung size (ALR), assessed at full inspiration on high resolution full-lung computed tomography (CT), is a major risk factor for chronic obstructive pulmonary disease (COPD). There is growing interest to infer ALR from cardiac CT images, which are widely available in epidemiological cohorts, to investigate the relationship of ALR to severe COVID-19 and post-acute sequelae of SARS-CoV-2 infection (PASC). Previously, cardiac scans included approximately 2/3 of the total lung volume with 5-6x greater slice thickness than high-resolution (HR) full-lung (FL) CT. In this study, we present a novel attention-based Multi-view Swin Transformer to infer FL ALR values from segmented cardiac CT scans. For the supervised training we exploit paired full-lung and cardiac CTs acquired in the Multi-Ethnic Study of Atherosclerosis (MESA). Our network significantly outperforms a proxy direct ALR inference on segmented cardiac CT scans and achieves accuracy and reproducibility comparable with a scan-rescan reproducibility of the FL ALR ground-truth.
△ Less
Submitted 15 January, 2025;
originally announced January 2025.
-
Robust Quantification of Percent Emphysema on CT via Domain Attention: the Multi-Ethnic Study of Atherosclerosis (MESA) Lung Study
Authors:
Xuzhe Zhang,
Elsa D. Angelini,
Eric A. Hoffman,
Karol E. Watson,
Benjamin M. Smith,
R. Graham Barr,
Andrew F. Laine
Abstract:
Robust quantification of pulmonary emphysema on computed tomography (CT) remains challenging for large-scale research studies that involve scans from different scanner types and for translation to clinical scans. Existing studies have explored several directions to tackle this challenge, including density correction, noise filtering, regression, hidden Markov measure field (HMMF) model-based segme…
▽ More
Robust quantification of pulmonary emphysema on computed tomography (CT) remains challenging for large-scale research studies that involve scans from different scanner types and for translation to clinical scans. Existing studies have explored several directions to tackle this challenge, including density correction, noise filtering, regression, hidden Markov measure field (HMMF) model-based segmentation, and volume-adjusted lung density. Despite some promising results, previous studies either required a tedious workflow or limited opportunities for downstream emphysema subtyping, limiting efficient adaptation on a large-scale study. To alleviate this dilemma, we developed an end-to-end deep learning framework based on an existing HMMF segmentation framework. We first demonstrate that a regular UNet cannot replicate the existing HMMF results because of the lack of scanner priors. We then design a novel domain attention block to fuse image feature with quantitative scanner priors which significantly improves the results.
△ Less
Submitted 6 March, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Mesh Strikes Back: Fast and Efficient Human Reconstruction from RGB videos
Authors:
Rohit Jena,
Pratik Chaudhari,
James Gee,
Ganesh Iyer,
Siddharth Choudhary,
Brandon M. Smith
Abstract:
Human reconstruction and synthesis from monocular RGB videos is a challenging problem due to clothing, occlusion, texture discontinuities and sharpness, and framespecific pose changes. Many methods employ deferred rendering, NeRFs and implicit methods to represent clothed humans, on the premise that mesh-based representations cannot capture complex clothing and textures from RGB, silhouettes, and…
▽ More
Human reconstruction and synthesis from monocular RGB videos is a challenging problem due to clothing, occlusion, texture discontinuities and sharpness, and framespecific pose changes. Many methods employ deferred rendering, NeRFs and implicit methods to represent clothed humans, on the premise that mesh-based representations cannot capture complex clothing and textures from RGB, silhouettes, and keypoints alone. We provide a counter viewpoint to this fundamental premise by optimizing a SMPL+D mesh and an efficient, multi-resolution texture representation using only RGB images, binary silhouettes and sparse 2D keypoints. Experimental results demonstrate that our approach is more capable of capturing geometric details compared to visual hull, mesh-based methods. We show competitive novel view synthesis and improvements in novel pose synthesis compared to NeRF-based methods, which introduce noticeable, unwanted artifacts. By restricting the solution space to the SMPL+D model combined with differentiable rendering, we obtain dramatic speedups in compute, training times (up to 24x) and inference times (up to 192x). Our method therefore can be used as is or as a fast initialization to NeRF-based methods.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Novel Subtypes of Pulmonary Emphysema Based on Spatially-Informed Lung Texture Learning
Authors:
Jie Yang,
Elsa D. Angelini,
Pallavi P. Balte,
Eric A. Hoffman,
John H. M. Austin,
Benjamin M. Smith,
R. Graham Barr,
Andrew F. Laine
Abstract:
Pulmonary emphysema overlaps considerably with chronic obstructive pulmonary disease (COPD), and is traditionally subcategorized into three subtypes previously identified on autopsy. Unsupervised learning of emphysema subtypes on computed tomography (CT) opens the way to new definitions of emphysema subtypes and eliminates the need of thorough manual labeling. However, CT-based emphysema subtypes…
▽ More
Pulmonary emphysema overlaps considerably with chronic obstructive pulmonary disease (COPD), and is traditionally subcategorized into three subtypes previously identified on autopsy. Unsupervised learning of emphysema subtypes on computed tomography (CT) opens the way to new definitions of emphysema subtypes and eliminates the need of thorough manual labeling. However, CT-based emphysema subtypes have been limited to texture-based patterns without considering spatial location. In this work, we introduce a standardized spatial mapping of the lung for quantitative study of lung texture location, and propose a novel framework for combining spatial and texture information to discover spatially-informed lung texture patterns (sLTPs) that represent novel emphysema subtypes. Exploiting two cohorts of full-lung CT scans from the MESA COPD and EMCAP studies, we first show that our spatial mapping enables population-wide study of emphysema spatial location. We then evaluate the characteristics of the sLTPs discovered on MESA COPD, and show that they are reproducible, able to encode standard emphysema subtypes, and associated with physiological symptoms.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
Differential Scene Flow from Light Field Gradients
Authors:
Sizhuo Ma,
Brandon M. Smith,
Mohit Gupta
Abstract:
This paper presents novel techniques for recovering 3D dense scene flow, based on differential analysis of 4D light fields. The key enabling result is a per-ray linear equation, called the ray flow equation, that relates 3D scene flow to 4D light field gradients. The ray flow equation is invariant to 3D scene structure and applicable to a general class of scenes, but is under-constrained (3 unknow…
▽ More
This paper presents novel techniques for recovering 3D dense scene flow, based on differential analysis of 4D light fields. The key enabling result is a per-ray linear equation, called the ray flow equation, that relates 3D scene flow to 4D light field gradients. The ray flow equation is invariant to 3D scene structure and applicable to a general class of scenes, but is under-constrained (3 unknowns per equation). Thus, additional constraints must be imposed to recover motion. We develop two families of scene flow algorithms by leveraging the structural similarity between ray flow and optical flow equations: local 'Lucas-Kanade' ray flow and global 'Horn-Schunck' ray flow, inspired by corresponding optical flow methods. We also develop a combined local-global method by utilizing the correspondence structure in the light fields. We demonstrate high precision 3D scene flow recovery for a wide range of scenarios, including rotation and non-rigid motion. We analyze the theoretical and practical performance limits of the proposed techniques via the light field structure tensor, a 3x3 matrix that encodes the local structure of light fields. We envision that the proposed analysis and algorithms will lead to design of future light-field cameras that are optimized for motion sensing, in addition to depth sensing.
△ Less
Submitted 29 July, 2019; v1 submitted 26 July, 2019;
originally announced July 2019.
-
Explaining Radiological Emphysema Subtypes with Unsupervised Texture Prototypes: MESA COPD Study
Authors:
Jie Yang,
Elsa D. Angelini,
Benjamin M. Smith,
John H. M. Austin,
Eric A. Hoffman,
David A. Bluemke,
R. Graham Barr,
Andrew F. Laine
Abstract:
Pulmonary emphysema is traditionally subcategorized into three subtypes, which have distinct radiological appearances on computed tomography (CT) and can help with the diagnosis of chronic obstructive pulmonary disease (COPD). Automated texture-based quantification of emphysema subtypes has been successfully implemented via supervised learning of these three emphysema subtypes. In this work, we de…
▽ More
Pulmonary emphysema is traditionally subcategorized into three subtypes, which have distinct radiological appearances on computed tomography (CT) and can help with the diagnosis of chronic obstructive pulmonary disease (COPD). Automated texture-based quantification of emphysema subtypes has been successfully implemented via supervised learning of these three emphysema subtypes. In this work, we demonstrate that unsupervised learning on a large heterogeneous database of CT scans can generate texture prototypes that are visually homogeneous and distinct, reproducible across subjects, and capable of predicting accurately the three standard radiological subtypes. These texture prototypes enable automated labeling of lung volumes, and open the way to new interpretations of lung CT scans with finer subtyping of emphysema.
△ Less
Submitted 5 December, 2016;
originally announced December 2016.
-
Efficient Branching Cascaded Regression for Face Alignment under Significant Head Rotation
Authors:
Brandon M. Smith,
Charles R. Dyer
Abstract:
Despite much interest in face alignment in recent years, the large majority of work has focused on near-frontal faces. Algorithms typically break down on profile faces, or are too slow for real-time applications. In this work we propose an efficient approach to face alignment that can handle 180 degrees of head rotation in a unified way (e.g., without resorting to view-based models) using 2D train…
▽ More
Despite much interest in face alignment in recent years, the large majority of work has focused on near-frontal faces. Algorithms typically break down on profile faces, or are too slow for real-time applications. In this work we propose an efficient approach to face alignment that can handle 180 degrees of head rotation in a unified way (e.g., without resorting to view-based models) using 2D training data. The foundation of our approach is cascaded shape regression (CSR), which has emerged recently as the leading strategy. We propose a generalization of conventional CSRs that we call branching cascaded regression (BCR). Conventional CSRs are single-track; that is, they progress from one cascade level to the next in a straight line, with each regressor attempting to fit the entire dataset. We instead split the regression problem into two or more simpler ones after each cascade level. Intuitively, each regressor can then operate on a simpler objective function (i.e., with fewer conflicting gradient directions). Within the BCR framework, we model and infer pose-related landmark visibility and face shape simultaneously using Structured Point Distribution Models (SPDMs). We propose to learn task-specific feature mapping functions that are adaptive to landmark visibility, and that use SPDM parameters as regression targets instead of 2D landmark coordinates. Additionally, we introduce a new in-the-wild dataset of profile faces to validate our approach.
△ Less
Submitted 9 November, 2016; v1 submitted 4 November, 2016;
originally announced November 2016.
-
Dual Modelling of Permutation and Injection Problems
Authors:
B. Hnich,
B. M. Smith,
T. Walsh
Abstract:
When writing a constraint program, we have to choose which variables should be the decision variables, and how to represent the constraints on these variables. In many cases, there is considerable choice for the decision variables. Consider, for example, permutation problems in which we have as many values as variables, and each variable takes an unique value. In such problems, we can choose…
▽ More
When writing a constraint program, we have to choose which variables should be the decision variables, and how to represent the constraints on these variables. In many cases, there is considerable choice for the decision variables. Consider, for example, permutation problems in which we have as many values as variables, and each variable takes an unique value. In such problems, we can choose between a primal and a dual viewpoint. In the dual viewpoint, each dual variable represents one of the primal values, whilst each dual value represents one of the primal variables. Alternatively, by means of channelling constraints to link the primal and dual variables, we can have a combined model with both sets of variables. In this paper, we perform an extensive theoretical and empirical study of such primal, dual and combined models for two classes of problems: permutation problems and injection problems. Our results show that it often be advantageous to use multiple viewpoints, and to have constraints which channel between them to maintain consistency. They also illustrate a general methodology for comparing different constraint models.
△ Less
Submitted 30 June, 2011;
originally announced July 2011.
-
Symmetry Breaking with Polynomial Delay
Authors:
Tim januschowski,
Barbara M. Smith,
M. R. C. van Dongen
Abstract:
A conservative class of constraint satisfaction problems CSPs is a class for which membership is preserved under arbitrary domain reductions. Many well-known tractable classes of CSPs are conservative. It is well known that lexleader constraints may significantly reduce the number of solutions by excluding symmetric solutions of CSPs. We show that adding certain lexleader constraints to any instan…
▽ More
A conservative class of constraint satisfaction problems CSPs is a class for which membership is preserved under arbitrary domain reductions. Many well-known tractable classes of CSPs are conservative. It is well known that lexleader constraints may significantly reduce the number of solutions by excluding symmetric solutions of CSPs. We show that adding certain lexleader constraints to any instance of any conservative class of CSPs still allows us to find all solutions with a time which is polynomial between successive solutions. The time is polynomial in the total size of the instance and the additional lexleader constraints. It is well known that for complete symmetry breaking one may need an exponential number of lexleader constraints. However, in practice, the number of additional lexleader constraints is typically polynomial number in the size of the instance. For polynomially many lexleader constraints, we may in general not have complete symmetry breaking but polynomially many lexleader constraints may provide practically useful symmetry breaking -- and they sometimes exclude super-exponentially many solutions. We prove that for any instance from a conservative class, the time between finding successive solutions of the instance with polynomially many additional lexleader constraints is polynomial even in the size of the instance without lexleaderconstraints.
△ Less
Submitted 27 December, 2010;
originally announced December 2010.