Search | arXiv e-print repository

Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling

Authors: Abril Corona-Figueroa, Hubert P. H. Shum, Chris G. Willcocks

Abstract: This paper investigates a 2D to 3D image translation method with a straightforward technique, enabling correlated 2D X-ray to 3D CT-like reconstruction. We observe that existing approaches, which integrate information across multiple 2D views in the latent space, lose valuable signal information during latent encoding. Instead, we simply repeat and concatenate the 2D views into higher-channel 3D v… ▽ More This paper investigates a 2D to 3D image translation method with a straightforward technique, enabling correlated 2D X-ray to 3D CT-like reconstruction. We observe that existing approaches, which integrate information across multiple 2D views in the latent space, lose valuable signal information during latent encoding. Instead, we simply repeat and concatenate the 2D views into higher-channel 3D volumes and approach the 3D reconstruction challenge as a straightforward 3D to 3D generative modeling problem, sidestepping several complex modeling issues. This method enables the reconstructed 3D volume to retain valuable information from the 2D inputs, which are passed between channel states in a Swin UNETR backbone. Our approach applies neural optimal transport, which is fast and stable to train, effectively integrating signal information across multiple views without the requirement for precise alignment; it produces non-collapsed reconstructions that are highly faithful to the 2D views, even after limited training. We demonstrate correlated results, both qualitatively and quantitatively, having trained our model on a single dataset and evaluated its generalization ability across six datasets, including out-of-distribution samples. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: CVPRW 2024 - DCA in MI; Best Paper Award

arXiv:2202.01020 [pdf, other]

MedNeRF: Medical Neural Radiance Fields for Reconstructing 3D-aware CT-Projections from a Single X-ray

Authors: Abril Corona-Figueroa, Jonathan Frawley, Sam Bond-Taylor, Sarath Bethapudi, Hubert P. H. Shum, Chris G. Willcocks

Abstract: Computed tomography (CT) is an effective medical imaging modality, widely used in the field of clinical medicine for the diagnosis of various pathologies. Advances in Multidetector CT imaging technology have enabled additional functionalities, including generation of thin slice multiplanar cross-sectional body imaging and 3D reconstructions. However, this involves patients being exposed to a consi… ▽ More Computed tomography (CT) is an effective medical imaging modality, widely used in the field of clinical medicine for the diagnosis of various pathologies. Advances in Multidetector CT imaging technology have enabled additional functionalities, including generation of thin slice multiplanar cross-sectional body imaging and 3D reconstructions. However, this involves patients being exposed to a considerable dose of ionising radiation. Excessive ionising radiation can lead to deterministic and harmful effects on the body. This paper proposes a Deep Learning model that learns to reconstruct CT projections from a few or even a single-view X-ray. This is based on a novel architecture that builds from neural radiance fields, which learns a continuous representation of CT scans by disentangling the shape and volumetric depth of surface and internal anatomical structures from 2D images. Our model is trained on chest and knee datasets, and we demonstrate qualitative and quantitative high-fidelity renderings and compare our approach to other recent radiance field-based methods. Our code and link to our datasets are available at https://github.com/abrilcf/mednerf △ Less

Submitted 8 April, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

Comments: 6 pages, 4 figures, accepted at IEEE EMBC 2022

ACM Class: I.4; J.7

arXiv:2104.05358 [pdf, other]

UNIT-DDPM: UNpaired Image Translation with Denoising Diffusion Probabilistic Models

Authors: Hiroshi Sasaki, Chris G. Willcocks, Toby P. Breckon

Abstract: We propose a novel unpaired image-to-image translation method that uses denoising diffusion probabilistic models without requiring adversarial training. Our method, UNpaired Image Translation with Denoising Diffusion Probabilistic Models (UNIT-DDPM), trains a generative model to infer the joint distribution of images over both domains as a Markov chain by minimising a denoising score matching obje… ▽ More We propose a novel unpaired image-to-image translation method that uses denoising diffusion probabilistic models without requiring adversarial training. Our method, UNpaired Image Translation with Denoising Diffusion Probabilistic Models (UNIT-DDPM), trains a generative model to infer the joint distribution of images over both domains as a Markov chain by minimising a denoising score matching objective conditioned on the other domain. In particular, we update both domain translation models simultaneously, and we generate target domain images by a denoising Markov Chain Monte Carlo approach that is conditioned on the input source domain images, based on Langevin dynamics. Our approach provides stable model training for image-to-image translation and generates high-quality image outputs. This enables state-of-the-art Fréchet Inception Distance (FID) performance on several public datasets, including both colour and multispectral imagery, significantly outperforming the contemporary adversarial image-to-image translation methods. △ Less

Submitted 12 April, 2021; originally announced April 2021.

Comments: 10 pages, 8 figures

arXiv:2103.01299 [pdf, other]

Robust 3D U-Net Segmentation of Macular Holes

Authors: Jonathan Frawley, Chris G. Willcocks, Maged Habib, Caspar Geenen, David H. Steel, Boguslaw Obara

Abstract: Macular holes are a common eye condition which result in visual impairment. We look at the application of deep convolutional neural networks to the problem of macular hole segmentation. We use the 3D U-Net architecture as a basis and experiment with a number of design variants. Manually annotating and measuring macular holes is time consuming and error prone. Previous automated approaches to macul… ▽ More Macular holes are a common eye condition which result in visual impairment. We look at the application of deep convolutional neural networks to the problem of macular hole segmentation. We use the 3D U-Net architecture as a basis and experiment with a number of design variants. Manually annotating and measuring macular holes is time consuming and error prone. Previous automated approaches to macular hole segmentation take minutes to segment a single 3D scan. Our proposed model generates significantly more accurate segmentations in less than a second. We found that an approach of architectural simplification, by greatly simplifying the network capacity and depth, exceeds both expert performance and state-of-the-art models such as residual 3D U-Nets. △ Less

Submitted 7 April, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

arXiv:2010.01942 [pdf, other]

Unsupervised Region-based Anomaly Detection in Brain MRI with Adversarial Image Inpainting

Authors: Bao Nguyen, Adam Feldman, Sarath Bethapudi, Andrew Jennings, Chris G. Willcocks

Abstract: Medical segmentation is performed to determine the bounds of regions of interest (ROI) prior to surgery. By allowing the study of growth, structure, and behaviour of the ROI in the planning phase, critical information can be obtained, increasing the likelihood of a successful operation. Usually, segmentations are performed manually or via machine learning methods trained on manual annotations. In… ▽ More Medical segmentation is performed to determine the bounds of regions of interest (ROI) prior to surgery. By allowing the study of growth, structure, and behaviour of the ROI in the planning phase, critical information can be obtained, increasing the likelihood of a successful operation. Usually, segmentations are performed manually or via machine learning methods trained on manual annotations. In contrast, this paper proposes a fully automatic, unsupervised inpainting-based brain tumour segmentation system for T1-weighted MRI. First, a deep convolutional neural network (DCNN) is trained to reconstruct missing healthy brain regions. Then, upon application, anomalous regions are determined by identifying areas of highest reconstruction loss. Finally, superpixel segmentation is performed to segment those regions. We show the proposed system is able to segment various sized and abstract tumours and achieves a mean and standard deviation Dice score of 0.771 and 0.176, respectively. △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: 5 pages, 6 figures

ACM Class: I.5.0; I.4.0

arXiv:2005.04697 [pdf, other]

Segmentation of Macular Edema Datasets with Small Residual 3D U-Net Architectures

Authors: Jonathan Frawley, Chris G. Willcocks, Maged Habib, Caspar Geenen, David H. Steel, Boguslaw Obara

Abstract: This paper investigates the application of deep convolutional neural networks with prohibitively small datasets to the problem of macular edema segmentation. In particular, we investigate several different heavily regularized architectures. We find that, contrary to popular belief, neural architectures within this application setting are able to achieve close to human-level performance on unseen t… ▽ More This paper investigates the application of deep convolutional neural networks with prohibitively small datasets to the problem of macular edema segmentation. In particular, we investigate several different heavily regularized architectures. We find that, contrary to popular belief, neural architectures within this application setting are able to achieve close to human-level performance on unseen test images without requiring large numbers of training examples. Annotating these 3D datasets is difficult, with multiple criteria required. It takes an experienced clinician two days to annotate a single 3D image, whereas our trained model achieves similar performance in less than a second. We found that an approach which uses targeted dataset augmentation, alongside architectural simplification with an emphasis on residual design, has acceptable generalization performance - despite relying on fewer than 15 training examples. △ Less

Submitted 10 May, 2020; originally announced May 2020.

Comments: 7 pages, 5 figures

arXiv:2005.02436 [pdf, other]

Data Augmentation via Mixed Class Interpolation using Cycle-Consistent Generative Adversarial Networks Applied to Cross-Domain Imagery

Authors: Hiroshi Sasaki, Chris G. Willcocks, Toby P. Breckon

Abstract: Machine learning driven object detection and classification within non-visible imagery has an important role in many fields such as night vision, all-weather surveillance and aviation security. However, such applications often suffer due to the limited quantity and variety of non-visible spectral domain imagery, in contrast to the high data availability of visible-band imagery that readily enables… ▽ More Machine learning driven object detection and classification within non-visible imagery has an important role in many fields such as night vision, all-weather surveillance and aviation security. However, such applications often suffer due to the limited quantity and variety of non-visible spectral domain imagery, in contrast to the high data availability of visible-band imagery that readily enables contemporary deep learning driven detection and classification approaches. To address this problem, this paper proposes and evaluates a novel data augmentation approach that leverages the more readily available visible-band imagery via a generative domain transfer model. The model can synthesise large volumes of non-visible domain imagery by image-to-image (I2I) translation from the visible image domain. Furthermore, we show that the generation of interpolated mixed class (non-visible domain) image examples via our novel Conditional CycleGAN Mixup Augmentation (C2GMA) methodology can lead to a significant improvement in the quality of non-visible domain classification tasks that otherwise suffer due to limited data availability. Focusing on classification within the Synthetic Aperture Radar (SAR) domain, our approach is evaluated on a variation of the Statoil/C-CORE Iceberg Classifier Challenge dataset and achieves 75.4% accuracy, demonstrating a significant improvement when compared against traditional data augmentation strategies (Rotation, Mixup, and MixCycleGAN). △ Less

Submitted 1 January, 2021; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: 9 pages, 9 figures, accepted at the 25th International Conference on Pattern Recognition (ICPR 2020)

Showing 1–7 of 7 results for author: Willcocks, C G