Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models

Huang, Kun; Ma, Xiao; Zhang, Yuhan; Su, Na; Yuan, Songtao; Liu, Yong; Chen, Qiang; Fu, Huazhu

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2405.16516 (eess)

[Submitted on 26 May 2024]

Title:Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models

Authors:Kun Huang, Xiao Ma, Yuhan Zhang, Na Su, Songtao Yuan, Yong Liu, Qiang Chen, Huazhu Fu

View PDF HTML (experimental)

Abstract:Optical coherence tomography (OCT) image analysis plays an important role in the field of ophthalmology. Current successful analysis models rely on available large datasets, which can be challenging to be obtained for certain tasks. The use of deep generative models to create realistic data emerges as a promising approach. However, due to limitations in hardware resources, it is still difficulty to synthesize high-resolution OCT volumes. In this paper, we introduce a cascaded amortized latent diffusion model (CA-LDM) that can synthesis high-resolution OCT volumes in a memory-efficient way. First, we propose non-holistic autoencoders to efficiently build a bidirectional mapping between high-resolution volume space and low-resolution latent space. In tandem with autoencoders, we propose cascaded diffusion processes to synthesize high-resolution OCT volumes with a global-to-local refinement process, amortizing the memory and computational demands. Experiments on a public high-resolution OCT dataset show that our synthetic data have realistic high-resolution and global features, surpassing the capabilities of existing methods. Moreover, performance gains on two down-stream fine-grained segmentation tasks demonstrate the benefit of the proposed method in training deep learning models for medical imaging tasks. The code is public available at: this https URL.

Comments:	Provisionally accepted for medical image computing and computer-assisted intervention (MICCAI) 2024
Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2405.16516 [eess.IV]
	(or arXiv:2405.16516v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2405.16516

Submission history

From: Kun Huang [view email]
[v1] Sun, 26 May 2024 10:58:22 UTC (36,633 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators