Search | arXiv e-print repository

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

Authors: Paul Häusner, Ozan Öktem, Jens Sjölund

Abstract: The convergence of the conjugate gradient method for solving large-scale and sparse linear equation systems depends on the spectral properties of the system matrix, which can be improved by preconditioning. In this paper, we develop a computationally efficient data-driven approach to accelerate the generation of effective preconditioners. We, therefore, replace the typically hand-engineered precon… ▽ More The convergence of the conjugate gradient method for solving large-scale and sparse linear equation systems depends on the spectral properties of the system matrix, which can be improved by preconditioning. In this paper, we develop a computationally efficient data-driven approach to accelerate the generation of effective preconditioners. We, therefore, replace the typically hand-engineered preconditioners by the output of graph neural networks. Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF). Optimizing the condition number of the linear system directly is computationally infeasible. Instead, we utilize a stochastic approximation of the Frobenius loss which only requires matrix-vector multiplications for efficient training. At the core of our method is a novel message-passing block, inspired by sparse matrix theory, that aligns with the objective of finding a sparse factorization of the matrix. We evaluate our proposed method on both synthetic problem instances and on problems arising from the discretization of the Poisson equation on varying domains. Our experiments show that by using data-driven preconditioners within the conjugate gradient method we are able to speed up the convergence of the iterative procedure. The code is available at https://github.com/paulhausner/neural-incomplete-factorization. △ Less

Submitted 24 October, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: 26 pages, 8 figures, accepted in Transactions on Machine Learning Research (TMLR)

arXiv:2209.05546 [pdf, other]

doi 10.1088/1361-6420/acb2ba

Spectral decomposition of atomic structures in heterogeneous cryo-EM

Authors: Carlos Esteve-Yagüe, Willem Diepeveen, Ozan Öktem, Carola-Bibiane Schönlieb

Abstract: We consider the problem of recovering the three-dimensional atomic structure of a flexible macromolecule from a heterogeneous cryo-EM dataset. The dataset contains noisy tomographic projections of the electrostatic potential of the macromolecule, taken from different viewing directions, and in the heterogeneous case, each image corresponds to a different conformation of the macromolecule. Under th… ▽ More We consider the problem of recovering the three-dimensional atomic structure of a flexible macromolecule from a heterogeneous cryo-EM dataset. The dataset contains noisy tomographic projections of the electrostatic potential of the macromolecule, taken from different viewing directions, and in the heterogeneous case, each image corresponds to a different conformation of the macromolecule. Under the assumption that the macromolecule can be modelled as a chain, or discrete curve (as it is for instance the case for a protein backbone with a single chain of amino-acids), we introduce a method to estimate the deformation of the atomic model with respect to a given conformation, which is assumed to be known a priori. Our method consists on estimating the torsion and bond angles of the atomic model in each conformation as a linear combination of the eigenfunctions of the Laplace operator in the manifold of conformations. These eigenfunctions can be approximated by means of a well-known technique in manifold learning, based on the construction of a graph Laplacian using the cryo-EM dataset. Finally, we test our approach with synthetic datasets, for which we recover the atomic model of two-dimensional and three-dimensional flexible structures from noisy tomographic projections. △ Less

Submitted 27 December, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

Comments: 35 pages,20 figures

arXiv:2108.11730 [pdf, other]

doi 10.1137/21M1445697

Deep learning based dictionary learning and tomographic image reconstruction

Authors: Jevgenija Rudzusika, Thomas Koehler, Ozan Öktem

Abstract: This work presents an approach for image reconstruction in clinical low-dose tomography that combines principles from sparse signal processing with ideas from deep learning. First, we describe sparse signal representation in terms of dictionaries from a statistical perspective and interpret dictionary learning as a process of aligning distribution that arises from a generative model with empirical… ▽ More This work presents an approach for image reconstruction in clinical low-dose tomography that combines principles from sparse signal processing with ideas from deep learning. First, we describe sparse signal representation in terms of dictionaries from a statistical perspective and interpret dictionary learning as a process of aligning distribution that arises from a generative model with empirical distribution of true signals. As a result we can see that sparse coding with learned dictionaries resembles a specific variational autoencoder, where the decoder is a linear function and the encoder is a sparse coding algorithm. Next, we show that dictionary learning can also benefit from computational advancements introduced in the context of deep learning, such as parallelism and as stochastic optimization. Finally, we show that regularization by dictionaries achieves competitive performance in computed tomography (CT) reconstruction comparing to state-of-the-art model based and data driven approaches. △ Less

Submitted 26 August, 2021; originally announced August 2021.

Comments: 34 pages, 5 figures

Journal ref: SIAM Journal on Imaging Sciences, Vol 15, Iss 4. (2002)

arXiv:2008.02839 [pdf, other]

Learned convex regularizers for inverse problems

Authors: Subhadip Mukherjee, Sören Dittmer, Zakhar Shumaylov, Sebastian Lunz, Ozan Öktem, Carola-Bibiane Schönlieb

Abstract: We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence g… ▽ More We consider the variational reconstruction framework for inverse problems and propose to learn a data-adaptive input-convex neural network (ICNN) as the regularization functional. The ICNN-based convex regularizer is trained adversarially to discern ground-truth images from unregularized reconstructions. Convexity of the regularizer is desirable since (i) one can establish analytical convergence guarantees for the corresponding variational reconstruction problem and (ii) devise efficient and provable algorithms for reconstruction. In particular, we show that the optimal solution to the variational problem converges to the ground-truth if the penalty parameter decays sub-linearly with respect to the norm of the noise. Further, we prove the existence of a sub-gradient-based algorithm that leads to a monotonically decreasing error in the parameter space with iterations. To demonstrate the performance of our approach for solving inverse problems, we consider the tasks of deblurring natural images and reconstructing images in computed tomography (CT), and show that the proposed convex regularizer is at least competitive with and sometimes superior to state-of-the-art data-driven techniques for inverse problems. △ Less

Submitted 1 March, 2021; v1 submitted 6 August, 2020; originally announced August 2020.

arXiv:1901.01388 [pdf, other]

Extraction of digital wavefront sets using applied harmonic analysis and deep neural networks

Authors: Héctor Andrade-Loarca, Gitta Kutyniok, Ozan Öktem, Philipp Petersen

Abstract: Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images, which combines data-based and model-based method… ▽ More Microlocal analysis provides deep insight into singularity structures and is often crucial for solving inverse problems, predominately, in imaging sciences. Of particular importance is the analysis of wavefront sets and the correct extraction of those. In this paper, we introduce the first algorithmic approach to extract the wavefront set of images, which combines data-based and model-based methods. Based on a celebrated property of the shearlet transform to unravel information on the wavefront set, we extract the wavefront set of an image by first applying a discrete shearlet transform and then feeding local patches of this transform to a deep convolutional neural network trained on labeled data. The resulting algorithm outperforms all competing algorithms in edge-orientation and ramp-orientation detection. △ Less

Submitted 10 July, 2019; v1 submitted 5 January, 2019; originally announced January 2019.

MSC Class: 35A18; 65T60; 68T10

arXiv:1811.05910 [pdf, other]

doi 10.1515/9783111251233-011

Deep Bayesian Inversion

Authors: Jonas Adler, Ozan Öktem

Abstract: Characterizing statistical properties of solutions of inverse problems is essential for decision making. Bayesian inversion offers a tractable framework for this purpose, but current approaches are computationally unfeasible for most realistic imaging applications in the clinic. We introduce two novel deep learning based methods for solving large-scale inverse problems using Bayesian inversion: a… ▽ More Characterizing statistical properties of solutions of inverse problems is essential for decision making. Bayesian inversion offers a tractable framework for this purpose, but current approaches are computationally unfeasible for most realistic imaging applications in the clinic. We introduce two novel deep learning based methods for solving large-scale inverse problems using Bayesian inversion: a sampling based method using a WGAN with a novel mini-discriminator and a direct approach that trains a neural network using a novel loss function. The performance of both methods is demonstrated on image reconstruction in ultra low dose 3D helical CT. We compute the posterior mean and standard deviation of the 3D images followed by a hypothesis test to assess whether a "dark spot" in the liver of a cancer stricken patient is present. Both methods are computationally efficient and our evaluation shows very promising performance that clearly supports the claim that Bayesian inversion is usable for 3D imaging in time critical applications. △ Less

Submitted 14 November, 2018; originally announced November 2018.

Journal ref: Data-driven Models in Inverse Problems 2025

arXiv:1805.11572 [pdf, other]

Adversarial Regularizers in Inverse Problems

Authors: Sebastian Lunz, Ozan Öktem, Carola-Bibiane Schönlieb

Abstract: Inverse Problems in medical imaging and computer vision are traditionally solved using purely model-based methods. Among those variational regularization models are one of the most popular approaches. We propose a new framework for applying data-driven approaches to inverse problems, using a neural network as a regularization functional. The network learns to discriminate between the distribution… ▽ More Inverse Problems in medical imaging and computer vision are traditionally solved using purely model-based methods. Among those variational regularization models are one of the most popular approaches. We propose a new framework for applying data-driven approaches to inverse problems, using a neural network as a regularization functional. The network learns to discriminate between the distribution of ground truth images and the distribution of unregularized reconstructions. Once trained, the network is applied to the inverse problem by solving the corresponding variational problem. Unlike other data-based approaches for inverse problems, the algorithm can be applied even if only unsupervised training data is available. Experiments demonstrate the potential of the framework for denoising on the BSDS dataset and for computed tomography reconstruction on the LIDC dataset. △ Less

Submitted 11 January, 2019; v1 submitted 29 May, 2018; originally announced May 2018.

Comments: published at NeurIPS 2018

Showing 1–7 of 7 results for author: Öktem, O