Search | arXiv e-print repository

Efficient high-resolution refinement in cryo-EM with stochastic gradient descent

Authors: Bogdan Toader, Marcus A. Brubaker, Roy R. Lederman

Abstract: Electron cryomicroscopy (cryo-EM) is an imaging technique widely used in structural biology to determine the three-dimensional structure of biological molecules from noisy two-dimensional projections with unknown orientations. As the typical pipeline involves processing large amounts of data, efficient algorithms are crucial for fast and reliable results. The stochastic gradient descent (SGD) algo… ▽ More Electron cryomicroscopy (cryo-EM) is an imaging technique widely used in structural biology to determine the three-dimensional structure of biological molecules from noisy two-dimensional projections with unknown orientations. As the typical pipeline involves processing large amounts of data, efficient algorithms are crucial for fast and reliable results. The stochastic gradient descent (SGD) algorithm has been used to improve the speed of ab initio reconstruction, which results in a first, low-resolution estimation of the volume representing the molecule of interest, but has yet to be applied successfully in the high-resolution regime, where expectation-maximization algorithms achieve state-of-the-art results, at a high computational cost. In this article, we investigate the conditioning of the optimization problem and show that the large condition number prevents the successful application of gradient descent-based methods at high resolution. Our results include a theoretical analysis of the condition number of the optimization problem in a simplified setting where the individual projection directions are known, an algorithm based on computing a diagonal preconditioner using Hutchinson's diagonal estimator, and numerical experiments showing the improvement in the convergence speed when using the estimated preconditioner with SGD. The preconditioned SGD approach can potentially enable a simple and unified approach to ab initio reconstruction and high-resolution refinement with faster convergence speed and higher flexibility, and our results are a promising step in this direction. △ Less

Submitted 30 October, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: 22 pages, 7 figures

arXiv:2304.14248 [pdf, other]

On Manifold Learning in Plato's Cave: Remarks on Manifold Learning and Physical Phenomena

Authors: Roy R. Lederman, Bogdan Toader

Abstract: Many techniques in machine learning attempt explicitly or implicitly to infer a low-dimensional manifold structure of an underlying physical phenomenon from measurements without an explicit model of the phenomenon or the measurement apparatus. This paper presents a cautionary tale regarding the discrepancy between the geometry of measurements and the geometry of the underlying phenomenon in a beni… ▽ More Many techniques in machine learning attempt explicitly or implicitly to infer a low-dimensional manifold structure of an underlying physical phenomenon from measurements without an explicit model of the phenomenon or the measurement apparatus. This paper presents a cautionary tale regarding the discrepancy between the geometry of measurements and the geometry of the underlying phenomenon in a benign setting. The deformation in the metric illustrated in this paper is mathematically straightforward and unavoidable in the general case, and it is only one of several similar effects. While this is not always problematic, we provide an example of an arguably standard and harmless data processing procedure where this effect leads to an incorrect answer to a seemingly simple question. Although we focus on manifold learning, these issues apply broadly to dimensionality reduction and unsupervised learning. △ Less

Submitted 30 June, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

Comments: 7 pages, 9 figures

arXiv:2211.10744 [pdf, other]

Methods for Cryo-EM Single Particle Reconstruction of Macromolecules having Continuous Heterogeneity

Authors: Bogdan Toader, Fred J. Sigworth, Roy R. Lederman

Abstract: Macromolecules change their shape (conformation) in the process of carrying out their functions. The imaging by cryo-electron microscopy of rapidly-frozen, individual copies of macromolecules (single particles) is a powerful and general approach to understanding the motions and energy landscapes of macromolecules. Widely-used computational methods already allow the recovery of a few distinct confo… ▽ More Macromolecules change their shape (conformation) in the process of carrying out their functions. The imaging by cryo-electron microscopy of rapidly-frozen, individual copies of macromolecules (single particles) is a powerful and general approach to understanding the motions and energy landscapes of macromolecules. Widely-used computational methods already allow the recovery of a few distinct conformations from heterogeneous single-particle samples, but the treatment of complex forms of heterogeneity such as the continuum of possible transitory states and flexible regions remains largely an open problem. In recent years there has been a surge of new approaches for treating the more general problem of continuous heterogeneity. This paper surveys the current state of the art in this area. △ Less

Submitted 19 November, 2022; originally announced November 2022.

Comments: 20 pages, 2 figures

arXiv:2211.10518 [pdf]

Integrating molecular models into CryoEM heterogeneity analysis using scalable high-resolution deep Gaussian mixture models

Authors: Muyuan Chen, Bogdan Toader, Roy Lederman

Abstract: Resolving the structural variability of proteins is often key to understanding the structure-function relationship of those macromolecular machines. Single particle analysis using Cryogenic electron microscopy (CryoEM), combined with machine learning algorithms, provides a way to reveal the dynamics within the protein system from noisy micrographs. Here, we introduce an improved computational meth… ▽ More Resolving the structural variability of proteins is often key to understanding the structure-function relationship of those macromolecular machines. Single particle analysis using Cryogenic electron microscopy (CryoEM), combined with machine learning algorithms, provides a way to reveal the dynamics within the protein system from noisy micrographs. Here, we introduce an improved computational method that uses Gaussian mixture models for protein structure representation and deep neural networks for conformation space embedding. By integrating information from molecular models into the heterogeneity analysis, we can resolve complex protein conformational changes at near atomic resolution and present the results in a more interpretable form. △ Less

Submitted 18 November, 2022; originally announced November 2022.

arXiv:2108.03642 [pdf, other]

Image reconstruction in light-sheet microscopy: spatially varying deconvolution and mixed noise

Authors: Bogdan Toader, Jerome Boulanger, Yury Korolev, Martin O. Lenz, James Manton, Carola-Bibiane Schonlieb, Leila Muresan

Abstract: We study the problem of deconvolution for light-sheet microscopy, where the data is corrupted by spatially varying blur and a combination of Poisson and Gaussian noise. The spatial variation of the point spread function (PSF) of a light-sheet microscope is determined by the interaction between the excitation sheet and the detection objective PSF. First, we introduce a model of the image formation… ▽ More We study the problem of deconvolution for light-sheet microscopy, where the data is corrupted by spatially varying blur and a combination of Poisson and Gaussian noise. The spatial variation of the point spread function (PSF) of a light-sheet microscope is determined by the interaction between the excitation sheet and the detection objective PSF. First, we introduce a model of the image formation process that incorporates this interaction, therefore capturing the main characteristics of this imaging modality. Then, we formulate a variational model that accounts for the combination of Poisson and Gaussian noise through a data fidelity term consisting of the infimal convolution of the single noise fidelities, first introduced in L. Calatroni et al. "Infimal convolution of data discrepancies for mixed noise removal", SIAM Journal on Imaging Sciences 10.3 (2017), 1196-1233. We establish convergence rates in a Bregman distance under a source condition for the infimal convolution fidelity and a discrepancy principle for choosing the value of the regularisation parameter. The inverse problem is solved by applying the primal-dual hybrid gradient (PDHG) algorithm in a novel way. Finally, numerical experiments performed on both simulated and real data show superior reconstruction results in comparison with other methods. △ Less

Submitted 8 August, 2021; originally announced August 2021.

Comments: 34 pages, 13 figures

arXiv:2007.02708 [pdf, other]

The dual approach to non-negative super-resolution: perturbation analysis

Authors: Stéphane Chrétien, Andrew Thompson, Bogdan Toader

Abstract: We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been shown that exact recovery is possible by minimising the total variation norm of the measure, and a practical way of achieve this is by solving the dual problem. In this paper, we study the stability of solutio… ▽ More We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been shown that exact recovery is possible by minimising the total variation norm of the measure, and a practical way of achieve this is by solving the dual problem. In this paper, we study the stability of solutions with respect to the solutions dual problem, both in the case of exact measurements and in the case of measurements with additive noise. In particular, we establish a relationship between perturbations in the dual variable and perturbations in the primal variable around the optimiser and a similar relationship between perturbations in the dual variable around the optimiser and the magnitude of the additive noise in the measurements. Our analysis is based on a quantitative version of the implicit function theorem. △ Less

Submitted 4 July, 2023; v1 submitted 6 July, 2020; originally announced July 2020.

Comments: 35 pages, 5 figures

arXiv:1904.01926 [pdf, ps, other]

The dual approach to non-negative super-resolution: impact on primal reconstruction accuracy

Authors: Stephane Chretien, Andrew Thompson, Bogdan Toader

Abstract: We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been recently shown that exact recovery is possible by minimising the total variation norm of the measure. An alternative practical approach is to solve its dual. In this paper, we study the stability of solutions… ▽ More We study the problem of super-resolution, where we recover the locations and weights of non-negative point sources from a few samples of their convolution with a Gaussian kernel. It has been recently shown that exact recovery is possible by minimising the total variation norm of the measure. An alternative practical approach is to solve its dual. In this paper, we study the stability of solutions with respect to the solutions to the dual problem. In particular, we establish a relationship between perturbations in the dual variable and the primal variables around the optimiser. This is achieved by applying a quantitative version of the implicit function theorem in a non-trivial way. △ Less

Submitted 8 May, 2019; v1 submitted 3 April, 2019; originally announced April 2019.

Comments: 4 pages double column

arXiv:1804.01490 [pdf, ps, other]

Sparse non-negative super-resolution -- simplified and stabilised

Authors: Armin Eftekhari, Jared Tanner, Andrew Thompson, Bogdan Toader, Hemant Tyagi

Abstract: The convolution of a discrete measure, $x=\sum_{i=1}^ka_iδ_{t_i}$, with a local window function, $φ(s-t)$, is a common model for a measurement device whose resolution is substantially lower than that of the objects being observed. Super-resolution concerns localising the point sources $\{a_i,t_i\}_{i=1}^k$ with an accuracy beyond the essential support of $φ(s-t)$, typically from $m$ samples… ▽ More The convolution of a discrete measure, $x=\sum_{i=1}^ka_iδ_{t_i}$, with a local window function, $φ(s-t)$, is a common model for a measurement device whose resolution is substantially lower than that of the objects being observed. Super-resolution concerns localising the point sources $\{a_i,t_i\}_{i=1}^k$ with an accuracy beyond the essential support of $φ(s-t)$, typically from $m$ samples $y(s_j)=\sum_{i=1}^k a_iφ(s_j-t_i)+η_j$, where $η_j$ indicates an inexactness in the sample value. We consider the setting of $x$ being non-negative and seek to characterise all non-negative measures approximately consistent with the samples. We first show that $x$ is the unique non-negative measure consistent with the samples provided the samples are exact, i.e. $η_j=0$, $m\ge 2k+1$ samples are available, and $φ(s-t)$ generates a Chebyshev system. This is independent of how close the sample locations are and {\em does not rely on any regulariser beyond non-negativity}; as such, it extends and clarifies the work by Schiebinger et al. and De Castro et al., who achieve the same results but require a total variation regulariser, which we show is unnecessary. Moreover, we characterise non-negative solutions $\hat{x}$ consistent with the samples within the bound $\sum_{j=1}^mη_j^2\le δ^2$. Any such non-negative measure is within ${\mathcal O}(δ^{1/7})$ of the discrete measure $x$ generating the samples in the generalised Wasserstein distance, converging to one another as $δ$ approaches zero. We also show how to make these general results, for windows that form a Chebyshev system, precise for the case of $φ(s-t)$ being a Gaussian window. The main innovation of these results is that non-negativity alone is sufficient to localise point sources beyond the essential sensor resolution. △ Less

Submitted 26 November, 2019; v1 submitted 4 April, 2018; originally announced April 2018.

Comments: 59 pages, 7 figures

arXiv:1001.4606 [pdf, ps, other]

On the dimension of the space of integrals on coalgebras

Authors: S. Dăscălescu, C. Năstăsescu, B. Toader

Abstract: We study the injective envelopes of the simple right $C$-comodules, and their duals, where $C$ is a coalgebra. This is used to give a short proof and to extend a result of Iovanov on the dimension of the space of integrals on coalgebras. We show that if $C$ is right co-Frobenius, then the dimension of the space of left $M$-integrals on $C$ is $\leq {\rm dim}M$ for any left $C$-comodule $M$ of fi… ▽ More We study the injective envelopes of the simple right $C$-comodules, and their duals, where $C$ is a coalgebra. This is used to give a short proof and to extend a result of Iovanov on the dimension of the space of integrals on coalgebras. We show that if $C$ is right co-Frobenius, then the dimension of the space of left $M$-integrals on $C$ is $\leq {\rm dim}M$ for any left $C$-comodule $M$ of finite support, and the dimension of the space of right $N$-integrals on $C$ is $\geq {\rm dim}N$ for any right $C$-comodule $N$ of finite support. If $C$ is a coalgebra, it is discussed how far is the dual algebra $C^*$ from being semiperfect. Some examples of integrals are computed for incidence coalgebras. △ Less

Submitted 26 January, 2010; originally announced January 2010.

MSC Class: 16W30

Showing 1–9 of 9 results for author: Toader, B