Search | arXiv e-print repository

adabmDCA 2.0 -- a flexible but easy-to-use package for Direct Coupling Analysis

Authors: Lorenzo Rosset, Roberto Netti, Anna Paola Muntoni, Martin Weigt, Francesco Zamponi

Abstract: In this methods article, we provide a flexible but easy-to-use implementation of Direct Coupling Analysis (DCA) based on Boltzmann machine learning, together with a tutorial on how to use it. The package \texttt{adabmDCA 2.0} is available in different programming languages (C++, Julia, Python) usable on different architectures (single-core and multi-core CPU, GPU) using a common front-end interfac… ▽ More In this methods article, we provide a flexible but easy-to-use implementation of Direct Coupling Analysis (DCA) based on Boltzmann machine learning, together with a tutorial on how to use it. The package \texttt{adabmDCA 2.0} is available in different programming languages (C++, Julia, Python) usable on different architectures (single-core and multi-core CPU, GPU) using a common front-end interface. In addition to several learning protocols for dense and sparse generative DCA models, it allows to directly address common downstream tasks like residue-residue contact prediction, mutational-effect prediction, scoring of sequence libraries and generation of artificial sequences for sequence design. It is readily applicable to protein and RNA sequence data. △ Less

Submitted 30 January, 2025; originally announced January 2025.

arXiv:2312.00910 [pdf, other]

doi 10.1093/pnasnexus/pgae377

Effectiveness of probabilistic contact tracing in epidemic containment: the role of super-spreaders and transmission path reconstruction

Authors: A. P. Muntoni, F. Mazza, A. Braunstein, G. Catania, L. Dall'Asta

Abstract: The recent COVID-19 pandemic underscores the significance of early-stage non-pharmacological intervention strategies. The widespread use of masks and the systematic implementation of contact tracing strategies provide a potentially equally effective and socially less impactful alternative to more conventional approaches, such as large-scale mobility restrictions. However, manual contact tracing fa… ▽ More The recent COVID-19 pandemic underscores the significance of early-stage non-pharmacological intervention strategies. The widespread use of masks and the systematic implementation of contact tracing strategies provide a potentially equally effective and socially less impactful alternative to more conventional approaches, such as large-scale mobility restrictions. However, manual contact tracing faces strong limitations in accessing the network of contacts, and the scalability of currently implemented protocols for smartphone-based digital contact tracing becomes impractical during the rapid expansion phases of the outbreaks, due to the surge in exposure notifications and associated tests. A substantial improvement in digital contact tracing can be obtained through the integration of probabilistic techniques for risk assessment that can more effectively guide the allocation of new diagnostic tests. In this study, we first quantitatively analyze the diagnostic and social costs associated with these containment measures based on contact tracing, employing three state-of-the-art models of SARS-CoV-2 spreading. Our results suggest that probabilistic techniques allow for more effective mitigation at a lower cost. Secondly, our findings reveal a remarkable efficacy of probabilistic contact-tracing techniques in performing backward and multi-step tracing and capturing super-spreading events. △ Less

Submitted 30 August, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

arXiv:2309.01540 [pdf, other]

doi 10.1093/bioinformatics/btad537

DCAlign v1.0: Aligning biological sequences using co-evolution models and informed priors

Authors: Anna Paola Muntoni, Andrea Pagnani

Abstract: DCAlign is a new alignment method able to cope with the conservation and the co-evolution signals that characterize the columns of multiple sequence alignments of homologous sequences. However, the pre-processing steps required to align a candidate sequence are computationally demanding. We show in v1.0 how to dramatically reduce the overall computing time by including an empirical prior over an i… ▽ More DCAlign is a new alignment method able to cope with the conservation and the co-evolution signals that characterize the columns of multiple sequence alignments of homologous sequences. However, the pre-processing steps required to align a candidate sequence are computationally demanding. We show in v1.0 how to dramatically reduce the overall computing time by including an empirical prior over an informative set of variables mirroring the presence of insertions and deletions. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2210.11167 [pdf, other]

doi 10.1088/1478-3975/acc1bc

Optimal metabolic strategies for microbial growth in stationary random environments

Authors: Anna Paola Muntoni, Andrea De Martino

Abstract: In order to grow in any given environment, bacteria need to collect information about the medium composition and implement suitable growth strategies by adjusting their regulatory and metabolic degrees of freedom. In the standard sense, optimal strategy selection is achieved when bacteria grow at the fastest rate possible in that medium. While this view of optimality is well suited for cells that… ▽ More In order to grow in any given environment, bacteria need to collect information about the medium composition and implement suitable growth strategies by adjusting their regulatory and metabolic degrees of freedom. In the standard sense, optimal strategy selection is achieved when bacteria grow at the fastest rate possible in that medium. While this view of optimality is well suited for cells that have perfect knowledge about their surroundings (e.g. nutrient levels), things are more involved in uncertain or fluctuating conditions, especially when changes occur over timescales comparable to (or faster than) those required to organize a response. Information theory however provides recipes for how cells can choose the optimal growth strategy under uncertainty about the stress levels they will face. Here we analyse the theoretically optimal scenarios for a coarse-grained, experiment-inspired model of bacterial metabolism for growth in a medium described by the (static) probability density of a single variable (the `stress level'). We show that heterogeneity in growth rates consistently emerges as the optimal response when the environment is sufficiently complex and/or when perfect adjustment of metabolic degrees of freedom is not possible (e.g. due to limited resources). In addition, outcomes close to those achievable with unlimited resources are often attained effectively with a modest amount of fine tuning. In other terms, heterogeneous population structures in complex media may be rather robust with respect to the resources available to probe the environment and adjust reaction rates. △ Less

Submitted 21 March, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

arXiv:2210.10179 [pdf, other]

doi 10.1038/s41598-023-33770-3

Inference in conditioned dynamics through causality restoration

Authors: Alfredo Braunstein, Giovanni Catania, Luca Dall'Asta, Matteo Mariani, Anna Paola Muntoni

Abstract: Computing observables from conditioned dynamics is typically computationally hard, because, although obtaining independent samples efficiently from the unconditioned dynamics is usually feasible, generally most of the samples must be discarded (in a form of importance sampling) because they do not satisfy the imposed conditions. Sampling directly from the conditioned distribution is non-trivial, a… ▽ More Computing observables from conditioned dynamics is typically computationally hard, because, although obtaining independent samples efficiently from the unconditioned dynamics is usually feasible, generally most of the samples must be discarded (in a form of importance sampling) because they do not satisfy the imposed conditions. Sampling directly from the conditioned distribution is non-trivial, as conditioning breaks the causal properties of the dynamics which ultimately renders the sampling procedure efficient. One standard way of achieving it is through a Metropolis Monte-Carlo procedure, but this procedure is normally slow and a very large number of Monte-Carlo steps is needed to obtain a small number of statistically independent samples. In this work, we propose an alternative method to produce independent samples from a conditioned distribution. The method learns the parameters of a generalized dynamical model that optimally describe the conditioned distribution in a variational sense. The outcome is an effective, unconditioned, dynamical model, from which one can trivially obtain independent samples, effectively restoring causality of the conditioned distribution. The consequences are twofold: on the one hand, it allows us to efficiently compute observables from the conditioned dynamics by simply averaging over independent samples. On the other hand, the method gives an effective unconditioned distribution which is easier to interpret. The method is flexible and can be applied virtually to any dynamics. We discuss an important application of the method, namely the problem of epidemic risk assessment from (imperfect) clinical tests, for a large family of time-continuous epidemic models endowed with a Gillespie-like sampler. We show that the method compares favorably against the state of the art, including the soft-margin approach and mean-field methods. △ Less

Submitted 30 March, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

Comments: 22 pages, 7 figures

arXiv:2109.04105 [pdf, other]

doi 10.1186/s12859-021-04441-9

adabmDCA: Adaptive Boltzmann machine learning for biological sequences

Authors: Anna Paola Muntoni, Andrea Pagnani, Martin Weigt, Francesco Zamponi

Abstract: Boltzmann machines are energy-based models that have been shown to provide an accurate statistical description of domains of evolutionary-related protein and RNA families. They are parametrized in terms of local biases accounting for residue conservation, and pairwise terms to model epistatic coevolution between residues. From the model parameters, it is possible to extract an accurate prediction… ▽ More Boltzmann machines are energy-based models that have been shown to provide an accurate statistical description of domains of evolutionary-related protein and RNA families. They are parametrized in terms of local biases accounting for residue conservation, and pairwise terms to model epistatic coevolution between residues. From the model parameters, it is possible to extract an accurate prediction of the three-dimensional contact map of the target domain. More recently, the accuracy of these models has been also assessed in terms of their ability in predicting mutational effects and generating in silico functional sequences. Our adaptive implementation of Boltzmann machine learning, adabmDCA, can be generally applied to both protein and RNA families and accomplishes several learning set-ups, depending on the complexity of the input data and on the user requirements. The code is fully available at https://github.com/anna-pa-m/adabmDCA. As an example, we have performed the learning of three Boltzmann machines modeling the Kunitz and Beta-lactamase2 protein domains and TPP-riboswitch RNA domain. The models learned by adabmDCA are comparable to those obtained by state-of-the-art techniques for this task, in terms of the quality of the inferred contact map as well as of the synthetically generated sequences. In addition, the code implements both equilibrium and out-of-equilibrium learning, which allows for an accurate and lossless training when the equilibrium one is prohibitive in terms of computational time, and allows for pruning irrelevant parameters using an information-based criterion. △ Less

Submitted 2 November, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

Journal ref: BMC Bioinformatics 22, 528 (2021)

arXiv:2104.02594 [pdf, other]

doi 10.1016/j.bpj.2022.04.012

Relationship between fitness and heterogeneity in exponentially growing microbial populations

Authors: Anna Paola Muntoni, Alfredo Braunstein, Andrea Pagnani, Daniele De Martino, Andrea De Martino

Abstract: Despite major environmental and genetic differences, microbial metabolic networks are known to generate consistent physiological outcomes across vastly different organisms. This remarkable robustness suggests that, at least in bacteria, metabolic activity may be guided by universal principles. The constrained optimization of evolutionarily-motivated objective functions like the growth rate has eme… ▽ More Despite major environmental and genetic differences, microbial metabolic networks are known to generate consistent physiological outcomes across vastly different organisms. This remarkable robustness suggests that, at least in bacteria, metabolic activity may be guided by universal principles. The constrained optimization of evolutionarily-motivated objective functions like the growth rate has emerged as the key theoretical assumption for the study of bacterial metabolism. While conceptually and practically useful in many situations, the idea that certain functions are optimized is hard to validate in data. Moreover, it is not always clear how optimality can be reconciled with the high degree of single-cell variability observed in experiments within microbial populations. To shed light on these issues, we develop an inverse modeling framework that connects the fitness of a population of cells (represented by the mean single-cell growth rate) to the underlying metabolic variability through the Maximum-Entropy inference of the distribution of metabolic phenotypes from data. While no clear objective function emerges, we find that, as the medium gets richer, the fitness and inferred variability for Escherichia coli populations follow and slowly approach the theoretically optimal bound defined by minimal reduction of variability at given fitness. These results suggest that bacterial metabolism may be crucially shaped by a population-level trade-off between growth and heterogeneity. △ Less

Submitted 7 April, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

Comments: 12+30 pages (includes Supporting Text)

arXiv:2011.11259 [pdf, other]

doi 10.1103/PhysRevE.104.024407

Sparse generative modeling via parameter-reduction of Boltzmann machines: application to protein-sequence families

Authors: Pierre Barrat-Charlaix, Anna Paola Muntoni, Kai Shimagaki, Martin Weigt, Francesco Zamponi

Abstract: Boltzmann machines (BM) are widely used as generative models. For example, pairwise Potts models (PM), which are instances of the BM class, provide accurate statistical models of families of evolutionarily related protein sequences. Their parameters are the local fields, which describe site-specific patterns of amino-acid conservation, and the two-site couplings, which mirror the coevolution betwe… ▽ More Boltzmann machines (BM) are widely used as generative models. For example, pairwise Potts models (PM), which are instances of the BM class, provide accurate statistical models of families of evolutionarily related protein sequences. Their parameters are the local fields, which describe site-specific patterns of amino-acid conservation, and the two-site couplings, which mirror the coevolution between pairs of sites. This coevolution reflects structural and functional constraints acting on protein sequences during evolution. The most conservative choice to describe the coevolution signal is to include all possible two-site couplings into the PM. This choice, typical of what is known as Direct Coupling Analysis, has been successful for predicting residue contacts in the three-dimensional structure, mutational effects, and in generating new functional sequences. However, the resulting PM suffers from important over-fitting effects: many couplings are small, noisy and hardly interpretable; the PM is close to a critical point, meaning that it is highly sensitive to small parameter perturbations. In this work, we introduce a general parameter-reduction procedure for BMs, via a controlled iterative decimation of the less statistically significant couplings, identified by an information-based criterion that selects either weak or statistically unsupported couplings. For several protein families, our procedure allows one to remove more than $90\%$ of the PM couplings, while preserving the predictive and generative properties of the original dense PM, and the resulting model is far away from criticality, hence more robust to noise. △ Less

Submitted 30 July, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

Comments: 7 pages, 5 figures, plus Appendix

Journal ref: Phys. Rev. E 104, 024407 (2021)

arXiv:2010.13746 [pdf, other]

doi 10.1088/1742-5468/abed43

A Density Consistency approach to the inverse Ising problem

Authors: Alfredo Braunstein, Giovanni Catania, Luca Dall'Asta, Anna Paola Muntoni

Abstract: We propose a novel approach to the inverse Ising problem which employs the recently introduced Density Consistency approximation (DC) to determine the model parameters (couplings and external fields) maximizing the likelihood of given empirical data. This method allows for closed-form expressions of the inferred parameters as a function of the first and second empirical moments. Such expressions h… ▽ More We propose a novel approach to the inverse Ising problem which employs the recently introduced Density Consistency approximation (DC) to determine the model parameters (couplings and external fields) maximizing the likelihood of given empirical data. This method allows for closed-form expressions of the inferred parameters as a function of the first and second empirical moments. Such expressions have a similar structure to the small-correlation expansion derived by Sessak and Monasson, of which they provide an improvement in the case of non-zero magnetization at low temperatures, as well as in presence of random external fields. The present work provides an extensive comparison with most common inference methods used to reconstruct the model parameters in several regimes, i.e. by varying both the network topology and the distribution of fields and couplings. The comparison shows that no method is uniformly better than every other one, but DC appears nevertheless as one of the most accurate and reliable approaches to infer couplings and fields from first and second moments in a significant range of parameters. △ Less

Submitted 19 January, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

Comments: 15 pages, 4 figures

ACM Class: I.2.6; G.3

Journal ref: J. Stat. Mech. (2021) 033416

arXiv:2009.09422 [pdf, other]

doi 10.1073/pnas.2106548118

Epidemic mitigation by statistical inference from contact tracing data

Authors: Antoine Baker, Indaco Biazzo, Alfredo Braunstein, Giovanni Catania, Luca Dall'Asta, Alessandro Ingrosso, Florent Krzakala, Fabio Mazza, Marc Mézard, Anna Paola Muntoni, Maria Refinetti, Stefano Sarao Mannelli, Lenka Zdeborová

Abstract: Contact-tracing is an essential tool in order to mitigate the impact of pandemic such as the COVID-19. In order to achieve efficient and scalable contact-tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing th… ▽ More Contact-tracing is an essential tool in order to mitigate the impact of pandemic such as the COVID-19. In order to achieve efficient and scalable contact-tracing in real time, digital devices can play an important role. While a lot of attention has been paid to analyzing the privacy and ethical risks of the associated mobile applications, so far much less research has been devoted to optimizing their performance and assessing their impact on the mitigation of the epidemic. We develop Bayesian inference methods to estimate the risk that an individual is infected. This inference is based on the list of his recent contacts and their own risk levels, as well as personal information such as results of tests or presence of syndromes. We propose to use probabilistic risk estimation in order to optimize testing and quarantining strategies for the control of an epidemic. Our results show that in some range of epidemic spreading (typically when the manual tracing of all contacts of infected people becomes practically impossible, but before the fraction of infected people reaches the scale where a lock-down becomes unavoidable), this inference of individuals at risk could be an efficient way to mitigate the epidemic. Our approaches translate into fully distributed algorithms that only require communication between individuals who have recently been in contact. Such communication may be encrypted and anonymized and thus compatible with privacy preserving standards. We conclude that probabilistic risk estimation is capable to enhance performance of digital contact tracing and should be considered in the currently developed mobile applications. △ Less

Submitted 20 September, 2020; originally announced September 2020.

Comments: 21 pages, 7 figures

ACM Class: G.3; G.4; I.2.11; J.3

Journal ref: PNAS 2021 Vol. 118 No. 32 e2106548118

arXiv:2005.08500 [pdf, other]

doi 10.1103/PhysRevE.102.062409

Aligning biological sequences by exploiting residue conservation and coevolution

Authors: Anna Paola Muntoni, Andrea Pagnani, Martin Weigt, Francesco Zamponi

Abstract: Sequences of nucleotides (for DNA and RNA) or amino acids (for proteins) are central objects in biology. Among the most important computational problems is that of sequence alignment, i.e. arranging sequences from different organisms in such a way to identify similar regions, to detect evolutionary relationships between sequences, and to predict biomolecular structure and function. This is typical… ▽ More Sequences of nucleotides (for DNA and RNA) or amino acids (for proteins) are central objects in biology. Among the most important computational problems is that of sequence alignment, i.e. arranging sequences from different organisms in such a way to identify similar regions, to detect evolutionary relationships between sequences, and to predict biomolecular structure and function. This is typically addressed through profile models, which capture position-specificities like conservation in sequences, but assume an independent evolution of different positions. Over the last years, it has been well established that coevolution of different amino-acid positions is essential for maintaining three-dimensional structure and function. Modeling approaches based on inverse statistical physics can catch the coevolution signal in sequence ensembles; and they are now widely used in predicting protein structure, protein-protein interactions, and mutational landscapes. Here, we present DCAlign, an efficient alignment algorithm based on an approximate message-passing strategy, which is able to overcome the limitations of profile models, to include coevolution among positions in a general way, and to be therefore universally applicable to protein- and RNA-sequence alignment without the need of using complementary structural information. The potential of DCAlign is carefully explored using well-controlled simulated data, as well as real protein and RNA sequences. △ Less

Submitted 13 November, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: 20 pages, 11 figures + Supplementary Information

Journal ref: Phys. Rev. E 102, 062409 (2020)

arXiv:1904.05777 [pdf, other]

doi 10.1088/1751-8121/ab3065

Compressed sensing reconstruction using Expectation Propagation

Authors: Alfredo Braunstein, Anna Paola Muntoni, Andrea Pagnani, Mirko Pieropan

Abstract: Many interesting problems in fields ranging from telecommunications to computational biology can be formalized in terms of large underdetermined systems of linear equations with additional constraints or regularizers. One of the most studied ones, the Compressed Sensing problem (CS), consists in finding the solution with the smallest number of non-zero components of a given system of linear equati… ▽ More Many interesting problems in fields ranging from telecommunications to computational biology can be formalized in terms of large underdetermined systems of linear equations with additional constraints or regularizers. One of the most studied ones, the Compressed Sensing problem (CS), consists in finding the solution with the smallest number of non-zero components of a given system of linear equations $\boldsymbol y = \mathbf{F} \boldsymbol{w}$ for known measurement vector $\boldsymbol{y}$ and sensing matrix $\mathbf{F}$. Here, we will address the compressed sensing problem within a Bayesian inference framework where the sparsity constraint is remapped into a singular prior distribution (called Spike-and-Slab or Bernoulli-Gauss). Solution to the problem is attempted through the computation of marginal distributions via Expectation Propagation (EP), an iterative computational scheme originally developed in Statistical Physics. We will show that this strategy is comparatively more accurate than the alternatives in solving instances of CS generated from statistically correlated measurement matrices. For computational strategies based on the Bayesian framework such as variants of Belief Propagation, this is to be expected, as they implicitly rely on the hypothesis of statistical independence among the entries of the sensing matrix. Perhaps surprisingly, the method outperforms uniformly also all the other state-of-the-art methods in our tests. △ Less

Submitted 3 August, 2019; v1 submitted 10 April, 2019; originally announced April 2019.

Comments: 20 pages, 6 figures

arXiv:1809.03958 [pdf, other]

doi 10.1103/PhysRevE.100.032134

Non-convex image reconstruction via Expectation Propagation

Authors: Anna Paola Muntoni, Rafael Díaz Hernández Rojas, Alfredo Braunstein, Andrea Pagnani, Isaac Pérez Castillo

Abstract: Tomographic image reconstruction can be mapped to a problem of finding solutions to a large system of linear equations which maximize a function that includes \textit{a priori} knowledge regarding features of typical images such as smoothness or sharpness. This maximization can be performed with standard local optimization tools when the function is concave, but it is generally intractable for rea… ▽ More Tomographic image reconstruction can be mapped to a problem of finding solutions to a large system of linear equations which maximize a function that includes \textit{a priori} knowledge regarding features of typical images such as smoothness or sharpness. This maximization can be performed with standard local optimization tools when the function is concave, but it is generally intractable for realistic priors, which are non-concave. We introduce a new method to reconstruct images obtained from Radon projections by using Expectation Propagation, which allows us to reframe the problem from an Bayesian inference perspective. We show, by means of extensive simulations, that, compared to state-of-the-art algorithms for this task, Expectation Propagation paired with very simple but non log-concave priors, is often able to reconstruct images up to a smaller error while using a lower amount of information per pixel. We provide estimates for the critical rate of information per pixel above which recovery is error-free by means of simulations on ensembles of phantom and real images. △ Less

Submitted 11 September, 2018; originally announced September 2018.

Comments: 12 pages, 6 figures

Journal ref: Phys. Rev. E 100, 032134 (2019)

arXiv:1712.07041 [pdf, other]

doi 10.1088/1742-5468/aaeb3f

The cavity approach for Steiner trees packing problems

Authors: Alfredo Braunstein, Anna Paola Muntoni

Abstract: The Belief Propagation approximation, or cavity method, has been recently applied to several combinatorial optimization problems in its zero-temperature implementation, the max-sum algorithm. In particular, recent developments to solve the edge-disjoint paths problem and the prize-collecting Steiner tree problem on graphs have shown remarkable results for several classes of graphs and for benchmar… ▽ More The Belief Propagation approximation, or cavity method, has been recently applied to several combinatorial optimization problems in its zero-temperature implementation, the max-sum algorithm. In particular, recent developments to solve the edge-disjoint paths problem and the prize-collecting Steiner tree problem on graphs have shown remarkable results for several classes of graphs and for benchmark instances. Here we propose a generalization of these techniques for two variants of the Steiner trees packing problem where multiple "interacting" trees have to be sought within a given graph. Depending on the interaction among trees we distinguish the vertex-disjoint Steiner trees problem, where trees cannot share nodes, from the edge-disjoint Steiner trees problem, where edges cannot be shared by trees but nodes can be members of multiple trees. Several practical problems of huge interest in network design can be mapped into these two variants, for instance, the physical design of Very Large Scale Integration (VLSI) chips. The formalism described here relies on two components edge-variables that allows us to formulate a massage-passing algorithm for the V-DStP and two algorithms for the E-DStP differing in the scaling of the computational time with respect to some relevant parameters. We will show that one of the two formalisms used for the edge-disjoint variant allow us to map the max-sum update equations into a weighted maximum matching problem over proper bipartite graphs. We developed a heuristic procedure based on the max-sum equations that shows excellent performance in synthetic networks (in particular outperforming standard multi-step greedy procedures by large margins) and on large benchmark instances of VLSI for which the optimal solution is known, on which the algorithm found the optimum in two cases and the gap to optimality was never larger than 4 %. △ Less

Submitted 3 January, 2019; v1 submitted 19 December, 2017; originally announced December 2017.

Journal ref: J. Stat. Mech. (2018) 123401

arXiv:1702.05400 [pdf, other]

doi 10.1038/ncomms14915

An analytic approximation of the feasible space of metabolic networks

Authors: Alfredo Braunstein, Anna Paola Muntoni, Andrea Pagnani

Abstract: Assuming a steady-state condition within a cell, metabolic fluxes satisfy an under-determined linear system of stoichiometric equations. Characterizing the space of fluxes that satisfy such equations along with given bounds (and possibly additional relevant constraints) is considered of utmost importance for the understanding of cellular metabolism. Extreme values for each individual flux can be c… ▽ More Assuming a steady-state condition within a cell, metabolic fluxes satisfy an under-determined linear system of stoichiometric equations. Characterizing the space of fluxes that satisfy such equations along with given bounds (and possibly additional relevant constraints) is considered of utmost importance for the understanding of cellular metabolism. Extreme values for each individual flux can be computed with Linear Programming (as Flux Balance Analysis), and their marginal distributions can be approximately computed with Monte-Carlo sampling. Here we present an approximate analytic method for the latter task based on Expectation Propagation equations that does not involve sampling and can achieve much better predictions than other existing analytic methods. The method is iterative, and its computation time is dominated by one matrix inversion per iteration. With respect to sampling, we show through extensive simulation that it has some advantages including computation time, and the ability to efficiently fix empirically estimated distributions of fluxes. △ Less

Submitted 6 April, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

Journal ref: Nature Communications 8 14915, 2017

arXiv:1609.00432 [pdf, other]

doi 10.1098/rsif.2018.0844

Network reconstruction from infection cascades

Authors: Alfredo Braunstein, Alessandro Ingrosso, Anna Paola Muntoni

Abstract: Accessing the network through which a propagation dynamics diffuse is essential for understanding and controlling it. In a few cases, such information is available through direct experiments or thanks to the very nature of propagation data. In a majority of cases however, available information about the network is indirect and comes from partial observations of the dynamics, rendering the network… ▽ More Accessing the network through which a propagation dynamics diffuse is essential for understanding and controlling it. In a few cases, such information is available through direct experiments or thanks to the very nature of propagation data. In a majority of cases however, available information about the network is indirect and comes from partial observations of the dynamics, rendering the network reconstruction a fundamental inverse problem. Here we show that it is possible to reconstruct the whole structure of an interaction network and to simultaneously infer the complete time course of activation spreading, relying just on single epoch (i.e. snapshot) or time-scattered observations of a small number of activity cascades. The method that we present is built on a Belief Propagation approximation, that has shown impressive accuracy in a wide variety of relevant cases, and is able to infer interactions in presence of incomplete time-series data by providing a detailed modeling of the posterior distribution of trajectories conditioned to the observations. Furthermore, we show by experiments that the information content of full cascades is relatively smaller than that of sparse observations or single snapshots. △ Less

Submitted 12 February, 2018; v1 submitted 1 September, 2016; originally announced September 2016.

Comments: 18 pages, 10 figures (main text: 13 pages, 9 figures; Appendix: 4 pages, 1 figure)

Showing 1–16 of 16 results for author: Muntoni, A P