-
Characterisation of conserved and reacting moieties in chemical reaction networks
Authors:
Hadjar Rahou,
Hulda S. Haraldsdottir,
Filippo Martinelli,
Ines Thiele,
Ronan M. T. Fleming
Abstract:
A detailed understanding of biochemical networks at the molecular level is essential for studying complex cellular processes. In this paper, we provide a comprehensive description of biochemical networks by considering individual atoms and chemical bonds. To address combinatorial complexity, we introduce a well-established approach to group similar types of information within biochemical networks.…
▽ More
A detailed understanding of biochemical networks at the molecular level is essential for studying complex cellular processes. In this paper, we provide a comprehensive description of biochemical networks by considering individual atoms and chemical bonds. To address combinatorial complexity, we introduce a well-established approach to group similar types of information within biochemical networks. A conserved moiety is a set of atoms whose association is invariant across all reactions in a network. A reacting moiety is a set of bonds that are either broken, formed, or undergo a change in bond order in at least one reaction in the network. By mathematically identifying these moieties, we establish the biological significance of conserved and reacting moieties according to the mathematical properties of the stoichiometric matrix. We also present a novel decomposition of the stoichiometric matrix based on conserved moieties. This approach bridges the gap between graph theory, linear algebra, and biological interpretation, thus opening up new horizons in the study of chemical reaction networks.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Data Augmentation Scheme for Raman Spectra with Highly Correlated Annotations
Authors:
Christoph Lange,
Isabel Thiele,
Lara Santolin,
Sebastian L. Riedel,
Maxim Borisyak,
Peter Neubauer,
M. Nicolas Cruz Bournazou
Abstract:
In biotechnology Raman Spectroscopy is rapidly gaining popularity as a process analytical technology (PAT) that measures cell densities, substrate- and product concentrations. As it records vibrational modes of molecules it provides that information non-invasively in a single spectrum. Typically, partial least squares (PLS) is the model of choice to infer information about variables of interest fr…
▽ More
In biotechnology Raman Spectroscopy is rapidly gaining popularity as a process analytical technology (PAT) that measures cell densities, substrate- and product concentrations. As it records vibrational modes of molecules it provides that information non-invasively in a single spectrum. Typically, partial least squares (PLS) is the model of choice to infer information about variables of interest from the spectra. However, biological processes are known for their complexity where convolutional neural networks (CNN) present a powerful alternative. They can handle non-Gaussian noise and account for beam misalignment, pixel malfunctions or the presence of additional substances. However, they require a lot of data during model training, and they pick up non-linear dependencies in the process variables. In this work, we exploit the additive nature of spectra in order to generate additional data points from a given dataset that have statistically independent labels so that a network trained on such data exhibits low correlations between the model predictions. We show that training a CNN on these generated data points improves the performance on datasets where the annotations do not bear the same correlation as the dataset that was used for model training. This data augmentation technique enables us to reuse spectra as training data for new contexts that exhibit different correlations. The additional data allows for building a better and more robust model. This is of interest in scenarios where large amounts of historical data are available but are currently not used for model training. We demonstrate the capabilities of the proposed method using synthetic spectra of Ralstonia eutropha batch cultivations to monitor substrate, biomass and polyhydroxyalkanoate (PHA) biopolymer concentrations during of the experiments.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
DEMETER: Efficient simultaneous curation of genome-scale reconstructions guided by experimental data and refined gene annotations
Authors:
Almut Heinken,
Stefanía Magnúsdóttir,
Ronan M. T. Fleming,
Ines Thiele
Abstract:
Motivation: Manual curation of genome-scale reconstructions is laborious, yet existing automated curation tools typically do not take species-specific experimental data and manually refined genome annotations into account. Results: We developed DEMETER, a COBRA Toolbox extension that enables the efficient simultaneous refinement of thousands of draft genome-scale reconstructions while ensuring adh…
▽ More
Motivation: Manual curation of genome-scale reconstructions is laborious, yet existing automated curation tools typically do not take species-specific experimental data and manually refined genome annotations into account. Results: We developed DEMETER, a COBRA Toolbox extension that enables the efficient simultaneous refinement of thousands of draft genome-scale reconstructions while ensuring adherence to the quality standards in the field, agreement with available experimental data, and refinement of pathways based on manually refined genome annotations. Availability: DEMETER and tutorials are available at https://github.com/opencobra/cobratoolbox.
△ Less
Submitted 11 June, 2021;
originally announced June 2021.
-
Creation and analysis of biochemical constraint-based models: the COBRA Toolbox v3.0
Authors:
Laurent Heirendt,
Sylvain Arreckx,
Thomas Pfau,
Sebastián N. Mendoza,
Anne Richelle,
Almut Heinken,
Hulda S. Haraldsdóttir,
Jacek Wachowiak,
Sarah M. Keating,
Vanja Vlasov,
Stefania Magnusdóttir,
Chiam Yu Ng,
German Preciat,
Alise Žagare,
Siu H. J. Chan,
Maike K. Aurich,
Catherine M. Clancy,
Jennifer Modamio,
John T. Sauls,
Alberto Noronha,
Aarash Bordbar,
Benjamin Cousins,
Diana C. El Assal,
Luis V. Valcarcel,
Iñigo Apaolaza
, et al. (30 additional authors not shown)
Abstract:
COnstraint-Based Reconstruction and Analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental data and quantitative prediction of physicochemically and biochemically feasible phenotypic states. The COBRA Toolbox is a comprehensive software suite of interoperable COBRA methods. It has found widespread applications in biology, biomedicine, and biotechnology…
▽ More
COnstraint-Based Reconstruction and Analysis (COBRA) provides a molecular mechanistic framework for integrative analysis of experimental data and quantitative prediction of physicochemically and biochemically feasible phenotypic states. The COBRA Toolbox is a comprehensive software suite of interoperable COBRA methods. It has found widespread applications in biology, biomedicine, and biotechnology because its functions can be flexibly combined to implement tailored COBRA protocols for any biochemical network. Version 3.0 includes new methods for quality controlled reconstruction, modelling, topological analysis, strain and experimental design, network visualisation as well as network integration of chemoinformatic, metabolomic, transcriptomic, proteomic, and thermochemical data. New multi-lingual code integration also enables an expansion in COBRA application scope via high-precision, high-performance, and nonlinear numerical optimisation solvers for multi-scale, multi-cellular and reaction kinetic modelling, respectively. This protocol can be adapted for the generation and analysis of a constraint-based model in a wide variety of molecular systems biology scenarios. This protocol is an update to the COBRA Toolbox 1.0 and 2.0. The COBRA Toolbox 3.0 provides an unparalleled depth of constraint-based reconstruction and analysis methods.
△ Less
Submitted 23 February, 2018; v1 submitted 11 October, 2017;
originally announced October 2017.
-
From metagenomic data to personalized computational microbiotas: Predicting dietary supplements for Crohn's disease
Authors:
Eugen Bauer,
Ines Thiele
Abstract:
Crohn's disease (CD) is associated with an ecological imbalance of the intestinal microbiota, consisting of hundreds of species. The underlying complexity as well as individual differences between patients contributes to the difficulty to define a standardized treatment. Computational modeling can systematically investigate metabolic interactions between gut microbes to unravel novel mechanistic i…
▽ More
Crohn's disease (CD) is associated with an ecological imbalance of the intestinal microbiota, consisting of hundreds of species. The underlying complexity as well as individual differences between patients contributes to the difficulty to define a standardized treatment. Computational modeling can systematically investigate metabolic interactions between gut microbes to unravel novel mechanistic insights. In this study, we integrated metagenomic data of CD patients and healthy controls with genome-scale metabolic models into personalized in silico microbiotas. We predicted short chain fatty acid (SFCA) levels for patients and controls, which were overall congruent with experimental findings. As an emergent property, low concentrations of SCFA were predicted for CD patients and the SCFA signatures were unique to each patient. Consequently, we suggest personalized dietary treatments that could improve each patient's SCFA levels. The underlying modeling approach could aid clinical practice to find novel dietary treatment and guide recovery by rationally proposing food aliments.
△ Less
Submitted 18 September, 2017;
originally announced September 2017.
-
Comparative genomic analysis of the human gut microbiome reveals a broad distribution of metabolic pathways for the degradation of host-synthetized mucin glycans
Authors:
Dmitry A. Ravcheev,
Ines Thiele
Abstract:
The colonic mucus layer is a dynamic and complex structure formed by secreted and transmembrane mucins, which are high-molecular-weight and heavily glycosylated proteins. Colonic mucus consists of a loose outer layer and a dense epithelium-attached layer. The outer layer is inhabited by various representatives of the human gut microbiota (HGM). Glycans of the colonic mucus can be used by the HGM a…
▽ More
The colonic mucus layer is a dynamic and complex structure formed by secreted and transmembrane mucins, which are high-molecular-weight and heavily glycosylated proteins. Colonic mucus consists of a loose outer layer and a dense epithelium-attached layer. The outer layer is inhabited by various representatives of the human gut microbiota (HGM). Glycans of the colonic mucus can be used by the HGM as a source of carbon and energy when dietary fibers are not sufficiently available. Here, we analyzed 397 individual HGM genomes to identify pathways for the cleavage of host-synthetized mucin glycans to monosaccharides as well as for the catabolism of the derived monosaccharides. Our key results are as follows: (i) Genes for the cleavage of mucin glycans were found in 86% of the analyzed genomes, whereas genes for the catabolism of derived monosaccharides were found in 89% of the analyzed genomes. (ii) Comparative genomic analysis identified four alternative forms of the monosaccharide-catabolizing enzymes and four alternative forms of monosaccharide transporters. (iii) Eighty-five percent of the analyzed genomes may be involved in exchange pathways for the monosaccharides derived from cleaved mucin glycans. (iv) The analyzed genomes demonstrated different abilities to degrade known mucin glycans. Generally, the ability to degrade at least one type of mucin glycan was predicted for 81% of the analyzed genomes. (v) Eighty-two percent of the analyzed genomes can form mutualistic pairs that are able to degrade mucin glycans and are not degradable by any of the paired organisms alone. Taken together, these findings provide further insight into the inter-microbial communications of the HGM as well as into host-HGM interactions.
△ Less
Submitted 30 March, 2017;
originally announced March 2017.
-
DistributedFBA.jl: High-level, high-performance flux balance analysis in Julia
Authors:
Laurent Heirendt,
Ronan M. T. Fleming,
Ines Thiele
Abstract:
Motivation:
Flux balance analysis, and its variants, are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations.
Results:
DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in J…
▽ More
Motivation:
Flux balance analysis, and its variants, are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations.
Results:
DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes.
Availability:
The code and benchmark data are freely available on http://github.com/opencobra/COBRA.jl. The documentation can be found at http://opencobra.github.io/COBRA.jl
△ Less
Submitted 15 November, 2016;
originally announced November 2016.
-
MetaboTools: A comprehensive toolbox for analysis of genome-scale metabolic models
Authors:
Maike K. Aurich,
Ronan M. T. Fleming,
Ines Thiele
Abstract:
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Our previous work revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. Through this work, which consists of a protocol, a toolbox, and tutorials of two use cases, we make our methods…
▽ More
Metabolomic data sets provide a direct read-out of cellular phenotypes and are increasingly generated to study biological questions. Our previous work revealed the potential of analyzing extracellular metabolomic data in the context of the metabolic model using constraint-based modeling. Through this work, which consists of a protocol, a toolbox, and tutorials of two use cases, we make our methods available to the broader scientific community. The protocol describes, in a step-wise manner, the workflow of data integration and computational analysis. The MetaboTools comprise the Matlab code required to complete the workflow described in the protocol. Tutorials explain the computational steps for integration of two different data sets and demonstrate a comprehensive set of methods for the computational analysis of metabolic models and stratification thereof into different phenotypes. The presented workflow supports integrative analysis of multiple omics data sets. Importantly, all analysis tools can be applied to metabolic models without performing the entire workflow. Taken together, this protocol constitutes a comprehensive guide to the intra-model analysis of extracellular metabolomic data and a resource offering a broad set of computational analysis tools for a wide biomedical and non-biomedical research community.
△ Less
Submitted 9 June, 2016;
originally announced June 2016.
-
Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression
Authors:
Ding Ma,
Laurence Yang,
Ronan M. T. Fleming,
Ines Thiele,
Bernhard O. Palsson,
Michael A. Saunders
Abstract:
Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Standard double-precision solvers may return inaccurate solutions or report that no s…
▽ More
Constraint-Based Reconstruction and Analysis (COBRA) is currently the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers are extremely slow and hence not practical for ME models that currently have 70,000 constraints and variables and will grow larger. We have developed a quadruple-precision version of our linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves efficiency and reliability for ME models. DQQ enables extensive use of large, multiscale, linear and nonlinear models in systems biology and many other applications.
△ Less
Submitted 26 September, 2016; v1 submitted 31 May, 2016;
originally announced June 2016.
-
ReconMap: An interactive visualisation of human metabolism
Authors:
Alberto Noronha,
Anna Dröfn Danielsdóttir,
Freyr Jóhannsson,
Soffia Jónsdóttir,
Sindri Jarlsson,
Jón Pétur Gunnarsson,
Sigurður Brynjólfsson,
Piotr Gawron,
Reinhard Schneider,
Ines Thiele,
Ronan M. T. Fleming
Abstract:
A genome-scale reconstruction of human metabolism, Recon 2, is available but no interface exists to interactively visualise its content integrated with omics data and simulation results. We manually drew a comprehensive map, ReconMap 2.0, that is consistent with the content of Recon 2. We present it within a web interface that allows content query, visualization of custom datasets and submission o…
▽ More
A genome-scale reconstruction of human metabolism, Recon 2, is available but no interface exists to interactively visualise its content integrated with omics data and simulation results. We manually drew a comprehensive map, ReconMap 2.0, that is consistent with the content of Recon 2. We present it within a web interface that allows content query, visualization of custom datasets and submission of feedback to manual curators. ReconMap can be accessed via http://vmh.uni.lu, with network export in a Systems Biology Graphical Notation compliant format. A Constraint-Based Reconstruction and Analysis (COBRA) Toolbox extension to interact with ReconMap is available via https://github.com/opencobra/cobratoolbox.
△ Less
Submitted 31 May, 2016;
originally announced June 2016.
-
Conditions for duality between fluxes and concentrations in biochemical networks
Authors:
Ronan M. T. Fleming,
Nikos Vlassis,
Ines Thiele,
Michael A. Saunders
Abstract:
Mathematical and computational modelling of biochemical networks is often done in terms of either the concentrations of molecular species or the fluxes of biochemical reactions. When is mathematical modelling from either perspective equivalent to the other? Mathematical duality translates concepts, theorems or mathematical structures into other concepts, theorems or structures, in a one-to-one man…
▽ More
Mathematical and computational modelling of biochemical networks is often done in terms of either the concentrations of molecular species or the fluxes of biochemical reactions. When is mathematical modelling from either perspective equivalent to the other? Mathematical duality translates concepts, theorems or mathematical structures into other concepts, theorems or structures, in a one-to-one manner. We present a novel stoichiometric condition that is necessary and sufficient for duality between unidirectional fluxes and concentrations. Our numerical experiments, with computational models derived from a range of genome-scale biochemical networks, suggest that this flux-concentration duality is a pervasive property of biochemical networks. We also provide a combinatorial characterisation that is sufficient to ensure flux-concentration duality. That is, for every two disjoint sets of molecular species, there is at least one reaction complex that involves species from only one of the two sets. When unidirectional fluxes and molecular species concentrations are dual vectors, this implies that the behaviour of the corresponding biochemical network can be described entirely in terms of either concentrations or unidirectional fluxes.
△ Less
Submitted 8 December, 2015;
originally announced December 2015.
-
Mass conserved elementary kinetics is sufficient for the existence of a non-equilibrium steady state concentration
Authors:
Ronan M. T. Fleming,
Ines Thiele
Abstract:
Living systems are forced away from thermodynamic equilibrium by exchange of mass and energy with their environment. In order to model a biochemical reaction network in a non-equilibrium state one requires a mathematical formulation to mimic this forcing. We provide a general formulation to force an arbitrary large kinetic model in a manner that is still consistent with the existence of a non-equi…
▽ More
Living systems are forced away from thermodynamic equilibrium by exchange of mass and energy with their environment. In order to model a biochemical reaction network in a non-equilibrium state one requires a mathematical formulation to mimic this forcing. We provide a general formulation to force an arbitrary large kinetic model in a manner that is still consistent with the existence of a non-equilibrium steady state. We can guarantee the existence of a non-equilibrium steady state assuming only two conditions; that every reaction is mass balanced and that continuous kinetic reaction rate laws never lead to a negative molecule concentration. These conditions can be verified in polynomial time and are flexible enough to permit one to force a system away from equilibrium. In an expository biochemical example we show how a reversible, mass balanced perpetual reaction, with thermodynamically infeasible kinetic parameters, can be used to perpetually force a kinetic model of anaerobic glycolysis in a manner consistent with the existence of a steady state. Easily testable existence conditions are foundational for efforts to reliably compute non-equilibrium steady states in genome-scale biochemical kinetic models.
△ Less
Submitted 17 February, 2012; v1 submitted 21 September, 2011;
originally announced September 2011.