-
From FAIR to CURE: Guidelines for Computational Models of Biological Systems
Authors:
Herbert M. Sauro,
Eran Agmon,
Michael L. Blinov,
John H. Gennari,
Joe Hellerstein,
Adel Heydarabadipour,
Peter Hunter,
Bartholomew E. Jardine,
Elebeoba May,
David P. Nickerson,
Lucian P. Smith,
Gary D Bader,
Frank Bergmann,
Patrick M. Boyle,
Andreas Drager,
James R. Faeder,
Song Feng,
Juliana Freire,
Fabian Frohlich,
James A. Glazier,
Thomas E. Gorochowski,
Tomas Helikar,
Stefan Hoops,
Princess Imoukhuede,
Sarah M. Keating
, et al. (26 additional authors not shown)
Abstract:
Guidelines for managing scientific data have been established under the FAIR principles requiring that data be Findable, Accessible, Interoperable, and Reusable. In many scientific disciplines, especially computational biology, both data and models are key to progress. For this reason, and recognizing that such models are a very special type of 'data', we argue that computational models, especiall…
▽ More
Guidelines for managing scientific data have been established under the FAIR principles requiring that data be Findable, Accessible, Interoperable, and Reusable. In many scientific disciplines, especially computational biology, both data and models are key to progress. For this reason, and recognizing that such models are a very special type of 'data', we argue that computational models, especially mechanistic models prevalent in medicine, physiology and systems biology, deserve a complementary set of guidelines. We propose the CURE principles, emphasizing that models should be Credible, Understandable, Reproducible, and Extensible. We delve into each principle, discussing verification, validation, and uncertainty quantification for model credibility; the clarity of model descriptions and annotations for understandability; adherence to standards and open science practices for reproducibility; and the use of open standards and modular code for extensibility and reuse. We outline recommended and baseline requirements for each aspect of CURE, aiming to enhance the impact and trustworthiness of computational models, particularly in biomedical applications where credibility is paramount. Our perspective underscores the need for a more disciplined approach to modeling, aligning with emerging trends such as Digital Twins and emphasizing the importance of data and modeling standards for interoperability and reuse. Finally, we emphasize that given the non-trivial effort required to implement the guidelines, the community moves to automate as many of the guidelines as possible.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
BioSimulators: a central registry of simulation engines and services for recommending specific tools
Authors:
Bilal Shaikh,
Lucian P. Smith,
Dan Vasilescu,
Gnaneswara Marupilla,
Michael Wilson,
Eran Agmon,
Henry Agnew,
Steven S. Andrews,
Azraf Anwar,
Moritz E. Beber,
Frank T. Bergmann,
David Brooks,
Lutz Brusch,
Laurence Calzone,
Kiri Choi,
Joshua Cooper,
John Detloff,
Brian Drawert,
Michel Dumontier,
G. Bard Ermentrout,
James R. Faeder,
Andrew P. Freiburger,
Fabian Fröhlich,
Akira Funahashi,
Alan Garny
, et al. (46 additional authors not shown)
Abstract:
Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find…
▽ More
Computational models have great potential to accelerate bioscience, bioengineering, and medicine. However, it remains challenging to reproduce and reuse simulations, in part, because the numerous formats and methods for simulating various subsystems and scales remain siloed by different software tools. For example, each tool must be executed through a distinct interface. To help investigators find and use simulation tools, we developed BioSimulators (https://biosimulators.org), a central registry of the capabilities of simulation tools and consistent Python, command-line, and containerized interfaces to each version of each tool. The foundation of BioSimulators is standards, such as CellML, SBML, SED-ML, and the COMBINE archive format, and validation tools for simulation projects and simulation tools that ensure these standards are used consistently. To help modelers find tools for particular projects, we have also used the registry to develop recommendation services. We anticipate that BioSimulators will help modelers exchange, reproduce, and combine simulations.
△ Less
Submitted 13 March, 2022;
originally announced March 2022.
-
SED-ML Validator: tool for debugging simulation experiments
Authors:
Bilal Shaikh,
Andrew Philip Freiburger,
Matthias König,
Frank T. Bergmann,
David P. Nickerson,
Herbert M. Sauro,
Michael L. Blinov,
Lucian P. Smith,
Ion I. Moraru,
Jonathan R. Karr
Abstract:
Summary: More sophisticated models are needed to address problems in bioscience, synthetic biology, and precision medicine. To help facilitate the collaboration needed for such models, the community developed the Simulation Experiment Description Markup Language (SED-ML), a common format for describing simulations. However, the utility of SED-ML has been hampered by limited support for SED-ML amon…
▽ More
Summary: More sophisticated models are needed to address problems in bioscience, synthetic biology, and precision medicine. To help facilitate the collaboration needed for such models, the community developed the Simulation Experiment Description Markup Language (SED-ML), a common format for describing simulations. However, the utility of SED-ML has been hampered by limited support for SED-ML among modeling software tools and by different interpretations of SED-ML among the tools that support the format. To help modelers debug their simulations and to push the community to use SED-ML consistently, we developed a tool for validating SED-ML files. We have used the validator to correct the official SED-ML example files. We plan to use the validator to correct the files in the BioModels database so that they can be simulated. We anticipate that the validator will be a valuable tool for developing more predictive simulations and that the validator will help increase the adoption and interoperability of SED-ML.
Availability: The validator is freely available as a webform, HTTP API, command-line program, and Python package at https://run.biosimulations.org/utils/validate and https://pypi.org/project/biosimulators-utils. The validator is also embedded into interfaces to 11 simulation tools. The source code is openly available as described in the Supplementary data.
Contact: [email protected]
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
Practical Resources for Enhancing the Reproducibility of Mechanistic Modeling in Systems Biology
Authors:
Michael L. Blinov,
John H. Gennari,
Jonathan R. Karr,
Ion I. Moraru,
David P. Nickerson,
Herbert M. Sauro
Abstract:
Although reproducibility is a core tenet of the scientific method, it remains challenging to reproduce many results. Surprisingly, this also holds true for computational results in domains such as systems biology where there have been extensive standardization efforts. For example, Tiwari et al. recently found that they could only repeat 50% of published simulation results in systems biology. Towa…
▽ More
Although reproducibility is a core tenet of the scientific method, it remains challenging to reproduce many results. Surprisingly, this also holds true for computational results in domains such as systems biology where there have been extensive standardization efforts. For example, Tiwari et al. recently found that they could only repeat 50% of published simulation results in systems biology. Toward improving the reproducibility of computational systems research, we identified several resources that investigators can leverage to make their research more accessible, executable, and comprehensible by others. In particular, we identified several domain standards and curation services, as well as powerful approaches pioneered by the software engineering industry that we believe many investigators could adopt. Together, we believe these approaches could substantially enhance the reproducibility of systems biology research. In turn, we believe enhanced reproducibility would accelerate the development of more sophisticated models that could inform precision medicine and synthetic biology.
△ Less
Submitted 9 April, 2021;
originally announced April 2021.
-
Combinatorial complexity and dynamical restriction of network flows in signal transduction
Authors:
James R. Faeder,
Michael L. Blinov,
Byron Goldstein,
William S. Hlavacek
Abstract:
The activities and interactions of proteins that govern the cellular response to a signal generate a multitude of protein phosphorylation states and heterogeneous protein complexes. Here, using a computational model that accounts for 307 molecular species implied by specified interactions of four proteins involved in signalling by the immunoreceptor Fc$ε$RI, we determine the relative importance…
▽ More
The activities and interactions of proteins that govern the cellular response to a signal generate a multitude of protein phosphorylation states and heterogeneous protein complexes. Here, using a computational model that accounts for 307 molecular species implied by specified interactions of four proteins involved in signalling by the immunoreceptor Fc$ε$RI, we determine the relative importance of molecular species that can be generated during signalling, chemical transitions among these species, and reaction paths that lead to activation of the protein tyrosine kinase (PTK) Syk. By all of these measures and over 2- and 10-fold ranges of model parameters--rate constants and initial concentrations--only a small portion of the biochemical network is active. The spectrum of active complexes, however, can be shifted dramatically, even by a change in the concentration of a single protein, which suggests that the network can produce qualitatively different responses under different cellular conditions and in response to different inputs. Reduced models that reproduce predictions of the full model for a particular set of parameters lose their predictive capacity when parameters are varied over 2-fold ranges.
△ Less
Submitted 5 November, 2004;
originally announced November 2004.