-
Implementation of a practical Markov chain Monte Carlo sampling algorithm in PyBioNetFit
Authors:
Jacob Neumann,
Yen Ting Lin,
Abhishek Mallela,
Ely F. Miller,
Joshua Colvin,
Abell T. Duprat1,
Ye Chen,
William S. Hlavacek,
Richard G. Posner
Abstract:
Bayesian inference in biological modeling commonly relies on Markov chain Monte Carlo (MCMC) sampling of a multidimensional and non-Gaussian posterior distribution that is not analytically tractable. Here, we present the implementation of a practical MCMC method in the open-source software package PyBioNetFit (PyBNF), which is designed to support parameterization of mathematical models for biologi…
▽ More
Bayesian inference in biological modeling commonly relies on Markov chain Monte Carlo (MCMC) sampling of a multidimensional and non-Gaussian posterior distribution that is not analytically tractable. Here, we present the implementation of a practical MCMC method in the open-source software package PyBioNetFit (PyBNF), which is designed to support parameterization of mathematical models for biological systems. The new MCMC method, am, incorporates an adaptive move proposal distribution. For warm starts, sampling can be initiated at a specified location in parameter space and with a multivariate Gaussian proposal distribution defined initially by a specified covariance matrix. Multiple chains can be generated in parallel using a computer cluster. We demonstrate that am can be used to successfully solve real-world Bayesian inference problems, including forecasting of new Coronavirus Disease 2019 case detection with Bayesian quantification of forecast uncertainty. PyBNF version 1.1.9, the first stable release with am, is available at PyPI and can be installed using the pip package-management system on platforms that have a working installation of Python 3. PyBNF relies on libRoadRunner and BioNetGen for simulations (e.g., numerical integration of ordinary differential equations defined in SBML or BNGL files) and Dask.Distributed for task scheduling on Linux computer clusters.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Daily Forecasting of New Cases for Regional Epidemics of Coronavirus Disease 2019 with Bayesian Uncertainty Quantification
Authors:
Yen Ting Lin,
Jacob Neumann,
Ely Miller,
Richard G. Posner,
Abhishek Mallela,
Cosmin Safta,
Jaideep Ray,
Gautam Thakur,
Supriya Chinthavali,
William S. Hlavacek
Abstract:
To increase situational awareness and support evidence-based policy-making, we formulated two types of mathematical models for COVID-19 transmission within a regional population. One is a fitting function that can be calibrated to reproduce an epidemic curve with two timescales (e.g., fast growth and slow decay). The other is a compartmental model that accounts for quarantine, self-isolation, soci…
▽ More
To increase situational awareness and support evidence-based policy-making, we formulated two types of mathematical models for COVID-19 transmission within a regional population. One is a fitting function that can be calibrated to reproduce an epidemic curve with two timescales (e.g., fast growth and slow decay). The other is a compartmental model that accounts for quarantine, self-isolation, social distancing, a non-exponentially distributed incubation period, asymptomatic individuals, and mild and severe forms of symptomatic disease. Using Bayesian inference, we have been calibrating our models daily for consistency with new reports of confirmed cases from the 15 most populous metropolitan statistical areas in the United States and quantifying uncertainty in parameter estimates and predictions of future case reports. This online learning approach allows for early identification of new trends despite considerable variability in case reporting. We infer new significant upward trends for five of the metropolitan areas starting between 19-April-2020 and 12-June-2020.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
PyBioNetFit and the Biological Property Specification Language
Authors:
Eshan D. Mitra,
Ryan Suderman,
Joshua Colvin,
Alexander Ionkov,
Andrew Hu,
Herbert M. Sauro,
Richard G. Posner,
William S. Hlavacek
Abstract:
In systems biology modeling, important steps include model parameterization, uncertainty quantification, and evaluation of agreement with experimental observations. To help modelers perform these steps, we developed the software PyBioNetFit. PyBioNetFit is designed for parameterization, and also supports uncertainty quantification, checking models against known system properties, and solving desig…
▽ More
In systems biology modeling, important steps include model parameterization, uncertainty quantification, and evaluation of agreement with experimental observations. To help modelers perform these steps, we developed the software PyBioNetFit. PyBioNetFit is designed for parameterization, and also supports uncertainty quantification, checking models against known system properties, and solving design problems. PyBioNetFit introduces the Biological Property Specification Language (BPSL) for the formal declaration of system properties. BPSL allows qualitative data to be used alone or in combination with quantitative data for parameterization model checking, and design. PyBioNetFit performs parameterization with parallelized metaheuristic optimization algorithms (differential evolution, particle swarm optimization, scatter search) that work directly with existing model definition standards: BioNetGen Language (BNGL) and Systems Biology Markup Language (SBML). We demonstrate PyBioNetFit's capabilities by solving 31 example problems, including the challenging problem of parameterizing a model of cell cycle control in yeast. We benchmark PyBioNetFit's parallelization efficiency on computer clusters, using up to 288 cores. Finally, we demonstrate the model checking and design applications of PyBioNetFit and BPSL by analyzing a model of therapeutic interventions in autophagy signaling.
△ Less
Submitted 18 March, 2019;
originally announced March 2019.
-
A Step-by-Step Guide to Using BioNetFit
Authors:
William S. Hlavacek,
Jennifer Longo,
Lewis R. Baker,
María del Carmen Ramos Álamo,
Alexander Ionkov,
Eshan D. Mitra,
Ryan Suderman,
Keesha E. Erickson,
Raquel Dias,
Joshua Colvin,
Brandon R. Thomas,
Richard G. Posner
Abstract:
BioNetFit is a software tool designed for solving parameter identification problems that arise in the development of rule-based models. It solves these problems through curve fitting (i.e., nonlinear regression). BioNetFit is compatible with deterministic and stochastic simulators that accept BioNetGen language (BNGL)-formatted files as inputs, such those available within the BioNetGen framework.…
▽ More
BioNetFit is a software tool designed for solving parameter identification problems that arise in the development of rule-based models. It solves these problems through curve fitting (i.e., nonlinear regression). BioNetFit is compatible with deterministic and stochastic simulators that accept BioNetGen language (BNGL)-formatted files as inputs, such those available within the BioNetGen framework. BioNetFit can be used on a laptop or standalone multicore workstation as well as on many Linux clusters, such as those that use the Slurm Workload Manager to schedule jobs. BioNetFit implements a metaheuristic population-based global optimization procedure, an evolutionary algorithm (EA), to minimize a user-defined objective function, such as a residual sum of squares (RSS) function. BioNetFit also implements a bootstrapping procedure for determining confidence intervals for parameter estimates. Here, we provide step-by-step instructions for using BioNetFit to estimate the values of parameters of a BNGL-encoded model and to define bootstrap confidence intervals. The process entails the use of several plain-text files, which are processed by BioNetFit and BioNetGen. In general, these files include 1) one or more EXP files, which each contains (experimental) data to be used in parameter identification/bootstrapping; 2) a BNGL file containing a model section, which defines a (rule-based) model, and an actions section, which defines simulation protocols that generate GDAT and/or SCAN files with model predictions corresponding to the data in the EXP file(s); and 3) a CONF file that configures the fitting/bootstrapping job and that defines algorithmic parameter settings.
△ Less
Submitted 21 September, 2018;
originally announced September 2018.