-
A unified approach to reverse engineering and data selection for unique network identification
Authors:
Alan Veliz-Cuba,
Vanessa Newsome-Slade,
Elena S. Dimitrova
Abstract:
Due to cost concerns, it is optimal to gain insight into the connectivity of biological and other networks using as few experiments as possible. Data selection for unique network connectivity identification has been an open problem since the introduction of algebraic methods for reverse engineering for almost two decades. In this manuscript we determine what data sets uniquely identify the unsigne…
▽ More
Due to cost concerns, it is optimal to gain insight into the connectivity of biological and other networks using as few experiments as possible. Data selection for unique network connectivity identification has been an open problem since the introduction of algebraic methods for reverse engineering for almost two decades. In this manuscript we determine what data sets uniquely identify the unsigned wiring diagram corresponding to a system that is discrete in time and space. Furthermore, we answer the question of uniqueness for signed wiring diagrams for Boolean networks. Computationally, unsigned and signed wiring diagrams have been studied separately, and in this manuscript we also show that there exists an ideal capable of encoding both unsigned and signed information. This provides a unified approach to studying reverse engineering that also gives significant computational benefits.
△ Less
Submitted 12 December, 2022; v1 submitted 10 December, 2022;
originally announced December 2022.
-
Estimating the Long-term Behavior of Biologically Inspired Agent-based Models
Authors:
Daniel A. Cruz,
Jack Toppen,
Eunbi Park,
Melissa L. Kemp,
Elena S. Dimitrova
Abstract:
An agent-based model (ABM) is a computational model in which the local interactions of autonomous agents with each other and with their environment give rise to global properties within a given domain. As the detail and complexity of these models has grown, so too has the computational expense of running several simulations to perform sensitivity analysis and evaluate long-term model behavior. Her…
▽ More
An agent-based model (ABM) is a computational model in which the local interactions of autonomous agents with each other and with their environment give rise to global properties within a given domain. As the detail and complexity of these models has grown, so too has the computational expense of running several simulations to perform sensitivity analysis and evaluate long-term model behavior. Here, we generalize a framework for mathematically formalizing ABMs to explicitly incorporate features commonly found in biological systems: appearance of agents (birth), removal of agents (death), and locally dependent state changes. We then use our broader framework to extend an approach for estimating long-term behavior without simulations, specifically changes in population densities over time. The approach is probabilistic and relies on treating the discrete, incremental update of an ABM via "time steps" as a Markov process to generate expected values for agents at each time step. As case studies, we apply our extensions to both a simple ABM based on the Game of Life and a published ABM of rib development in vertebrates.
△ Less
Submitted 30 November, 2022; v1 submitted 1 November, 2022;
originally announced November 2022.
-
Algebraic Experimental Design: Theory and Computation
Authors:
Elena S. Dimitrova,
Cameron H. Fredrickson,
Nicholas A. Rondoni,
Brandilyn Stigler,
Alan Veliz-Cuba
Abstract:
Over the past several decades, algebraic geometry has provided innovative approaches to biological experimental design that resolved theoretical questions and improved computational efficiency. However, guaranteeing uniqueness and perfect recovery of models are still open problems. In this work we study the problem of uniqueness of wiring diagrams. We use as a modeling framework polynomial dynamic…
▽ More
Over the past several decades, algebraic geometry has provided innovative approaches to biological experimental design that resolved theoretical questions and improved computational efficiency. However, guaranteeing uniqueness and perfect recovery of models are still open problems. In this work we study the problem of uniqueness of wiring diagrams. We use as a modeling framework polynomial dynamical systems and utilize the correspondence between simplicial complexes and square-free monomial ideals from Stanley-Reisner theory to develop theory and construct an algorithm for identifying input data sets $V\subset \mathbb F_p^n$ that are guaranteed to correspond to a unique minimal wiring diagram regardless of the experimental output. We apply the results on a tumor-suppression network mediated by epidermal derived growth factor receptor and demonstrate how careful experimental design decisions can lead to a unique minimal wiring diagram identification. One of the insights of the theoretical work is the connection between the uniqueness of a wiring diagram for a given $V\subset \mathbb F_p^n$ and the uniqueness of the reduced Gröbner basis of the polynomial ideal $I(V)\subset \mathbb F_p[x_1,\ldots, x_n]$. We discuss existing results and introduce a new necessary condition on the points in $V$ for uniqueness of the reduced Gröbner basis of $I(V)$. These results also point to the importance of the relative proximity of the experimental input points on the number of minimal wiring diagrams, which we then study computationally. We find that there is a concrete heuristic way to generate data that tends to result in fewer minimal wiring diagrams.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Algebraic Model Selection and Experimental Design in Biological Data Science
Authors:
Anyu Zhang,
Jingzhen Hu,
Qingzhong Liang,
Elena S. Dimitrova,
Brandilyn Stigler
Abstract:
Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental…
▽ More
Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental design and model selection for discrete data sets and minimal polynomial models. We use a special affine transformation, called a linear shift, to provide both the data sets and the polynomial terms that form a basis for a model. This framework enables us to address two important questions that arise in biological data science research: finding the data which identify a set of known interactions and finding identifiable interactions given a set of data. We present the theoretical foundation for a web-accessible database. As an example, we apply this methodology to a previously constructed pharmacodynamic model of epidermal derived growth factor receptor (EGFR) signaling.
△ Less
Submitted 22 January, 2021;
originally announced January 2021.
-
Geometric characterization of data sets with unique reduced Gröbner bases
Authors:
Elena S. Dimitrova,
Qijun He,
Brandilyn Stigler,
Anyu Zhang
Abstract:
Model selection based on experimental data is an important challenge in biological data science. Particularly when collecting data is expensive or time consuming, as it is often the case with clinical trial and biomolecular experiments, the problem of selecting information-rich data becomes crucial for creating relevant models. We identify geometric properties of input data that result in a unique…
▽ More
Model selection based on experimental data is an important challenge in biological data science. Particularly when collecting data is expensive or time consuming, as it is often the case with clinical trial and biomolecular experiments, the problem of selecting information-rich data becomes crucial for creating relevant models. We identify geometric properties of input data that result in a unique algebraic model and we show that if the data form a staircase, or a so-called linear shift of a staircase, the ideal of the points has a unique reduced Gro bner basis and thus corresponds to a unique model. We use linear shifts to partition data into equivalence classes with the same basis. We demonstrate the utility of the results by applying them to a Boolean model of the well-studied lac operon in E. coli.
△ Less
Submitted 2 June, 2020; v1 submitted 2 November, 2018;
originally announced November 2018.
-
Molecular Network Control Through Boolean Canalization
Authors:
David Murrugarra,
Elena S. Dimitrova
Abstract:
Boolean networks are an important class of computational models for molecular interaction networks. Boolean canalization, a type of hierarchical clustering of the inputs of a Boolean function, has been extensively studied in the context of network modeling where each layer of canalization adds a degree of stability in the dynamics of the network. Recently, dynamic network control approaches have b…
▽ More
Boolean networks are an important class of computational models for molecular interaction networks. Boolean canalization, a type of hierarchical clustering of the inputs of a Boolean function, has been extensively studied in the context of network modeling where each layer of canalization adds a degree of stability in the dynamics of the network. Recently, dynamic network control approaches have been used for the design of new therapeutic interventions and for other applications such as stem cell reprogramming. This work studies the role of canalization in the control of Boolean molecular networks. It provides a method for identifying the potential edges to control in the wiring diagram of a network for avoiding undesirable state transitions. The method is based on identifying appropriate input-output combinations on undesirable transitions that can be modified using the edges in the wiring diagram of the network. Moreover, a method for estimating the number of changed transitions in the state space of the system as a result of an edge deletion in the wiring diagram is presented. The control methods of this paper were applied to a mutated cell-cycle model and to a p53-mdm2 model to identify potential control targets.
△ Less
Submitted 24 October, 2015; v1 submitted 12 August, 2015;
originally announced August 2015.
-
Discretization of Time Series Data
Authors:
Elena S. Dimitrova,
John J. McGee,
Reinhard C. Laubenbacher
Abstract:
Data discretization, also known as binning, is a frequently used technique in computer science, statistics, and their applications to biological data analysis. We present a new method for the discretization of real-valued data into a finite number of discrete values. Novel aspects of the method are the incorporation of an information-theoretic criterion and a criterion to determine the optimal n…
▽ More
Data discretization, also known as binning, is a frequently used technique in computer science, statistics, and their applications to biological data analysis. We present a new method for the discretization of real-valued data into a finite number of discrete values. Novel aspects of the method are the incorporation of an information-theoretic criterion and a criterion to determine the optimal number of values. While the method can be used for data clustering, the motivation for its development is the need for a discretization algorithm for several multivariate time series of heterogeneous data, such as transcript, protein, and metabolite concentration measurements. As several modeling methods for biochemical networks employ discrete variable states, the method needs to preserve correlations between variables as well as the dynamic features of the time series. A C++ implementation of the algorithm is available from the authors at http://polymath.vbi.vt.edu/discretization .
△ Less
Submitted 29 August, 2005; v1 submitted 15 May, 2005;
originally announced May 2005.