Search | arXiv e-print repository

A unified approach to reverse engineering and data selection for unique network identification

Authors: Alan Veliz-Cuba, Vanessa Newsome-Slade, Elena S. Dimitrova

Abstract: Due to cost concerns, it is optimal to gain insight into the connectivity of biological and other networks using as few experiments as possible. Data selection for unique network connectivity identification has been an open problem since the introduction of algebraic methods for reverse engineering for almost two decades. In this manuscript we determine what data sets uniquely identify the unsigne… ▽ More Due to cost concerns, it is optimal to gain insight into the connectivity of biological and other networks using as few experiments as possible. Data selection for unique network connectivity identification has been an open problem since the introduction of algebraic methods for reverse engineering for almost two decades. In this manuscript we determine what data sets uniquely identify the unsigned wiring diagram corresponding to a system that is discrete in time and space. Furthermore, we answer the question of uniqueness for signed wiring diagrams for Boolean networks. Computationally, unsigned and signed wiring diagrams have been studied separately, and in this manuscript we also show that there exists an ideal capable of encoding both unsigned and signed information. This provides a unified approach to studying reverse engineering that also gives significant computational benefits. △ Less

Submitted 12 December, 2022; v1 submitted 10 December, 2022; originally announced December 2022.

Comments: 21 pages

arXiv:2211.00630 [pdf, other]

Estimating the Long-term Behavior of Biologically Inspired Agent-based Models

Authors: Daniel A. Cruz, Jack Toppen, Eunbi Park, Melissa L. Kemp, Elena S. Dimitrova

Abstract: An agent-based model (ABM) is a computational model in which the local interactions of autonomous agents with each other and with their environment give rise to global properties within a given domain. As the detail and complexity of these models has grown, so too has the computational expense of running several simulations to perform sensitivity analysis and evaluate long-term model behavior. Her… ▽ More An agent-based model (ABM) is a computational model in which the local interactions of autonomous agents with each other and with their environment give rise to global properties within a given domain. As the detail and complexity of these models has grown, so too has the computational expense of running several simulations to perform sensitivity analysis and evaluate long-term model behavior. Here, we generalize a framework for mathematically formalizing ABMs to explicitly incorporate features commonly found in biological systems: appearance of agents (birth), removal of agents (death), and locally dependent state changes. We then use our broader framework to extend an approach for estimating long-term behavior without simulations, specifically changes in population densities over time. The approach is probabilistic and relies on treating the discrete, incremental update of an ABM via "time steps" as a Markov process to generate expected values for agents at each time step. As case studies, we apply our extensions to both a simple ABM based on the Game of Life and a published ABM of rib development in vertebrates. △ Less

Submitted 30 November, 2022; v1 submitted 1 November, 2022; originally announced November 2022.

MSC Class: 03D20; 60J05; 68Q80; 68U20; 92B05; 92C15

arXiv:2208.02726 [pdf, other]

doi 10.1137/22M1513241

Algebraic Experimental Design: Theory and Computation

Authors: Elena S. Dimitrova, Cameron H. Fredrickson, Nicholas A. Rondoni, Brandilyn Stigler, Alan Veliz-Cuba

Abstract: Over the past several decades, algebraic geometry has provided innovative approaches to biological experimental design that resolved theoretical questions and improved computational efficiency. However, guaranteeing uniqueness and perfect recovery of models are still open problems. In this work we study the problem of uniqueness of wiring diagrams. We use as a modeling framework polynomial dynamic… ▽ More Over the past several decades, algebraic geometry has provided innovative approaches to biological experimental design that resolved theoretical questions and improved computational efficiency. However, guaranteeing uniqueness and perfect recovery of models are still open problems. In this work we study the problem of uniqueness of wiring diagrams. We use as a modeling framework polynomial dynamical systems and utilize the correspondence between simplicial complexes and square-free monomial ideals from Stanley-Reisner theory to develop theory and construct an algorithm for identifying input data sets $V\subset \mathbb F_p^n$ that are guaranteed to correspond to a unique minimal wiring diagram regardless of the experimental output. We apply the results on a tumor-suppression network mediated by epidermal derived growth factor receptor and demonstrate how careful experimental design decisions can lead to a unique minimal wiring diagram identification. One of the insights of the theoretical work is the connection between the uniqueness of a wiring diagram for a given $V\subset \mathbb F_p^n$ and the uniqueness of the reduced Gröbner basis of the polynomial ideal $I(V)\subset \mathbb F_p[x_1,\ldots, x_n]$. We discuss existing results and introduce a new necessary condition on the points in $V$ for uniqueness of the reduced Gröbner basis of $I(V)$. These results also point to the importance of the relative proximity of the experimental input points on the number of minimal wiring diagrams, which we then study computationally. We find that there is a concrete heuristic way to generate data that tends to result in fewer minimal wiring diagrams. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: 16 pages, 2 figures (one in color but prints well in b&w), 2 tables

MSC Class: 11T06; 92C42; 37N25; 13F55

Journal ref: SIAM Journal on Applied Algebra and Geometry, 8 (2024), 284-301

arXiv:2101.09384 [pdf, other]

doi 10.1016/j.aam.2021.102282

Algebraic Model Selection and Experimental Design in Biological Data Science

Authors: Anyu Zhang, Jingzhen Hu, Qingzhong Liang, Elena S. Dimitrova, Brandilyn Stigler

Abstract: Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental… ▽ More Design of experiments and model selection, though essential steps in data science, are usually viewed as unrelated processes in the study and analysis of biological networks. Not accounting for their inter-relatedness has the potential to introduce bias and increase the risk of missing salient features in the modeling process. We propose a data-driven computational framework to unify experimental design and model selection for discrete data sets and minimal polynomial models. We use a special affine transformation, called a linear shift, to provide both the data sets and the polynomial terms that form a basis for a model. This framework enables us to address two important questions that arise in biological data science research: finding the data which identify a set of known interactions and finding identifiable interactions given a set of data. We present the theoretical foundation for a web-accessible database. As an example, we apply this methodology to a previously constructed pharmacodynamic model of epidermal derived growth factor receptor (EGFR) signaling. △ Less

Submitted 22 January, 2021; originally announced January 2021.

Comments: 22 pages, 8 figures, 6 tables

MSC Class: 13P10; 14G15; 93B15; 93B20

Journal ref: Advances in Applied Mathematics, 133 (2022)

arXiv:1811.01114 [pdf, other]

doi 10.1007/s11538-019-00624-x

Geometric characterization of data sets with unique reduced Gröbner bases

Authors: Elena S. Dimitrova, Qijun He, Brandilyn Stigler, Anyu Zhang

Abstract: Model selection based on experimental data is an important challenge in biological data science. Particularly when collecting data is expensive or time consuming, as it is often the case with clinical trial and biomolecular experiments, the problem of selecting information-rich data becomes crucial for creating relevant models. We identify geometric properties of input data that result in a unique… ▽ More Model selection based on experimental data is an important challenge in biological data science. Particularly when collecting data is expensive or time consuming, as it is often the case with clinical trial and biomolecular experiments, the problem of selecting information-rich data becomes crucial for creating relevant models. We identify geometric properties of input data that result in a unique algebraic model and we show that if the data form a staircase, or a so-called linear shift of a staircase, the ideal of the points has a unique reduced Gro bner basis and thus corresponds to a unique model. We use linear shifts to partition data into equivalence classes with the same basis. We demonstrate the utility of the results by applying them to a Boolean model of the well-studied lac operon in E. coli. △ Less

Submitted 2 June, 2020; v1 submitted 2 November, 2018; originally announced November 2018.

Journal ref: Bulletin of Mathematical Biology 81 (2019) 2691-2705

arXiv:1508.03026 [pdf, other]

doi 10.1186/s13637-015-0029-2

Molecular Network Control Through Boolean Canalization

Authors: David Murrugarra, Elena S. Dimitrova

Abstract: Boolean networks are an important class of computational models for molecular interaction networks. Boolean canalization, a type of hierarchical clustering of the inputs of a Boolean function, has been extensively studied in the context of network modeling where each layer of canalization adds a degree of stability in the dynamics of the network. Recently, dynamic network control approaches have b… ▽ More Boolean networks are an important class of computational models for molecular interaction networks. Boolean canalization, a type of hierarchical clustering of the inputs of a Boolean function, has been extensively studied in the context of network modeling where each layer of canalization adds a degree of stability in the dynamics of the network. Recently, dynamic network control approaches have been used for the design of new therapeutic interventions and for other applications such as stem cell reprogramming. This work studies the role of canalization in the control of Boolean molecular networks. It provides a method for identifying the potential edges to control in the wiring diagram of a network for avoiding undesirable state transitions. The method is based on identifying appropriate input-output combinations on undesirable transitions that can be modified using the edges in the wiring diagram of the network. Moreover, a method for estimating the number of changed transitions in the state space of the system as a result of an edge deletion in the wiring diagram is presented. The control methods of this paper were applied to a mutated cell-cycle model and to a p53-mdm2 model to identify potential control targets. △ Less

Submitted 24 October, 2015; v1 submitted 12 August, 2015; originally announced August 2015.

Journal ref: EURASIP Journal on Bioinformatics and Systems Biology, 2015:9, 2015

arXiv:q-bio/0505028 [pdf]

Discretization of Time Series Data

Authors: Elena S. Dimitrova, John J. McGee, Reinhard C. Laubenbacher

Abstract: Data discretization, also known as binning, is a frequently used technique in computer science, statistics, and their applications to biological data analysis. We present a new method for the discretization of real-valued data into a finite number of discrete values. Novel aspects of the method are the incorporation of an information-theoretic criterion and a criterion to determine the optimal n… ▽ More Data discretization, also known as binning, is a frequently used technique in computer science, statistics, and their applications to biological data analysis. We present a new method for the discretization of real-valued data into a finite number of discrete values. Novel aspects of the method are the incorporation of an information-theoretic criterion and a criterion to determine the optimal number of values. While the method can be used for data clustering, the motivation for its development is the need for a discretization algorithm for several multivariate time series of heterogeneous data, such as transcript, protein, and metabolite concentration measurements. As several modeling methods for biochemical networks employ discrete variable states, the method needs to preserve correlations between variables as well as the dynamic features of the time series. A C++ implementation of the algorithm is available from the authors at http://polymath.vbi.vt.edu/discretization . △ Less

Submitted 29 August, 2005; v1 submitted 15 May, 2005; originally announced May 2005.

Comments: Number of pages: 8; number of figures: 5; number of tables: 1; software download at http://polymath.vbi.vt.edu/discretization

Showing 1–7 of 7 results for author: Dimitrova, E S