-
Experimental design for causal query estimation in partially observed biomolecular networks
Authors:
Sara Mohammad-Taheri,
Vartika Tewari,
Rohan Kapre,
Ehsan Rahiminasab,
Karen Sachs,
Charles Tapley Hoyt,
Jeremy Zucker,
Olga Vitek
Abstract:
Estimating a causal query from observational data is an essential task in the analysis of biomolecular networks. Estimation takes as input a network topology, a query estimation method, and observational measurements on the network variables. However, estimations involving many variables can be experimentally expensive, and computationally intractable. Moreover, using the full set of variables can…
▽ More
Estimating a causal query from observational data is an essential task in the analysis of biomolecular networks. Estimation takes as input a network topology, a query estimation method, and observational measurements on the network variables. However, estimations involving many variables can be experimentally expensive, and computationally intractable. Moreover, using the full set of variables can be detrimental, leading to bias, or increasing the variance in the estimation. Therefore, designing an experiment based on a well-chosen subset of network components can increase estimation accuracy, and reduce experimental and computational costs. We propose a simulation-based algorithm for selecting sub-networks that support unbiased estimators of the causal query under a constraint of cost, ranked with respect to the variance of the estimators. The simulations are constructed based on historical experimental data, or based on known properties of the biological system. Three case studies demonstrated the effectiveness of well-chosen network subsets for estimating causal queries from observational data. All the case studies are reproducible and available at https://github.com/srtaheri/Simplified_LVM.
△ Less
Submitted 28 November, 2022; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Perspectives for self-driving labs in synthetic biology
Authors:
Hector Garcia Martin,
Tijana Radivojevic,
Jeremy Zucker,
Kristofer Bouchard,
Jess Sustarich,
Sean Peisert,
Dan Arnold,
Nathan Hillson,
Gyorgy Babnigg,
Jose Manuel Marti,
Christopher J. Mungall,
Gregg T. Beckham,
Lucas Waldburger,
James Carothers,
ShivShankar Sundaram,
Deb Agarwal,
Blake A. Simmons,
Tyler Backman,
Deepanwita Banerjee,
Deepti Tanjore,
Lavanya Ramakrishnan,
Anup Singh
Abstract:
Self-driving labs (SDLs) combine fully automated experiments with artificial intelligence (AI) that decides the next set of experiments. Taken to their ultimate expression, SDLs could usher a new paradigm of scientific research, where the world is probed, interpreted, and explained by machines for human benefit. While there are functioning SDLs in the fields of chemistry and materials science, we…
▽ More
Self-driving labs (SDLs) combine fully automated experiments with artificial intelligence (AI) that decides the next set of experiments. Taken to their ultimate expression, SDLs could usher a new paradigm of scientific research, where the world is probed, interpreted, and explained by machines for human benefit. While there are functioning SDLs in the fields of chemistry and materials science, we contend that synthetic biology provides a unique opportunity since the genome provides a single target for affecting the incredibly wide repertoire of biological cell behavior. However, the level of investment required for the creation of biological SDLs is only warranted if directed towards solving difficult and enabling biological questions. Here, we discuss challenges and opportunities in creating SDLs for synthetic biology.
△ Less
Submitted 1 November, 2022; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Leveraging Structured Biological Knowledge for Counterfactual Inference: a Case Study of Viral Pathogenesis
Authors:
Jeremy Zucker,
Kaushal Paneri,
Sara Mohammad-Taheri,
Somya Bhargava,
Pallavi Kolambkar,
Craig Bakker,
Jeremy Teuton,
Charles Tapley Hoyt,
Kristie Oxford,
Robert Ness,
Olga Vitek
Abstract:
Counterfactual inference is a useful tool for comparing outcomes of interventions on complex systems. It requires us to represent the system in form of a structural causal model, complete with a causal diagram, probabilistic assumptions on exogenous variables, and functional assignments. Specifying such models can be extremely difficult in practice. The process requires substantial domain expertis…
▽ More
Counterfactual inference is a useful tool for comparing outcomes of interventions on complex systems. It requires us to represent the system in form of a structural causal model, complete with a causal diagram, probabilistic assumptions on exogenous variables, and functional assignments. Specifying such models can be extremely difficult in practice. The process requires substantial domain expertise, and does not scale easily to large systems, multiple systems, or novel system modifications. At the same time, many application domains, such as molecular biology, are rich in structured causal knowledge that is qualitative in nature. This manuscript proposes a general approach for querying a causal biological knowledge graph, and converting the qualitative result into a quantitative structural causal model that can learn from data to answer the question. We demonstrate the feasibility, accuracy and versatility of this approach using two case studies in systems biology. The first demonstrates the appropriateness of the underlying assumptions and the accuracy of the results. The second demonstrates the versatility of the approach by querying a knowledge base for the molecular determinants of a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-induced cytokine storm, and performing counterfactual inference to estimate the causal effect of medical countermeasures for severely ill patients.
△ Less
Submitted 13 January, 2021;
originally announced January 2021.
-
The new science of metagenomics and the challenges of its use in both developed and developing countries
Authors:
Edi Prifti,
Jean-Daniel Zucker
Abstract:
Our view of the microbial world and its impact on human health is changing radically with the ability to sequence uncultured or unculturable microbes sampled directly from their habitats, ability made possible by fast and cheap next generation sequencing technologies. Such recent developments represents a paradigmatic shift in the analysis of habitat biodiversity, be it the human, soil or ocean mi…
▽ More
Our view of the microbial world and its impact on human health is changing radically with the ability to sequence uncultured or unculturable microbes sampled directly from their habitats, ability made possible by fast and cheap next generation sequencing technologies. Such recent developments represents a paradigmatic shift in the analysis of habitat biodiversity, be it the human, soil or ocean microbiome. We review here some research examples and results that indicate the importance of the microbiome in our lives and then discus some of the challenges faced by metagenomic experiments and the subsequent analysis of the generated data. We then analyze the economic and social impact on genomic-medicine and research in both developing and developed countries. We support the idea that there are significant benefits in building capacities for developing high-level scientific research in metagenomics in developing countries. Indeed, the notion that developing countries should wait for developed countries to make advances in science and technology that they later import at great cost has recently been challenged.
△ Less
Submitted 10 May, 2013;
originally announced May 2013.