-
Opportunities in deep learning methods development for computational biology
Authors:
Alex Jihun Lee,
Reza Abbasi-Asl
Abstract:
Advances in molecular technologies underlie an enormous growth in the size of data sets pertaining to biology and biomedicine. These advances parallel those in the deep learning subfield of machine learning. Components in the differentiable programming toolbox that makes deep learning possible are allowing computer scientists to address an increasingly large array of problems with flexible and eff…
▽ More
Advances in molecular technologies underlie an enormous growth in the size of data sets pertaining to biology and biomedicine. These advances parallel those in the deep learning subfield of machine learning. Components in the differentiable programming toolbox that makes deep learning possible are allowing computer scientists to address an increasingly large array of problems with flexible and effective tools. However many of these tools have not fully proliferated into the computational biology and bioinformatics fields. In this perspective we survey some of these advances and highlight exemplary examples of their utilization in the biosciences, with the goal of increasing awareness among practitioners of emerging opportunities to blend expert knowledge with newly emerging deep learning architectural tools.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Machine Learning for Uncovering Biological Insights in Spatial Transcriptomics Data
Authors:
Alex J. Lee,
Robert Cahill,
Reza Abbasi-Asl
Abstract:
Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and…
▽ More
Development and homeostasis in multicellular systems both require exquisite control over spatial molecular pattern formation and maintenance. Advances in spatially-resolved and high-throughput molecular imaging methods such as multiplexed immunofluorescence and spatial transcriptomics (ST) provide exciting new opportunities to augment our fundamental understanding of these processes in health and disease. The large and complex datasets resulting from these techniques, particularly ST, have led to rapid development of innovative machine learning (ML) tools primarily based on deep learning techniques. These ML tools are now increasingly featured in integrated experimental and computational workflows to disentangle signals from noise in complex biological systems. However, it can be difficult to understand and balance the different implicit assumptions and methodologies of a rapidly expanding toolbox of analytical tools in ST. To address this, we summarize major ST analysis goals that ML can help address and current analysis trends. We also describe four major data science concepts and related heuristics that can help guide practitioners in their choices of the right tools for the right biological questions.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
The Coming of Age of Nucleic Acid Vaccines during COVID-19
Authors:
Halie M. Rando,
Ronan Lordan,
Likhitha Kolla,
Elizabeth Sell,
Alexandra J. Lee,
Nils Wellhausen,
Amruta Naik,
Jeremy P. Kamil,
COVID-19 Review Consortium,
Anthony Gitter,
Casey S. Greene
Abstract:
In the 21st century, several emergent viruses have posed a global threat. Each pathogen has emphasized the value of rapid and scalable vaccine development programs. The ongoing SARS-CoV-2 pandemic has made the importance of such efforts especially clear. New biotechnological advances in vaccinology allow for recent advances that provide only the nucleic acid building blocks of an antigen, eliminat…
▽ More
In the 21st century, several emergent viruses have posed a global threat. Each pathogen has emphasized the value of rapid and scalable vaccine development programs. The ongoing SARS-CoV-2 pandemic has made the importance of such efforts especially clear. New biotechnological advances in vaccinology allow for recent advances that provide only the nucleic acid building blocks of an antigen, eliminating many safety concerns. During the COVID-19 pandemic, these DNA and RNA vaccines have facilitated the development and deployment of vaccines at an unprecedented pace. This success was attributable at least in part to broader shifts in scientific research relative to prior epidemics; the genome of SARS-CoV-2 was available as early as January 2020, facilitating global efforts in the development of DNA and RNA vaccines within two weeks of the international community becoming aware of the new viral threat. Additionally, these technologies that were previously only theoretical are not only safe but also highly efficacious. Although historically a slow process, the rapid development of vaccines during the COVID-19 crisis reveals a major shift in vaccine technologies. Here, we provide historical context for the emergence of these paradigm-shifting vaccines. We describe several DNA and RNA vaccines and in terms of their efficacy, safety, and approval status. We also discuss patterns in worldwide distribution. The advances made since early 2020 provide an exceptional illustration of how rapidly vaccine development technology has advanced in the last two decades in particular and suggest a new era in vaccines against emerging pathogens.
△ Less
Submitted 24 January, 2023; v1 submitted 14 October, 2022;
originally announced October 2022.
-
Application of Traditional Vaccine Development Strategies to SARS-CoV-2
Authors:
Halie M. Rando,
Ronan Lordan,
Alexandra J. Lee,
Amruta Naik,
Nils Wellhausen,
Elizabeth Sell,
Likhitha Kolla,
COVID-19 Review Consortium,
Anthony Gitter,
Casey S. Greene
Abstract:
Over the past 150 years, vaccines have revolutionized the relationship between people and disease. During the COVID-19 pandemic, technologies such as mRNA vaccines have received attention due to their novelty and successes. However, more traditional vaccine development platforms have also yielded important tools in the worldwide fight against the SARS-CoV-2 virus. A variety of approaches have been…
▽ More
Over the past 150 years, vaccines have revolutionized the relationship between people and disease. During the COVID-19 pandemic, technologies such as mRNA vaccines have received attention due to their novelty and successes. However, more traditional vaccine development platforms have also yielded important tools in the worldwide fight against the SARS-CoV-2 virus. A variety of approaches have been used to develop COVID-19 vaccines that are now authorized for use in countries around the world. In this review, we highlight strategies that focus on the viral capsid and outwards, rather than on the nucleic acids inside. These approaches fall into two broad categories: whole-virus vaccines and subunit vaccines. Whole-virus vaccines use the virus itself, either in an inactivated or attenuated state. Subunit vaccines contain instead an isolated, immunogenic component of the virus. Here, we highlight vaccine candidates that apply these approaches against SARS-CoV-2 in different ways. In a companion manuscript, we review the more recent and novel development of nucleic-acid based vaccine technologies. We further consider the role that these COVID-19 vaccine development programs have played in prophylaxis at the global scale. Well-established vaccine technologies have proved especially important to making vaccines accessible in low- and middle-income countries. Vaccine development programs that use established platforms have been undertaken in a much wider range of countries than those using nucleic-acid-based technologies, which have been led by wealthy Western countries. Therefore, these vaccine platforms, though less novel from a biotechnological standpoint, have proven to be extremely important to the management of SARS-CoV-2.
△ Less
Submitted 23 January, 2023; v1 submitted 16 August, 2022;
originally announced August 2022.
-
Using genome-wide expression compendia to study microorganisms
Authors:
Alexandra J. Lee,
Taylor Reiter,
Georgia Doing,
Julia Oh,
Deborah A. Hogan,
Casey S. Greene
Abstract:
A gene expression compendium is a heterogeneous collection of gene expression experiments assembled from data collected for diverse purposes. The widely varied experimental conditions and genetic backgrounds across samples creates a tremendous opportunity for gaining a systems level understanding of the transcriptional responses that influence phenotypes. Variety in experimental design is particul…
▽ More
A gene expression compendium is a heterogeneous collection of gene expression experiments assembled from data collected for diverse purposes. The widely varied experimental conditions and genetic backgrounds across samples creates a tremendous opportunity for gaining a systems level understanding of the transcriptional responses that influence phenotypes. Variety in experimental design is particularly important for studying microbes, where the transcriptional responses integrate many signals and demonstrate plasticity across strains including response to what nutrients are available and what microbes are present. Advances in high-throughput measurement technology have made it feasible to construct compendia for many microbes. In this review we discuss how these compendia are constructed and analyzed to reveal transcriptional patterns.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Ten Quick Tips for Deep Learning in Biology
Authors:
Benjamin D. Lee,
Anthony Gitter,
Casey S. Greene,
Sebastian Raschka,
Finlay Maguire,
Alexander J. Titus,
Michael D. Kessler,
Alexandra J. Lee,
Marc G. Chevrette,
Paul Allen Stewart,
Thiago Britto-Borges,
Evan M. Cofer,
Kun-Hsing Yu,
Juan Jose Carmona,
Elana J. Fertig,
Alexandr A. Kalinin,
Beth Signal,
Benjamin J. Lengerich,
Timothy J. Triche Jr,
Simina M. Boca
Abstract:
Machine learning is a modern approach to problem-solving and task automation. In particular, machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling. Artificial neural networks are a particular class of machine learning algorithms and models that evolved into what is now described as deep learning. G…
▽ More
Machine learning is a modern approach to problem-solving and task automation. In particular, machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling. Artificial neural networks are a particular class of machine learning algorithms and models that evolved into what is now described as deep learning. Given the computational advances made in the last decade, deep learning can now be applied to massive data sets and in innumerable contexts. Therefore, deep learning has become its own subfield of machine learning. In the context of biological research, it has been increasingly used to derive novel insights from high-dimensional biological data. To make the biological applications of deep learning more accessible to scientists who have some experience with machine learning, we solicited input from a community of researchers with varied biological and deep learning interests. These individuals collaboratively contributed to this manuscript's writing using the GitHub version control platform and the Manubot manuscript generation toolset. The goal was to articulate a practical, accessible, and concise set of guidelines and suggestions to follow when using deep learning. In the course of our discussions, several themes became clear: the importance of understanding and applying machine learning fundamentals as a baseline for utilizing deep learning, the necessity for extensive model comparisons with careful evaluation, and the need for critical thought in interpreting results generated by deep learning, among others.
△ Less
Submitted 29 May, 2021;
originally announced May 2021.
-
Identification and Development of Therapeutics for COVID-19
Authors:
Halie M. Rando,
Nils Wellhausen,
Soumita Ghosh,
Alexandra J. Lee,
Anna Ada Dattoli,
Fengling Hu,
James Brian Byrd,
Diane N. Rafizadeh,
Ronan Lordan,
Yanjun Qi,
Yuchen Sun,
Christian Brueffer,
Jeffrey M. Field,
Marouen Ben Guebila,
Nafisa M. Jadavji,
Ashwin N. Skelly,
Bharath Ramsundar,
Jinhui Wang,
Rishi Raj Goel,
YoSon Park,
the COVID-19 Review Consortium,
Simina M. Boca,
Anthony Gitter,
Casey S. Greene
Abstract:
After emerging in China in late 2019, the novel Severe acute respiratory syndrome-like coronavirus 2 (SARS-CoV-2) spread worldwide and as of early 2021, continues to significantly impact most countries. Only a small number of coronaviruses are known to infect humans, and only two are associated with the severe outcomes associated with SARS-CoV-2: Severe acute respiratory syndrome-related coronavir…
▽ More
After emerging in China in late 2019, the novel Severe acute respiratory syndrome-like coronavirus 2 (SARS-CoV-2) spread worldwide and as of early 2021, continues to significantly impact most countries. Only a small number of coronaviruses are known to infect humans, and only two are associated with the severe outcomes associated with SARS-CoV-2: Severe acute respiratory syndrome-related coronavirus, a closely related species of SARS-CoV-2 that emerged in 2002, and Middle East respiratory syndrome-related coronavirus, which emerged in 2012. Both of these previous epidemics were controlled fairly rapidly through public health measures, and no vaccines or robust therapeutic interventions were identified. However, previous insights into the immune response to coronaviruses gained during the outbreaks of severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) have proved beneficial to identifying approaches to the treatment and prophylaxis of novel coronavirus disease 2019 (COVID-19). A number of potential therapeutics against SARS-CoV-2 and the resultant COVID-19 illness were rapidly identified, leading to a large number of clinical trials investigating a variety of possible therapeutic approaches being initiated early on in the pandemic. As a result, a small number of therapeutics have already been authorized by regulatory agencies such as the Food and Drug Administration (FDA) in the United States, and many other therapeutics remain under investigation. Here, we describe a range of approaches for the treatment of COVID-19, along with their proposed mechanisms of action and the current status of clinical investigation into each candidate. The status of these investigations will continue to evolve, and this review will be updated as progress is made.
△ Less
Submitted 10 September, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
Pathogenesis, Symptomatology, and Transmission of SARS-CoV-2 through Analysis of Viral Genomics and Structure
Authors:
Halie M. Rando,
Adam L. MacLean,
Alexandra J. Lee,
Ronan Lordan,
Sandipan Ray,
Vikas Bansal,
Ashwin N. Skelly,
Elizabeth Sell,
John J. Dziak,
Lamonica Shinholster,
Lucy D'Agostino McGowan,
Marouen Ben Guebila,
Nils Wellhausen,
Sergey Knyazev,
Simina M. Boca,
Stephen Capone,
Yanjun Qi,
YoSon Park,
Yuchen Sun,
David Mai,
Joel D. Boerckel,
Christian Brueffer,
James Brian Byrd,
Jeremy P. Kamil,
Jinhui Wang
, et al. (9 additional authors not shown)
Abstract:
The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions of people with coronavirus disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity to other coronaviruses that infect humans has allowed for rapid insight into the mechanisms that it uses to infect human hosts, as well as the…
▽ More
The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions of people with coronavirus disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity to other coronaviruses that infect humans has allowed for rapid insight into the mechanisms that it uses to infect human hosts, as well as the ways in which the human immune system can respond. Here, we contextualize SARS-CoV-2 among other coronaviruses and identify what is known and what can be inferred about its behavior once inside a human host. Because the genomic content of coronaviruses, which specifies the virus's structure, is highly conserved, early genomic analysis provided a significant head start in predicting viral pathogenesis and in understanding potential differences among variants. The pathogenesis of the virus offers insights into symptomatology, transmission, and individual susceptibility. Additionally, prior research into interactions between the human immune system and coronaviruses has identified how these viruses can evade the immune system's protective mechanisms. We also explore systems-level research into the regulatory and proteomic effects of SARS-CoV-2 infection and the immune response. Understanding the structure and behavior of the virus serves to contextualize the many facets of the COVID-19 pandemic and can influence efforts to control the virus and treat the disease.
△ Less
Submitted 3 December, 2021; v1 submitted 1 February, 2021;
originally announced February 2021.