-
SARS-CoV-2 Wastewater Genomic Surveillance: Approaches, Challenges, and Opportunities
Authors:
Viorel Munteanu,
Michael A. Saldana,
David Dreifuss,
Wenhao O. Ouyang,
Jannatul Ferdous,
Fatemeh Mohebbi,
Jessica Schlueter,
Dumitru Ciorba,
Viorel Bostan,
Victor Gordeev,
Justin Maine Su,
Nadiia Kasianchuk,
Nitesh Kumar Sharma,
Sergey Knyazev,
Eva Aßmann,
Andrei Lobiuc,
Mihai Covasa,
Keith A. Crandall,
Nicholas C. Wu,
Christopher E. Mason,
Braden T Tierney,
Alexander G Lucaci,
Roel A. Ophoff,
Cynthia Gibas,
Piotr Rzymski
, et al. (7 additional authors not shown)
Abstract:
During the SARS-CoV-2 pandemic, wastewater-based genomic surveillance (WWGS) emerged as an efficient viral surveillance tool that takes into account asymptomatic cases and can identify known and novel mutations and offers the opportunity to assign known virus lineages based on the detected mutations profiles. WWGS can also hint towards novel or cryptic lineages, but it is difficult to clearly iden…
▽ More
During the SARS-CoV-2 pandemic, wastewater-based genomic surveillance (WWGS) emerged as an efficient viral surveillance tool that takes into account asymptomatic cases and can identify known and novel mutations and offers the opportunity to assign known virus lineages based on the detected mutations profiles. WWGS can also hint towards novel or cryptic lineages, but it is difficult to clearly identify and define novel lineages from wastewater (WW) alone. While WWGS has significant advantages in monitoring SARS-CoV-2 viral spread, technical challenges remain, including poor sequencing coverage and quality due to viral RNA degradation. As a result, the viral RNAs in wastewater have low concentrations and are often fragmented, making sequencing difficult. WWGS analysis requires advanced computational tools that are yet to be developed and benchmarked. The existing bioinformatics tools used to analyze wastewater sequencing data are often based on previously developed methods for quantifying the expression of transcripts or viral diversity. Those methods were not developed for wastewater sequencing data specifically, and are not optimized to address unique challenges associated with wastewater. While specialized tools for analysis of wastewater sequencing data have also been developed recently, it remains to be seen how they will perform given the ongoing evolution of SARS-CoV-2 and the decline in testing and patient-based genomic surveillance. Here, we discuss opportunities and challenges associated with WWGS, including sample preparation, sequencing technology, and bioinformatics methods.
△ Less
Submitted 2 March, 2025; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Unlocking capacities of viral genomics for the COVID-19 pandemic response
Authors:
Sergey Knyazev,
Karishma Chhugani,
Varuni Sarwal,
Ram Ayyala,
Harman Singh,
Smruthi Karthikeyan,
Dhrithi Deshpande,
Zoia Comarova,
Angela Lu,
Yuri Porozov,
Aiping Wu,
Malak Abedalthagafi,
Shivashankar Nagaraj,
Adam Smith,
Pavel Skums,
Jason Ladner,
Tommy Tsan-Yuk Lam,
Nicholas Wu,
Alex Zelikovsky,
Rob Knight,
Keith Crandall,
Serghei Mangul
Abstract:
More than any other infectious disease epidemic, the COVID-19 pandemic has been characterized by the generation of large volumes of viral genomic data at an incredible pace due to recent advances in high-throughput sequencing technologies, the rapid global spread of SARS-CoV-2, and its persistent threat to public health. However, distinguishing the most epidemiologically relevant information encod…
▽ More
More than any other infectious disease epidemic, the COVID-19 pandemic has been characterized by the generation of large volumes of viral genomic data at an incredible pace due to recent advances in high-throughput sequencing technologies, the rapid global spread of SARS-CoV-2, and its persistent threat to public health. However, distinguishing the most epidemiologically relevant information encoded in these vast amounts of data requires substantial effort across the research and public health communities. Studies of SARS-CoV-2 genomes have been critical in tracking the spread of variants and understanding its epidemic dynamics, and may prove crucial for controlling future epidemics and alleviating significant public health burdens. Together, genomic data and bioinformatics methods enable broad-scale investigations of the spread of SARS-CoV-2 at the local, national, and global scales and allow researchers the ability to efficiently track the emergence of novel variants, reconstruct epidemic dynamics, and provide important insights into drug and vaccine development and disease control. Here, we discuss the tremendous opportunities that genomics offers to unlock the effective use of SARS-CoV-2 genomic data for efficient public health surveillance and guiding timely responses to COVID-19.
△ Less
Submitted 4 June, 2021; v1 submitted 28 April, 2021;
originally announced April 2021.
-
Pathogenesis, Symptomatology, and Transmission of SARS-CoV-2 through Analysis of Viral Genomics and Structure
Authors:
Halie M. Rando,
Adam L. MacLean,
Alexandra J. Lee,
Ronan Lordan,
Sandipan Ray,
Vikas Bansal,
Ashwin N. Skelly,
Elizabeth Sell,
John J. Dziak,
Lamonica Shinholster,
Lucy D'Agostino McGowan,
Marouen Ben Guebila,
Nils Wellhausen,
Sergey Knyazev,
Simina M. Boca,
Stephen Capone,
Yanjun Qi,
YoSon Park,
Yuchen Sun,
David Mai,
Joel D. Boerckel,
Christian Brueffer,
James Brian Byrd,
Jeremy P. Kamil,
Jinhui Wang
, et al. (9 additional authors not shown)
Abstract:
The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions of people with coronavirus disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity to other coronaviruses that infect humans has allowed for rapid insight into the mechanisms that it uses to infect human hosts, as well as the…
▽ More
The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions of people with coronavirus disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity to other coronaviruses that infect humans has allowed for rapid insight into the mechanisms that it uses to infect human hosts, as well as the ways in which the human immune system can respond. Here, we contextualize SARS-CoV-2 among other coronaviruses and identify what is known and what can be inferred about its behavior once inside a human host. Because the genomic content of coronaviruses, which specifies the virus's structure, is highly conserved, early genomic analysis provided a significant head start in predicting viral pathogenesis and in understanding potential differences among variants. The pathogenesis of the virus offers insights into symptomatology, transmission, and individual susceptibility. Additionally, prior research into interactions between the human immune system and coronaviruses has identified how these viruses can evade the immune system's protective mechanisms. We also explore systems-level research into the regulatory and proteomic effects of SARS-CoV-2 infection and the immune response. Understanding the structure and behavior of the virus serves to contextualize the many facets of the COVID-19 pandemic and can influence efforts to control the virus and treat the disease.
△ Less
Submitted 3 December, 2021; v1 submitted 1 February, 2021;
originally announced February 2021.
-
Technology dictates algorithms: Recent developments in read alignment
Authors:
Mohammed Alser,
Jeremy Rotman,
Kodi Taraszka,
Huwenbo Shi,
Pelin Icer Baykal,
Harry Taegyun Yang,
Victor Xue,
Sergey Knyazev,
Benjamin D. Singer,
Brunilda Balliu,
David Koslicki,
Pavel Skums,
Alex Zelikovsky,
Can Alkan,
Onur Mutlu,
Serghei Mangul
Abstract:
Massively parallel sequencing techniques have revolutionized biological and medical sciences by providing unprecedented insight into the genomes of humans, animals, and microbes. Modern sequencing platforms generate enormous amounts of genomic data in the form of nucleotide sequences or reads. Aligning reads onto reference genomes enables the identification of individual-specific genetic variants…
▽ More
Massively parallel sequencing techniques have revolutionized biological and medical sciences by providing unprecedented insight into the genomes of humans, animals, and microbes. Modern sequencing platforms generate enormous amounts of genomic data in the form of nucleotide sequences or reads. Aligning reads onto reference genomes enables the identification of individual-specific genetic variants and is an essential step of the majority of genomic analysis pipelines. Aligned reads are essential for answering important biological questions, such as detecting mutations driving various human diseases and complex traits as well as identifying species present in metagenomic samples. The read alignment problem is extremely challenging due to the large size of analyzed datasets and numerous technological limitations of sequencing platforms, and researchers have developed novel bioinformatics algorithms to tackle these difficulties. Importantly, computational algorithms have evolved and diversified in accordance with technological advances, leading to todays diverse array of bioinformatics tools. Our review provides a survey of algorithmic foundations and methodologies across 107 alignment methods published between 1988 and 2020, for both short and long reads. We provide rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read aligners. We separately discuss how longer read lengths produce unique advantages and limitations to read alignment techniques. We also discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology, including whole transcriptome, adaptive immune repertoire, and human microbiome studies.
△ Less
Submitted 9 July, 2020; v1 submitted 28 February, 2020;
originally announced March 2020.