-
SARS-CoV-2 Wastewater Genomic Surveillance: Approaches, Challenges, and Opportunities
Authors:
Viorel Munteanu,
Michael A. Saldana,
David Dreifuss,
Wenhao O. Ouyang,
Jannatul Ferdous,
Fatemeh Mohebbi,
Jessica Schlueter,
Dumitru Ciorba,
Viorel Bostan,
Victor Gordeev,
Justin Maine Su,
Nadiia Kasianchuk,
Nitesh Kumar Sharma,
Sergey Knyazev,
Eva Aßmann,
Andrei Lobiuc,
Mihai Covasa,
Keith A. Crandall,
Nicholas C. Wu,
Christopher E. Mason,
Braden T Tierney,
Alexander G Lucaci,
Roel A. Ophoff,
Cynthia Gibas,
Piotr Rzymski
, et al. (7 additional authors not shown)
Abstract:
During the SARS-CoV-2 pandemic, wastewater-based genomic surveillance (WWGS) emerged as an efficient viral surveillance tool that takes into account asymptomatic cases and can identify known and novel mutations and offers the opportunity to assign known virus lineages based on the detected mutations profiles. WWGS can also hint towards novel or cryptic lineages, but it is difficult to clearly iden…
▽ More
During the SARS-CoV-2 pandemic, wastewater-based genomic surveillance (WWGS) emerged as an efficient viral surveillance tool that takes into account asymptomatic cases and can identify known and novel mutations and offers the opportunity to assign known virus lineages based on the detected mutations profiles. WWGS can also hint towards novel or cryptic lineages, but it is difficult to clearly identify and define novel lineages from wastewater (WW) alone. While WWGS has significant advantages in monitoring SARS-CoV-2 viral spread, technical challenges remain, including poor sequencing coverage and quality due to viral RNA degradation. As a result, the viral RNAs in wastewater have low concentrations and are often fragmented, making sequencing difficult. WWGS analysis requires advanced computational tools that are yet to be developed and benchmarked. The existing bioinformatics tools used to analyze wastewater sequencing data are often based on previously developed methods for quantifying the expression of transcripts or viral diversity. Those methods were not developed for wastewater sequencing data specifically, and are not optimized to address unique challenges associated with wastewater. While specialized tools for analysis of wastewater sequencing data have also been developed recently, it remains to be seen how they will perform given the ongoing evolution of SARS-CoV-2 and the decline in testing and patient-based genomic surveillance. Here, we discuss opportunities and challenges associated with WWGS, including sample preparation, sequencing technology, and bioinformatics methods.
△ Less
Submitted 2 March, 2025; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection
Authors:
Debarpan Bhattacharya,
Neeraj Kumar Sharma,
Debottam Dutta,
Srikanth Raj Chetupalli,
Pravin Mote,
Sriram Ganapathy,
Chandrakiran C,
Sahiti Nori,
Suhail K K,
Sadhana Gonuguntla,
Murali Alagesan
Abstract:
This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demogr…
▽ More
This paper presents the Coswara dataset, a dataset containing diverse set of respiratory sounds and rich meta-data, recorded between April-2020 and February-2022 from 2635 individuals (1819 SARS-CoV-2 negative, 674 positive, and 142 recovered subjects). The respiratory sounds contained nine sound categories associated with variants of breathing, cough and speech. The rich metadata contained demographic information associated with age, gender and geographic location, as well as the health information relating to the symptoms, pre-existing respiratory ailments, comorbidity and SARS-CoV-2 test status. Our study is the first of its kind to manually annotate the audio quality of the entire dataset (amounting to 65~hours) through manual listening. The paper summarizes the data collection procedure, demographic, symptoms and audio data information. A COVID-19 classifier based on bi-directional long short-term (BLSTM) architecture, is trained and evaluated on the different population sub-groups contained in the dataset to understand the bias/fairness of the model. This enabled the analysis of the impact of gender, geographic location, date of recording, and language proficiency on the COVID-19 detection performance.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics
Authors:
Neeraj Kumar Sharma,
Srikanth Raj Chetupalli,
Debarpan Bhattacharya,
Debottam Dutta,
Pravin Mote,
Sriram Ganapathy
Abstract:
The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough…
▽ More
The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough and speech signals. This data was collected from individuals with and without COVID-19 infection, and the task in the challenge was a two-class classification. The development set audio recordings were collected from 965 (172 COVID-19 positive) individuals, while the evaluation set contained data from 471 individuals (71 COVID-19 positive). The challenge featured four tracks, one associated with each sound category of cough, speech and breathing, and a fourth fusion track. A baseline system was also released to benchmark the participants. In this paper, we present an overview of the challenge, the rationale for the data collection and the baseline system. Further, a performance analysis for the systems submitted by the $16$ participating teams in the leaderboard is also presented.
△ Less
Submitted 11 October, 2021; v1 submitted 4 October, 2021;
originally announced October 2021.
-
Fractal and Multifractal Properties of Electrographic Recordings of Human Brain Activity: Toward Its Use as a Signal Feature for Machine Learning in Clinical Applications
Authors:
Lucas G. S. França,
José G. V. Miranda,
Marco Leite,
Niraj K. Sharma,
Matthew C. Walker,
Louis Lemieux,
Yujiang Wang
Abstract:
The brain is a system operating on multiple time scales, and characterisation of dynamics across time scales remains a challenge. One framework to study such dynamics is that of fractal geometry. However, currently there exists no established method for the study of brain dynamics using fractal geometry, due to the many challenges in the conceptual and technical understanding of the methods. We ai…
▽ More
The brain is a system operating on multiple time scales, and characterisation of dynamics across time scales remains a challenge. One framework to study such dynamics is that of fractal geometry. However, currently there exists no established method for the study of brain dynamics using fractal geometry, due to the many challenges in the conceptual and technical understanding of the methods. We aim to highlight some of the practical challenges of applying fractal geometry to brain dynamics and propose solutions to enable its wider use in neuroscience. Using intracranially recorded EEG and simulated data, we compared monofractal and multifractal methods with regards to their sensitivity to signal variance. We found that both correlate closely with signal variance, thus not offering new information about the signal. However, after applying an epoch-wise standardisation procedure to the signal, we found that multifractal measures could offer non-redundant information compared to signal variance, power and other established EEG signal measures. We also compared different multifractal estimation methods and found that the Chhabra-Jensen algorithm performed best. Finally, we investigated the impact of sampling frequency and epoch length on multifractal properties. Using epileptic seizures as an example event in the EEG, we show that there may be an optimal time scale for detecting temporal changes in multifractal properties around seizures. The practical issues we highlighted and our suggested solutions should help in developing a robust method for the application of fractal geometry in EEG signals. Our analyses and observations also aid the theoretical understanding of the multifractal properties of the brain and might provide grounds for new discoveries in the study of brain signals. These could be crucial for understanding of neurological function and for the developments of new treatments.
△ Less
Submitted 11 December, 2018; v1 submitted 11 June, 2018;
originally announced June 2018.