COGEDAP: A COmprehensive GEnomic Data Analysis Platform
Authors:
Bayram Cevdet Akdeniz,
Oleksandr Frei,
Espen Hagen,
Tahir Tekin Filiz,
Sandeep Karthikeyan,
Joelle Pasman,
Andreas Jangmo,
Jacob Bergsted,
John R. Shorter,
Richard Zetterberg,
Joeri Meijsen,
Ida Elken Sonderby,
Alfonso Buil,
Martin Tesli,
Yi Lu,
Patrick Sullivan,
Ole Andreassen,
Eivind Hovig
Abstract:
Non-sharable sensitive data collection and analysis in large-scale consortia for genomic research is complicated. Time consuming issues in installing software arise due to different operating systems, software dependencies and running the software. Therefore, easier, more standardized, automated protocols and platforms can be a solution to overcome these issues. We have developed one such solution…
▽ More
Non-sharable sensitive data collection and analysis in large-scale consortia for genomic research is complicated. Time consuming issues in installing software arise due to different operating systems, software dependencies and running the software. Therefore, easier, more standardized, automated protocols and platforms can be a solution to overcome these issues. We have developed one such solution for genomic data analysis using software container technologies. The platform, COGEDAP, consists of different software tools placed into Singularity containers with corresponding pipelines and instructions on how to perform genome-wide association studies (GWAS) and other genomic data analysis via corresponding tools. Using a provided helper script written in Python, users can obtain auto-generated scripts to conduct the desired analysis both on high-performance computing (HPC) systems and on personal computers. The analyses can be done by running these auto-generated scripts with the software containers. The helper script also performs minor re-formatting of the input/output data, so that the end user can work with a unified file format regardless of which genetic software is used for the analysis. COGEDAP is actively being used by users from different countries/projects to conduct their genomic data analyses. Thanks to this platform, users can easily run GWAS and other genomic analyses without spending much effort on software installation, data formats, and other technical requirements.
△ Less
Submitted 28 December, 2022;
originally announced December 2022.
Unlocking capacities of viral genomics for the COVID-19 pandemic response
Authors:
Sergey Knyazev,
Karishma Chhugani,
Varuni Sarwal,
Ram Ayyala,
Harman Singh,
Smruthi Karthikeyan,
Dhrithi Deshpande,
Zoia Comarova,
Angela Lu,
Yuri Porozov,
Aiping Wu,
Malak Abedalthagafi,
Shivashankar Nagaraj,
Adam Smith,
Pavel Skums,
Jason Ladner,
Tommy Tsan-Yuk Lam,
Nicholas Wu,
Alex Zelikovsky,
Rob Knight,
Keith Crandall,
Serghei Mangul
Abstract:
More than any other infectious disease epidemic, the COVID-19 pandemic has been characterized by the generation of large volumes of viral genomic data at an incredible pace due to recent advances in high-throughput sequencing technologies, the rapid global spread of SARS-CoV-2, and its persistent threat to public health. However, distinguishing the most epidemiologically relevant information encod…
▽ More
More than any other infectious disease epidemic, the COVID-19 pandemic has been characterized by the generation of large volumes of viral genomic data at an incredible pace due to recent advances in high-throughput sequencing technologies, the rapid global spread of SARS-CoV-2, and its persistent threat to public health. However, distinguishing the most epidemiologically relevant information encoded in these vast amounts of data requires substantial effort across the research and public health communities. Studies of SARS-CoV-2 genomes have been critical in tracking the spread of variants and understanding its epidemic dynamics, and may prove crucial for controlling future epidemics and alleviating significant public health burdens. Together, genomic data and bioinformatics methods enable broad-scale investigations of the spread of SARS-CoV-2 at the local, national, and global scales and allow researchers the ability to efficiently track the emergence of novel variants, reconstruct epidemic dynamics, and provide important insights into drug and vaccine development and disease control. Here, we discuss the tremendous opportunities that genomics offers to unlock the effective use of SARS-CoV-2 genomic data for efficient public health surveillance and guiding timely responses to COVID-19.
△ Less
Submitted 4 June, 2021; v1 submitted 28 April, 2021;
originally announced April 2021.