-
Contributions of the Petabyte Scale Sequence Search Codeathon toward efforts to scale sequence-based searches on SRA
Authors:
Priyanka Ghosh,
Kjiersten Fagnan,
Ryan Connor,
Ravinder Pannu,
Travis J. Wheeler,
Mihai Pop,
C. Titus Brown,
Tessa Pierce-Ward,
Rob Patro,
Jacquelyn S. Michaelis,
Thomas L. Madden,
Christiam Camacho,
Olaitan I. Awe,
Arianna I. Krinos,
René KM Xavier,
Rodrigo Ortega Polo,
Jack W. Roddy,
Adelaide Rhodes,
Alexander Sweeten,
Adrian Viehweger,
Bariş Ekim,
Harihara Subrahmaniam Muralidharan,
Amatur Rahman,
Vinícius W. Salazar,
Andrew Tritt
, et al. (13 additional authors not shown)
Abstract:
The volume of biological data being generated by the scientific community is growing exponentially, reflecting technological advances and research activities. The National Institutes of Health's (NIH) Sequence Read Archive (SRA), which is maintained by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM), is a rapidly growing public database that resea…
▽ More
The volume of biological data being generated by the scientific community is growing exponentially, reflecting technological advances and research activities. The National Institutes of Health's (NIH) Sequence Read Archive (SRA), which is maintained by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM), is a rapidly growing public database that researchers use to drive scientific discovery across all domains of life. This increase in available data has great promise for pushing scientific discovery but also introduces new challenges that scientific communities need to address. As genomic datasets have grown in scale and diversity, a parade of new methods and associated software have been developed to address the challenges posed by this growth. These methodological advances are vital for maximally leveraging the power of next-generation sequencing (NGS) technologies. With the goal of laying a foundation for evaluation of methods for petabyte-scale sequence search, the Department of Energy (DOE) Office of Biological and Environmental Research (BER), the NIH Office of Data Science Strategy (ODSS), and NCBI held a virtual codeathon 'Petabyte Scale Sequence Search: Metagenomics Benchmarking Codeathon' on September 27 - Oct 1 2021, to evaluate emerging solutions in petabyte scale sequence search. The codeathon attracted experts from national laboratories, research institutions, and universities across the world to (a) develop benchmarking approaches to address challenges in conducting large-scale analyses of metagenomic data (which comprises approximately 20% of SRA), (b) identify potential applications that benefit from SRA-wide searches and the tools required to execute the search, and (c) produce community resources i.e. a public facing repository with information to rebuild and reproduce the problems addressed by each team challenge.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Temperature measurement methods in an experimental setup during bone drilling: A brief review on the comparison of thermocouple and infrared thermography
Authors:
Md Ashequl Islam,
Nur Saifullah Kamarrudin,
Ruslizam Daud,
Ishak Ibrahim,
Anas Rahman,
Fauziah Mat
Abstract:
Predicting thermal response in orthopaedic surgery or dental implantation remains a significant challenge. This study aims to find a practical approach for measuring temperature elevation during a bone drilling experiment by analyzing the existing methods. Traditionally thermocouple has frequently been used to predict the bone temperature in the drilling process. However, several experimental stud…
▽ More
Predicting thermal response in orthopaedic surgery or dental implantation remains a significant challenge. This study aims to find a practical approach for measuring temperature elevation during a bone drilling experiment by analyzing the existing methods. Traditionally thermocouple has frequently been used to predict the bone temperature in the drilling process. However, several experimental studies demonstrate that the invasive method using thermocouple is impractical in medical conditions and prefers the thermal infrared (IR) camera as a non-invasive method. This work proposes a simplified experimental model that uses the thermocouple to determine temperature rise coupled with the thermal image source approach. Furthermore, our new method provides a significant opportunity to calibrate the thermal IR camera by discovering the undetected heat elevation in a workpiece depth.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Spatio-temporal models of infectious disease with high rates of asymptomatic transmission
Authors:
Aminur Rahman,
Angela Peace,
Ramesh Kesawan,
Souparno Ghosh
Abstract:
The surprisingly mercurial Covid-19 pandemic has highlighted the need to not only accelerate research on infectious disease, but to also study them using novel techniques and perspectives. A major contributor to the difficulty of containing the current pandemic is due to the highly asymptomatic nature of the disease. In this investigation, we develop a modeling framework to study the spatio-tempor…
▽ More
The surprisingly mercurial Covid-19 pandemic has highlighted the need to not only accelerate research on infectious disease, but to also study them using novel techniques and perspectives. A major contributor to the difficulty of containing the current pandemic is due to the highly asymptomatic nature of the disease. In this investigation, we develop a modeling framework to study the spatio-temporal evolution of diseases with high rates of asymptomatic transmission, and we apply this framework to a hypothetical country with mathematically tractable geography; namely, square counties uniformly organized into a rectangle. We first derive a model for the temporal dynamics of susceptible, infected, and recovered populations, which is applied at the county level. Next we use likelihood-based parameter estimation to derive temporally varying disease transmission parameters on the state-wide level. While these two methods give us some spatial structure and show the effects of behavioral and policy changes, they miss the evolution of hot zones that have caused significant difficulties in resource allocation during the current pandemic. It is evident that the distribution of cases will not be stagnantly based on the population density, as with many other diseases, but will continuously evolve. We model this as a diffusive process where the diffusivity is spatially varying based on the population distribution, and temporally varying based on the current number of simulated asymptomatic cases. With this final addition coupled to the SIR model with temporally varying transmission parameters, we capture the evolution of "hot zones" in our hypothetical setup.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
SystemMatch: optimizing preclinical drug models to human clinical outcomes via generative latent-space matching
Authors:
Scott Gigante,
Varsha G. Raghavan,
Amanda M. Robinson,
Robert A. Barton,
Adeeb H. Rahman,
Drausin F. Wulsin,
Jacques Banchereau,
Noam Solomon,
Luis F. Voloch,
Fabian J. Theis
Abstract:
Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduc…
▽ More
Translating the relevance of preclinical models ($\textit{in vitro}$, animal models, or organoids) to their relevance in humans presents an important challenge during drug development. The rising abundance of single-cell genomic data from human tumors and tissue offers a new opportunity to optimize model systems by their similarity to targeted human cell types in disease. In this work, we introduce SystemMatch to assess the fit of preclinical model systems to an $\textit{in sapiens}$ target population and to recommend experimental changes to further optimize these systems. We demonstrate this through an application to developing $\textit{in vitro}$ systems to model human tumor-derived suppressive macrophages. We show with held-out $\textit{in vivo}$ controls that our pipeline successfully ranks macrophage subpopulations by their biological similarity to the target population, and apply this analysis to rank a series of 18 $\textit{in vitro}$ macrophage systems perturbed with a variety of cytokine stimulations. We extend this analysis to predict the behavior of 66 $\textit{in silico}$ model systems generated using a perturbational autoencoder and apply a $k$-medoids approach to recommend a subset of these model systems for further experimental development in order to fully explore the space of possible perturbations. Through this use case, we demonstrate a novel approach to model system development to generate a system more similar to human biology.
△ Less
Submitted 14 May, 2022;
originally announced May 2022.
-
Feature Fusion of Raman Chemical Imaging and Digital Histopathology using Machine Learning for Prostate Cancer Detection
Authors:
Trevor Doherty,
Susan McKeever,
Nebras Al-Attar,
Tiarnan Murphy,
Claudia Aura,
Arman Rahman,
Amanda O'Neill,
Stephen P Finn,
Elaine Kay,
William M. Gallagher,
R. William G. Watson,
Aoife Gowen,
Patrick Jackman
Abstract:
The diagnosis of prostate cancer is challenging due to the heterogeneity of its presentations, leading to the over diagnosis and treatment of non-clinically important disease. Accurate diagnosis can directly benefit a patient's quality of life and prognosis. Towards addressing this issue, we present a learning model for the automatic identification of prostate cancer. While many prostate cancer st…
▽ More
The diagnosis of prostate cancer is challenging due to the heterogeneity of its presentations, leading to the over diagnosis and treatment of non-clinically important disease. Accurate diagnosis can directly benefit a patient's quality of life and prognosis. Towards addressing this issue, we present a learning model for the automatic identification of prostate cancer. While many prostate cancer studies have adopted Raman spectroscopy approaches, none have utilised the combination of Raman Chemical Imaging (RCI) and other imaging modalities. This study uses multimodal images formed from stained Digital Histopathology (DP) and unstained RCI. The approach was developed and tested on a set of 178 clinical samples from 32 patients, containing a range of non-cancerous, Gleason grade 3 (G3) and grade 4 (G4) tissue microarray samples. For each histological sample, there is a pathologist labelled DP - RCI image pair. The hypothesis tested was whether multimodal image models can outperform single modality baseline models in terms of diagnostic accuracy. Binary non-cancer/cancer models and the more challenging G3/G4 differentiation were investigated. Regarding G3/G4 classification, the multimodal approach achieved a sensitivity of 73.8% and specificity of 88.1% while the baseline DP model showed a sensitivity and specificity of 54.1% and 84.7% respectively. The multimodal approach demonstrated a statistically significant 12.7% AUC advantage over the baseline with a value of 85.8% compared to 73.1%, also outperforming models based solely on RCI and median Raman spectra. Feature fusion of DP and RCI does not improve the more trivial task of tumour identification but does deliver an observed advantage in G3/G4 discrimination. Building on these promising findings, future work could include the acquisition of larger datasets for enhanced model generalization.
△ Less
Submitted 18 January, 2021;
originally announced January 2021.
-
Tumor ablation due to inhomogeneous -- anisotropic diffusion in generic 3-dimensional topologies
Authors:
Erdi Kara,
Aminur Rahman,
Eugenio Aulisa,
Souparno Ghosh
Abstract:
We derive a full 3-dimensional (3-D) model of inhomogeneous -- anisotropic diffusion in a tumor region coupled to a binary population model. The diffusion tensors are acquired using Diffusion Tensor Magnetic Resonance Imaging (DTI) from a patient diagnosed with glioblastoma multiform (GBM). Then we numerically simulate the full model with Finite Element Method (FEM) and produce drug concentration…
▽ More
We derive a full 3-dimensional (3-D) model of inhomogeneous -- anisotropic diffusion in a tumor region coupled to a binary population model. The diffusion tensors are acquired using Diffusion Tensor Magnetic Resonance Imaging (DTI) from a patient diagnosed with glioblastoma multiform (GBM). Then we numerically simulate the full model with Finite Element Method (FEM) and produce drug concentration heat maps, apoptosis regions, and dose-response curves. Finally, predictions are made about optimal injection locations and volumes, which are presented in a form that can be employed by doctors and oncologists.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
Modeling of drug diffusion in a solid tumor leading to tumor cell death
Authors:
Aminur Rahman,
Souparno Ghosh,
Ranadip Pal
Abstract:
It has been shown recently that changing the fluidic properties of a drug can improve its efficacy in ablating solid tumors. We develop a modeling framework for tumor ablation, and present the simplest possible model for drug diffusion in a spherical tumor with leaky boundaries and assuming cell death eventually leads to ablation of that cell effectively making the two quantities numerically equiv…
▽ More
It has been shown recently that changing the fluidic properties of a drug can improve its efficacy in ablating solid tumors. We develop a modeling framework for tumor ablation, and present the simplest possible model for drug diffusion in a spherical tumor with leaky boundaries and assuming cell death eventually leads to ablation of that cell effectively making the two quantities numerically equivalent. The death of a cell after a given exposure time depends on both the concentration of the drug and the amount of oxygen available to the cell. Higher oxygen availability leads to cell death at lower drug concentrations. It can be assumed that a minimum concentration is required for a cell to die, effectively connecting diffusion with efficacy. The concentration threshold decreases as exposure time increases, which allows us to compute dose-response curves. Furthermore, these curves can be plotted at much finer time intervals compared to that of experiments, which is used to produce a dose-threshold-response surface giving an observer a complete picture of the drug's efficacy for an individual. In addition, since the diffusion, leak coefficients, and the availability of oxygen is different for different individuals and tumors, we produce artificial replication data through bootstrapping to simulate error. While the usual data-driven model with Sigmoidal curves use 12 free parameters, our mechanistic model only has two free parameters, allowing it to be open to scrutiny rather than forcing agreement with data. Even so, the simplest model in our framework, derived here, shows close agreement with the bootstrapped curves, and reproduces well established relations, such as Haber's rule.
△ Less
Submitted 4 June, 2018;
originally announced June 2018.
-
Clustering Gene Expression Time Series with Coregionalization: Speed propagation of ALS
Authors:
Muhammad Arifur Rahman,
Paul R. Heath,
Neil D. Lawrence
Abstract:
Clustering of gene expression time series gives insight into which genes may be coregulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different model conditions or genetic background. Amyotrophic lateral sclerosis (ALS), an irreversible diverse neurodegenerative disorder showed consistent phen…
▽ More
Clustering of gene expression time series gives insight into which genes may be coregulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different model conditions or genetic background. Amyotrophic lateral sclerosis (ALS), an irreversible diverse neurodegenerative disorder showed consistent phenotypic differences and the disease progression is heterogeneous with significant variability. This paper demonstrated about finding some significant gene expression profiles and its associated or co-regulated cluster of gene expressions from four groups of data with different genetic background or models conditions. Gene enrichment score analysis and pathway analysis of judicially selected clusters lead toward identifying features underlying the differential speed of disease progression. Gene ontology overrepresentation analysis showed clusters from the proposed method are less likely to be clustered just by chance. In this paper, we develop a new clustering method that allows each cluster to be parameterised according to whether the behaviour of the genes across conditions is correlated or anti-correlated. Our proposed method unveil the potency of latent information shared between multiple model conditions and their replicates during modelling gene expression data.
△ Less
Submitted 12 February, 2018; v1 submitted 7 February, 2018;
originally announced February 2018.
-
A Brain-Inspired Trust Management Model to Assure Security in a Cloud based IoT Framework for Neuroscience Applications
Authors:
Mufti Mahmud,
M. Shamim Kaiser,
M. Mostafizur Rahman,
M. Arifur Rahman,
Antesar Shabut,
Shamim Al-Mamun,
Amir Hussain
Abstract:
Rapid popularity of Internet of Things (IoT) and cloud computing permits neuroscientists to collect multilevel and multichannel brain data to better understand brain functions, diagnose diseases, and devise treatments. To ensure secure and reliable data communication between end-to-end (E2E) devices supported by current IoT and cloud infrastructure, trust management is needed at the IoT and user e…
▽ More
Rapid popularity of Internet of Things (IoT) and cloud computing permits neuroscientists to collect multilevel and multichannel brain data to better understand brain functions, diagnose diseases, and devise treatments. To ensure secure and reliable data communication between end-to-end (E2E) devices supported by current IoT and cloud infrastructure, trust management is needed at the IoT and user ends. This paper introduces a Neuro-Fuzzy based Brain-inspired trust management model (TMM) to secure IoT devices and relay nodes, and to ensure data reliability. The proposed TMM utilizes node behavioral trust and data trust estimated using Adaptive Neuro-Fuzzy Inference System and weighted-additive methods respectively to assess the nodes trustworthiness. In contrast to the existing fuzzy based TMMs, the NS2 simulation results confirm the robustness and accuracy of the proposed TMM in identifying malicious nodes in the communication network. With the growing usage of cloud based IoT frameworks in Neuroscience research, integrating the proposed TMM into the existing infrastructure will assure secure and reliable data communication among the E2E devices.
△ Less
Submitted 11 January, 2018;
originally announced January 2018.
-
Control of rodent sleeping sickness disease by surface functionalized amorphous nanosilica
Authors:
Dipankar Seth,
Mritunjay Mandal,
Nitai Debnath,
Ayesha Rahman,
N. K. Sasmal,
Sunit Mukhopadhyaya,
Arunava Goswami
Abstract:
Wild animals, pets, zoo animals and mammals of veterinary importance heavily suffer from trypanosomiasis. Drugs with serious side effects are currently mainstay of therapies used by veterinarians. Trypanosomiasis is caused by Trypanosoma sp. leading to sleeping sickness in humans. Surface modified (hydrophobic and lipophilic) amorphous nanoporous silica molecules could be effectively used as the…
▽ More
Wild animals, pets, zoo animals and mammals of veterinary importance heavily suffer from trypanosomiasis. Drugs with serious side effects are currently mainstay of therapies used by veterinarians. Trypanosomiasis is caused by Trypanosoma sp. leading to sleeping sickness in humans. Surface modified (hydrophobic and lipophilic) amorphous nanoporous silica molecules could be effectively used as therapeutic drug for combating trypanosomiasis. The amorphous nanosilica was developed by top-down approach using volcanic soil derived silica (Advasan; 50- 60 nm size with 3-10 nm inner pore size range) and diatomaceous earth (FS; 60-80 nm size with 3-5 nm inner pore size range) as source materials. According to WHO and USDA standards amorphous silica has long been used as feed additives for several veterinary industries and considered to be safe for human consumption. The basic mechanism of action of these nanosilica molecules is mediated by the physical absorption of HDL components in the lipophilic nanopores of nanosilica. This reduces the supply of the host derived cholesterol, thus limiting the growth of the Trypanosoma sp. in vivo.
△ Less
Submitted 18 July, 2007;
originally announced July 2007.
-
Nanosilica mop up host lipids and fights baculovirus
Authors:
Ayesha Rahman,
Dipankar Seth,
Nitai Debnath,
C. Ulrichs,
I. Mewis,
R. L. Brahmachary,
A. Goswami
Abstract:
Various types of surface functionalized nanosilica (50-60 nm size with 3-10 nm inner pore size range) have been used to kill insect pests by sucking up cuticular lipids and breaking the water barrier. We have also utilized nanosilica for mopping up host lipids induced by the malarial parasite, P. gallinaceum in poultry birds; VLDL cholesterol and serum triglycerides are brought back to the norma…
▽ More
Various types of surface functionalized nanosilica (50-60 nm size with 3-10 nm inner pore size range) have been used to kill insect pests by sucking up cuticular lipids and breaking the water barrier. We have also utilized nanosilica for mopping up host lipids induced by the malarial parasite, P. gallinaceum in poultry birds; VLDL cholesterol and serum triglycerides are brought back to the normal level with a concomitant check in parasite growth. While this work continues, we have explored another more convenient system, silkworm (Bombyx mori) that is frequently decimated by a baculovirus, NPV for which no antidote is known so far. Here, too, viral infection enhances host lipids. Eight different types of nanosilica were injected in the virus infected silkworm (batches of 10 worms) after ensuring 100% survival up to cocoon formation in control larvae (injected with the same volume of ethanol, the medium of nanosilica). Of these 8, AL60102 and AL60106, have the most marked effect on infected silkworm, both as prophylactic and pharmaceutical agents. Normal larvae injected with these nanosilica survive up to cocoon formation.
△ Less
Submitted 18 July, 2007;
originally announced July 2007.
-
Nanosilica mops up host lipids and fights baculovirus: a B. mori model
Authors:
Ayesha Rahman,
Dipankar Seth,
Nitai Debnath,
C. Ulrichs,
I. Mewis,
R. L. Brahmachary,
A. Goswami
Abstract:
Malaria and other parasites, including virus often induce an increase in host lipids which the invaders use to their own advantage. We obtained encouraging results in our investigations on bird malaria with a new approach namely the use of nanosilica to mop up excess host lipids. While this project is continuing we have investigated another, simpler system namely silkworms which suffer from a de…
▽ More
Malaria and other parasites, including virus often induce an increase in host lipids which the invaders use to their own advantage. We obtained encouraging results in our investigations on bird malaria with a new approach namely the use of nanosilica to mop up excess host lipids. While this project is continuing we have investigated another, simpler system namely silkworms which suffer from a deadly baculovirus, BmNPV. This virus decimates the infected population within 24 hours or so and no known antibiotic antidote or genetically resistant strain of silkworm3 exists. We report here a partial success, which is worth following up. Our rationale, we believe, has a broad and interdisciplinary appeal, for, this nanosilica treatment might be used together with other arsenals on all sorts of virus which take advantage of enhanced host lipids. It has not escaped our notice that Ebola and HIV also belong to this category. Nanoparticles are being preferentially harnessed, because they offer a greater surface area, circulate more easily and in lepidopteran system4 they are removed within 24 hours from the body. Lawry surmised, on cogent theoretical grounds that particles significantly smaller than micron order would be less harmful in the hemocoele. Furthermore, Hui-peng et al. pointed out that lipase treatment, the only viable option for controlling BmNPV interferes in hormonal balance and cannot be applied to pre molting stage.
△ Less
Submitted 18 July, 2007;
originally announced July 2007.
-
Control of poultry chicken malaria by surface functionalized amorphous nanosilica
Authors:
Dipankar Seth,
Nitai Debnath,
Ayesha Rahman,
Sunit Mukhopadhyaya,
Inga Mewis,
Christian Ulrichs,
R. L. Bramhachary,
Arunava Goswami
Abstract:
Surface modified amorphous nanoporous silica molecules with hydrophobic as well as hydrophilic character can be effectively used as therapeutic drug for combating chicken malaria in poultry industry. The amorphous nanosilica was developed by top-down approach using volcanic soil derived silica as source material. Amorphous silica has long been used as feed additive for poultry industry and consi…
▽ More
Surface modified amorphous nanoporous silica molecules with hydrophobic as well as hydrophilic character can be effectively used as therapeutic drug for combating chicken malaria in poultry industry. The amorphous nanosilica was developed by top-down approach using volcanic soil derived silica as source material. Amorphous silica has long been used as feed additive for poultry industry and considered to be safe for human consumption by WHO and USDA. The basic mechanism of action of these nanosilica molecules is mediated by the physical absorption of VLDL, serum triglycerides and other serum cholesterol components in the lipophilic nanopores of nanosilica. This reduces the supply of the host derived cholesterol, thus limiting the growth of the malarial parasite in vivo.
△ Less
Submitted 17 July, 2007;
originally announced July 2007.