Search | arXiv e-print repository

Graph-Based Biomarker Discovery and Interpretation for Alzheimer's Disease

Authors: Maryam Khalid, Fadeel Sher Khan, John Broussard, Arko Barman

Abstract: Early diagnosis and discovery of therapeutic drug targets are crucial objectives for the effective management of Alzheimer's Disease (AD). Current approaches for AD diagnosis and treatment planning are based on radiological imaging and largely inaccessible for population-level screening due to prohibitive costs and limited availability. Recently, blood tests have shown promise in diagnosing AD and… ▽ More Early diagnosis and discovery of therapeutic drug targets are crucial objectives for the effective management of Alzheimer's Disease (AD). Current approaches for AD diagnosis and treatment planning are based on radiological imaging and largely inaccessible for population-level screening due to prohibitive costs and limited availability. Recently, blood tests have shown promise in diagnosing AD and highlighting possible biomarkers that can be used as drug targets for AD management. Blood tests are significantly more accessible to disadvantaged populations, cost-effective, and minimally invasive. However, biomarker discovery in the context of AD diagnosis is complex as there exist important associations between various biomarkers. Here, we introduce BRAIN (Biomarker Representation, Analysis, and Interpretation Network), a novel machine learning (ML) framework to jointly optimize the diagnostic accuracy and biomarker discovery processes to identify all relevant biomarkers that contribute to AD diagnosis. Using a holistic graph-based representation for biomarkers, we highlight their inter-dependencies and explain why different ML models identify different discriminative biomarkers. We apply BRAIN to a publicly available blood biomarker dataset, revealing three novel biomarker sub-networks whose interactions vary between the control and AD groups, offering a new paradigm for drug discovery and biomarker analysis for AD. △ Less

Submitted 27 November, 2024; originally announced November 2024.

Comments: 9 pages, 7 figures

arXiv:2112.08211 [pdf, other]

TrialGraph: Machine Intelligence Enabled Insight from Graph Modelling of Clinical Trials

Authors: Christopher Yacoumatos, Stefano Bragaglia, Anshul Kanakia, Nils Svangård, Jonathan Mangion, Claire Donoghue, Jim Weatherall, Faisal M. Khan, Khader Shameer

Abstract: A major impediment to successful drug development is the complexity, cost, and scale of clinical trials. The detailed internal structure of clinical trial data can make conventional optimization difficult to achieve. Recent advances in machine learning, specifically graph-structured data analysis, have the potential to enable significant progress in improving the clinical trial design. TrialGraph… ▽ More A major impediment to successful drug development is the complexity, cost, and scale of clinical trials. The detailed internal structure of clinical trial data can make conventional optimization difficult to achieve. Recent advances in machine learning, specifically graph-structured data analysis, have the potential to enable significant progress in improving the clinical trial design. TrialGraph seeks to apply these methodologies to produce a proof-of-concept framework for developing models which can aid drug development and benefit patients. In this work, we first introduce a curated clinical trial data set compiled from the CT.gov, AACT and TrialTrove databases (n=1191 trials; representing one million patients) and describe the conversion of this data to graph-structured formats. We then detail the mathematical basis and implementation of a selection of graph machine learning algorithms, which typically use standard machine classifiers on graph data embedded in a low-dimensional feature space. We trained these models to predict side effect information for a clinical trial given information on the disease, existing medical conditions, and treatment. The MetaPath2Vec algorithm performed exceptionally well, with standard Logistic Regression, Decision Tree, Random Forest, Support Vector, and Neural Network classifiers exhibiting typical ROC-AUC scores of 0.85, 0.68, 0.86, 0.80, and 0.77, respectively. Remarkably, the best performing classifiers could only produce typical ROC-AUC scores of 0.70 when trained on equivalent array-structured data. Our work demonstrates that graph modelling can significantly improve prediction accuracy on appropriate datasets. Successive versions of the project that refine modelling assumptions and incorporate more data types can produce excellent predictors with real-world applications in drug development. △ Less

Submitted 15 December, 2021; originally announced December 2021.

Comments: 17 pages (Manuscript); 3 pages (Supplemental Data); 9 figures

MSC Class: 68Q04; 05Cxx ACM Class: J.3.1; I.2.0; I.5.1; I.7; H.3

arXiv:1806.00073 [pdf]

Nanomechanical resonance captures pre-melting transition in DNA unravelling

Authors: Keren Jiang, Faheem Khan, Javix Thomas, Arindam Phani, Thomas Thundat

Abstract: A double-stranded DNA unravels thermally through intermediate denatured bubble segments. Intrinsically, fluctuations ensue at the bubble boundaries from non-equilibrium (NE) energy exchanges with the environment. However, such details gets obscured by large population kinetics at the macroscale, associating equilibrium pathway to the unravelling landscape. In this work, we capture evidence of fluc… ▽ More A double-stranded DNA unravels thermally through intermediate denatured bubble segments. Intrinsically, fluctuations ensue at the bubble boundaries from non-equilibrium (NE) energy exchanges with the environment. However, such details gets obscured by large population kinetics at the macroscale, associating equilibrium pathway to the unravelling landscape. In this work, we capture evidence of fluctuation energetics with picoliter samples in a microfluidic cantilever. We exploit nanomechanical resonance to measure the NE energy exchanges through dissipation, revealing a crucial pre-melting transition at T~42C . This signifies that unravelling possibly proceeds via intermediate collapsed-bubble conformations releasing energy, sufficient to unbind bubble ends, assisting further unbinding. Fluctuation theorem explains the observations opening further avenues to investigate fluctuation kinetics in other biological phenomena that also proceed through similar NE energetics. △ Less

Submitted 20 September, 2018; v1 submitted 31 May, 2018; originally announced June 2018.

Comments: Manuscript and Supplementary Information

arXiv:1001.1984 [pdf]

DNA-MATRIX a tool for DNA motif discovery and weight matrix construction

Authors: Chandra Prakash Singh, Feroz Khan, Sanjay Kumar Singh, Durg Singh Chauhan

Abstract: In computational molecular biology, gene regulatory binding sites prediction in whole genome remains a challenge for the researchers. Now a days, the genome wide regulatory binding site prediction tools required either direct pattern sequence or weight matrix. Although there are known transcription factor binding sites databases available for genome wide prediction but no tool is available which… ▽ More In computational molecular biology, gene regulatory binding sites prediction in whole genome remains a challenge for the researchers. Now a days, the genome wide regulatory binding site prediction tools required either direct pattern sequence or weight matrix. Although there are known transcription factor binding sites databases available for genome wide prediction but no tool is available which can construct different weight matrices as per need of user or tools available for large data set scanning by first aligning the input upstream or promoter sequences and than construct the matrices in different level and file format. Considering this, we developed a DNA MATRIX tool for searching putative regulatory binding sites in gene upstream sequences. This tool uses the simple biological rule based heuristic algorithm for weight matrix construction, which can be transformed into different formats after motif alignment and therefore provides the possibility to identify the most potential conserved binding sites in the regulated genes. The user may construct and save specific weight or frequency matrices in different form and file formats based on user based selection of conserved aligned block of short sequences ranges from 6 to 20 base pairs and prior nucleotide frequency before weight scoring. △ Less

Submitted 6 February, 2010; v1 submitted 12 January, 2010; originally announced January 2010.

Comments: 3 pages IEEE format, International Journal of Computer Science and Information Security, IJCSIS December 2009, ISSN 1947 5500, http://sites.google.com/site/ijcsis/

Report number: Volume 6, No. 3, ISSN 1947 5500

Journal ref: International Journal of Computer Science and Information Security, IJCSIS, Vol. 6, No. 3, pp. 090-092, December 2009, USA

Showing 1–4 of 4 results for author: Khan, F