-
Augmented Curation of Unstructured Clinical Notes from a Massive EHR System Reveals Specific Phenotypic Signature of Impending COVID-19 Diagnosis
Authors:
FNU Shweta,
Karthik Murugadoss,
Samir Awasthi,
AJ Venkatakrishnan,
Arjun Puranik,
Martin Kang,
Brian W. Pickering,
John C. O'Horo,
Philippe R. Bauer,
Raymund R. Razonable,
Paschalis Vergidis,
Zelalem Temesgen,
Stacey Rizza,
Maryam Mahmood,
Walter R. Wilson,
Douglas Challener,
Praveen Anand,
Matt Liebers,
Zainab Doctor,
Eli Silvert,
Hugo Solomon,
Tyler Wagner,
Gregory J. Gores,
Amy W. Williams,
John Halamka
, et al. (2 additional authors not shown)
Abstract:
Understanding the temporal dynamics of COVID-19 patient phenotypes is necessary to derive fine-grained resolution of pathophysiology. Here we use state-of-the-art deep neural networks over an institution-wide machine intelligence platform for the augmented curation of 15.8 million clinical notes from 30,494 patients subjected to COVID-19 PCR diagnostic testing. By contrasting the Electronic Health…
▽ More
Understanding the temporal dynamics of COVID-19 patient phenotypes is necessary to derive fine-grained resolution of pathophysiology. Here we use state-of-the-art deep neural networks over an institution-wide machine intelligence platform for the augmented curation of 15.8 million clinical notes from 30,494 patients subjected to COVID-19 PCR diagnostic testing. By contrasting the Electronic Health Record (EHR)-derived clinical phenotypes of COVID-19-positive (COVIDpos, n=635) versus COVID-19-negative (COVIDneg, n=29,859) patients over each day of the week preceding the PCR testing date, we identify anosmia/dysgeusia (37.4-fold), myalgia/arthralgia (2.6-fold), diarrhea (2.2-fold), fever/chills (2.1-fold), respiratory difficulty (1.9-fold), and cough (1.8-fold) as significantly amplified in COVIDpos over COVIDneg patients. The specific combination of cough and diarrhea has a 3.2-fold amplification in COVIDpos patients during the week prior to PCR testing, and along with anosmia/dysgeusia, constitutes the earliest EHR-derived signature of COVID-19 (4-7 days prior to typical PCR testing date). This study introduces an Augmented Intelligence platform for the real-time synthesis of institutional knowledge captured in EHRs. The platform holds tremendous potential for scaling up curation throughput, with minimal need for retraining underlying neural networks, thus promising EHR-powered early diagnosis for a broad spectrum of diseases.
△ Less
Submitted 28 April, 2020; v1 submitted 17 April, 2020;
originally announced April 2020.
-
Knowledge synthesis from 100 million biomedical documents augments the deep expression profiling of coronavirus receptors
Authors:
AJ Venkatakrishnan,
Arjun Puranik,
Akash Anand,
David Zemmour,
Xiang Yao,
Xiaoying Wu,
Ramakrishna Chilaka,
Dariusz K. Murakowski,
Kristopher Standish,
Bharathwaj Raghunathan,
Tyler Wagner,
Enrique Garcia-Rivera,
Hugo Solomon,
Abhinav Garg,
Rakesh Barve,
Anuli Anyanwu-Ofili,
Najat Khan,
Venky Soundararajan
Abstract:
The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a platform for the real-time synthesis of the exponentially growing biomedical literature and its comprehensive triangulation with deep omic insights i…
▽ More
The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a platform for the real-time synthesis of the exponentially growing biomedical literature and its comprehensive triangulation with deep omic insights is not available. Here, we present the nferX platform for dynamic inference from over 45 quadrillion possible conceptual associations extracted from unstructured biomedical text, and their triangulation with Single Cell RNA-sequencing based insights from over 25 tissues. Using this platform, we identify intersections between the pathologic manifestations of COVID-19 and the comprehensive expression profile of the SARS-CoV-2 receptor ACE2. We find that tongue keratinocytes and olfactory epithelial cells are likely under-appreciated targets of SARS-CoV-2 infection, correlating with reported loss of sense of taste and smell as early indicators of COVID-19 infection, including in otherwise asymptomatic patients. Airway club cells, ciliated cells and type II pneumocytes in the lung, and enterocytes of the gut also express ACE2. This study demonstrates how a holistic data science platform can leverage unprecedented quantities of structured and unstructured publicly available data to accelerate the generation of impactful biological insights and hypotheses.
△ Less
Submitted 28 March, 2020;
originally announced March 2020.
-
Implementing a Concept Network Model
Authors:
Sarah H. Solomon,
John D. Medaglia,
Sharon L. Thompson-Schill
Abstract:
The same concept can mean different things or be instantiated in different forms depending on context, suggesting a degree of flexibility within the conceptual system. We propose that a compositional network model can be used to capture and predict this flexibility. We modeled individual concepts (e.g., BANANA, BOTTLE) as graph-theoretical networks, in which properties (e.g., YELLOW, SWEET) were r…
▽ More
The same concept can mean different things or be instantiated in different forms depending on context, suggesting a degree of flexibility within the conceptual system. We propose that a compositional network model can be used to capture and predict this flexibility. We modeled individual concepts (e.g., BANANA, BOTTLE) as graph-theoretical networks, in which properties (e.g., YELLOW, SWEET) were represented as nodes and their associations as edges. In this framework, networks capture the within-concept statistics that reflect how properties correlate with each other across instances of a concept. We ran a classification analysis using graph eigendecomposition to validate these models, and find that these models can successfully discriminate between object concepts. We then computed formal measures from these concept networks and explored their relationship to conceptual structure. We find that diversity coefficients and core-periphery structure can be interpreted as network-based measures of conceptual flexibility and stability, respectively. These results support the feasibility of a concept network framework and highlight its ability to formally capture important characteristics of the conceptual system.
△ Less
Submitted 20 March, 2019; v1 submitted 22 February, 2018;
originally announced February 2018.