-
The human biological advantage over AI
Authors:
William Stewart
Abstract:
Recent advances in AI raise the possibility that AI systems will one day be able to do anything humans can do, only better. If artificial general intelligence (AGI) is achieved, AI systems may be able to understand, reason, problem solve, create, and evolve at a level and speed that humans will increasingly be unable to match, or even understand. These possibilities raise a natural question as to…
▽ More
Recent advances in AI raise the possibility that AI systems will one day be able to do anything humans can do, only better. If artificial general intelligence (AGI) is achieved, AI systems may be able to understand, reason, problem solve, create, and evolve at a level and speed that humans will increasingly be unable to match, or even understand. These possibilities raise a natural question as to whether AI will eventually become superior to humans, a successor "digital species", with a rightful claim to assume leadership of the universe. However, a deeper consideration suggests the overlooked differentiator between human beings and AI is not the brain, but the central nervous system (CNS), providing us with an immersive integration with physical reality. It is our CNS that enables us to experience emotion including pain, joy, suffering, and love, and therefore to fully appreciate the consequences of our actions on the world around us. And that emotional understanding of the consequences of our actions is what is required to be able to develop sustainable ethical systems, and so be fully qualified to be the leaders of the universe. A CNS cannot be manufactured or simulated; it must be grown as a biological construct. And so, even the development of consciousness will not be sufficient to make AI systems superior to humans. AI systems may become more capable than humans on almost every measure and transform our society. However, the best foundation for leadership of our universe will always be DNA, not silicon.
△ Less
Submitted 4 September, 2025;
originally announced September 2025.
-
Representation gaps of rigid planar diagram monoids
Authors:
Willow Stewart,
Daniel Tubbenhauer
Abstract:
We define non-pivotal analogs of the Temperley-Lieb, Motzkin, and planar rook monoids, and compute bounds for the sizes of their nontrivial simple representations. From this, we assess the two types of monoids in their relative suitability for use in cryptography by comparing their representation gaps and gap ratios. We conclude that the non-pivotal monoids are generally worse for cryptographic pu…
▽ More
We define non-pivotal analogs of the Temperley-Lieb, Motzkin, and planar rook monoids, and compute bounds for the sizes of their nontrivial simple representations. From this, we assess the two types of monoids in their relative suitability for use in cryptography by comparing their representation gaps and gap ratios. We conclude that the non-pivotal monoids are generally worse for cryptographic purposes.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
A Time and Place to Land: Online Learning-Based Distributed MPC for Multirotor Landing on Surface Vessel in Waves
Authors:
Jess Stephenson,
William S. Stewart,
Melissa Greeff
Abstract:
Landing a multirotor unmanned aerial vehicle (UAV) on an uncrewed surface vessel (USV) extends the operational range and offers recharging capabilities for maritime and limnology applications, such as search-and-rescue and environmental monitoring. However, autonomous UAV landings on USVs are challenging due to the unpredictable tilt and motion of the vessel caused by waves. This movement introduc…
▽ More
Landing a multirotor unmanned aerial vehicle (UAV) on an uncrewed surface vessel (USV) extends the operational range and offers recharging capabilities for maritime and limnology applications, such as search-and-rescue and environmental monitoring. However, autonomous UAV landings on USVs are challenging due to the unpredictable tilt and motion of the vessel caused by waves. This movement introduces spatial and temporal uncertainties, complicating safe, precise landings. Existing autonomous landing techniques on unmanned ground vehicles (UGVs) rely on shared state information, often causing time delays due to communication limits. This paper introduces a learning-based distributed Model Predictive Control (MPC) framework for autonomous UAV landings on USVs in wave-like conditions. Each vehicle's MPC optimizes for an artificial goal and input, sharing only the goal with the other vehicle. These goals are penalized by coupling and platform tilt costs, learned as a Gaussian Process (GP). We validate our framework in comprehensive indoor experiments using a custom-designed platform attached to a UGV to simulate USV tilting motion. Our approach achieves a 53% increase in landing success compared to an approach that neglects the impact of tilt motion on landing.
△ Less
Submitted 28 October, 2024;
originally announced October 2024.
-
Crash-perching on vertical poles with a hugging-wing robot
Authors:
Mohammad Askari,
Michele Benciolini,
Hoang-Vu Phan,
William Stewart,
Auke J. Ijspeert,
Dario Floreano
Abstract:
Perching with winged Unmanned Aerial Vehicles has often been solved by means of complex control or intricate appendages. Here, we present a simple yet novel method that relies on passive wing morphing for crash-landing on trees and other types of vertical poles. Inspired by the adaptability of animals' and bats' limbs in gripping and holding onto trees, we design dual-purpose wings that enable bot…
▽ More
Perching with winged Unmanned Aerial Vehicles has often been solved by means of complex control or intricate appendages. Here, we present a simple yet novel method that relies on passive wing morphing for crash-landing on trees and other types of vertical poles. Inspired by the adaptability of animals' and bats' limbs in gripping and holding onto trees, we design dual-purpose wings that enable both aerial gliding and perching on poles. With an upturned nose design, the robot can passively reorient from horizontal flight to vertical upon a head-on crash with a pole, followed by hugging with its wings to perch. We characterize the performance of reorientation and perching in terms of impact speed and angle, pole material, and size. The robot robustly reorients at impact angles above 15° and speeds of 3 m/s to 9 m/s, and can hold onto various pole types larger than 28% of its wingspan in diameter. We demonstrate crash-perching on tree trunks with an overall success rate of 71%. The method opens up new possibilities for the use of aerial robots in applications such as inspection, maintenance, and biodiversity conservation.
△ Less
Submitted 3 February, 2024;
originally announced February 2024.
-
Perching by hugging: an initial feasibility study
Authors:
William Stewart,
Mohammad Askari,
Maïk Guihard,
Dario Floreano
Abstract:
Current UAVs capable of perching require added structure and mechanisms to accomplish this. These take the form of hooks, claws, needles, etc which add weight and usually drag. We propose in this paper the dual use of structures already on the vehicle to enable perching, thus reducing the weight and drag cost associated with perching UAVs. We propose a wing design capable of passively wrapping aro…
▽ More
Current UAVs capable of perching require added structure and mechanisms to accomplish this. These take the form of hooks, claws, needles, etc which add weight and usually drag. We propose in this paper the dual use of structures already on the vehicle to enable perching, thus reducing the weight and drag cost associated with perching UAVs. We propose a wing design capable of passively wrapping around a vertical pole to perch. We experimentally investigate the feasibility of the design, presenting results on minimum required perching speeds as well as the effect of weight distribution on the success rate of the wing wrapping. Finally, we comment on design requirements for holding onto the pole based on our findings.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Avian-Inspired Claws Enable Robot Perching or Walking
Authors:
Mohammad Askari,
Won Dong Shin,
Damian Lenherr,
William Stewart,
Dario Floreano
Abstract:
Multimodal UAVs (Unmanned Aerial Vehicles) are rarely capable of more than two modalities, i.e., flying and walking or flying and perching. However, being able to fly, perch, and walk could further improve their usefulness by expanding their operating envelope. For instance, an aerial robot could fly a long distance, perch in a high place to survey the surroundings, then walk to avoid obstacles th…
▽ More
Multimodal UAVs (Unmanned Aerial Vehicles) are rarely capable of more than two modalities, i.e., flying and walking or flying and perching. However, being able to fly, perch, and walk could further improve their usefulness by expanding their operating envelope. For instance, an aerial robot could fly a long distance, perch in a high place to survey the surroundings, then walk to avoid obstacles that could potentially inhibit flight. Birds are capable of these three tasks, and so offer a practical example of how a robot might be developed to do the same. In this paper, we present a specialized avian-inspired claw design to enable UAVs to perch passively or walk. The key innovation is the combination of a Hoberman linkage leg with Fin Ray claw that uses the weight of the UAV to wrap the claw around a perch, or hyperextend it in the opposite direction to form a curved-up shape for stable terrestrial locomotion. Because the design uses the weight of the vehicle, the underactuated design is lightweight and low power. With the inclusion of talons, the 45g claws are capable of holding a 700g UAV to an almost 20-degree angle on a perch. In scenarios where cluttered environments impede flight and long mission times are required, such a combination of flying, perching, and walking is critical.
△ Less
Submitted 2 February, 2024; v1 submitted 29 March, 2023;
originally announced March 2023.
-
EVA: Generating Longitudinal Electronic Health Records Using Conditional Variational Autoencoders
Authors:
Siddharth Biswal,
Soumya Ghosh,
Jon Duke,
Bradley Malin,
Walter Stewart,
Jimeng Sun
Abstract:
Researchers require timely access to real-world longitudinal electronic health records (EHR) to develop, test, validate, and implement machine learning solutions that improve the quality and efficiency of healthcare. In contrast, health systems value deeply patient privacy and data security. De-identified EHRs do not adequately address the needs of health systems, as de-identified data are suscept…
▽ More
Researchers require timely access to real-world longitudinal electronic health records (EHR) to develop, test, validate, and implement machine learning solutions that improve the quality and efficiency of healthcare. In contrast, health systems value deeply patient privacy and data security. De-identified EHRs do not adequately address the needs of health systems, as de-identified data are susceptible to re-identification and its volume is also limited. Synthetic EHRs offer a potential solution. In this paper, we propose EHR Variational Autoencoder (EVA) for synthesizing sequences of discrete EHR encounters (e.g., clinical visits) and encounter features (e.g., diagnoses, medications, procedures). We illustrate that EVA can produce realistic EHR sequences, account for individual differences among patients, and can be conditioned on specific disease conditions, thus enabling disease-specific studies. We design efficient, accurate inference algorithms by combining stochastic gradient Markov Chain Monte Carlo with amortized variational inference. We assess the utility of the methods on large real-world EHR repositories containing over 250, 000 patients. Our experiments, which include user studies with knowledgeable clinicians, indicate the generated EHR sequences are realistic. We confirmed the performance of predictive models trained on the synthetic data are similar with those trained on real EHRs. Additionally, our findings indicate that augmenting real data with synthetic EHRs results in the best predictive performance - improving the best baseline by as much as 8% in top-20 recall.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
TASTE: Temporal and Static Tensor Factorization for Phenotyping Electronic Health Records
Authors:
Ardavan Afshar,
Ioakeim Perros,
Haesun Park,
Christopher deFilippi,
Xiaowei Yan,
Walter Stewart,
Joyce Ho,
Jimeng Sun
Abstract:
Phenotyping electronic health records (EHR) focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However,…
▽ More
Phenotyping electronic health records (EHR) focuses on defining meaningful patient groups (e.g., heart failure group and diabetes group) and identifying the temporal evolution of patients in those groups. Tensor factorization has been an effective tool for phenotyping. Most of the existing works assume either a static patient representation with aggregate data or only model temporal data. However, real EHR data contain both temporal (e.g., longitudinal clinical visits) and static information (e.g., patient demographics), which are difficult to model simultaneously. In this paper, we propose Temporal And Static TEnsor factorization (TASTE) that jointly models both static and temporal information to extract phenotypes. TASTE combines the PARAFAC2 model with non-negative matrix factorization to model a temporal and a static tensor. To fit the proposed model, we transform the original problem into simpler ones which are optimally solved in an alternating fashion. For each of the sub-problems, our proposed mathematical reformulations lead to efficient sub-problem solvers. Comprehensive experiments on large EHR data from a heart failure (HF) study confirmed that TASTE is up to 14x faster than several baselines and the resulting phenotypes were confirmed to be clinically meaningful by a cardiologist. Using 80 phenotypes extracted by TASTE, a simple logistic regression can achieve the same level of area under the curve (AUC) for HF prediction compared to a deep learning model using recurrent neural networks (RNN) with 345 features.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
MiME: Multilevel Medical Embedding of Electronic Health Records for Predictive Healthcare
Authors:
Edward Choi,
Cao Xiao,
Walter F. Stewart,
Jimeng Sun
Abstract:
Deep learning models exhibit state-of-the-art performance for many predictive healthcare tasks using electronic health records (EHR) data, but these models typically require training data volume that exceeds the capacity of most healthcare systems. External resources such as medical ontologies are used to bridge the data volume constraint, but this approach is often not directly applicable or usef…
▽ More
Deep learning models exhibit state-of-the-art performance for many predictive healthcare tasks using electronic health records (EHR) data, but these models typically require training data volume that exceeds the capacity of most healthcare systems. External resources such as medical ontologies are used to bridge the data volume constraint, but this approach is often not directly applicable or useful because of inconsistencies with terminology. To solve the data insufficiency challenge, we leverage the inherent multilevel structure of EHR data and, in particular, the encoded relationships among medical codes. We propose Multilevel Medical Embedding (MiME) which learns the multilevel embedding of EHR data while jointly performing auxiliary prediction tasks that rely on this inherent EHR structure without the need for external labels. We conducted two prediction tasks, heart failure prediction and sequential disease prediction, where MiME outperformed baseline methods in diverse evaluation settings. In particular, MiME consistently outperformed all baselines when predicting heart failure on datasets of different volumes, especially demonstrating the greatest performance improvement (15% relative gain in PR-AUC over the best baseline) on the smallest dataset, demonstrating its ability to effectively model the multilevel structure of EHR data.
△ Less
Submitted 22 October, 2018;
originally announced October 2018.
-
SUSTain: Scalable Unsupervised Scoring for Tensors and its Application to Phenotyping
Authors:
Ioakeim Perros,
Evangelos E. Papalexakis,
Haesun Park,
Richard Vuduc,
Xiaowei Yan,
Christopher Defilippi,
Walter F. Stewart,
Jimeng Sun
Abstract:
This paper presents a new method, which we call SUSTain, that extends real-valued matrix and tensor factorizations to data where values are integers. Such data are common when the values correspond to event counts or ordinal measures. The conventional approach is to treat integer data as real, and then apply real-valued factorizations. However, doing so fails to preserve important characteristics…
▽ More
This paper presents a new method, which we call SUSTain, that extends real-valued matrix and tensor factorizations to data where values are integers. Such data are common when the values correspond to event counts or ordinal measures. The conventional approach is to treat integer data as real, and then apply real-valued factorizations. However, doing so fails to preserve important characteristics of the original data, thereby making it hard to interpret the results. Instead, our approach extracts factor values from integer datasets as scores that are constrained to take values from a small integer set. These scores are easy to interpret: a score of zero indicates no feature contribution and higher scores indicate distinct levels of feature importance.
At its core, SUSTain relies on: a) a problem partitioning into integer-constrained subproblems, so that they can be optimally solved in an efficient manner; and b) organizing the order of the subproblems' solution, to promote reuse of shared intermediate results. We propose two variants, SUSTain_M and SUSTain_T, to handle both matrix and tensor inputs, respectively. We evaluate SUSTain against several state-of-the-art baselines on both synthetic and real Electronic Health Record (EHR) datasets. Comparing to those baselines, SUSTain shows either significantly better fit or orders of magnitude speedups that achieve a comparable fit (up to 425X faster). We apply SUSTain to EHR datasets to extract patient phenotypes (i.e., clinically meaningful patient clusters). Furthermore, 87% of them were validated as clinically meaningful phenotypes related to heart failure by a cardiologist.
△ Less
Submitted 14 March, 2018;
originally announced March 2018.
-
Structure in scientific networks: towards predictions of research dynamism
Authors:
Benjamin W. Stewart,
Andy Rivas,
Luat T. Vuong
Abstract:
Certain areas of scientific research flourish while others lose advocates and attention. We are interested in whether structural patterns within citation networks correspond to the growth or decline of the research areas to which those networks belong. We focus on three topic areas within optical physics as a set of cases; those areas have developed along different trajectories: one continues to e…
▽ More
Certain areas of scientific research flourish while others lose advocates and attention. We are interested in whether structural patterns within citation networks correspond to the growth or decline of the research areas to which those networks belong. We focus on three topic areas within optical physics as a set of cases; those areas have developed along different trajectories: one continues to expand rapidly; another is on the wane after an earlier peak; the final area has re-emerged after a short waning period. These three areas have substantial overlaps in the types of equipment they use and general methodology; at the same time, their citation networks are largely independent of each other. For each of our three areas, we map the citation networks of the top-100 most-cited papers, published pre-1999. In order to quantify the structures of the selected articles' citation networks, we use a modified version of weak tie theory in tandem with entropy measures. Although the fortunes of a given research area are most obviously the result of accumulated innovations and impasses, our preliminary study provides evidence that these citation networks' emergent structures reflect those developments and may shape evolving conversations in the scholarly literature.
△ Less
Submitted 13 August, 2017;
originally announced August 2017.
-
Generating Multi-label Discrete Patient Records using Generative Adversarial Networks
Authors:
Edward Choi,
Siddharth Biswal,
Bradley Malin,
Jon Duke,
Walter F. Stewart,
Jimeng Sun
Abstract:
Access to electronic health record (EHR) data has motivated computational advances in medical research. However, various concerns, particularly over privacy, can limit access to and collaborative use of EHR data. Sharing synthetic EHR data could mitigate risk. In this paper, we propose a new approach, medical Generative Adversarial Network (medGAN), to generate realistic synthetic patient records.…
▽ More
Access to electronic health record (EHR) data has motivated computational advances in medical research. However, various concerns, particularly over privacy, can limit access to and collaborative use of EHR data. Sharing synthetic EHR data could mitigate risk. In this paper, we propose a new approach, medical Generative Adversarial Network (medGAN), to generate realistic synthetic patient records. Based on input real patient records, medGAN can generate high-dimensional discrete variables (e.g., binary and count features) via a combination of an autoencoder and generative adversarial networks. We also propose minibatch averaging to efficiently avoid mode collapse, and increase the learning efficiency with batch normalization and shortcut connections. To demonstrate feasibility, we showed that medGAN generates synthetic patient records that achieve comparable performance to real data on many experiments including distribution statistics, predictive modeling tasks and a medical expert review. We also empirically observe a limited privacy risk in both identity and attribute disclosure using medGAN.
△ Less
Submitted 11 January, 2018; v1 submitted 19 March, 2017;
originally announced March 2017.
-
Causal Regularization
Authors:
Mohammad Taha Bahadori,
Krzysztof Chalupka,
Edward Choi,
Robert Chen,
Walter F. Stewart,
Jimeng Sun
Abstract:
In application domains such as healthcare, we want accurate predictive models that are also causally interpretable. In pursuit of such models, we propose a causal regularizer to steer predictive models towards causally-interpretable solutions and theoretically study its properties. In a large-scale analysis of Electronic Health Records (EHR), our causally-regularized model outperforms its L1-regul…
▽ More
In application domains such as healthcare, we want accurate predictive models that are also causally interpretable. In pursuit of such models, we propose a causal regularizer to steer predictive models towards causally-interpretable solutions and theoretically study its properties. In a large-scale analysis of Electronic Health Records (EHR), our causally-regularized model outperforms its L1-regularized counterpart in causal accuracy and is competitive in predictive performance. We perform non-linear causality analysis by causally regularizing a special neural network architecture. We also show that the proposed causal regularizer can be used together with neural representation learning algorithms to yield up to 20% improvement over multilayer perceptron in detecting multivariate causation, a situation common in healthcare, where many causal factors should occur simultaneously to have an effect on the target variable.
△ Less
Submitted 23 February, 2017; v1 submitted 8 February, 2017;
originally announced February 2017.
-
GRAM: Graph-based Attention Model for Healthcare Representation Learning
Authors:
Edward Choi,
Mohammad Taha Bahadori,
Le Song,
Walter F. Stewart,
Jimeng Sun
Abstract:
Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -Data insufficiency:Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results. -Interpretation:The representations learned by deep learning methods should align with medical knowledge. To address the…
▽ More
Deep learning methods exhibit promising performance for predictive modeling in healthcare, but two important challenges remain: -Data insufficiency:Often in healthcare predictive modeling, the sample size is insufficient for deep learning methods to achieve satisfactory results. -Interpretation:The representations learned by deep learning methods should align with medical knowledge. To address these challenges, we propose a GRaph-based Attention Model, GRAM that supplements electronic health records (EHR) with hierarchical information inherent to medical ontologies. Based on the data volume and the ontology structure, GRAM represents a medical concept as a combination of its ancestors in the ontology via an attention mechanism. We compared predictive performance (i.e. accuracy, data needs, interpretability) of GRAM to various methods including the recurrent neural network (RNN) in two sequential diagnoses prediction tasks and one heart failure prediction task. Compared to the basic RNN, GRAM achieved 10% higher accuracy for predicting diseases rarely observed in the training data and 3% improved area under the ROC curve for predicting heart failure using an order of magnitude less training data. Additionally, unlike other methods, the medical concept representations learned by GRAM are well aligned with the medical ontology. Finally, GRAM exhibits intuitive attention behaviors by adaptively generalizing to higher level concepts when facing data insufficiency at the lower level concepts.
△ Less
Submitted 1 April, 2017; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Connecting the dots across time: Reconstruction of single cell signaling trajectories using time-stamped data
Authors:
Sayak Mukherjee,
David Stewart,
William Stewart,
Lewis L. Lanier,
Jayajit Das
Abstract:
Single cell responses are shaped by the geometry of signaling kinetic trajectories carved in a multidimensional space spanned by signaling protein abundances. It is however challenging to assay large number (>3) of signaling species in live-cell imaging which makes it difficult to probe single cell signaling kinetic trajectories in large dimensions. Flow and mass cytometry techniques can measure a…
▽ More
Single cell responses are shaped by the geometry of signaling kinetic trajectories carved in a multidimensional space spanned by signaling protein abundances. It is however challenging to assay large number (>3) of signaling species in live-cell imaging which makes it difficult to probe single cell signaling kinetic trajectories in large dimensions. Flow and mass cytometry techniques can measure a large number (4 - >40) of signaling species but are unable to track single cells. Thus cytometry experiments provide detailed time stamped snapshots of single cell signaling kinetics. Is it possible to use the time stamped cytometry data to reconstruct single cell signaling trajectories? Borrowing concepts of conserved and slow variables from non-equilibrium statistical physics we develop an approach to reconstruct signaling trajectories using snapshot data by creating new variables that remain invariant or vary slowly during the signaling kinetics. We apply this approach to reconstruct trajectories using snapshot data obtained from in silico simulations and live-cell imaging measurements. The use of invariants and slow variables to reconstruct trajectories provides a radically different way to track object using snapshot data. The approach is likely to have implications for solving matching problems in a wide range of disciplines.
△ Less
Submitted 20 July, 2017; v1 submitted 26 September, 2016;
originally announced September 2016.
-
RETAIN: An Interpretable Predictive Model for Healthcare using Reverse Time Attention Mechanism
Authors:
Edward Choi,
Mohammad Taha Bahadori,
Joshua A. Kulas,
Andy Schuetz,
Walter F. Stewart,
Jimeng Sun
Abstract:
Accuracy and interpretability are two dominant features of successful predictive models. Typically, a choice must be made in favor of complex black box models such as recurrent neural networks (RNN) for accuracy versus less accurate but more interpretable traditional models such as logistic regression. This tradeoff poses challenges in medicine where both accuracy and interpretability are importan…
▽ More
Accuracy and interpretability are two dominant features of successful predictive models. Typically, a choice must be made in favor of complex black box models such as recurrent neural networks (RNN) for accuracy versus less accurate but more interpretable traditional models such as logistic regression. This tradeoff poses challenges in medicine where both accuracy and interpretability are important. We addressed this challenge by developing the REverse Time AttentIoN model (RETAIN) for application to Electronic Health Records (EHR) data. RETAIN achieves high accuracy while remaining clinically interpretable and is based on a two-level neural attention model that detects influential past visits and significant clinical variables within those visits (e.g. key diagnoses). RETAIN mimics physician practice by attending the EHR data in a reverse time order so that recent clinical visits are likely to receive higher attention. RETAIN was tested on a large health system EHR dataset with 14 million visits completed by 263K patients over an 8 year period and demonstrated predictive accuracy and computational scalability comparable to state-of-the-art methods such as RNN, and ease of interpretability comparable to traditional models.
△ Less
Submitted 26 February, 2017; v1 submitted 19 August, 2016;
originally announced August 2016.
-
Medical Concept Representation Learning from Electronic Health Records and its Application on Heart Failure Prediction
Authors:
Edward Choi,
Andy Schuetz,
Walter F. Stewart,
Jimeng Sun
Abstract:
Objective: To transform heterogeneous clinical data from electronic health records into clinically meaningful constructed features using data driven method that rely, in part, on temporal relations among data. Materials and Methods: The clinically meaningful representations of medical concepts and patients are the key for health analytic applications. Most of existing approaches directly construct…
▽ More
Objective: To transform heterogeneous clinical data from electronic health records into clinically meaningful constructed features using data driven method that rely, in part, on temporal relations among data. Materials and Methods: The clinically meaningful representations of medical concepts and patients are the key for health analytic applications. Most of existing approaches directly construct features mapped to raw data (e.g., ICD or CPT codes), or utilize some ontology mapping such as SNOMED codes. However, none of the existing approaches leverage EHR data directly for learning such concept representation. We propose a new way to represent heterogeneous medical concepts (e.g., diagnoses, medications and procedures) based on co-occurrence patterns in longitudinal electronic health records. The intuition behind the method is to map medical concepts that are co-occuring closely in time to similar concept vectors so that their distance will be small. We also derive a simple method to construct patient vectors from the related medical concept vectors. Results: For qualitative evaluation, we study similar medical concepts across diagnosis, medication and procedure. In quantitative evaluation, our proposed representation significantly improves the predictive modeling performance for onset of heart failure (HF), where classification methods (e.g. logistic regression, neural network, support vector machine and K-nearest neighbors) achieve up to 23% improvement in area under the ROC curve (AUC) using this proposed representation. Conclusion: We proposed an effective method for patient and medical concept representation learning. The resulting representation can map relevant concepts together and also improves predictive modeling performance.
△ Less
Submitted 20 June, 2017; v1 submitted 11 February, 2016;
originally announced February 2016.
-
Doctor AI: Predicting Clinical Events via Recurrent Neural Networks
Authors:
Edward Choi,
Mohammad Taha Bahadori,
Andy Schuetz,
Walter F. Stewart,
Jimeng Sun
Abstract:
Leveraging large historical data in electronic health record (EHR), we developed Doctor AI, a generic predictive model that covers observed medical conditions and medication uses. Doctor AI is a temporal model using recurrent neural networks (RNN) and was developed and applied to longitudinal time stamped EHR data from 260K patients over 8 years. Encounter records (e.g. diagnosis codes, medication…
▽ More
Leveraging large historical data in electronic health record (EHR), we developed Doctor AI, a generic predictive model that covers observed medical conditions and medication uses. Doctor AI is a temporal model using recurrent neural networks (RNN) and was developed and applied to longitudinal time stamped EHR data from 260K patients over 8 years. Encounter records (e.g. diagnosis codes, medication codes or procedure codes) were input to RNN to predict (all) the diagnosis and medication categories for a subsequent visit. Doctor AI assesses the history of patients to make multilabel predictions (one label for each diagnosis or medication category). Based on separate blind test set evaluation, Doctor AI can perform differential diagnosis with up to 79% recall@30, significantly higher than several baselines. Moreover, we demonstrate great generalizability of Doctor AI by adapting the resulting models from one institution to another without losing substantial accuracy.
△ Less
Submitted 28 September, 2016; v1 submitted 18 November, 2015;
originally announced November 2015.
-
Developing numerical libraries in Java
Authors:
Ronald F. Boisvert,
Jack J. Dongarra,
Roldan Pozo,
Karin Remington,
G. W. Stewart
Abstract:
The rapid and widespread adoption of Java has created a demand for reliable and reusable mathematical software components to support the growing number of compute-intensive applications now under development, particularly in science and engineering. In this paper we address practical issues of the Java language and environment which have an effect on numerical library design and development. Ben…
▽ More
The rapid and widespread adoption of Java has created a demand for reliable and reusable mathematical software components to support the growing number of compute-intensive applications now under development, particularly in science and engineering. In this paper we address practical issues of the Java language and environment which have an effect on numerical library design and development. Benchmarks which illustrate the current levels of performance of key numerical kernels on a variety of Java platforms are presented. Finally, a strategy for the development of a fundamental numerical toolkit for Java is proposed and its current status is described.
△ Less
Submitted 2 September, 1998;
originally announced September 1998.