-
Needles in Needle Stacks: Meaningful Clinical Information Buried in Noisy Waveform Data
Authors:
Sujay Nagaraj,
Andrew J. Goodwin,
Dmytro Lopushanskyy,
Danny Eytan,
Robert W. Greer,
Sebastian D. Goodfellow,
Azadeh Assadi,
Anand Jayarajan,
Anna Goldenberg,
Mjaye L. Mazwi
Abstract:
Central Venous Lines (C-Lines) and Arterial Lines (A-Lines) are routinely used in the Critical Care Unit (CCU) for blood sampling, medication administration, and high-frequency blood pressure measurement. Judiciously accessing these lines is important, as over-utilization is associated with significant in-hospital morbidity and mortality. Documenting the frequency of line-access is an important st…
▽ More
Central Venous Lines (C-Lines) and Arterial Lines (A-Lines) are routinely used in the Critical Care Unit (CCU) for blood sampling, medication administration, and high-frequency blood pressure measurement. Judiciously accessing these lines is important, as over-utilization is associated with significant in-hospital morbidity and mortality. Documenting the frequency of line-access is an important step in reducing these adverse outcomes. Unfortunately, the current gold-standard for documentation is manual and subject to error, omission, and bias. The high-frequency blood pressure waveform data from sensors in these lines are often noisy and full of artifacts. Standard approaches in signal processing remove noise artifacts before meaningful analysis. However, from bedside observations, we characterized a distinct artifact that occurs during each instance of C-Line or A-Line use. These artifacts are buried amongst physiological waveform and extraneous noise. We focus on Machine Learning (ML) models that can detect these artifacts from waveform data in real-time - finding needles in needle stacks, in order to automate the documentation of line-access. We built and evaluated ML classifiers running in real-time at a major children's hospital to achieve this goal. We demonstrate the utility of these tools for reducing documentation burden, increasing available information for bedside clinicians, and informing unit-level initiatives to improve patient safety.
△ Less
Submitted 18 August, 2024;
originally announced September 2024.
-
Aiming for Relevance
Authors:
Bar Eini Porat,
Danny Eytan,
Uri Shalit
Abstract:
Vital signs are crucial in intensive care units (ICUs). They are used to track the patient's state and to identify clinically significant changes. Predicting vital sign trajectories is valuable for early detection of adverse events. However, conventional machine learning metrics like RMSE often fail to capture the true clinical relevance of such predictions. We introduce novel vital sign predictio…
▽ More
Vital signs are crucial in intensive care units (ICUs). They are used to track the patient's state and to identify clinically significant changes. Predicting vital sign trajectories is valuable for early detection of adverse events. However, conventional machine learning metrics like RMSE often fail to capture the true clinical relevance of such predictions. We introduce novel vital sign prediction performance metrics that align with clinical contexts, focusing on deviations from clinical norms, overall trends, and trend deviations. These metrics are derived from empirical utility curves obtained in a previous study through interviews with ICU clinicians. We validate the metrics' usefulness using simulated and real clinical datasets (MIMIC and eICU). Furthermore, we employ these metrics as loss functions for neural networks, resulting in models that excel in predicting clinically significant events. This research paves the way for clinically relevant machine learning model evaluation and optimization, promising to improve ICU patient care. 10 pages, 9 figures.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Individualized Dosing Dynamics via Neural Eigen Decomposition
Authors:
Stav Belogolovsky,
Ido Greenberg,
Danny Eytan,
Shie Mannor
Abstract:
Dosing models often use differential equations to model biological dynamics. Neural differential equations in particular can learn to predict the derivative of a process, which permits predictions at irregular points of time. However, this temporal flexibility often comes with a high sensitivity to noise, whereas medical problems often present high noise and limited data. Moreover, medical dosing…
▽ More
Dosing models often use differential equations to model biological dynamics. Neural differential equations in particular can learn to predict the derivative of a process, which permits predictions at irregular points of time. However, this temporal flexibility often comes with a high sensitivity to noise, whereas medical problems often present high noise and limited data. Moreover, medical dosing models must generalize reliably over individual patients and changing treatment policies. To address these challenges, we introduce the Neural Eigen Stochastic Differential Equation algorithm (NESDE). NESDE provides individualized modeling (using a hypernetwork over patient-level parameters); generalization to new treatment policies (using decoupled control); tunable expressiveness according to the noise level (using piecewise linearity); and fast, continuous, closed-form prediction (using spectral representation). We demonstrate the robustness of NESDE in both synthetic and real medical problems, and use the learned dynamics to publish simulated medical gym environments.
△ Less
Submitted 24 June, 2023;
originally announced June 2023.
-
Building Trust: Lessons from the Technion-Rambam Machine Learning in Healthcare Datathon Event
Authors:
Jonathan A. Sobel,
Ronit Almog,
Leo Anthony Celi,
Michal Gaziel-Yablowitz,
Danny Eytan,
Joachim A. Behar
Abstract:
A datathon is a time-constrained competition involving data science applied to a specific problem. In the past decade, datathons have been shown to be a valuable bridge between fields and expertise . Biomedical data analysis represents a challenging area requiring collaboration between engineers, biologists and physicians to gain a better understanding of patient physiology and of guide decision p…
▽ More
A datathon is a time-constrained competition involving data science applied to a specific problem. In the past decade, datathons have been shown to be a valuable bridge between fields and expertise . Biomedical data analysis represents a challenging area requiring collaboration between engineers, biologists and physicians to gain a better understanding of patient physiology and of guide decision processes for diagnosis, prognosis and therapeutic interventions to improve care practice. Here, we reflect on the outcomes of an event that we organized in Israel at the end of March 2022 between the MIT Critical Data group, Rambam Health Care Campus (Rambam) and the Technion Israel Institute of Technology (Technion) in Haifa. Participants were asked to complete a survey about their skills and interests, which enabled us to identify current needs in machine learning training for medical problem applications. This work describes opportunities and limitations in medical data science in the Israeli context.
△ Less
Submitted 2 August, 2022; v1 submitted 16 July, 2022;
originally announced July 2022.
-
Machine Learning to Support Triage of Children at Risk for Epileptic Seizures in the Pediatric Intensive Care Unit
Authors:
Raphael Azriel,
Cecil D. Hahn,
Thomas De Cooman,
Sabine Van Huffel,
Eric T. Payne,
Kristin L. McBain,
Danny Eytan,
Joachim A. Behar
Abstract:
Objective: Epileptic seizures are relatively common in critically-ill children admitted to the pediatric intensive care unit (PICU) and thus serve as an important target for identification and treatment. Most of these seizures have no discernible clinical manifestation but still have a significant impact on morbidity and mortality. Children that are deemed at risk for seizures within the PICU are…
▽ More
Objective: Epileptic seizures are relatively common in critically-ill children admitted to the pediatric intensive care unit (PICU) and thus serve as an important target for identification and treatment. Most of these seizures have no discernible clinical manifestation but still have a significant impact on morbidity and mortality. Children that are deemed at risk for seizures within the PICU are monitored using continuous-electroencephalogram (cEEG). cEEG monitoring cost is considerable and as the number of available machines is always limited, clinicians need to resort to triaging patients according to perceived risk in order to allocate resources. This research aims to develop a computer aided tool to improve seizures risk assessment in critically-ill children, using an ubiquitously recorded signal in the PICU, namely the electrocardiogram (ECG). Approach: A novel data-driven model was developed at a patient-level approach, based on features extracted from the first hour of ECG recording and the clinical data of the patient. Main results: The most predictive features were the age of the patient, the brain injury as coma etiology and the QRS area. For patients without any prior clinical data, using one hour of ECG recording, the classification performance of the random forest classifier reached an area under the receiver operating characteristic curve (AUROC) score of 0.84. When combining ECG features with the patients clinical history, the AUROC reached 0.87. Significance: Taking a real clinical scenario, we estimated that our clinical decision support triage tool can improve the positive predictive value by more than 59% over the clinical standard.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding
Authors:
Sana Tonekaboni,
Danny Eytan,
Anna Goldenberg
Abstract:
Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. In this paper, we propose a self-supervised framework for learning generalizable representations for non-stationary time series. Our approach, called Temporal Neighborhood Coding (TNC), takes advantage of the local smoothness of a signal's generative process to define neighborhoods in tim…
▽ More
Time series are often complex and rich in information but sparsely labeled and therefore challenging to model. In this paper, we propose a self-supervised framework for learning generalizable representations for non-stationary time series. Our approach, called Temporal Neighborhood Coding (TNC), takes advantage of the local smoothness of a signal's generative process to define neighborhoods in time with stationary properties. Using a debiased contrastive objective, our framework learns time series representations by ensuring that in the encoding space, the distribution of signals from within a neighborhood is distinguishable from the distribution of non-neighboring signals. Our motivation stems from the medical field, where the ability to model the dynamic nature of time series data is especially valuable for identifying, tracking, and predicting the underlying patients' latent states in settings where labeling data is practically impossible. We compare our method to recently developed unsupervised representation learning approaches and demonstrate superior performance on clustering and classification tasks for multiple datasets.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
About Explicit Variance Minimization: Training Neural Networks for Medical Imaging With Limited Data Annotations
Authors:
Dmitrii Shubin,
Danny Eytan,
Sebastian D. Goodfellow
Abstract:
Self-supervised learning methods for computer vision have demonstrated the effectiveness of pre-training feature representations, resulting in well-generalizing Deep Neural Networks, even if the annotated data are limited. However, representation learning techniques require a significant amount of time for model training, with most of the time spent on precise hyper-parameter optimization and sele…
▽ More
Self-supervised learning methods for computer vision have demonstrated the effectiveness of pre-training feature representations, resulting in well-generalizing Deep Neural Networks, even if the annotated data are limited. However, representation learning techniques require a significant amount of time for model training, with most of the time spent on precise hyper-parameter optimization and selection of augmentation techniques. We hypothesized that if the annotated dataset has enough morphological diversity to capture the diversity of the general population, as is common in medical imaging due to conserved similarities of tissue morphology, the variance error of the trained model is the dominant component of the Bias-Variance Trade-off. Therefore, we proposed the Variance Aware Training (VAT) method that exploits this data property by introducing the variance error into the model loss function, thereby, explicitly regularizing the model. Additionally, we provided a theoretical formulation and proof of the proposed method to aid interpreting the approach. Our method requires selecting only one hyper-parameter and matching or improving the performance of state-of-the-art self-supervised methods while achieving an order of magnitude reduction in the GPU training time. We validated VAT on three medical imaging datasets from diverse domains and for various learning objectives. These included a Magnetic Resonance Imaging (MRI) dataset for the heart semantic segmentation (MICCAI 2017 ACDC challenge), fundus photography dataset for ordinary regression of diabetic retinopathy progression (Kaggle 2019 APTOS Blindness Detection challenge), and classification of histopathologic scans of lymph node sections (PatchCamelyon dataset). Our code is available at https://github.com/DmitriiShubin/Variance-Aware-Training.
△ Less
Submitted 24 August, 2021; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Using Deep Networks for Scientific Discovery in Physiological Signals
Authors:
Tom Beer,
Bar Eini-Porat,
Sebastian Goodfellow,
Danny Eytan,
Uri Shalit
Abstract:
Deep neural networks (DNN) have shown remarkable success in the classification of physiological signals. In this study we propose a method for examining to what extent does a DNN's performance rely on rediscovering existing features of the signals, as opposed to discovering genuinely new features. Moreover, we offer a novel method of "removing" a hand-engineered feature from the network's hypothes…
▽ More
Deep neural networks (DNN) have shown remarkable success in the classification of physiological signals. In this study we propose a method for examining to what extent does a DNN's performance rely on rediscovering existing features of the signals, as opposed to discovering genuinely new features. Moreover, we offer a novel method of "removing" a hand-engineered feature from the network's hypothesis space, thus forcing it to try and learn representations which are different from known ones, as a method of scientific exploration. We then build on existing work in the field of interpretability, specifically class activation maps, to try and infer what new features the network has learned. We demonstrate this approach using ECG and EEG signals. With respect to ECG signals we show that for the specific task of classifying atrial fibrillation, DNNs are likely rediscovering known features. We also show how our method could be used to discover new features, by selectively removing some ECG features and "rediscovering" them. We further examine how could our method be used as a tool for examining scientific hypotheses. We simulate this scenario by looking into the importance of eye movements in classifying sleep from EEG. We show that our tool can successfully focus a researcher's attention by bringing to light patterns in the data that would be hidden otherwise.
△ Less
Submitted 25 August, 2020;
originally announced August 2020.
-
Generative ODE Modeling with Known Unknowns
Authors:
Ori Linial,
Neta Ravid,
Danny Eytan,
Uri Shalit
Abstract:
In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vasc…
▽ More
In several crucial applications, domain knowledge is encoded by a system of ordinary differential equations (ODE), often stemming from underlying physical and biological processes. A motivating example is intensive care unit patients: the dynamics of vital physiological functions, such as the cardiovascular system with its associated variables (heart rate, cardiac contractility and output and vascular resistance) can be approximately described by a known system of ODEs. Typically, some of the ODE variables are directly observed (heart rate and blood pressure for example) while some are unobserved (cardiac contractility, output and vascular resistance), and in addition many other variables are observed but not modeled by the ODE, for example body temperature. Importantly, the unobserved ODE variables are known-unknowns: We know they exist and their functional dynamics, but cannot measure them directly, nor do we know the function tying them to all observed measurements. As is often the case in medicine, and specifically the cardiovascular system, estimating these known-unknowns is highly valuable and they serve as targets for therapeutic manipulations. Under this scenario we wish to learn the parameters of the ODE generating each observed time-series, and extrapolate the future of the ODE variables and the observations. We address this task with a variational autoencoder incorporating the known ODE function, called GOKU-net for Generative ODE modeling with Known Unknowns. We first validate our method on videos of single and double pendulums with unknown length or mass; we then apply it to a model of the cardiovascular system. We show that modeling the known-unknowns allows us to successfully discover clinically meaningful unobserved system parameters, leads to much better extrapolation, and enables learning using much smaller training sets.
△ Less
Submitted 30 March, 2021; v1 submitted 24 March, 2020;
originally announced March 2020.