-
RMExplorer: A Visual Analytics Approach to Explore the Performance and the Fairness of Disease Risk Models on Population Subgroups
Authors:
Bum Chul Kwon,
Uri Kartoun,
Shaan Khurshid,
Mikhail Yurochkin,
Subha Maity,
Deanna G Brockman,
Amit V Khera,
Patrick T Ellinor,
Steven A Lubitz,
Kenney Ng
Abstract:
Disease risk models can identify high-risk patients and help clinicians provide more personalized care. However, risk models developed on one dataset may not generalize across diverse subpopulations of patients in different datasets and may have unexpected performance. It is challenging for clinical researchers to inspect risk models across different subgroups without any tools. Therefore, we deve…
▽ More
Disease risk models can identify high-risk patients and help clinicians provide more personalized care. However, risk models developed on one dataset may not generalize across diverse subpopulations of patients in different datasets and may have unexpected performance. It is challenging for clinical researchers to inspect risk models across different subgroups without any tools. Therefore, we developed an interactive visualization system called RMExplorer (Risk Model Explorer) to enable interactive risk model assessment. Specifically, the system allows users to define subgroups of patients by selecting clinical, demographic, or other characteristics, to explore the performance and fairness of risk models on the subgroups, and to understand the feature contributions to risk scores. To demonstrate the usefulness of the tool, we conduct a case study, where we use RMExplorer to explore three atrial fibrillation risk models by applying them to the UK Biobank dataset of 445,329 individuals. RMExplorer can help researchers to evaluate the performance and biases of risk models on subpopulations of interest in their data.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
A Methodology to Generate Virtual Patient Repositories
Authors:
Uri Kartoun
Abstract:
Electronic medical records (EMR) contain sensitive personal information. For example, they may include details about infectious diseases, such as human immunodeficiency virus (HIV), or they may contain information about a mental illness. They may also contain other sensitive information such as medical details related to fertility treatments. Because EMRs are subject to confidentiality requirement…
▽ More
Electronic medical records (EMR) contain sensitive personal information. For example, they may include details about infectious diseases, such as human immunodeficiency virus (HIV), or they may contain information about a mental illness. They may also contain other sensitive information such as medical details related to fertility treatments. Because EMRs are subject to confidentiality requirements, accessing and analyzing EMR databases is a privilege given to only a small number of individuals. Individuals who work at institutions that do not have access to EMR systems have no opportunity to gain hands-on experience with this valuable resource. Simulated medical databases are currently available; however, they are difficult to configure and are limited in their resemblance to real clinical databases. Generating highly accessible repositories of virtual patient EMRs while relying only minimally on real patient data is expected to serve as a valuable resource to a broader audience of medical personnel, including those who reside in underdeveloped countries.
△ Less
Submitted 1 August, 2016;
originally announced August 2016.
-
Identifying Pairs in Simulated Bio-Medical Time-Series
Authors:
Uri Kartoun
Abstract:
The paper presents a time-series-based classification approach to identify similarities in pairs of simulated human-generated patterns. An example for a pattern is a time-series representing a heart rate during a specific time-range, wherein the time-series is a sequence of data points that represent the changes in the heart rate values. A bio-medical simulator system was developed to acquire a co…
▽ More
The paper presents a time-series-based classification approach to identify similarities in pairs of simulated human-generated patterns. An example for a pattern is a time-series representing a heart rate during a specific time-range, wherein the time-series is a sequence of data points that represent the changes in the heart rate values. A bio-medical simulator system was developed to acquire a collection of 7,871 price patterns of financial instruments. The financial instruments traded in real-time on three American stock exchanges, NASDAQ, NYSE, and AMEX, simulate bio-medical measurements. The system simulates a human in which each price pattern represents one bio-medical sensor. Data provided during trading hours from the stock exchanges allowed real-time classification. Classification is based on new machine learning techniques: self-labeling, which allows the application of supervised learning methods on unlabeled time-series and similarity ranking, which applied on a decision tree learning algorithm to classify time-series regardless of type and quantity.
△ Less
Submitted 12 May, 2013;
originally announced June 2013.
-
Inverse Signal Classification for Financial Instruments
Authors:
Uri Kartoun
Abstract:
The paper presents new machine learning methods: signal composition, which classifies time-series regardless of length, type, and quantity; and self-labeling, a supervised-learning enhancement. The paper describes further the implementation of the methods on a financial search engine system using a collection of 7,881 financial instruments traded during 2011 to identify inverse behavior among the…
▽ More
The paper presents new machine learning methods: signal composition, which classifies time-series regardless of length, type, and quantity; and self-labeling, a supervised-learning enhancement. The paper describes further the implementation of the methods on a financial search engine system using a collection of 7,881 financial instruments traded during 2011 to identify inverse behavior among the time-series.
△ Less
Submitted 19 March, 2013; v1 submitted 28 February, 2013;
originally announced March 2013.
-
Bio-Signals-based Situation Comparison Approach to Predict Pain
Authors:
Uri Kartoun
Abstract:
This paper describes a time-series-based classification approach to identify similarities between bio-medical-based situations. The proposed approach allows classifying collections of time-series representing bio-medical measurements, i.e., situations, regardless of the type, the length and the quantity of the time-series a situation comprised of.
This paper describes a time-series-based classification approach to identify similarities between bio-medical-based situations. The proposed approach allows classifying collections of time-series representing bio-medical measurements, i.e., situations, regardless of the type, the length and the quantity of the time-series a situation comprised of.
△ Less
Submitted 19 March, 2013; v1 submitted 28 February, 2013;
originally announced March 2013.
-
A Method for Comparing Hedge Funds
Authors:
Uri Kartoun
Abstract:
The paper presents new machine learning methods: signal composition, which classifies time-series regardless of length, type, and quantity; and self-labeling, a supervised-learning enhancement. The paper describes further the implementation of the methods on a financial search engine system to identify behavioral similarities among time-series representing monthly returns of 11,312 hedge funds ope…
▽ More
The paper presents new machine learning methods: signal composition, which classifies time-series regardless of length, type, and quantity; and self-labeling, a supervised-learning enhancement. The paper describes further the implementation of the methods on a financial search engine system to identify behavioral similarities among time-series representing monthly returns of 11,312 hedge funds operated during approximately one decade (2000 - 2010). The presented approach of cross-category and cross-location classification assists the investor to identify alternative investments.
△ Less
Submitted 19 March, 2013; v1 submitted 28 February, 2013;
originally announced March 2013.