-
ICYM2I: The illusion of multimodal informativeness under missingness
Authors:
Young Sang Choi,
Vincent Jeanselme,
Pierre Elias,
Shalmali Joshi
Abstract:
Multimodal learning is of continued interest in artificial intelligence-based applications, motivated by the potential information gain from combining different types of data. However, modalities collected and curated during development may differ from the modalities available at deployment due to multiple factors including cost, hardware failure, or -- as we argue in this work -- the perceived in…
▽ More
Multimodal learning is of continued interest in artificial intelligence-based applications, motivated by the potential information gain from combining different types of data. However, modalities collected and curated during development may differ from the modalities available at deployment due to multiple factors including cost, hardware failure, or -- as we argue in this work -- the perceived informativeness of a given modality. Na{ï}ve estimation of the information gain associated with including an additional modality without accounting for missingness may result in improper estimates of that modality's value in downstream tasks. Our work formalizes the problem of missingness in multimodal learning and demonstrates the biases resulting from ignoring this process. To address this issue, we introduce ICYM2I (In Case You Multimodal Missed It), a framework for the evaluation of predictive performance and information gain under missingness through inverse probability weighting-based correction. We demonstrate the importance of the proposed adjustment to estimate information gain under missingness on synthetic, semi-synthetic, and real-world medical datasets.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
FoMoH: A clinically meaningful foundation model evaluation for structured electronic health records
Authors:
Chao Pang,
Vincent Jeanselme,
Young Sang Choi,
Xinzhuo Jiang,
Zilin Jing,
Aparajita Kashyap,
Yuta Kobayashi,
Yanwei Li,
Florent Pollet,
Karthik Natarajan,
Shalmali Joshi
Abstract:
Foundation models hold significant promise in healthcare, given their capacity to extract meaningful representations independent of downstream tasks. This property has enabled state-of-the-art performance across several clinical applications trained on structured electronic health record (EHR) data, even in settings with limited labeled data, a prevalent challenge in healthcare. However, there is…
▽ More
Foundation models hold significant promise in healthcare, given their capacity to extract meaningful representations independent of downstream tasks. This property has enabled state-of-the-art performance across several clinical applications trained on structured electronic health record (EHR) data, even in settings with limited labeled data, a prevalent challenge in healthcare. However, there is little consensus on these models' potential for clinical utility due to the lack of desiderata of comprehensive and meaningful tasks and sufficiently diverse evaluations to characterize the benefit over conventional supervised learning. To address this gap, we propose a suite of clinically meaningful tasks spanning patient outcomes, early prediction of acute and chronic conditions, including desiderata for robust evaluations. We evaluate state-of-the-art foundation models on EHR data consisting of 5 million patients from Columbia University Irving Medical Center (CUMC), a large urban academic medical center in New York City, across 14 clinically relevant tasks. We measure overall accuracy, calibration, and subpopulation performance to surface tradeoffs based on the choice of pre-training, tokenization, and data representation strategies. Our study aims to advance the empirical evaluation of structured EHR foundation models and guide the development of future healthcare foundation models.
△ Less
Submitted 16 June, 2025; v1 submitted 22 May, 2025;
originally announced May 2025.
-
Identifying treatment response subgroups in observational time-to-event data
Authors:
Vincent Jeanselme,
Chang Ho Yoon,
Fabian Falck,
Brian Tom,
Jessica Barrett
Abstract:
Identifying patient subgroups with different treatment responses is an important task to inform medical recommendations, guidelines, and the design of future clinical trials. Existing approaches for treatment effect estimation primarily rely on Randomised Controlled Trials (RCTs), which are often limited by insufficient power, multiple comparisons, and unbalanced covariates. In addition, RCTs tend…
▽ More
Identifying patient subgroups with different treatment responses is an important task to inform medical recommendations, guidelines, and the design of future clinical trials. Existing approaches for treatment effect estimation primarily rely on Randomised Controlled Trials (RCTs), which are often limited by insufficient power, multiple comparisons, and unbalanced covariates. In addition, RCTs tend to feature more homogeneous patient groups, making them less relevant for uncovering subgroups in the population encountered in real-world clinical practice. Subgroup analyses established for RCTs suffer from significant statistical biases when applied to observational studies, which benefit from larger and more representative populations. Our work introduces a novel, outcome-guided, subgroup analysis strategy for identifying subgroups of treatment response in both RCTs and observational studies alike. It hence positions itself in-between individualised and average treatment effect estimation to uncover patient subgroups with distinct treatment responses, critical for actionable insights that may influence treatment guidelines. In experiments, our approach significantly outperforms the current state-of-the-art method for subgroup analysis in both randomised and observational treatment regimes.
△ Less
Submitted 23 February, 2025; v1 submitted 6 August, 2024;
originally announced August 2024.
-
Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium
Authors:
Hyewon Jeong,
Sarah Jabbour,
Yuzhe Yang,
Rahul Thapta,
Hussein Mozannar,
William Jongwon Han,
Nikita Mehandru,
Michael Wornow,
Vladislav Lialin,
Xin Liu,
Alejandro Lozano,
Jiacheng Zhu,
Rafal Dariusz Kocielnik,
Keith Harrigian,
Haoran Zhang,
Edward Lee,
Milos Vukadinovic,
Aparna Balagopalan,
Vincent Jeanselme,
Katherine Matton,
Ilker Demirel,
Jason Fries,
Parisa Rashidi,
Brett Beaulieu-Jones,
Xuhai Orson Xu
, et al. (18 additional authors not shown)
Abstract:
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir…
▽ More
The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four virtual roundtables at ML4H 2022. The organization of the research roundtables at the conference involved 17 Senior Chairs and 19 Junior Chairs across 11 tables. Each roundtable session included invited senior chairs (with substantial experience in the field), junior chairs (responsible for facilitating the discussion), and attendees from diverse backgrounds with interest in the session's topic. Herein we detail the organization process and compile takeaways from these roundtable discussions, including recent advances, applications, and open challenges for each topic. We conclude with a summary and lessons learned across all roundtables. This document serves as a comprehensive review paper, summarizing the recent advancements in machine learning for healthcare as contributed by foremost researchers in the field.
△ Less
Submitted 5 April, 2024; v1 submitted 3 March, 2024;
originally announced March 2024.
-
Neural Fine-Gray: Monotonic neural networks for competing risks
Authors:
Vincent Jeanselme,
Chang Ho Yoon,
Brian Tom,
Jessica Barrett
Abstract:
Time-to-event modelling, known as survival analysis, differs from standard regression as it addresses censoring in patients who do not experience the event of interest. Despite competitive performances in tackling this problem, machine learning methods often ignore other competing risks that preclude the event of interest. This practice biases the survival estimation. Extensions to address this ch…
▽ More
Time-to-event modelling, known as survival analysis, differs from standard regression as it addresses censoring in patients who do not experience the event of interest. Despite competitive performances in tackling this problem, machine learning methods often ignore other competing risks that preclude the event of interest. This practice biases the survival estimation. Extensions to address this challenge often rely on parametric assumptions or numerical estimations leading to sub-optimal survival approximations. This paper leverages constrained monotonic neural networks to model each competing survival distribution. This modelling choice ensures the exact likelihood maximisation at a reduced computational cost by using automatic differentiation. The effectiveness of the solution is demonstrated on one synthetic and three medical datasets. Finally, we discuss the implications of considering competing risks when developing risk scores for medical practice.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness
Authors:
Vincent Jeanselme,
Maria De-Arteaga,
Zhe Zhang,
Jessica Barrett,
Brian Tom
Abstract:
Machine learning risks reinforcing biases present in data and, as we argue in this work, in what is absent from data. In healthcare, societal and decision biases shape patterns in missing data, yet the algorithmic fairness implications of group-specific missingness are poorly understood. The way we address missingness in healthcare can have detrimental impacts on downstream algorithmic fairness. O…
▽ More
Machine learning risks reinforcing biases present in data and, as we argue in this work, in what is absent from data. In healthcare, societal and decision biases shape patterns in missing data, yet the algorithmic fairness implications of group-specific missingness are poorly understood. The way we address missingness in healthcare can have detrimental impacts on downstream algorithmic fairness. Our work questions current recommendations and practices aimed at handling missing data with a focus on their effect on algorithmic fairness, and offers a path forward. Specifically, we consider the theoretical underpinnings of existing recommendations as well as their empirical predictive performance and corresponding algorithmic fairness measured through subgroup performances. Our results show that current practices for handling missingness lack principled foundations, are disconnected from the realities of missingness mechanisms in healthcare, and can be counterproductive. For example, we show that favouring group-specific imputation strategy can be misguided and exacerbate prediction disparities. We then build on our findings to propose a framework for empirically guiding imputation choices, and an accompanying reporting framework. Our work constitutes an important contribution to recent efforts by regulators and practitioners to grapple with the realities of real-world data, and to foster the responsible and transparent deployment of machine learning systems. We demonstrate the practical utility of the proposed framework through experimentation on widely used datasets, where we show how the proposed framework can guide the selection of imputation strategies, allowing us to choose among strategies that yield equal overall predictive performance but present different algorithmic fairness properties.
△ Less
Submitted 17 March, 2025; v1 submitted 13 August, 2022;
originally announced August 2022.
-
DeepJoint: Robust Survival Modelling Under Clinical Presence Shift
Authors:
Vincent Jeanselme,
Glen Martin,
Niels Peek,
Matthew Sperrin,
Brian Tom,
Jessica Barrett
Abstract:
Observational data in medicine arise as a result of the complex interaction between patients and the healthcare system. The sampling process is often highly irregular and itself constitutes an informative process. When using such data to develop prediction models, this phenomenon is often ignored, leading to sub-optimal performance and generalisability of models when practices evolve. We propose a…
▽ More
Observational data in medicine arise as a result of the complex interaction between patients and the healthcare system. The sampling process is often highly irregular and itself constitutes an informative process. When using such data to develop prediction models, this phenomenon is often ignored, leading to sub-optimal performance and generalisability of models when practices evolve. We propose a multi-task recurrent neural network which models three clinical presence dimensions -- namely the longitudinal, the inter-observation and the missingness processes -- in parallel to the survival outcome. On a prediction task using MIMIC III laboratory tests, explicit modelling of these three processes showed improved performance in comparison to state-of-the-art predictive models (C-index at 1 day horizon: 0.878). More importantly, the proposed approach was more robust to change in the clinical presence setting, demonstrated by performance comparison between patients admitted on weekdays and weekends. This analysis demonstrates the importance of studying and leveraging clinical presence to improve performance and create more transportable clinical models.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
Constrained Clustering and Multiple Kernel Learning without Pairwise Constraint Relaxation
Authors:
Benedikt Boecking,
Vincent Jeanselme,
Artur Dubrawski
Abstract:
Clustering under pairwise constraints is an important knowledge discovery tool that enables the learning of appropriate kernels or distance metrics to improve clustering performance. These pairwise constraints, which come in the form of must-link and cannot-link pairs, arise naturally in many applications and are intuitive for users to provide. However, the common practice of relaxing discrete con…
▽ More
Clustering under pairwise constraints is an important knowledge discovery tool that enables the learning of appropriate kernels or distance metrics to improve clustering performance. These pairwise constraints, which come in the form of must-link and cannot-link pairs, arise naturally in many applications and are intuitive for users to provide. However, the common practice of relaxing discrete constraints to a continuous domain to ease optimization when learning kernels or metrics can harm generalization, as information which only encodes linkage is transformed to informing distances. We introduce a new constrained clustering algorithm that jointly clusters data and learns a kernel in accordance with the available pairwise constraints. To generalize well, our method is designed to maximize constraint satisfaction without relaxing pairwise constraints to a continuous domain where they inform distances. We show that the proposed method outperforms existing approaches on a large number of diverse publicly available datasets, and we discuss how our method can scale to handling large data.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Leveraging Expert Consistency to Improve Algorithmic Decision Support
Authors:
Maria De-Arteaga,
Vincent Jeanselme,
Artur Dubrawski,
Alexandra Chouldechova
Abstract:
Machine learning (ML) is increasingly being used to support high-stakes decisions. However, there is frequently a construct gap: a gap between the construct of interest to the decision-making task and what is captured in proxies used as labels to train ML models. As a result, ML models may fail to capture important dimensions of decision criteria, hampering their utility for decision support. Thus…
▽ More
Machine learning (ML) is increasingly being used to support high-stakes decisions. However, there is frequently a construct gap: a gap between the construct of interest to the decision-making task and what is captured in proxies used as labels to train ML models. As a result, ML models may fail to capture important dimensions of decision criteria, hampering their utility for decision support. Thus, an essential step in the design of ML systems for decision support is selecting a target label among available proxies. In this work, we explore the use of historical expert decisions as a rich -- yet also imperfect -- source of information that can be combined with observed outcomes to narrow the construct gap. We argue that managers and system designers may be interested in learning from experts in instances where they exhibit consistency with each other, while learning from observed outcomes otherwise. We develop a methodology to enable this goal using information that is commonly available in organizational information systems. This involves two core steps. First, we propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Second, we introduce a label amalgamation approach that allows ML models to simultaneously learn from expert decisions and observed outcomes. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap, yielding better predictive performance than learning from either observed outcomes or expert decisions alone.
△ Less
Submitted 3 June, 2024; v1 submitted 24 January, 2021;
originally announced January 2021.