ISLES'24: Final Infarct Prediction with Multimodal Imaging and Clinical Data. Where Do We Stand?
Authors:
Ezequiel de la Rosa,
Ruisheng Su,
Mauricio Reyes,
Evamaria O. Riedel,
Hakim Baazaoui,
Roland Wiest,
Florian Kofler,
Kaiyuan Yang,
David Robben,
Mahsa Mojtahedi,
Laura van Poppel,
Lucas de Vries,
Anthony Winder,
Kimberly Amador,
Nils D. Forkert,
Gyeongyeon Hwang,
Jiwoo Song,
Dohyun Kim,
Eneko Uruñuela,
Annabella Bregazzi,
Matthias Wilms,
Hyun Yang,
Jin Tae Kwak,
Sumin Jung,
Luan Matheus Trindade Dalmazo
, et al. (15 additional authors not shown)
Abstract:
Accurate estimation of brain infarction (i.e., irreversibly damaged tissue) is critical for guiding treatment decisions in acute ischemic stroke. Reliable infarct prediction informs key clinical interventions, including the need for patient transfer to comprehensive stroke centers, the potential benefit of additional reperfusion attempts during mechanical thrombectomy, decisions regarding secondar…
▽ More
Accurate estimation of brain infarction (i.e., irreversibly damaged tissue) is critical for guiding treatment decisions in acute ischemic stroke. Reliable infarct prediction informs key clinical interventions, including the need for patient transfer to comprehensive stroke centers, the potential benefit of additional reperfusion attempts during mechanical thrombectomy, decisions regarding secondary neuroprotective treatments, and ultimately, prognosis of clinical outcomes. This work introduces the Ischemic Stroke Lesion Segmentation (ISLES) 2024 challenge, which focuses on the prediction of final infarct volumes from pre-interventional acute stroke imaging and clinical data. ISLES24 provides a comprehensive, multimodal setting where participants can leverage all clinically and practically available data, including full acute CT imaging, sub-acute follow-up MRI, and structured clinical information, across a train set of 150 cases. On the hidden test set of 98 cases, the top-performing model, a multimodal nnU-Net-based architecture, achieved a Dice score of 0.285 (+/- 0.213) and an absolute volume difference of 21.2 (+/- 37.2) mL, underlining the significant challenges posed by this task and the need for further advances in multimodal learning. This work makes two primary contributions: first, we establish a standardized, clinically realistic benchmark for post-treatment infarct prediction, enabling systematic evaluation of multimodal algorithmic strategies on a longitudinal stroke dataset; second, we analyze current methodological limitations and outline key research directions to guide the development of next-generation infarct prediction models.
△ Less
Submitted 7 July, 2025; v1 submitted 20 August, 2024;
originally announced August 2024.
Towards objective and systematic evaluation of bias in artificial intelligence for medical imaging
Authors:
Emma A. M. Stanley,
Raissa Souza,
Anthony Winder,
Vedant Gulve,
Kimberly Amador,
Matthias Wilms,
Nils D. Forkert
Abstract:
Artificial intelligence (AI) models trained using medical images for clinical tasks often exhibit bias in the form of disparities in performance between subgroups. Since not all sources of biases in real-world medical imaging data are easily identifiable, it is challenging to comprehensively assess how those biases are encoded in models, and how capable bias mitigation methods are at ameliorating…
▽ More
Artificial intelligence (AI) models trained using medical images for clinical tasks often exhibit bias in the form of disparities in performance between subgroups. Since not all sources of biases in real-world medical imaging data are easily identifiable, it is challenging to comprehensively assess how those biases are encoded in models, and how capable bias mitigation methods are at ameliorating performance disparities. In this article, we introduce a novel analysis framework for systematically and objectively investigating the impact of biases in medical images on AI models. We developed and tested this framework for conducting controlled in silico trials to assess bias in medical imaging AI using a tool for generating synthetic magnetic resonance images with known disease effects and sources of bias. The feasibility is showcased by using three counterfactual bias scenarios to measure the impact of simulated bias effects on a convolutional neural network (CNN) classifier and the efficacy of three bias mitigation strategies. The analysis revealed that the simulated biases resulted in expected subgroup performance disparities when the CNN was trained on the synthetic datasets. Moreover, reweighing was identified as the most successful bias mitigation strategy for this setup, and we demonstrated how explainable AI methods can aid in investigating the manifestation of bias in the model using this framework. Developing fair AI models is a considerable challenge given that many and often unknown sources of biases can be present in medical imaging datasets. In this work, we present a novel methodology to objectively study the impact of biases and mitigation strategies on deep learning pipelines, which can support the development of clinical AI that is robust and responsible.
△ Less
Submitted 1 July, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.