Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Authors:
Sarthak Pati,
Ujjwal Baid,
Brandon Edwards,
Micah Sheller,
Shih-Han Wang,
G Anthony Reina,
Patrick Foley,
Alexey Gruzdev,
Deepthi Karkada,
Christos Davatzikos,
Chiharu Sako,
Satyam Ghodasara,
Michel Bilello,
Suyash Mohan,
Philipp Vollmuth,
Gianluca Brugnara,
Chandrakanth J Preetha,
Felix Sahm,
Klaus Maier-Hein,
Maximilian Zenk,
Martin Bendszus,
Wolfgang Wick,
Evan Calabrese,
Jeffrey Rudie,
Javier Villanueva-Meyer
, et al. (254 additional authors not shown)
Abstract:
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc…
▽ More
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
△ Less
Submitted 25 April, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
Robust Automatic Whole Brain Extraction on Magnetic Resonance Imaging of Brain Tumor Patients using Dense-Vnet
Authors:
Sara Ranjbar,
Kyle W. Singleton,
Lee Curtin,
Cassandra R. Rickertsen,
Lisa E. Paulson,
Leland S. Hu,
J. Ross Mitchell,
Kristin R. Swanson
Abstract:
Whole brain extraction, also known as skull stripping, is a process in neuroimaging in which non-brain tissue such as skull, eyeballs, skin, etc. are removed from neuroimages. Skull striping is a preliminary step in presurgical planning, cortical reconstruction, and automatic tumor segmentation. Despite a plethora of skull stripping approaches in the literature, few are sufficiently accurate for p…
▽ More
Whole brain extraction, also known as skull stripping, is a process in neuroimaging in which non-brain tissue such as skull, eyeballs, skin, etc. are removed from neuroimages. Skull striping is a preliminary step in presurgical planning, cortical reconstruction, and automatic tumor segmentation. Despite a plethora of skull stripping approaches in the literature, few are sufficiently accurate for processing pathology-presenting MRIs, especially MRIs with brain tumors. In this work we propose a deep learning approach for skull striping common MRI sequences in oncology such as T1-weighted with gadolinium contrast (T1Gd) and T2-weighted fluid attenuated inversion recovery (FLAIR) in patients with brain tumors. We automatically created gray matter, white matter, and CSF probability masks using SPM12 software and merged the masks into one for a final whole-brain mask for model training. Dice agreement, sensitivity, and specificity of the model (referred herein as DeepBrain) was tested against manual brain masks. To assess data efficiency, we retrained our models using progressively fewer training data examples and calculated average dice scores on the test set for the models trained in each round. Further, we tested our model against MRI of healthy brains from the LBP40A dataset. Overall, DeepBrain yielded an average dice score of 94.5%, sensitivity of 96.4%, and specificity of 98.5% on brain tumor data. For healthy brains, model performance improved to a dice score of 96.2%, sensitivity of 96.6% and specificity of 99.2%. The data efficiency experiment showed that, for this specific task, comparable levels of accuracy could have been achieved with as few as 50 training samples. In conclusion, this study demonstrated that a deep learning model trained on minimally processed automatically-generated labels can generate more accurate brain masks on MRI of brain tumor patients within seconds.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.