An automated machine learning framework to optimize radiomics model construction validated on twelve clinical applications
Authors:
Martijn P. A. Starmans,
Sebastian R. van der Voort,
Thomas Phil,
Milea J. M. Timbergen,
Melissa Vos,
Guillaume A. Padmos,
Wouter Kessels,
David Hanff,
Dirk J. Grunhagen,
Cornelis Verhoef,
Stefan Sleijfer,
Martin J. van den Bent,
Marion Smits,
Roy S. Dwarkasing,
Christopher J. Els,
Federico Fiduzi,
Geert J. L. H. van Leenders,
Anela Blazevic,
Johannes Hofland,
Tessa Brabander,
Renza A. H. van Gils,
Gaston J. H. Franssen,
Richard A. Feelders,
Wouter W. de Herder,
Florian E. Buisman
, et al. (21 additional authors not shown)
Abstract:
Predicting clinical outcomes from medical images using quantitative features (``radiomics'') requires many method design choices, Currently, in new clinical applications, finding the optimal radiomics method out of the wide range of methods relies on a manual, heuristic trial-and-error process. We introduce a novel automated framework that optimizes radiomics workflow construction per application…
▽ More
Predicting clinical outcomes from medical images using quantitative features (``radiomics'') requires many method design choices, Currently, in new clinical applications, finding the optimal radiomics method out of the wide range of methods relies on a manual, heuristic trial-and-error process. We introduce a novel automated framework that optimizes radiomics workflow construction per application by standardizing the radiomics workflow in modular components, including a large collection of algorithms for each component, and formulating a combined algorithm selection and hyperparameter optimization problem. To solve it, we employ automated machine learning through two strategies (random search and Bayesian optimization) and three ensembling approaches. Results show that a medium-sized random search and straight-forward ensembling perform similar to more advanced methods while being more efficient. Validated across twelve clinical applications, our approach outperforms both a radiomics baseline and human experts. Concluding, our framework improves and streamlines radiomics research by fully automatically optimizing radiomics workflow construction. To facilitate reproducibility, we publicly release six datasets, software of the method, and code to reproduce this study.
△ Less
Submitted 10 March, 2025; v1 submitted 19 August, 2021;
originally announced August 2021.
Towards Unsupervised Cancer Subtyping: Predicting Prognosis Using A Histologic Visual Dictionary
Authors:
Hassan Muhammad,
Carlie S. Sigel,
Gabriele Campanella,
Thomas Boerner,
Linda M. Pak,
Stefan Büttner,
Jan N. M. IJzermans,
Bas Groot Koerkamp,
Michael Doukas,
William R. Jarnagin,
Amber Simpson,
Thomas J. Fuchs
Abstract:
Unlike common cancers, such as those of the prostate and breast, tumor grading in rare cancers is difficult and largely undefined because of small sample sizes, the sheer volume of time needed to undertake on such a task, and the inherent difficulty of extracting human-observed patterns. One of the most challenging examples is intrahepatic cholangiocarcinoma (ICC), a primary liver cancer arising f…
▽ More
Unlike common cancers, such as those of the prostate and breast, tumor grading in rare cancers is difficult and largely undefined because of small sample sizes, the sheer volume of time needed to undertake on such a task, and the inherent difficulty of extracting human-observed patterns. One of the most challenging examples is intrahepatic cholangiocarcinoma (ICC), a primary liver cancer arising from the biliary system, for which there is well-recognized tumor heterogeneity and no grading paradigm or prognostic biomarkers. In this paper, we propose a new unsupervised deep convolutional autoencoder-based clustering model that groups together cellular and structural morphologies of tumor in 246 ICC digitized whole slides, based on visual similarity. From this visual dictionary of histologic patterns, we use the clusters as covariates to train Cox-proportional hazard survival models. In univariate analysis, three clusters were significantly associated with recurrence-free survival. Combinations of these clusters were significant in multivariate analysis. In a multivariate analysis of all clusters, five showed significance to recurrence-free survival, however the overall model was not measured to be significant. Finally, a pathologist assigned clinical terminology to the significant clusters in the visual dictionary and found evidence supporting the hypothesis that collagen-enriched fibrosis plays a role in disease severity. These results offer insight into the future of cancer subtyping and show that computational pathology can contribute to disease prognostication, especially in rare cancers.
△ Less
Submitted 12 March, 2019;
originally announced March 2019.