-
A collaborative digital twin built on FAIR data and compute infrastructure
Authors:
Thomas M. Deucher,
Juan C. Verduzco,
Michael Titus,
Alejandro Strachan
Abstract:
The integration of machine learning with automated experimentation in self-driving laboratories (SDL) offers a powerful approach to accelerate discovery and optimization tasks in science and engineering applications. When supported by findable, accessible, interoperable, and reusable (FAIR) data infrastructure, SDLs with overlapping interests can collaborate more effectively. This work presents a…
▽ More
The integration of machine learning with automated experimentation in self-driving laboratories (SDL) offers a powerful approach to accelerate discovery and optimization tasks in science and engineering applications. When supported by findable, accessible, interoperable, and reusable (FAIR) data infrastructure, SDLs with overlapping interests can collaborate more effectively. This work presents a distributed SDL implementation built on nanoHUB services for online simulation and FAIR data management. In this framework, geographically dispersed collaborators conducting independent optimization tasks contribute raw experimental data to a shared central database. These researchers can then benefit from analysis tools and machine learning models that automatically update as additional data become available. New data points are submitted through a simple web interface and automatically processed using a nanoHUB Sim2L, which extracts derived quantities and indexes all inputs and outputs in a FAIR data repository called ResultsDB. A separate nanoHUB workflow enables sequential optimization using active learning, where researchers define the optimization objective, and machine learning models are trained on-the-fly with all existing data, guiding the selection of future experiments. Inspired by the concept of ``frugal twin", the optimization task seeks to find the optimal recipe to combine food dyes to achieve the desired target color. With easily accessible and inexpensive materials, researchers and students can set up their own experiments, share data with collaborators, and explore the combination of FAIR data, predictive ML models, and sequential optimization. The tools introduced are generally applicable and can easily be extended to other optimization problems.
△ Less
Submitted 24 June, 2025;
originally announced July 2025.
-
Artificial Intelligence-Assisted Prostate Cancer Diagnosis for Reduced Use of Immunohistochemistry
Authors:
Anders Blilie,
Nita Mulliqi,
Xiaoyi Ji,
Kelvin Szolnoky,
Sol Erika Boman,
Matteo Titus,
Geraldine Martinez Gonzalez,
José Asenjo,
Marcello Gambacorta,
Paolo Libretti,
Einar Gudlaugsson,
Svein R. Kjosavik,
Lars Egevad,
Emiel A. M. Janssen,
Martin Eklund,
Kimmo Kartasalo
Abstract:
Prostate cancer diagnosis heavily relies on histopathological evaluation, which is subject to variability. While immunohistochemical staining (IHC) assists in distinguishing benign from malignant tissue, it involves increased work, higher costs, and diagnostic delays. Artificial intelligence (AI) presents a promising solution to reduce reliance on IHC by accurately classifying atypical glands and…
▽ More
Prostate cancer diagnosis heavily relies on histopathological evaluation, which is subject to variability. While immunohistochemical staining (IHC) assists in distinguishing benign from malignant tissue, it involves increased work, higher costs, and diagnostic delays. Artificial intelligence (AI) presents a promising solution to reduce reliance on IHC by accurately classifying atypical glands and borderline morphologies in hematoxylin & eosin (H&E) stained tissue sections. In this study, we evaluated an AI model's ability to minimize IHC use without compromising diagnostic accuracy by retrospectively analyzing prostate core needle biopsies from routine diagnostics at three different pathology sites. These cohorts were composed exclusively of difficult cases where the diagnosing pathologists required IHC to finalize the diagnosis. The AI model demonstrated area under the curve values of 0.951-0.993 for detecting cancer in routine H&E-stained slides. Applying sensitivity-prioritized diagnostic thresholds reduced the need for IHC staining by 44.4%, 42.0%, and 20.7% in the three cohorts investigated, without a single false negative prediction. This AI model shows potential for optimizing IHC use, streamlining decision-making in prostate pathology, and alleviating resource burdens.
△ Less
Submitted 31 March, 2025;
originally announced April 2025.
-
Foundation Models -- A Panacea for Artificial Intelligence in Pathology?
Authors:
Nita Mulliqi,
Anders Blilie,
Xiaoyi Ji,
Kelvin Szolnoky,
Henrik Olsson,
Sol Erika Boman,
Matteo Titus,
Geraldine Martinez Gonzalez,
Julia Anna Mielcarz,
Masi Valkonen,
Einar Gudlaugsson,
Svein R. Kjosavik,
José Asenjo,
Marcello Gambacorta,
Paolo Libretti,
Marcin Braun,
Radzislaw Kordek,
Roman Łowicki,
Kristina Hotakainen,
Päivi Väre,
Bodil Ginnerup Pedersen,
Karina Dalsgaard Sørensen,
Benedicte Parm Ulhøi,
Pekka Ruusuvuori,
Brett Delahunt
, et al. (6 additional authors not shown)
Abstract:
The role of artificial intelligence (AI) in pathology has evolved from aiding diagnostics to uncovering predictive morphological patterns in whole slide images (WSIs). Recently, foundation models (FMs) leveraging self-supervised pre-training have been widely advocated as a universal solution for diverse downstream tasks. However, open questions remain about their clinical applicability and general…
▽ More
The role of artificial intelligence (AI) in pathology has evolved from aiding diagnostics to uncovering predictive morphological patterns in whole slide images (WSIs). Recently, foundation models (FMs) leveraging self-supervised pre-training have been widely advocated as a universal solution for diverse downstream tasks. However, open questions remain about their clinical applicability and generalization advantages over end-to-end learning using task-specific (TS) models. Here, we focused on AI with clinical-grade performance for prostate cancer diagnosis and Gleason grading. We present the largest validation of AI for this task, using over 100,000 core needle biopsies from 7,342 patients across 15 sites in 11 countries. We compared two FMs with a fully end-to-end TS model in a multiple instance learning framework. Our findings challenge assumptions that FMs universally outperform TS models. While FMs demonstrated utility in data-scarce scenarios, their performance converged with - and was in some cases surpassed by - TS models when sufficient labeled training data were available. Notably, extensive task-specific training markedly reduced clinically significant misgrading, misdiagnosis of challenging morphologies, and variability across different WSI scanners. Additionally, FMs used up to 35 times more energy than the TS model, raising concerns about their sustainability. Our results underscore that while FMs offer clear advantages for rapid prototyping and research, their role as a universal solution for clinically applicable medical AI remains uncertain. For high-stakes clinical applications, rigorous validation and consideration of task-specific training remain critically important. We advocate for integrating the strengths of FMs and end-to-end learning to achieve robust and resource-efficient AI pathology solutions fit for clinical use.
△ Less
Submitted 3 March, 2025; v1 submitted 28 February, 2025;
originally announced February 2025.
-
Simulation of Fractional Brownian Surfaces via Spectral Synthesis on Manifolds
Authors:
Zachary Gelbaum,
Mathew Titus
Abstract:
Using the spectral decomposition of the Laplace-Beltrami operator we simulate fractal surfaces as random series of eigenfunctions. This approach allows us to generate random fields over smooth manifolds of arbitrary dimension, generalizing previous work with fractional Brownian motion with multi-dimensional parameter. We give examples of surfaces with and without boundary and discuss implementatio…
▽ More
Using the spectral decomposition of the Laplace-Beltrami operator we simulate fractal surfaces as random series of eigenfunctions. This approach allows us to generate random fields over smooth manifolds of arbitrary dimension, generalizing previous work with fractional Brownian motion with multi-dimensional parameter. We give examples of surfaces with and without boundary and discuss implementation.
△ Less
Submitted 25 March, 2013;
originally announced March 2013.