Search | arXiv e-print repository

From Pixels to Histopathology: A Graph-Based Framework for Interpretable Whole Slide Image Analysis

Authors: Alexander Weers, Alexander H. Berger, Laurin Lux, Peter Schüffler, Daniel Rueckert, Johannes C. Paetzold

Abstract: The histopathological classification of whole-slide images (WSIs) is a fundamental task in digital pathology; yet it requires extensive time and expertise from specialists. While deep learning methods show promising results, they typically process WSIs by dividing them into artificial patches, which inherently prevents a network from learning from the entire image context, disregards natural tissu… ▽ More The histopathological classification of whole-slide images (WSIs) is a fundamental task in digital pathology; yet it requires extensive time and expertise from specialists. While deep learning methods show promising results, they typically process WSIs by dividing them into artificial patches, which inherently prevents a network from learning from the entire image context, disregards natural tissue structures and compromises interpretability. Our method overcomes this limitation through a novel graph-based framework that constructs WSI graph representations. The WSI-graph efficiently captures essential histopathological information in a compact form. We build tissue representations (nodes) that follow biological boundaries rather than arbitrary patches all while providing interpretable features for explainability. Through adaptive graph coarsening guided by learned embeddings, we progressively merge regions while maintaining discriminative local features and enabling efficient global information exchange. In our method's final step, we solve the diagnostic task through a graph attention network. We empirically demonstrate strong performance on multiple challenging tasks such as cancer stage classification and survival prediction, while also identifying predictive factors using Integrated Gradients. Our implementation is publicly available at https://github.com/HistoGraph31/pix2pathology △ Less

Submitted 14 March, 2025; originally announced March 2025.

Comments: 11 pages, 2 figures

arXiv:2503.09808 [pdf, other]

Fine-tuning Vision Language Models with Graph-based Knowledge for Explainable Medical Image Analysis

Authors: Chenjun Li, Laurin Lux, Alexander H. Berger, Martin J. Menten, Mert R. Sabuncu, Johannes C. Paetzold

Abstract: Accurate staging of Diabetic Retinopathy (DR) is essential for guiding timely interventions and preventing vision loss. However, current staging models are hardly interpretable, and most public datasets contain no clinical reasoning or interpretation beyond image-level labels. In this paper, we present a novel method that integrates graph representation learning with vision-language models (VLMs)… ▽ More Accurate staging of Diabetic Retinopathy (DR) is essential for guiding timely interventions and preventing vision loss. However, current staging models are hardly interpretable, and most public datasets contain no clinical reasoning or interpretation beyond image-level labels. In this paper, we present a novel method that integrates graph representation learning with vision-language models (VLMs) to deliver explainable DR diagnosis. Our approach leverages optical coherence tomography angiography (OCTA) images by constructing biologically informed graphs that encode key retinal vascular features such as vessel morphology and spatial connectivity. A graph neural network (GNN) then performs DR staging while integrated gradients highlight critical nodes and edges and their individual features that drive the classification decisions. We collect this graph-based knowledge which attributes the model's prediction to physiological structures and their characteristics. We then transform it into textual descriptions for VLMs. We perform instruction-tuning with these textual descriptions and the corresponding image to train a student VLM. This final agent can classify the disease and explain its decision in a human interpretable way solely based on a single image input. Experimental evaluations on both proprietary and public datasets demonstrate that our method not only improves classification accuracy but also offers more clinically interpretable results. An expert study further demonstrates that our method provides more accurate diagnostic explanations and paves the way for precise localization of pathologies in OCTA images. △ Less

Submitted 12 March, 2025; originally announced March 2025.

Comments: 11 pages, 3 figures

arXiv:2502.16697 [pdf, other]

Interpretable Retinal Disease Prediction Using Biology-Informed Heterogeneous Graph Representations

Authors: Laurin Lux, Alexander H. Berger, Maria Romeo Tricas, Alaa E. Fayed, Sobha Sivaprasada, Linus Kreitner, Jonas Weidner, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzold

Abstract: Interpretability is crucial to enhance trust in machine learning models for medical diagnostics. However, most state-of-the-art image classifiers based on neural networks are not interpretable. As a result, clinicians often resort to known biomarkers for diagnosis, although biomarker-based classification typically performs worse than large neural networks. This work proposes a method that surpasse… ▽ More Interpretability is crucial to enhance trust in machine learning models for medical diagnostics. However, most state-of-the-art image classifiers based on neural networks are not interpretable. As a result, clinicians often resort to known biomarkers for diagnosis, although biomarker-based classification typically performs worse than large neural networks. This work proposes a method that surpasses the performance of established machine learning models while simultaneously improving prediction interpretability for diabetic retinopathy staging from optical coherence tomography angiography (OCTA) images. Our method is based on a novel biology-informed heterogeneous graph representation that models retinal vessel segments, intercapillary areas, and the foveal avascular zone (FAZ) in a human-interpretable way. This graph representation allows us to frame diabetic retinopathy staging as a graph-level classification task, which we solve using an efficient graph neural network. We benchmark our method against well-established baselines, including classical biomarker-based classifiers, convolutional neural networks (CNNs), and vision transformers. Our model outperforms all baselines on two datasets. Crucially, we use our biology-informed graph to provide explanations of unprecedented detail. Our approach surpasses existing methods in precisely localizing and identifying critical vessels or intercapillary areas. In addition, we give informative and human-interpretable attributions to critical characteristics. Our work contributes to the development of clinical decision-support tools in ophthalmology. △ Less

Submitted 23 February, 2025; originally announced February 2025.

arXiv:2412.14619 [pdf, other]

Pitfalls of topology-aware image segmentation

Authors: Alexander H. Berger, Laurin Lux, Alexander Weers, Martin Menten, Daniel Rueckert, Johannes C. Paetzold

Abstract: Topological correctness, i.e., the preservation of structural integrity and specific characteristics of shape, is a fundamental requirement for medical imaging tasks, such as neuron or vessel segmentation. Despite the recent surge in topology-aware methods addressing this challenge, their real-world applicability is hindered by flawed benchmarking practices. In this paper, we identify critical pit… ▽ More Topological correctness, i.e., the preservation of structural integrity and specific characteristics of shape, is a fundamental requirement for medical imaging tasks, such as neuron or vessel segmentation. Despite the recent surge in topology-aware methods addressing this challenge, their real-world applicability is hindered by flawed benchmarking practices. In this paper, we identify critical pitfalls in model evaluation that include inadequate connectivity choices, overlooked topological artifacts in ground truth annotations, and inappropriate use of evaluation metrics. Through detailed empirical analysis, we uncover these issues' profound impact on the evaluation and ranking of segmentation methods. Drawing from our findings, we propose a set of actionable recommendations to establish fair and robust evaluation standards for topology-aware medical image segmentation methods. △ Less

Submitted 19 December, 2024; originally announced December 2024.

Comments: Code is available at https://github.com/AlexanderHBerger/topo-pitfalls

arXiv:2411.03228 [pdf, other]

Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Authors: Laurin Lux, Alexander H. Berger, Alexander Weers, Nico Stucki, Daniel Rueckert, Ulrich Bauer, Johannes C. Paetzold

Abstract: Topological correctness plays a critical role in many image segmentation tasks, yet most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy. Existing topology-aware methods often lack robust topological guarantees, are limited to specific use cases, or impose high computational costs. In this work, we propose a novel, graph-based framework for topol… ▽ More Topological correctness plays a critical role in many image segmentation tasks, yet most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy. Existing topology-aware methods often lack robust topological guarantees, are limited to specific use cases, or impose high computational costs. In this work, we propose a novel, graph-based framework for topologically accurate image segmentation that is both computationally efficient and generally applicable. Our method constructs a component graph that fully encodes the topological information of both the prediction and ground truth, allowing us to efficiently identify topologically critical regions and aggregate a loss based on local neighborhood information. Furthermore, we introduce a strict topological metric capturing the homotopy equivalence between the union and intersection of prediction-label pairs. We formally prove the topological guarantees of our approach and empirically validate its effectiveness on binary and multi-class datasets. Our loss demonstrates state-of-the-art performance with up to fivefold faster loss computation compared to persistent homology methods. △ Less

Submitted 17 April, 2025; v1 submitted 5 November, 2024; originally announced November 2024.

arXiv:2403.11001 [pdf, other]

doi 10.1007/978-3-031-72111-3_68

Topologically Faithful Multi-class Segmentation in Medical Images

Authors: Alexander H. Berger, Nico Stucki, Laurin Lux, Vincent Buergin, Suprosanna Shit, Anna Banaszak, Daniel Rueckert, Ulrich Bauer, Johannes C. Paetzold

Abstract: Topological accuracy in medical image segmentation is a highly important property for downstream applications such as network analysis and flow modeling in vessels or cell counting. Recently, significant methodological advancements have brought well-founded concepts from algebraic topology to binary segmentation. However, these approaches have been underexplored in multi-class segmentation scenari… ▽ More Topological accuracy in medical image segmentation is a highly important property for downstream applications such as network analysis and flow modeling in vessels or cell counting. Recently, significant methodological advancements have brought well-founded concepts from algebraic topology to binary segmentation. However, these approaches have been underexplored in multi-class segmentation scenarios, where topological errors are common. We propose a general loss function for topologically faithful multi-class segmentation extending the recent Betti matching concept, which is based on induced matchings of persistence barcodes. We project the N-class segmentation problem to N single-class segmentation tasks, which allows us to use 1-parameter persistent homology, making training of neural networks computationally feasible. We validate our method on a comprehensive set of four medical datasets with highly variant topological characteristics. Our loss formulation significantly enhances topological correctness in cardiac, cell, artery-vein, and Circle of Willis segmentation. △ Less

Submitted 9 October, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

Journal ref: MICCAI 2024, Lecture Notes in Computer Science, vol. 15008, pp. 721-731, 2024

arXiv:2403.06601 [pdf, other]

Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

Authors: Alexander H. Berger, Laurin Lux, Suprosanna Shit, Ivan Ezhov, Georgios Kaissis, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzold

Abstract: Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model. Due to this task's complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging. This data sparsity necessitates transfer learning strategies akin to the state-of-the-art in general computer vision… ▽ More Direct image-to-graph transformation is a challenging task that involves solving object detection and relationship prediction in a single model. Due to this task's complexity, large training datasets are rare in many domains, making the training of deep-learning methods challenging. This data sparsity necessitates transfer learning strategies akin to the state-of-the-art in general computer vision. In this work, we introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers. We propose (1) a regularized edge sampling loss to effectively learn object relations in multiple domains with different numbers of edges, (2) a domain adaptation framework for image-to-graph transformers aligning image- and graph-level features from different domains, and (3) a projection function that allows using 2D data for training 3D transformers. We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we utilize labeled data from 2D road networks for simultaneous learning in vastly different target domains. Our method consistently outperforms standard transfer learning and self-supervised pretraining on challenging benchmarks, such as retinal or whole-brain vessel graph extraction. △ Less

Submitted 5 December, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2306.00944 [pdf]

pgMAP: a pipeline to enable guide RNA read mapping from dual-targeting CRISPR screens

Authors: Phoebe C. R. Parrish, Daniel J. Groso, James D. Thomas, Robert K. Bradley, Alice H. Berger

Abstract: We developed pgMAP, an analysis pipeline to map gRNA sequencing reads from dual-targeting CRISPR screens. pgMAP output includes a dual gRNA read counts table and quality control metrics including the proportion of correctly-paired reads and CRISPR library sequencing coverage across all time points and samples. pgMAP is implemented using Snakemake and is available open-source under the MIT license… ▽ More We developed pgMAP, an analysis pipeline to map gRNA sequencing reads from dual-targeting CRISPR screens. pgMAP output includes a dual gRNA read counts table and quality control metrics including the proportion of correctly-paired reads and CRISPR library sequencing coverage across all time points and samples. pgMAP is implemented using Snakemake and is available open-source under the MIT license at https://github.com/fredhutch/pgmap_pipeline. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: 6 pages, 1 figure, code available at https://github.com/fredhutch/pgmap_pipeline

Showing 1–8 of 8 results for author: Berger, A H