-
Applicability of oculomics for individual risk prediction: Repeatability and robustness of retinal Fractal Dimension using DART and AutoMorph
Authors:
Justin Engelmann,
Diana Moukaddem,
Lucas Gago,
Niall Strang,
Miguel O. Bernabeu
Abstract:
Purpose: To investigate whether Fractal Dimension (FD)-based oculomics could be used for individual risk prediction by evaluating repeatability and robustness. Methods: We used two datasets: Caledonia, healthy adults imaged multiple times in quick succession for research (26 subjects, 39 eyes, 377 colour fundus images), and GRAPE, glaucoma patients with baseline and follow-up visits (106 subjects,…
▽ More
Purpose: To investigate whether Fractal Dimension (FD)-based oculomics could be used for individual risk prediction by evaluating repeatability and robustness. Methods: We used two datasets: Caledonia, healthy adults imaged multiple times in quick succession for research (26 subjects, 39 eyes, 377 colour fundus images), and GRAPE, glaucoma patients with baseline and follow-up visits (106 subjects, 196 eyes, 392 images). Mean follow-up time was 18.3 months in GRAPE, thus it provides a pessimistic lower-bound as vasculature could change. FD was computed with DART and AutoMorph. Image quality was assessed with QuickQual, but no images were initially excluded. Pearson, Spearman, and Intraclass Correlation (ICC) were used for population-level repeatability. For individual-level repeatability, we introduce measurement noise parameter λ which is within-eye Standard Deviation (SD) of FD measurements in units of between-eyes SD. Results: In Caledonia, ICC was 0.8153 for DART and 0.5779 for AutoMorph, Pearson/Spearman correlation (first and last image) 0.7857/0.7824 for DART, and 0.3933/0.6253 for AutoMorph. In GRAPE, Pearson/Spearman correlation (first and next visit) was 0.7479/0.7474 for DART, and 0.7109/0.7208 for AutoMorph (all p<0.0001). Median λ in Caledonia without exclusions was 3.55\% for DART and 12.65\% for AutoMorph, and improved to up to 1.67\% and 6.64\% with quality-based exclusions, respectively. Quality exclusions primarily mitigated large outliers. Worst quality in an eye correlated strongly with λ (Pearson 0.5350-0.7550, depending on dataset and method, all p<0.0001). Conclusions: Repeatability was sufficient for individual-level predictions in heterogeneous populations. DART performed better on all metrics and might be able to detect small, longitudinal changes, highlighting the potential of robust methods.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Choroidalyzer: An open-source, end-to-end pipeline for choroidal analysis in optical coherence tomography
Authors:
Justin Engelmann,
Jamie Burke,
Charlene Hamid,
Megan Reid-Schachter,
Dan Pugh,
Neeraj Dhaun,
Diana Moukaddem,
Lyle Gray,
Niall Strang,
Paul McGraw,
Amos Storkey,
Paul J. Steptoe,
Stuart King,
Tom MacGillivray,
Miguel O. Bernabeu,
Ian J. C. MacCormick
Abstract:
Purpose: To develop Choroidalyzer, an open-source, end-to-end pipeline for segmenting the choroid region, vessels, and fovea, and deriving choroidal thickness, area, and vascular index.
Methods: We used 5,600 OCT B-scans (233 subjects, 6 systemic disease cohorts, 3 device types, 2 manufacturers). To generate region and vessel ground-truths, we used state-of-the-art automatic methods following ma…
▽ More
Purpose: To develop Choroidalyzer, an open-source, end-to-end pipeline for segmenting the choroid region, vessels, and fovea, and deriving choroidal thickness, area, and vascular index.
Methods: We used 5,600 OCT B-scans (233 subjects, 6 systemic disease cohorts, 3 device types, 2 manufacturers). To generate region and vessel ground-truths, we used state-of-the-art automatic methods following manual correction of inaccurate segmentations, with foveal positions manually annotated. We trained a U-Net deep-learning model to detect the region, vessels, and fovea to calculate choroid thickness, area, and vascular index in a fovea-centred region of interest. We analysed segmentation agreement (AUC, Dice) and choroid metrics agreement (Pearson, Spearman, mean absolute error (MAE)) in internal and external test sets. We compared Choroidalyzer to two manual graders on a small subset of external test images and examined cases of high error.
Results: Choroidalyzer took 0.299 seconds per image on a standard laptop and achieved excellent region (Dice: internal 0.9789, external 0.9749), very good vessel segmentation performance (Dice: internal 0.8817, external 0.8703) and excellent fovea location prediction (MAE: internal 3.9 pixels, external 3.4 pixels). For thickness, area, and vascular index, Pearson correlations were 0.9754, 0.9815, and 0.8285 (internal) / 0.9831, 0.9779, 0.7948 (external), respectively (all p<0.0001). Choroidalyzer's agreement with graders was comparable to the inter-grader agreement across all metrics.
Conclusions: Choroidalyzer is an open-source, end-to-end pipeline that accurately segments the choroid and reliably extracts thickness, area, and vascular index. Especially choroidal vessel segmentation is a difficult and subjective task, and fully-automatic methods like Choroidalyzer could provide objectivity and standardisation.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Mixed Models with Multiple Instance Learning
Authors:
Jan P. Engelmann,
Alessandro Palma,
Jakub M. Tomczak,
Fabian J. Theis,
Francesco Paolo Casale
Abstract:
Predicting patient features from single-cell data can help identify cellular states implicated in health and disease. Linear models and average cell type expressions are typically favored for this task for their efficiency and robustness, but they overlook the rich cell heterogeneity inherent in single-cell data. To address this gap, we introduce MixMIL, a framework integrating Generalized Linear…
▽ More
Predicting patient features from single-cell data can help identify cellular states implicated in health and disease. Linear models and average cell type expressions are typically favored for this task for their efficiency and robustness, but they overlook the rich cell heterogeneity inherent in single-cell data. To address this gap, we introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL), upholding the advantages of linear models while modeling cell state heterogeneity. By leveraging predefined cell embeddings, MixMIL enhances computational efficiency and aligns with recent advancements in single-cell representation learning. Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets, uncovering new associations and elucidating biological mechanisms across different domains.
△ Less
Submitted 8 March, 2024; v1 submitted 4 November, 2023;
originally announced November 2023.
-
QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models
Authors:
Justin Engelmann,
Amos Storkey,
Miguel O. Bernabeu
Abstract:
Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time consuming and subjective. Thus, automated methods for retinal image quality scoring (RIQS) are needed. The current state-of-the-art is MCFNet, composed of three Densenet121 backbones each operating in a different colour space. M…
▽ More
Image quality remains a key problem for both traditional and deep learning (DL)-based approaches to retinal image analysis, but identifying poor quality images can be time consuming and subjective. Thus, automated methods for retinal image quality scoring (RIQS) are needed. The current state-of-the-art is MCFNet, composed of three Densenet121 backbones each operating in a different colour space. MCFNet, and the EyeQ dataset released by the same authors, was a huge step forward for RIQS. We present QuickQual, a simple approach to RIQS, consisting of a single off-the-shelf ImageNet-pretrained Densenet121 backbone plus a Support Vector Machine (SVM). QuickQual performs very well, setting a new state-of-the-art for EyeQ (Accuracy: 88.50% vs 88.00% for MCFNet; AUC: 0.9687 vs 0.9588). This suggests that RIQS can be solved with generic perceptual features learned on natural images, as opposed to requiring DL models trained on large amounts of fundus images. Additionally, we propose a Fixed Prior linearisation scheme, that converts EyeQ from a 3-way classification to a continuous logistic regression task. For this task, we present a second model, QuickQual MEga Minified Estimator (QuickQual-MEME), that consists of only 10 parameters on top of an off-the-shelf Densenet121 and can distinguish between gradable and ungradable images with an accuracy of 89.18% (AUC: 0.9537). Code and model are available on GitHub: https://github.com/justinengelmann/QuickQual . QuickQual is so lightweight, that the entire inference code (and even the parameters for QuickQual-MEME) is already contained in this paper.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
An open-source deep learning algorithm for efficient and fully-automatic analysis of the choroid in optical coherence tomography
Authors:
Jamie Burke,
Justin Engelmann,
Charlene Hamid,
Megan Reid-Schachter,
Tom Pearson,
Dan Pugh,
Neeraj Dhaun,
Stuart King,
Tom MacGillivray,
Miguel O. Bernabeu,
Amos Storkey,
Ian J. C. MacCormick
Abstract:
Purpose: To develop an open-source, fully-automatic deep learning algorithm, DeepGPET, for choroid region segmentation in optical coherence tomography (OCT) data. Methods: We used a dataset of 715 OCT B-scans (82 subjects, 115 eyes) from 3 clinical studies related to systemic disease. Ground truth segmentations were generated using a clinically validated, semi-automatic choroid segmentation method…
▽ More
Purpose: To develop an open-source, fully-automatic deep learning algorithm, DeepGPET, for choroid region segmentation in optical coherence tomography (OCT) data. Methods: We used a dataset of 715 OCT B-scans (82 subjects, 115 eyes) from 3 clinical studies related to systemic disease. Ground truth segmentations were generated using a clinically validated, semi-automatic choroid segmentation method, Gaussian Process Edge Tracing (GPET). We finetuned a UNet with MobileNetV3 backbone pre-trained on ImageNet. Standard segmentation agreement metrics, as well as derived measures of choroidal thickness and area, were used to evaluate DeepGPET, alongside qualitative evaluation from a clinical ophthalmologist. Results: DeepGPET achieves excellent agreement with GPET on data from 3 clinical studies (AUC=0.9994, Dice=0.9664; Pearson correlation of 0.8908 for choroidal thickness and 0.9082 for choroidal area), while reducing the mean processing time per image on a standard laptop CPU from 34.49s ($\pm$15.09) using GPET to 1.25s ($\pm$0.10) using DeepGPET. Both methods performed similarly according to a clinical ophthalmologist, who qualitatively judged a subset of segmentations by GPET and DeepGPET, based on smoothness and accuracy of segmentations. Conclusions: DeepGPET, a fully-automatic, open-source algorithm for choroidal segmentation, will enable researchers to efficiently extract choroidal measurements, even for large datasets. As no manual interventions are required, DeepGPET is less subjective than semi-automatic methods and could be deployed in clinical practice without necessitating a trained operator.
△ Less
Submitted 29 October, 2023; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Uncertainty Quantification for Atlas-Level Cell Type Transfer
Authors:
Jan Engelmann,
Leon Hetzel,
Giovanni Palla,
Lisa Sikkema,
Malte Luecken,
Fabian Theis
Abstract:
Single-cell reference atlases are large-scale, cell-level maps that capture cellular heterogeneity within an organ using single cell genomics. Given their size and cellular diversity, these atlases serve as high-quality training data for the transfer of cell type labels to new datasets. Such label transfer, however, must be robust to domain shifts in gene expression due to measurement technique, l…
▽ More
Single-cell reference atlases are large-scale, cell-level maps that capture cellular heterogeneity within an organ using single cell genomics. Given their size and cellular diversity, these atlases serve as high-quality training data for the transfer of cell type labels to new datasets. Such label transfer, however, must be robust to domain shifts in gene expression due to measurement technique, lab specifics and more general batch effects. This requires methods that provide uncertainty estimates on the cell type predictions to ensure correct interpretation. Here, for the first time, we introduce uncertainty quantification methods for cell type classification on single-cell reference atlases. We benchmark four model classes and show that currently used models lack calibration, robustness, and actionable uncertainty scores. Furthermore, we demonstrate how models that quantify uncertainty are better suited to detect unseen cell types in the setting of atlas-level cell type transfer.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Robust and efficient computation of retinal fractal dimension through deep approximation
Authors:
Justin Engelmann,
Ana Villaplana-Velasco,
Amos Storkey,
Miguel O. Bernabeu
Abstract:
A retinal trait, or phenotype, summarises a specific aspect of a retinal image in a single number. This can then be used for further analyses, e.g. with statistical methods. However, reducing an aspect of a complex image to a single, meaningful number is challenging. Thus, methods for calculating retinal traits tend to be complex, multi-step pipelines that can only be applied to high quality image…
▽ More
A retinal trait, or phenotype, summarises a specific aspect of a retinal image in a single number. This can then be used for further analyses, e.g. with statistical methods. However, reducing an aspect of a complex image to a single, meaningful number is challenging. Thus, methods for calculating retinal traits tend to be complex, multi-step pipelines that can only be applied to high quality images. This means that researchers often have to discard substantial portions of the available data. We hypothesise that such pipelines can be approximated with a single, simpler step that can be made robust to common quality issues. We propose Deep Approximation of Retinal Traits (DART) where a deep neural network is used predict the output of an existing pipeline on high quality images from synthetically degraded versions of these images. We demonstrate DART on retinal Fractal Dimension (FD) calculated by VAMPIRE, using retinal images from UK Biobank that previous work identified as high quality. Our method shows very high agreement with FD VAMPIRE on unseen test images (Pearson r=0.9572). Even when those images are severely degraded, DART can still recover an FD estimate that shows good agreement with FD VAMPIRE obtained from the original images (Pearson r=0.8817). This suggests that our method could enable researchers to discard fewer images in the future. Our method can compute FD for over 1,000img/s using a single GPU. We consider these to be very encouraging initial results and hope to develop this approach into a useful tool for retinal analysis.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Detection of multiple retinal diseases in ultra-widefield fundus images using deep learning: data-driven identification of relevant regions
Authors:
Justin Engelmann,
Alice D. McTrusty,
Ian J. C. MacCormick,
Emma Pead,
Amos Storkey,
Miguel O. Bernabeu
Abstract:
Ultra-widefield (UWF) imaging is a promising modality that captures a larger retinal field of view compared to traditional fundus photography. Previous studies showed that deep learning (DL) models are effective for detecting retinal disease in UWF images, but primarily considered individual diseases under less-than-realistic conditions (excluding images with other diseases, artefacts, comorbiditi…
▽ More
Ultra-widefield (UWF) imaging is a promising modality that captures a larger retinal field of view compared to traditional fundus photography. Previous studies showed that deep learning (DL) models are effective for detecting retinal disease in UWF images, but primarily considered individual diseases under less-than-realistic conditions (excluding images with other diseases, artefacts, comorbidities, or borderline cases; and balancing healthy and diseased images) and did not systematically investigate which regions of the UWF images are relevant for disease detection. We first improve on the state of the field by proposing a DL model that can recognise multiple retinal diseases under more realistic conditions. We then use global explainability methods to identify which regions of the UWF images the model generally attends to. Our model performs very well, separating between healthy and diseased retinas with an area under the curve (AUC) of 0.9206 on an internal test set, and an AUC of 0.9841 on a challenging, external test set. When diagnosing specific diseases, the model attends to regions where we would expect those diseases to occur. We further identify the posterior pole as the most important region in a purely data-driven fashion. Surprisingly, 10% of the image around the posterior pole is sufficient for achieving comparable performance to having the full images available.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.