-
Seamless Interaction: Dyadic Audiovisual Motion Modeling and Large-Scale Dataset
Authors:
Vasu Agrawal,
Akinniyi Akinyemi,
Kathryn Alvero,
Morteza Behrooz,
Julia Buffalini,
Fabio Maria Carlucci,
Joy Chen,
Junming Chen,
Zhang Chen,
Shiyang Cheng,
Praveen Chowdary,
Joe Chuang,
Antony D'Avirro,
Jon Daly,
Ning Dong,
Mark Duppenthaler,
Cynthia Gao,
Jeff Girard,
Martin Gleize,
Sahir Gomez,
Hongyu Gong,
Srivathsan Govindarajan,
Brandon Han,
Sen He,
Denise Hernandez
, et al. (59 additional authors not shown)
Abstract:
Human communication involves a complex interplay of verbal and nonverbal signals, essential for conveying meaning and achieving interpersonal goals. To develop socially intelligent AI technologies, it is crucial to develop models that can both comprehend and generate dyadic behavioral dynamics. To this end, we introduce the Seamless Interaction Dataset, a large-scale collection of over 4,000 hours…
▽ More
Human communication involves a complex interplay of verbal and nonverbal signals, essential for conveying meaning and achieving interpersonal goals. To develop socially intelligent AI technologies, it is crucial to develop models that can both comprehend and generate dyadic behavioral dynamics. To this end, we introduce the Seamless Interaction Dataset, a large-scale collection of over 4,000 hours of face-to-face interaction footage from over 4,000 participants in diverse contexts. This dataset enables the development of AI technologies that understand dyadic embodied dynamics, unlocking breakthroughs in virtual agents, telepresence experiences, and multimodal content analysis tools. We also develop a suite of models that utilize the dataset to generate dyadic motion gestures and facial expressions aligned with human speech. These models can take as input both the speech and visual behavior of their interlocutors. We present a variant with speech from an LLM model and integrations with 2D and 3D rendering methods, bringing us closer to interactive virtual agents. Additionally, we describe controllable variants of our motion models that can adapt emotional responses and expressivity levels, as well as generating more semantically-relevant gestures. Finally, we discuss methods for assessing the quality of these dyadic motion models, which are demonstrating the potential for more intuitive and responsive human-AI interactions.
△ Less
Submitted 30 June, 2025; v1 submitted 27 June, 2025;
originally announced June 2025.
-
AI to Identify Strain-sensitive Regions of the Optic Nerve Head Linked to Functional Loss in Glaucoma
Authors:
Thanadet Chuangsuwanich,
Monisha E. Nongpiur,
Fabian A. Braeu,
Tin A. Tun,
Alexandre Thiery,
Shamira Perera,
Ching Lin Ho,
Martin Buist,
George Barbastathis,
Tin Aung,
Michaël J. A. Girard
Abstract:
Objective: (1) To assess whether ONH biomechanics improves prediction of three progressive visual field loss patterns in glaucoma; (2) to use explainable AI to identify strain-sensitive ONH regions contributing to these predictions.
Methods: We recruited 237 glaucoma subjects. The ONH of one eye was imaged under two conditions: (1) primary gaze and (2) primary gaze with IOP elevated to ~35 mmHg…
▽ More
Objective: (1) To assess whether ONH biomechanics improves prediction of three progressive visual field loss patterns in glaucoma; (2) to use explainable AI to identify strain-sensitive ONH regions contributing to these predictions.
Methods: We recruited 237 glaucoma subjects. The ONH of one eye was imaged under two conditions: (1) primary gaze and (2) primary gaze with IOP elevated to ~35 mmHg via ophthalmo-dynamometry. Glaucoma experts classified the subjects into four categories based on the presence of specific visual field defects: (1) superior nasal step (N=26), (2) superior partial arcuate (N=62), (3) full superior hemifield defect (N=25), and (4) other/non-specific defects (N=124). Automatic ONH tissue segmentation and digital volume correlation were used to compute IOP-induced neural tissue and lamina cribrosa (LC) strains. Biomechanical and structural features were input to a Geometric Deep Learning model. Three classification tasks were performed to detect: (1) superior nasal step, (2) superior partial arcuate, (3) full superior hemifield defect. For each task, the data were split into 80% training and 20% testing sets. Area under the curve (AUC) was used to assess performance. Explainable AI techniques were employed to highlight the ONH regions most critical to each classification.
Results: Models achieved high AUCs of 0.77-0.88, showing that ONH strain improved VF loss prediction beyond morphology alone. The inferior and inferotemporal rim were identified as key strain-sensitive regions, contributing most to visual field loss prediction and showing progressive expansion with increasing disease severity.
Conclusion and Relevance: ONH strain enhances prediction of glaucomatous VF loss patterns. Neuroretinal rim, rather than the LC, was the most critical region contributing to model predictions.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
3D Structural Phenotype of the Optic Nerve Head at the Intersection of Glaucoma and Myopia -- A Key to Improving Glaucoma Diagnosis in Myopic Populations
Authors:
Swati Sharma,
Fabian A. Braeu,
Thanadet Chuangsuwanich,
Tin A. Tun,
Quan V Hoang,
Rachel Chong,
Shamira Perera,
Ching-Lin Ho,
Rahat Husain,
Martin L. Buist,
Tin Aung,
Michaël J. A. Girard
Abstract:
Purpose: To characterize the 3D structural phenotypes of the optic nerve head (ONH) in patients with glaucoma, high myopia, and concurrent high myopia and glaucoma, and to evaluate their variations across these conditions. Participants: A total of 685 optical coherence tomography (OCT) scans from 754 subjects of Singapore-Chinese ethnicity, including 256 healthy (H), 94 highly myopic (HM), 227 gla…
▽ More
Purpose: To characterize the 3D structural phenotypes of the optic nerve head (ONH) in patients with glaucoma, high myopia, and concurrent high myopia and glaucoma, and to evaluate their variations across these conditions. Participants: A total of 685 optical coherence tomography (OCT) scans from 754 subjects of Singapore-Chinese ethnicity, including 256 healthy (H), 94 highly myopic (HM), 227 glaucomatous (G), and 108 highly myopic with glaucoma (HMG) cases. Methods: We segmented the retinal and connective tissues from OCT volumes and their boundary edges were converted into 3D point clouds. To classify the 3D point clouds into four ONH conditions, i.e., H, HM, G, and HMG, a specialized ensemble network was developed, consisting of an encoder to transform high-dimensional input data into a compressed latent vector, a decoder to reconstruct point clouds from the latent vector, and a classifier to categorize the point clouds into the four ONH conditions. Results: The classification network achieved high accuracy, distinguishing H, HM, G, and HMG classes with a micro-average AUC of 0.92 $\pm$ 0.03 on an independent test set. The decoder effectively reconstructed point clouds, achieving a Chamfer loss of 0.013 $\pm$ 0.002. Dimensionality reduction clustered ONHs into four distinct groups, revealing structural variations such as changes in retinal and connective tissue thickness, tilting and stretching of the disc and scleral canal opening, and alterations in optic cup morphology, including shallow or deep excavation, across the four conditions. Conclusions: This study demonstrated that ONHs exhibit distinct structural signatures across H, HM, G, and HMG conditions. The findings further indicate that ONH morphology provides sufficient information for classification into distinct clusters, with principal components capturing unique structural patterns within each group.
△ Less
Submitted 24 March, 2025;
originally announced March 2025.
-
LlaMADRS: Prompting Large Language Models for Interview-Based Depression Assessment
Authors:
Gaoussou Youssouf Kebe,
Jeffrey M. Girard,
Einat Liebenthal,
Justin Baker,
Fernando De la Torre,
Louis-Philippe Morency
Abstract:
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment using the Montgomery-Asberg Depression Rating Scale (MADRS). We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews. Our approach, tested on 236 real-world inter…
▽ More
This study introduces LlaMADRS, a novel framework leveraging open-source Large Language Models (LLMs) to automate depression severity assessment using the Montgomery-Asberg Depression Rating Scale (MADRS). We employ a zero-shot prompting strategy with carefully designed cues to guide the model in interpreting and scoring transcribed clinical interviews. Our approach, tested on 236 real-world interviews from the Context-Adaptive Multimodal Informatics (CAMI) dataset, demonstrates strong correlations with clinician assessments. The Qwen 2.5--72b model achieves near-human level agreement across most MADRS items, with Intraclass Correlation Coefficients (ICC) closely approaching those between human raters. We provide a comprehensive analysis of model performance across different MADRS items, highlighting strengths and current limitations. Our findings suggest that LLMs, with appropriate prompting, can serve as efficient tools for mental health assessment, potentially increasing accessibility in resource-limited settings. However, challenges remain, particularly in assessing symptoms that rely on non-verbal cues, underscoring the need for multimodal approaches in future work.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
CaBRNet, an open-source library for developing and evaluating Case-Based Reasoning Models
Authors:
Romain Xu-Darme,
Aymeric Varasse,
Alban Grastien,
Julien Girard,
Zakaria Chihani
Abstract:
In the field of explainable AI, a vibrant effort is dedicated to the design of self-explainable models, as a more principled alternative to post-hoc methods that attempt to explain the decisions after a model opaquely makes them. However, this productive line of research suffers from common downsides: lack of reproducibility, unfeasible comparison, diverging standards. In this paper, we propose Ca…
▽ More
In the field of explainable AI, a vibrant effort is dedicated to the design of self-explainable models, as a more principled alternative to post-hoc methods that attempt to explain the decisions after a model opaquely makes them. However, this productive line of research suffers from common downsides: lack of reproducibility, unfeasible comparison, diverging standards. In this paper, we propose CaBRNet, an open-source, modular, backward-compatible framework for Case-Based Reasoning Networks: https://github.com/aiser-team/cabrnet.
△ Less
Submitted 25 September, 2024;
originally announced September 2024.
-
Introducing the Biomechanics-Function Relationship in Glaucoma: Improved Visual Field Loss Predictions from intraocular pressure-induced Neural Tissue Strains
Authors:
Thanadet Chuangsuwanich,
Monisha E. Nongpiur,
Fabian A. Braeu,
Tin A. Tun,
Alexandre Thiery,
Shamira Perera,
Ching Lin Ho,
Martin Buist,
George Barbastathis,
Tin Aung,
Michaël J. A. Girard
Abstract:
Objective. (1) To assess whether neural tissue structure and biomechanics could predict functional loss in glaucoma; (2) To evaluate the importance of biomechanics in making such predictions. Design, Setting and Participants. We recruited 238 glaucoma subjects. For one eye of each subject, we imaged the optic nerve head (ONH) using spectral-domain OCT under the following conditions: (1) primary ga…
▽ More
Objective. (1) To assess whether neural tissue structure and biomechanics could predict functional loss in glaucoma; (2) To evaluate the importance of biomechanics in making such predictions. Design, Setting and Participants. We recruited 238 glaucoma subjects. For one eye of each subject, we imaged the optic nerve head (ONH) using spectral-domain OCT under the following conditions: (1) primary gaze and (2) primary gaze with acute IOP elevation. Main Outcomes: We utilized automatic segmentation of optic nerve head (ONH) tissues and digital volume correlation (DVC) analysis to compute intraocular pressure (IOP)-induced neural tissue strains. A robust geometric deep learning approach, known as Point-Net, was employed to predict the full Humphrey 24-2 pattern standard deviation (PSD) maps from ONH structural and biomechanical information. For each point in each PSD map, we predicted whether it exhibited no defect or a PSD value of less than 5%. Predictive performance was evaluated using 5-fold cross-validation and the F1-score. We compared the model's performance with and without the inclusion of IOP-induced strains to assess the impact of biomechanics on prediction accuracy. Results: Integrating biomechanical (IOP-induced neural tissue strains) and structural (tissue morphology and neural tissues thickness) information yielded a significantly better predictive model (F1-score: 0.76+-0.02) across validation subjects, as opposed to relying only on structural information, which resulted in a significantly lower F1-score of 0.71+-0.02 (p < 0.05). Conclusion: Our study has shown that the integration of biomechanical data can significantly improve the accuracy of visual field loss predictions. This highlights the importance of the biomechanics-function relationship in glaucoma, and suggests that biomechanics may serve as a crucial indicator for the development and progression of glaucoma.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
3D Growth and Remodeling Theory Supports the Hypothesis of Staphyloma Formation from Local Scleral Weakening under Normal Intraocular Pressure
Authors:
Fabian A. Braeu,
Stéphane Avril,
Michaël J. A. Girard
Abstract:
$\bf{Purpose}$: To assess whether Growth & Remodeling (G&R) theory could explain staphyloma formation from a local scleral weakening.
$\bf{Methods}…
▽ More
$\bf{Purpose}$: To assess whether Growth & Remodeling (G&R) theory could explain staphyloma formation from a local scleral weakening.
$\bf{Methods}$: A finite element model of a healthy eye was reconstructed, including the following connective tissues: the lamina cribrosa, the peripapillary sclera, and the peripheral sclera. The scleral shell was modelled as a constrained mixture, consisting of an isotropic ground matrix and two collagen fiber families (circumferential and meridional). The homogenized constrained mixture model was employed to simulate the adaptation of the sclera to alterations in its biomechanical environment over a duration of 13.7 years. G&R processes were triggered by reducing the shear stiffness of the ground matrix in the peripapillary sclera and lamina cribrosa by 85%. Three distinct G&R scenarios were investigated: (1) low mass turnover rate in combination with transmural volumetric growth; (2) high mass turnover rate in combination with transmural volumetric growth; and (3) high mass turnover rate in combination with mass density growth.
$\bf{Results}$: In scenario 1, we observed a significant outpouching of the posterior pole, closely resembling the shape of a Type-III staphyloma. Additionally, we found a notable change in scleral curvature and a thinning of the peripapillary sclera by 84%. In contrast, scenarios 2 and 3 exhibited less drastic deformations, with stable posterior staphylomas after approximately 7 years.
$\bf{Conclusions}$: Our framework suggests that local scleral weakening is sufficient to trigger staphyloma formation under normal intraocular pressure. With patient-specific scleral geometries (obtainable via wide-field optical coherence tomography), our framework could aid in identifying individuals at risk of developing posterior staphylomas.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Deep learning-based deconvolution for interferometric radio transient reconstruction
Authors:
Benjamin Naoto Chiche,
Julien N. Girard,
Joana Frontera-Pons,
Arnaud Woiselle,
Jean-Luc Starck
Abstract:
Radio astronomy is currently thriving with new large ground-based radio telescopes coming online in preparation for the upcoming Square Kilometre Array (SKA). Facilities like LOFAR, MeerKAT/SKA, ASKAP/SKA, and the future SKA-LOW bring tremendous sensitivity in time and frequency, improved angular resolution, and also high-rate data streams that need to be processed. They enable advanced studies of…
▽ More
Radio astronomy is currently thriving with new large ground-based radio telescopes coming online in preparation for the upcoming Square Kilometre Array (SKA). Facilities like LOFAR, MeerKAT/SKA, ASKAP/SKA, and the future SKA-LOW bring tremendous sensitivity in time and frequency, improved angular resolution, and also high-rate data streams that need to be processed. They enable advanced studies of radio transients, volatile by nature, that can be detected or missed in the data. These transients are markers of high-energy accelerations of electrons and manifest in a wide range of temporal scales. Usually studied with dynamic spectroscopy of time series analysis, there is a motivation to search for such sources in large interferometric datasets. This requires efficient and robust signal reconstruction algorithms. To correctly account for the temporal dependency of the data, we improve the classical image deconvolution inverse problem by adding the temporal dependency in the reconstruction problem. Then, we introduce two novel neural network architectures that can do both spatial and temporal modeling of the data and the instrumental response. Then, we simulate representative time-dependent image cubes of point source distributions and realistic telescope pointings of MeerKAT to generate toy models to build the training, validation, and test datasets. Finally, based on the test data, we evaluate the source profile reconstruction performance of the proposed methods and classical image deconvolution algorithm CLEAN applied frame-by-frame. In the presence of increasing noise level in data frame, the proposed methods display a high level of robustness compared to frame-by-frame imaging with CLEAN. The deconvolved image cubes bring a factor of 3 improvement in fidelity of the recovered temporal profiles and a factor of 2 improvement in background denoising.
△ Less
Submitted 24 June, 2023;
originally announced June 2023.
-
The 3D Structural Phenotype of the Glaucomatous Optic Nerve Head and its Relationship with The Severity of Visual Field Damage
Authors:
Fabian A. Braeu,
Thanadet Chuangsuwanich,
Tin A. Tun,
Shamira A. Perera,
Rahat Husain,
Aiste Kadziauskiene,
Leopold Schmetterer,
Alexandre H. Thiéry,
George Barbastathis,
Tin Aung,
Michaël J. A. Girard
Abstract:
$\bf{Purpose}$: To describe the 3D structural changes in both connective and neural tissues of the optic nerve head (ONH) that occur concurrently at different stages of glaucoma using traditional and AI-driven approaches.
$\bf{Methods}$: We included 213 normal, 204 mild glaucoma (mean deviation [MD] $\ge…
▽ More
$\bf{Purpose}$: To describe the 3D structural changes in both connective and neural tissues of the optic nerve head (ONH) that occur concurrently at different stages of glaucoma using traditional and AI-driven approaches.
$\bf{Methods}$: We included 213 normal, 204 mild glaucoma (mean deviation [MD] $\ge$ -6.00 dB), 118 moderate glaucoma (MD of -6.01 to -12.00 dB), and 118 advanced glaucoma patients (MD < -12.00 dB). All subjects had their ONHs imaged in 3D with Spectralis optical coherence tomography. To describe the 3D structural phenotype of glaucoma as a function of severity, we used two different approaches: (1) We extracted human-defined 3D structural parameters of the ONH including retinal nerve fiber layer (RNFL) thickness, lamina cribrosa (LC) shape and depth at different stages of glaucoma; (2) we also employed a geometric deep learning method (i.e. PointNet) to identify the most important 3D structural features that differentiate ONHs from different glaucoma severity groups without any human input.
$\bf{Results}$: We observed that the majority of ONH structural changes occurred in the early glaucoma stage, followed by a plateau effect in the later stages. Using PointNet, we also found that 3D ONH structural changes were present in both neural and connective tissues. In both approaches, we observed that structural changes were more prominent in the superior and inferior quadrant of the ONH, particularly in the RNFL, the prelamina, and the LC. As the severity of glaucoma increased, these changes became more diffuse (i.e. widespread), particularly in the LC.
$\bf{Conclusions}$: In this study, we were able to uncover complex 3D structural changes of the ONH in both neural and connective tissues as a function of glaucoma severity. We hope to provide new insights into the complex pathophysiology of glaucoma that might help clinicians in their daily clinical care.
△ Less
Submitted 7 January, 2023;
originally announced January 2023.
-
Are Macula or Optic Nerve Head Structures better at Diagnosing Glaucoma? An Answer using AI and Wide-Field Optical Coherence Tomography
Authors:
Charis Y. N. Chiang,
Fabian Braeu,
Thanadet Chuangsuwanich,
Royston K. Y. Tan,
Jacqueline Chua,
Leopold Schmetterer,
Alexandre Thiery,
Martin Buist,
Michaël J. A. Girard
Abstract:
Purpose: (1) To develop a deep learning algorithm to automatically segment structures of the optic nerve head (ONH) and macula in 3D wide-field optical coherence tomography (OCT) scans; (2) To assess whether 3D macula or ONH structures (or the combination of both) provide the best diagnostic power for glaucoma. Methods: A cross-sectional comparative study was performed which included wide-field sw…
▽ More
Purpose: (1) To develop a deep learning algorithm to automatically segment structures of the optic nerve head (ONH) and macula in 3D wide-field optical coherence tomography (OCT) scans; (2) To assess whether 3D macula or ONH structures (or the combination of both) provide the best diagnostic power for glaucoma. Methods: A cross-sectional comparative study was performed which included wide-field swept-source OCT scans from 319 glaucoma subjects and 298 non-glaucoma subjects. All scans were compensated to improve deep-tissue visibility. We developed a deep learning algorithm to automatically label all major ONH tissue structures by using 270 manually annotated B-scans for training. The performance of our algorithm was assessed using the Dice coefficient (DC). A glaucoma classification algorithm (3D CNN) was then designed using a combination of 500 OCT volumes and their corresponding automatically segmented masks. This algorithm was trained and tested on 3 datasets: OCT scans cropped to contain the macular tissues only, those to contain the ONH tissues only, and the full wide-field OCT scans. The classification performance for each dataset was reported using the AUC. Results: Our segmentation algorithm was able to segment ONH and macular tissues with a DC of 0.94 $\pm$ 0.003. The classification algorithm was best able to diagnose glaucoma using wide-field 3D-OCT volumes with an AUC of 0.99 $\pm$ 0.01, followed by ONH volumes with an AUC of 0.93 $\pm$ 0.06, and finally macular volumes with an AUC of 0.91 $\pm$ 0.11. Conclusions: this study showed that using wide-field OCT as compared to the typical OCT images containing just the ONH or macular may allow for a significantly improved glaucoma diagnosis. This may encourage the mainstream adoption of 3D wide-field OCT scans. For clinical AI studies that use traditional machines, we would recommend the use of ONH scans as opposed to macula scans.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
AI-based Clinical Assessment of Optic Nerve Head Robustness Superseding Biomechanical Testing
Authors:
Fabian A. Braeu,
Thanadet Chuangsuwanich,
Tin A. Tun,
Alexandre H. Thiery,
Tin Aung,
George Barbastathis,
Michaël J. A. Girard
Abstract:
$\mathbf{Purpose}$: To use artificial intelligence (AI) to: (1) exploit biomechanical knowledge of the optic nerve head (ONH) from a relatively large population; (2) assess ONH robustness from a single optical coherence tomography (OCT) scan of the ONH; (3) identify what critical three-dimensional (3D) structural features make a given ONH robust.
$\mathbf{Design}…
▽ More
$\mathbf{Purpose}$: To use artificial intelligence (AI) to: (1) exploit biomechanical knowledge of the optic nerve head (ONH) from a relatively large population; (2) assess ONH robustness from a single optical coherence tomography (OCT) scan of the ONH; (3) identify what critical three-dimensional (3D) structural features make a given ONH robust.
$\mathbf{Design}$: Retrospective cross-sectional study.
$\mathbf{Methods}$: 316 subjects had their ONHs imaged with OCT before and after acute intraocular pressure (IOP) elevation through ophthalmo-dynamometry. IOP-induced lamina-cribrosa deformations were then mapped in 3D and used to classify ONHs. Those with LC deformations superior to 4% were considered fragile, while those with deformations inferior to 4% robust. Learning from these data, we compared three AI algorithms to predict ONH robustness strictly from a baseline (undeformed) OCT volume: (1) a random forest classifier; (2) an autoencoder; and (3) a dynamic graph CNN (DGCNN). The latter algorithm also allowed us to identify what critical 3D structural features make a given ONH robust.
$\mathbf{Results}$: All 3 methods were able to predict ONH robustness from 3D structural information alone and without the need to perform biomechanical testing. The DGCNN (area under the receiver operating curve [AUC]: 0.76 $\pm$ 0.08) outperformed the autoencoder (AUC: 0.70 $\pm$ 0.07) and the random forest classifier (AUC: 0.69 $\pm$ 0.05). Interestingly, to assess ONH robustness, the DGCNN mainly used information from the scleral canal and the LC insertion sites.
$\mathbf{Conclusions}$: We propose an AI-driven approach that can assess the robustness of a given ONH solely from a single OCT scan of the ONH, and without the need to perform biomechanical testing. Longitudinal studies should establish whether ONH robustness could help us identify fast visual field loss progressors.
△ Less
Submitted 9 June, 2022;
originally announced June 2022.
-
Medical Application of Geometric Deep Learning for the Diagnosis of Glaucoma
Authors:
Alexandre H. Thiery,
Fabian Braeu,
Tin A. Tun,
Tin Aung,
Michael J. A. Girard
Abstract:
Purpose: (1) To assess the performance of geometric deep learning (PointNet) in diagnosing glaucoma from a single optical coherence tomography (OCT) 3D scan of the optic nerve head (ONH); (2) To compare its performance to that obtained with a standard 3D convolutional neural network (CNN), and with a gold-standard glaucoma parameter, i.e. retinal nerve fiber layer (RNFL) thickness.
Methods: 3D r…
▽ More
Purpose: (1) To assess the performance of geometric deep learning (PointNet) in diagnosing glaucoma from a single optical coherence tomography (OCT) 3D scan of the optic nerve head (ONH); (2) To compare its performance to that obtained with a standard 3D convolutional neural network (CNN), and with a gold-standard glaucoma parameter, i.e. retinal nerve fiber layer (RNFL) thickness.
Methods: 3D raster scans of the ONH were acquired with Spectralis OCT for 477 glaucoma and 2,296 non-glaucoma subjects at the Singapore National Eye Centre. All volumes were automatically segmented using deep learning to identify 7 major neural and connective tissues including the RNFL, the prelamina, and the lamina cribrosa (LC). Each ONH was then represented as a 3D point cloud with 1,000 points chosen randomly from all tissue boundaries. To simplify the problem, all ONH point clouds were aligned with respect to the plane and center of Bruch's membrane opening. Geometric deep learning (PointNet) was then used to provide a glaucoma diagnosis from a single OCT point cloud. The performance of our approach was compared to that obtained with a 3D CNN, and with RNFL thickness.
Results: PointNet was able to provide a robust glaucoma diagnosis solely from the ONH represented as a 3D point cloud (AUC=95%). The performance of PointNet was superior to that obtained with a standard 3D CNN (AUC=87%) and with that obtained from RNFL thickness alone (AUC=80%).
Discussion: We provide a proof-of-principle for the application of geometric deep learning in the field of glaucoma. Our technique requires significantly less information as input to perform better than a 3D CNN, and with an AUC superior to that obtained from RNFL thickness alone. Geometric deep learning may have wide applicability in the field of Ophthalmology.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Geometric Deep Learning to Identify the Critical 3D Structural Features of the Optic Nerve Head for Glaucoma Diagnosis
Authors:
Fabian A. Braeu,
Alexandre H. Thiéry,
Tin A. Tun,
Aiste Kadziauskiene,
George Barbastathis,
Tin Aung,
Michaël J. A. Girard
Abstract:
Purpose: The optic nerve head (ONH) undergoes complex and deep 3D morphological changes during the development and progression of glaucoma. Optical coherence tomography (OCT) is the current gold standard to visualize and quantify these changes, however the resulting 3D deep-tissue information has not yet been fully exploited for the diagnosis and prognosis of glaucoma. To this end, we aimed: (1) T…
▽ More
Purpose: The optic nerve head (ONH) undergoes complex and deep 3D morphological changes during the development and progression of glaucoma. Optical coherence tomography (OCT) is the current gold standard to visualize and quantify these changes, however the resulting 3D deep-tissue information has not yet been fully exploited for the diagnosis and prognosis of glaucoma. To this end, we aimed: (1) To compare the performance of two relatively recent geometric deep learning techniques in diagnosing glaucoma from a single OCT scan of the ONH; and (2) To identify the 3D structural features of the ONH that are critical for the diagnosis of glaucoma.
Methods: In this study, we included a total of 2,247 non-glaucoma and 2,259 glaucoma scans from 1,725 subjects. All subjects had their ONHs imaged in 3D with Spectralis OCT. All OCT scans were automatically segmented using deep learning to identify major neural and connective tissues. Each ONH was then represented as a 3D point cloud. We used PointNet and dynamic graph convolutional neural network (DGCNN) to diagnose glaucoma from such 3D ONH point clouds and to identify the critical 3D structural features of the ONH for glaucoma diagnosis.
Results: Both the DGCNN (AUC: 0.97$\pm$0.01) and PointNet (AUC: 0.95$\pm$0.02) were able to accurately detect glaucoma from 3D ONH point clouds. The critical points formed an hourglass pattern with most of them located in the inferior and superior quadrant of the ONH.
Discussion: The diagnostic accuracy of both geometric deep learning approaches was excellent. Moreover, we were able to identify the critical 3D structural features of the ONH for glaucoma diagnosis that tremendously improved the transparency and interpretability of our method. Consequently, our approach may have strong potential to be used in clinical applications for the diagnosis and prognosis of a wide range of ophthalmic disorders.
△ Less
Submitted 20 April, 2022; v1 submitted 14 April, 2022;
originally announced April 2022.
-
Explainable and Interpretable Diabetic Retinopathy Classification Based on Neural-Symbolic Learning
Authors:
Se-In Jang,
Michael J. A. Girard,
Alexandre H. Thiery
Abstract:
In this paper, we propose an explainable and interpretable diabetic retinopathy (ExplainDR) classification model based on neural-symbolic learning. To gain explainability, a highlevel symbolic representation should be considered in decision making. Specifically, we introduce a human-readable symbolic representation, which follows a taxonomy style of diabetic retinopathy characteristics related to…
▽ More
In this paper, we propose an explainable and interpretable diabetic retinopathy (ExplainDR) classification model based on neural-symbolic learning. To gain explainability, a highlevel symbolic representation should be considered in decision making. Specifically, we introduce a human-readable symbolic representation, which follows a taxonomy style of diabetic retinopathy characteristics related to eye health conditions to achieve explainability. We then include humanreadable features obtained from the symbolic representation in the disease prediction. Experimental results on a diabetic retinopathy classification dataset show that our proposed ExplainDR method exhibits promising performance when compared to that from state-of-the-art methods applied to the IDRiD dataset, while also providing interpretability and explainability.
△ Less
Submitted 31 March, 2022;
originally announced April 2022.
-
3D Structural Analysis of the Optic Nerve Head to Robustly Discriminate Between Papilledema and Optic Disc Drusen
Authors:
Michaël J. A. Girard,
Satish K. Panda,
Tin Aung Tun,
Elisabeth A. Wibroe,
Raymond P. Najjar,
Aung Tin,
Alexandre H. Thiéry,
Steffen Hamann,
Clare Fraser,
Dan Milea
Abstract:
Purpose: (1) To develop a deep learning algorithm to identify major tissue structures of the optic nerve head (ONH) in 3D optical coherence tomography (OCT) scans; (2) to exploit such information to robustly differentiate among healthy, optic disc drusen (ODD), and papilledema ONHs.
It was a cross-sectional comparative study with confirmed ODD (105 eyes), papilledema due to high intracranial pre…
▽ More
Purpose: (1) To develop a deep learning algorithm to identify major tissue structures of the optic nerve head (ONH) in 3D optical coherence tomography (OCT) scans; (2) to exploit such information to robustly differentiate among healthy, optic disc drusen (ODD), and papilledema ONHs.
It was a cross-sectional comparative study with confirmed ODD (105 eyes), papilledema due to high intracranial pressure (51 eyes), and healthy controls (100 eyes). 3D scans of the ONHs were acquired using OCT, then processed to improve deep-tissue visibility. At first, a deep learning algorithm was developed using 984 B-scans (from 130 eyes) in order to identify: major neural/connective tissues, and ODD regions. The performance of our algorithm was assessed using the Dice coefficient (DC). In a 2nd step, a classification algorithm (random forest) was designed using 150 OCT volumes to perform 3-class classifications (1: ODD, 2: papilledema, 3: healthy) strictly from their drusen and prelamina swelling scores (derived from the segmentations). To assess performance, we reported the area under the receiver operating characteristic curves (AUCs) for each class.
Our segmentation algorithm was able to isolate neural and connective tissues, and ODD regions whenever present. This was confirmed by an average DC of 0.93$\pm$0.03 on the test set, corresponding to good performance. Classification was achieved with high AUCs, i.e. 0.99$\pm$0.01 for the detection of ODD, 0.99 $\pm$ 0.01 for the detection of papilledema, and 0.98$\pm$0.02 for the detection of healthy ONHs.
Our AI approach accurately discriminated ODD from papilledema, using a single OCT scan. Our classification performance was excellent, with the caveat that validation in a much larger population is warranted. Our approach may have the potential to establish OCT as the mainstay of diagnostic imaging in neuro-ophthalmology.
△ Less
Submitted 18 December, 2021;
originally announced December 2021.
-
The Three-Dimensional Structural Configuration of the Central Retinal Vessel Trunk and Branches as a Glaucoma Biomarker
Authors:
Satish K. Panda,
Haris Cheong,
Tin A. Tun,
Thanadet Chuangsuwanich,
Aiste Kadziauskiene,
Vijayalakshmi Senthil,
Ramaswami Krishnadas,
Martin L. Buist,
Shamira Perera,
Ching-Yu Cheng,
Tin Aung,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Purpose: To assess whether the three-dimensional (3D) structural configuration of the central retinal vessel trunk and its branches (CRVT&B) could be used as a diagnostic marker for glaucoma. Method: We trained a deep learning network to automatically segment the CRVT&B from the B-scans of the optical coherence tomography (OCT) volume of the optic nerve head (ONH). Subsequently, two different appr…
▽ More
Purpose: To assess whether the three-dimensional (3D) structural configuration of the central retinal vessel trunk and its branches (CRVT&B) could be used as a diagnostic marker for glaucoma. Method: We trained a deep learning network to automatically segment the CRVT&B from the B-scans of the optical coherence tomography (OCT) volume of the optic nerve head (ONH). Subsequently, two different approaches were used for glaucoma diagnosis using the structural configuration of the CRVT&B as extracted from the OCT volumes. In the first approach, we aimed to provide a diagnosis using only 3D CNN and the 3D structure of the CRVT&B. For the second approach, we projected the 3D structure of the CRVT&B orthographically onto three planes to obtain 2D images, and then a 2D CNN was used for diagnosis. The segmentation accuracy was evaluated using the Dice coefficient, whereas the diagnostic accuracy was assessed using the area under the receiver operating characteristic curves (AUC). The diagnostic performance of the CRVT&B was also compared with that of retinal nerve fiber layer (RNFL) thickness. Results: Our segmentation network was able to efficiently segment retinal blood vessels from OCT scans. On a test set, we achieved a Dice coefficient of 0.81\pm0.07. The 3D and 2D diagnostic networks were able to differentiate glaucoma from non-glaucoma subjects with accuracies of 82.7% and 83.3%, respectively. The corresponding AUCs for CRVT&B were 0.89 and 0.90, higher than those obtained with RNFL thickness alone. Conclusions: Our work demonstrated that the diagnostic power of the CRVT&B is superior to that of a gold-standard glaucoma parameter, i.e., RNFL thickness. Our work also suggested that the major retinal blood vessels form a skeleton -- the configuration of which may be representative of major ONH structural changes as typically observed with the development and progression of glaucoma.
△ Less
Submitted 8 November, 2021; v1 submitted 7 November, 2021;
originally announced November 2021.
-
To Rate or Not To Rate: Investigating Evaluation Methods for Generated Co-Speech Gestures
Authors:
Pieter Wolfert,
Jeffrey M. Girard,
Taras Kucherenko,
Tony Belpaeme
Abstract:
While automatic performance metrics are crucial for machine learning of artificial human-like behaviour, the gold standard for evaluation remains human judgement. The subjective evaluation of artificial human-like behaviour in embodied conversational agents is however expensive and little is known about the quality of the data it returns. Two approaches to subjective evaluation can be largely dist…
▽ More
While automatic performance metrics are crucial for machine learning of artificial human-like behaviour, the gold standard for evaluation remains human judgement. The subjective evaluation of artificial human-like behaviour in embodied conversational agents is however expensive and little is known about the quality of the data it returns. Two approaches to subjective evaluation can be largely distinguished, one relying on ratings, the other on pairwise comparisons. In this study we use co-speech gestures to compare the two against each other and answer questions about their appropriateness for evaluation of artificial behaviour. We consider their ability to rate quality, but also aspects pertaining to the effort of use and the time required to collect subjective data. We use crowd sourcing to rate the quality of co-speech gestures in avatars, assessing which method picks up more detail in subjective assessments. We compared gestures generated by three different machine learning models with various level of behavioural quality. We found that both approaches were able to rank the videos according to quality and that the ranking significantly correlated, showing that in terms of quality there is no preference of one method over the other. We also found that pairwise comparisons were slightly faster and came with improved inter-rater reliability, suggesting that for small-scale studies pairwise comparisons are to be favoured over ratings.
△ Less
Submitted 13 August, 2021; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Terminologies, mod{è}les de donn{é}es arch{é}ologiques et th{é}saurus documentaires
Authors:
Sébastien Durost,
Guillaume Reich,
Jean-Pierre Girard
Abstract:
The HyperTh{é}sau and Bibracte num{é}rique projects have given rise to a collective effort centred on the use of vocabulary as a means of ensuring the interoperability of archaeological data throughout its life cycle. To this end, the use of the standardised form of the thesaurus -- via the Opentheso platform -- provides a tool that is already adapted to the Linked Data. Nevertheless, its use quic…
▽ More
The HyperTh{é}sau and Bibracte num{é}rique projects have given rise to a collective effort centred on the use of vocabulary as a means of ensuring the interoperability of archaeological data throughout its life cycle. To this end, the use of the standardised form of the thesaurus -- via the Opentheso platform -- provides a tool that is already adapted to the Linked Data. Nevertheless, its use quickly raised the question of the different paradigms presiding over the elaboration of a specific vocabulary by each (group of) scientist(s). The ISO 25964 standard -- designed for the management and interoperability of indexing languages -- is flexible enough to permit the comparison and linking of different scientific or documentary ``points of view''. Their coherence through interoperability alignments nevertheless requires to interface different semantic granularities: search reporting, the description of raw data, a gateway or ''pivot'' between the two, by using a regulated cooperation methodology. The challenges that remain to be met on this path do not prevent the thesaurus tool from already being a suitable support for a complete ''human-to-machine-to-human'' interoperability, developed within the framework of the Bibracte Ville Ouverte project and exemplified through a research on the ceramics of that archaeological site.
△ Less
Submitted 6 July, 2021;
originally announced July 2021.
-
Describing the Structural Phenotype of the Glaucomatous Optic Nerve Head Using Artificial Intelligence
Authors:
Satish K. Panda,
Haris Cheong,
Tin A. Tun,
Sripad K. Devella,
Ramaswami Krishnadas,
Martin L. Buist,
Shamira Perera,
Ching-Yu Cheng,
Tin Aung,
Alexandre H. Thiéry,
Michaël J. A. Girard
Abstract:
The optic nerve head (ONH) typically experiences complex neural- and connective-tissue structural changes with the development and progression of glaucoma, and monitoring these changes could be critical for improved diagnosis and prognosis in the glaucoma clinic. The gold-standard technique to assess structural changes of the ONH clinically is optical coherence tomography (OCT). However, OCT is li…
▽ More
The optic nerve head (ONH) typically experiences complex neural- and connective-tissue structural changes with the development and progression of glaucoma, and monitoring these changes could be critical for improved diagnosis and prognosis in the glaucoma clinic. The gold-standard technique to assess structural changes of the ONH clinically is optical coherence tomography (OCT). However, OCT is limited to the measurement of a few hand-engineered parameters, such as the thickness of the retinal nerve fiber layer (RNFL), and has not yet been qualified as a stand-alone device for glaucoma diagnosis and prognosis applications. We argue this is because the vast amount of information available in a 3D OCT scan of the ONH has not been fully exploited. In this study we propose a deep learning approach that can: \textbf{(1)} fully exploit information from an OCT scan of the ONH; \textbf{(2)} describe the structural phenotype of the glaucomatous ONH; and that can \textbf{(3)} be used as a robust glaucoma diagnosis tool. Specifically, the structural features identified by our algorithm were found to be related to clinical observations of glaucoma. The diagnostic accuracy from these structural features was $92.0 \pm 2.3 \%$ with a sensitivity of $90.0 \pm 2.4 \% $ (at $95 \%$ specificity). By changing their magnitudes in steps, we were able to reveal how the morphology of the ONH changes as one transitions from a `non-glaucoma' to a `glaucoma' condition. We believe our work may have strong clinical implication for our understanding of glaucoma pathogenesis, and could be improved in the future to also predict future loss of vision.
△ Less
Submitted 17 December, 2020;
originally announced December 2020.
-
OCT-GAN: Single Step Shadow and Noise Removal from Optical Coherence Tomography Images of the Human Optic Nerve Head
Authors:
Haris Cheong,
Sripad Krishna Devalla,
Thanadet Chuangsuwanich,
Tin A. Tun,
Xiaofei Wang,
Tin Aung,
Leopold Schmetterer,
Martin L. Buist,
Craig Boote,
Alexandre H. Thiéry,
Michaël J. A. Girard
Abstract:
Speckle noise and retinal shadows within OCT B-scans occlude important edges, fine textures and deep tissues, preventing accurate and robust diagnosis by algorithms and clinicians. We developed a single process that successfully removed both noise and retinal shadows from unseen single-frame B-scans within 10.4ms. Mean average gradient magnitude (AGM) for the proposed algorithm was 57.2% higher th…
▽ More
Speckle noise and retinal shadows within OCT B-scans occlude important edges, fine textures and deep tissues, preventing accurate and robust diagnosis by algorithms and clinicians. We developed a single process that successfully removed both noise and retinal shadows from unseen single-frame B-scans within 10.4ms. Mean average gradient magnitude (AGM) for the proposed algorithm was 57.2% higher than current state-of-the-art, while mean peak signal to noise ratio (PSNR), contrast to noise ratio (CNR), and structural similarity index metric (SSIM) increased by 11.1%, 154% and 187% respectively compared to single-frame B-scans. Mean intralayer contrast (ILC) improvement for the retinal nerve fiber layer (RNFL), photoreceptor layer (PR) and retinal pigment epithelium (RPE) layers decreased from 0.362 \pm 0.133 to 0.142 \pm 0.102, 0.449 \pm 0.116 to 0.0904 \pm 0.0769, 0.381 \pm 0.100 to 0.0590 \pm 0.0451 respectively. The proposed algorithm reduces the necessity for long image acquisition times, minimizes expensive hardware requirements and reduces motion artifacts in OCT images.
△ Less
Submitted 6 October, 2020;
originally announced October 2020.
-
Toward Multimodal Modeling of Emotional Expressiveness
Authors:
Victoria Lin,
Jeffrey M. Girard,
Michael A. Sayette,
Louis-Philippe Morency
Abstract:
Emotional expressiveness captures the extent to which a person tends to outwardly display their emotions through behavior. Due to the close relationship between emotional expressiveness and behavioral health, as well as the crucial role that it plays in social interaction, the ability to automatically predict emotional expressiveness stands to spur advances in science, medicine, and industry. In t…
▽ More
Emotional expressiveness captures the extent to which a person tends to outwardly display their emotions through behavior. Due to the close relationship between emotional expressiveness and behavioral health, as well as the crucial role that it plays in social interaction, the ability to automatically predict emotional expressiveness stands to spur advances in science, medicine, and industry. In this paper, we explore three related research questions. First, how well can emotional expressiveness be predicted from visual, linguistic, and multimodal behavioral signals? Second, which behavioral modalities are uniquely important to the prediction of emotional expressiveness? Third, which behavioral signals are reliably related to emotional expressiveness? To answer these questions, we add highly reliable transcripts and human ratings of perceived emotional expressiveness to an existing video database and use this data to train, validate, and test predictive models. Our best model shows promising predictive performance on this dataset (RMSE=0.65, R^2=0.45, r=0.74). Multimodal models tend to perform best overall, and models trained on the linguistic modality tend to outperform models trained on the visual modality. Finally, examination of our interpretable models' coefficients reveals a number of visual and linguistic behavioral signals--such as facial action unit intensity, overall word count, and use of words related to social processes--that reliably predict emotional expressiveness.
△ Less
Submitted 31 August, 2020;
originally announced September 2020.
-
Towards Label-Free 3D Segmentation of Optical Coherence Tomography Images of the Optic Nerve Head Using Deep Learning
Authors:
Sripad Krishna Devalla,
Tan Hung Pham,
Satish Kumar Panda,
Liang Zhang,
Giridhar Subramanian,
Anirudh Swaminathan,
Chin Zhi Yun,
Mohan Rajan,
Sujatha Mohan,
Ramaswami Krishnadas,
Vijayalakshmi Senthil,
John Mark S. de Leon,
Tin A. Tun,
Ching-Yu Cheng,
Leopold Schmetterer,
Shamira Perera,
Tin Aung,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Since the introduction of optical coherence tomography (OCT), it has been possible to study the complex 3D morphological changes of the optic nerve head (ONH) tissues that occur along with the progression of glaucoma. Although several deep learning (DL) techniques have been recently proposed for the automated extraction (segmentation) and quantification of these morphological changes, the device s…
▽ More
Since the introduction of optical coherence tomography (OCT), it has been possible to study the complex 3D morphological changes of the optic nerve head (ONH) tissues that occur along with the progression of glaucoma. Although several deep learning (DL) techniques have been recently proposed for the automated extraction (segmentation) and quantification of these morphological changes, the device specific nature and the difficulty in preparing manual segmentations (training data) limit their clinical adoption. With several new manufacturers and next-generation OCT devices entering the market, the complexity in deploying DL algorithms clinically is only increasing. To address this, we propose a DL based 3D segmentation framework that is easily translatable across OCT devices in a label-free manner (i.e. without the need to manually re-segment data for each device). Specifically, we developed 2 sets of DL networks. The first (referred to as the enhancer) was able to enhance OCT image quality from 3 OCT devices, and harmonized image-characteristics across these devices. The second performed 3D segmentation of 6 important ONH tissue layers. We found that the use of the enhancer was critical for our segmentation network to achieve device independency. In other words, our 3D segmentation network trained on any of 3 devices successfully segmented ONH tissue layers from the other two devices with high performance (Dice coefficients > 0.92). With such an approach, we could automatically segment images from new OCT devices without ever needing manual segmentation data from such devices.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Context-Dependent Models for Predicting and Characterizing Facial Expressiveness
Authors:
Victoria Lin,
Jeffrey M. Girard,
Louis-Philippe Morency
Abstract:
In recent years, extensive research has emerged in affective computing on topics like automatic emotion recognition and determining the signals that characterize individual emotions. Much less studied, however, is expressiveness, or the extent to which someone shows any feeling or emotion. Expressiveness is related to personality and mental health and plays a crucial role in social interaction. As…
▽ More
In recent years, extensive research has emerged in affective computing on topics like automatic emotion recognition and determining the signals that characterize individual emotions. Much less studied, however, is expressiveness, or the extent to which someone shows any feeling or emotion. Expressiveness is related to personality and mental health and plays a crucial role in social interaction. As such, the ability to automatically detect or predict expressiveness can facilitate significant advancements in areas ranging from psychiatric care to artificial social intelligence. Motivated by these potential applications, we present an extension of the BP4D+ dataset with human ratings of expressiveness and develop methods for (1) automatically predicting expressiveness from visual data and (2) defining relationships between interpretable visual signals and expressiveness. In addition, we study the emotional context in which expressiveness occurs and hypothesize that different sets of signals are indicative of expressiveness in different contexts (e.g., in response to surprise or in response to pain). Analysis of our statistical models confirms our hypothesis. Consequently, by looking at expressiveness separately in distinct emotional contexts, our predictive models show significant improvements over baselines and achieve comparable results to human performance in terms of correlation with the ground truth.
△ Less
Submitted 10 December, 2019;
originally announced December 2019.
-
DeshadowGAN: A Deep Learning Approach to Remove Shadows from Optical Coherence Tomography Images
Authors:
Haris Cheong,
Sripad Krishna Devalla,
Tan Hung Pham,
Zhang Liang,
Tin Aung Tun,
Xiaofei Wang,
Shamira Perera,
Leopold Schmetterer,
Aung Tin,
Craig Boote,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Purpose: To remove retinal shadows from optical coherence tomography (OCT) images of the optic nerve head(ONH).
Methods:2328 OCT images acquired through the center of the ONH using a Spectralis OCT machine for both eyes of 13 subjects were used to train a generative adversarial network (GAN) using a custom loss function. Image quality was assessed qualitatively (for artifacts) and quantitatively…
▽ More
Purpose: To remove retinal shadows from optical coherence tomography (OCT) images of the optic nerve head(ONH).
Methods:2328 OCT images acquired through the center of the ONH using a Spectralis OCT machine for both eyes of 13 subjects were used to train a generative adversarial network (GAN) using a custom loss function. Image quality was assessed qualitatively (for artifacts) and quantitatively using the intralayer contrast: a measure of shadow visibility ranging from 0 (shadow-free) to 1 (strong shadow) and compared to compensated images. This was computed in the Retinal Nerve Fiber Layer (RNFL), the Inner Plexiform Layer (IPL), the Photoreceptor layer (PR) and the Retinal Pigment Epithelium (RPE) layers.
Results: Output images had improved intralayer contrast in all ONH tissue layers. On average the intralayer contrast decreased by 33.7$\pm$6.81%, 28.8$\pm$10.4%, 35.9$\pm$13.0%, and43.0$\pm$19.5%for the RNFL, IPL, PR, and RPE layers respectively, indicating successful shadow removal across all depths. This compared to 70.3$\pm$22.7%, 33.9$\pm$11.5%, 47.0$\pm$11.2%, 26.7$\pm$19.0%for compensation. Output images were also free from artifacts commonly observed with compensation.
Conclusions: DeshadowGAN significantly corrected blood vessel shadows in OCT images of the ONH. Our algorithm may be considered as a pre-processing step to improve the performance of a wide range of algorithms including those currently being used for OCT image segmentation, denoising, and classification.
Translational Relevance: DeshadowGAN could be integrated to existing OCT devices to improve the diagnosis and prognosis of ocular pathologies.
△ Less
Submitted 7 October, 2019;
originally announced October 2019.
-
Deep Learning Algorithms to Isolate and Quantify the Structures of the Anterior Segment in Optical Coherence Tomography Images
Authors:
Tan Hung Pham,
Sripad Krishna Devalla,
Aloysius Ang,
Soh Zhi Da,
Alexandre H. Thiery,
Craig Boote,
Ching-Yu Cheng,
Victor Koh,
Michael J. A. Girard
Abstract:
Accurate isolation and quantification of intraocular dimensions in the anterior segment (AS) of the eye using optical coherence tomography (OCT) images is important in the diagnosis and treatment of many eye diseases, especially angle closure glaucoma. In this study, we developed a deep convolutional neural network (DCNN) for the localization of the scleral spur, and the segmentation of anterior s…
▽ More
Accurate isolation and quantification of intraocular dimensions in the anterior segment (AS) of the eye using optical coherence tomography (OCT) images is important in the diagnosis and treatment of many eye diseases, especially angle closure glaucoma. In this study, we developed a deep convolutional neural network (DCNN) for the localization of the scleral spur, and the segmentation of anterior segment structures (iris, corneo-sclera shell, anterior chamber). With limited training data, the DCNN was able to detect the scleral spur on unseen ASOCT images as accurately as an experienced ophthalmologist; and simultaneously isolated the anterior segment structures with a Dice coefficient of 95.7%. We then automatically extracted eight clinically relevant ASOCT parameters and proposed an automated quality check process that asserts the reliability of these parameters. When combined with an OCT machine capable of imaging multiple radial sections, the algorithms can provide a more complete objective assessment. This is an essential step toward providing a robust automated framework for reliable quantification of ASOCT scans, for applications in the diagnosis and management of angle closure glaucoma.
△ Less
Submitted 1 September, 2019;
originally announced September 2019.
-
Learning Finer-class Networks for Universal Representations
Authors:
Julien Girard,
Youssef Tamaazousti,
Hervé Le Borgne,
Céline Hudelot
Abstract:
Many real-world visual recognition use-cases can not directly benefit from state-of-the-art CNN-based approaches because of the lack of many annotated data. The usual approach to deal with this is to transfer a representation pre-learned on a large annotated source-task onto a target-task of interest. This raises the question of how well the original representation is "universal", that is to say d…
▽ More
Many real-world visual recognition use-cases can not directly benefit from state-of-the-art CNN-based approaches because of the lack of many annotated data. The usual approach to deal with this is to transfer a representation pre-learned on a large annotated source-task onto a target-task of interest. This raises the question of how well the original representation is "universal", that is to say directly adapted to many different target-tasks. To improve such universality, the state-of-the-art consists in training networks on a diversified source problem, that is modified either by adding generic or specific categories to the initial set of categories. In this vein, we proposed a method that exploits finer-classes than the most specific ones existing, for which no annotation is available. We rely on unsupervised learning and a bottom-up split and merge strategy. We show that our method learns more universal representations than state-of-the-art, leading to significantly better results on 10 target-tasks from multiple domains, using several network architectures, either alone or combined with networks learned at a coarser semantic level.
△ Less
Submitted 4 October, 2018;
originally announced October 2018.
-
A Deep Learning Approach to Denoise Optical Coherence Tomography Images of the Optic Nerve Head
Authors:
Sripad Krishna Devalla,
Giridhar Subramanian,
Tan Hung Pham,
Xiaofei Wang,
Shamira Perera,
Tin A. Tun,
Tin Aung,
Leopold Schmetterer,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Purpose: To develop a deep learning approach to de-noise optical coherence tomography (OCT) B-scans of the optic nerve head (ONH).
Methods: Volume scans consisting of 97 horizontal B-scans were acquired through the center of the ONH using a commercial OCT device (Spectralis) for both eyes of 20 subjects. For each eye, single-frame (without signal averaging), and multi-frame (75x signal averaging…
▽ More
Purpose: To develop a deep learning approach to de-noise optical coherence tomography (OCT) B-scans of the optic nerve head (ONH).
Methods: Volume scans consisting of 97 horizontal B-scans were acquired through the center of the ONH using a commercial OCT device (Spectralis) for both eyes of 20 subjects. For each eye, single-frame (without signal averaging), and multi-frame (75x signal averaging) volume scans were obtained. A custom deep learning network was then designed and trained with 2,328 "clean B-scans" (multi-frame B-scans), and their corresponding "noisy B-scans" (clean B-scans + gaussian noise) to de-noise the single-frame B-scans. The performance of the de-noising algorithm was assessed qualitatively, and quantitatively on 1,552 B-scans using the signal to noise ratio (SNR), contrast to noise ratio (CNR), and mean structural similarity index metrics (MSSIM).
Results: The proposed algorithm successfully denoised unseen single-frame OCT B-scans. The denoised B-scans were qualitatively similar to their corresponding multi-frame B-scans, with enhanced visibility of the ONH tissues. The mean SNR increased from $4.02 \pm 0.68$ dB (single-frame) to $8.14 \pm 1.03$ dB (denoised). For all the ONH tissues, the mean CNR increased from $3.50 \pm 0.56$ (single-frame) to $7.63 \pm 1.81$ (denoised). The MSSIM increased from $0.13 \pm 0.02$ (single frame) to $0.65 \pm 0.03$ (denoised) when compared with the corresponding multi-frame B-scans.
Conclusions: Our deep learning algorithm can denoise a single-frame OCT B-scan of the ONH in under 20 ms, thus offering a framework to obtain superior quality OCT B-scans with reduced scanning times and minimal patient discomfort.
△ Less
Submitted 27 September, 2018;
originally announced September 2018.
-
DRUNET: A Dilated-Residual U-Net Deep Learning Network to Digitally Stain Optic Nerve Head Tissues in Optical Coherence Tomography Images
Authors:
Sripad Krishna Devalla,
Prajwal K. Renukanand,
Bharathwaj K. Sreedhar,
Shamira Perera,
Jean-Martial Mari,
Khai Sing Chin,
Tin A. Tun,
Nicholas G. Strouthidis,
Tin Aung,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Given that the neural and connective tissues of the optic nerve head (ONH) exhibit complex morphological changes with the development and progression of glaucoma, their simultaneous isolation from optical coherence tomography (OCT) images may be of great interest for the clinical diagnosis and management of this pathology. A deep learning algorithm was designed and trained to digitally stain (i.e.…
▽ More
Given that the neural and connective tissues of the optic nerve head (ONH) exhibit complex morphological changes with the development and progression of glaucoma, their simultaneous isolation from optical coherence tomography (OCT) images may be of great interest for the clinical diagnosis and management of this pathology. A deep learning algorithm was designed and trained to digitally stain (i.e. highlight) 6 ONH tissue layers by capturing both the local (tissue texture) and contextual information (spatial arrangement of tissues). The overall dice coefficient (mean of all tissues) was $0.91 \pm 0.05$ when assessed against manual segmentations performed by an expert observer. We offer here a robust segmentation framework that could be extended for the automated parametric study of the ONH tissues.
△ Less
Submitted 1 March, 2018;
originally announced March 2018.
-
An Activity-Based Quality Model for Maintainability
Authors:
Florian Deissenboeck,
Stefan Wagner,
Markus Pizka,
Stefan Teuchert,
Jean-François Girard
Abstract:
Maintainability is a key quality attribute of successful software systems. However, its management in practice is still problematic. Currently, there is no comprehensive basis for assessing and improving the maintainability of software systems. Quality models have been proposed to solve this problem. Nevertheless, existing approaches do not explicitly take into account the maintenance activities,…
▽ More
Maintainability is a key quality attribute of successful software systems. However, its management in practice is still problematic. Currently, there is no comprehensive basis for assessing and improving the maintainability of software systems. Quality models have been proposed to solve this problem. Nevertheless, existing approaches do not explicitly take into account the maintenance activities, that largely determine the software maintenance effort. This paper proposes a 2-dimensional model of maintainability that explicitly associates system properties with the activities carried out during maintenance. The separation of activities and properties facilitates the identification of sound quality criteria and allows to reason about their interdependencies. This transforms the quality model into a structured and comprehensive quality knowledge base that is usable in industrial project environments. For example, review guidelines can be generated from it. The model is based on an explicit quality metamodel that supports its systematic construction and fosters preciseness as well as completeness. An industrial case study demonstrates the applicability of the model for the evaluation of the maintainability of Matlab Simulink models that are frequently used in model-based development of embedded systems.
△ Less
Submitted 26 July, 2017;
originally announced July 2017.
-
A Deep Learning Approach to Digitally Stain Optical Coherence Tomography Images of the Optic Nerve Head
Authors:
Sripad Krishna Devalla,
Jean-Martial Mari,
Tin A. Tun,
Nicholas G. Strouthidis,
Tin Aung,
Alexandre H. Thiery,
Michael J. A. Girard
Abstract:
Purpose: To develop a deep learning approach to digitally-stain optical coherence tomography (OCT) images of the optic nerve head (ONH).
Methods: A horizontal B-scan was acquired through the center of the ONH using OCT (Spectralis) for 1 eye of each of 100 subjects (40 normal & 60 glaucoma). All images were enhanced using adaptive compensation. A custom deep learning network was then designed an…
▽ More
Purpose: To develop a deep learning approach to digitally-stain optical coherence tomography (OCT) images of the optic nerve head (ONH).
Methods: A horizontal B-scan was acquired through the center of the ONH using OCT (Spectralis) for 1 eye of each of 100 subjects (40 normal & 60 glaucoma). All images were enhanced using adaptive compensation. A custom deep learning network was then designed and trained with the compensated images to digitally stain (i.e. highlight) 6 tissue layers of the ONH. The accuracy of our algorithm was assessed (against manual segmentations) using the Dice coefficient, sensitivity, and specificity. We further studied how compensation and the number of training images affected the performance of our algorithm.
Results: For images it had not yet assessed, our algorithm was able to digitally stain the retinal nerve fiber layer + prelamina, the retinal pigment epithelium, all other retinal layers, the choroid, and the peripapillary sclera and lamina cribrosa. For all tissues, the mean dice coefficient was $0.84 \pm 0.03$, the mean sensitivity $0.92 \pm 0.03$, and the mean specificity $0.99 \pm 0.00$. Our algorithm performed significantly better when compensated images were used for training. Increasing the number of images (from 10 to 40) to train our algorithm did not significantly improve performance, except for the RPE.
Conclusion. Our deep learning algorithm can simultaneously stain neural and connective tissues in ONH images. Our approach offers a framework to automatically measure multiple key structural parameters of the ONH that may be critical to improve glaucoma management.
△ Less
Submitted 24 July, 2017;
originally announced July 2017.
-
FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge
Authors:
Michel F. Valstar,
Enrique Sánchez-Lozano,
Jeffrey F. Cohn,
László A. Jeni,
Jeffrey M. Girard,
Zheng Zhang,
Lijun Yin,
Maja Pantic
Abstract:
The field of Automatic Facial Expression Analysis has grown rapidly in recent years. However, despite progress in new approaches as well as benchmarking efforts, most evaluations still focus on either posed expressions, near-frontal recordings, or both. This makes it hard to tell how existing expression recognition approaches perform under conditions where faces appear in a wide range of poses (or…
▽ More
The field of Automatic Facial Expression Analysis has grown rapidly in recent years. However, despite progress in new approaches as well as benchmarking efforts, most evaluations still focus on either posed expressions, near-frontal recordings, or both. This makes it hard to tell how existing expression recognition approaches perform under conditions where faces appear in a wide range of poses (or camera views), displaying ecologically valid expressions. The main obstacle for assessing this is the availability of suitable data, and the challenge proposed here addresses this limitation. The FG 2017 Facial Expression Recognition and Analysis challenge (FERA 2017) extends FERA 2015 to the estimation of Action Units occurrence and intensity under different camera views. In this paper we present the third challenge in automatic recognition of facial expressions, to be held in conjunction with the 12th IEEE conference on Face and Gesture Recognition, May 2017, in Washington, United States. Two sub-challenges are defined: the detection of AU occurrence, and the estimation of AU intensity. In this work we outline the evaluation protocol, the data used, and the results of a baseline method for both sub-challenges.
△ Less
Submitted 14 February, 2017;
originally announced February 2017.