-
End-To-End Prediction of Knee Osteoarthritis Progression With Multi-Modal Transformers
Authors:
Egor Panfilov,
Simo Saarakkala,
Miika T. Nieminen,
Aleksei Tiulpin
Abstract:
Knee Osteoarthritis (KOA) is a highly prevalent chronic musculoskeletal condition with no currently available treatment. The manifestation of KOA is heterogeneous and prediction of its progression is challenging. Current literature suggests that the use of multi-modal data and advanced modeling methods, such as the ones based on Deep Learning, has promise in tackling this challenge. To date, howev…
▽ More
Knee Osteoarthritis (KOA) is a highly prevalent chronic musculoskeletal condition with no currently available treatment. The manifestation of KOA is heterogeneous and prediction of its progression is challenging. Current literature suggests that the use of multi-modal data and advanced modeling methods, such as the ones based on Deep Learning, has promise in tackling this challenge. To date, however, the evidence on the efficacy of this approach is limited. In this study, we leveraged recent advances in Deep Learning and, using a Transformer approach, developed a unified framework for the multi-modal fusion of knee imaging data. Subsequently, we analyzed its performance across a range of scenarios by investigating multiple progression horizons -- from short-term to long-term. We report our findings using a large cohort (n=2421-3967) derived from the Osteoarthritis Initiative dataset. We show that structural knee MRI allows identifying radiographic KOA progressors on par with multi-modal fusion approaches, achieving an area under the ROC curve (ROC AUC) of 0.70-0.76 and Average Precision (AP) of 0.15-0.54 in 2-8 year horizons. Progression within 1 year was better predicted with a multi-modal method using X-ray, structural, and compositional MR images -- ROC AUC of 0.76(0.04), AP of 0.13(0.04) -- or via clinical data. Our follow-up analysis generally shows that prediction from the imaging data is more accurate for post-traumatic subjects, and we further investigate which subject subgroups may benefit the most. The present study provides novel insights into multi-modal imaging of KOA and brings a unified data-driven framework for studying its progression in an end-to-end manner, providing new tools for the design of more efficient clinical trials. The source code of our framework and the pre-trained models are made publicly available.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
A Stronger Baseline For Automatic Pfirrmann Grading Of Lumbar Spine MRI Using Deep Learning
Authors:
Narasimharao Kowlagi,
Huy Hoang Nguyen,
Terence McSweeney,
Simo Saarakkala,
Juhani määttä,
Jaro Karppinen,
Aleksei Tiulpin
Abstract:
This paper addresses the challenge of grading visual features in lumbar spine MRI using Deep Learning. Such a method is essential for the automatic quantification of structural changes in the spine, which is valuable for understanding low back pain. Multiple recent studies investigated different architecture designs, and the most recent success has been attributed to the use of transformer archite…
▽ More
This paper addresses the challenge of grading visual features in lumbar spine MRI using Deep Learning. Such a method is essential for the automatic quantification of structural changes in the spine, which is valuable for understanding low back pain. Multiple recent studies investigated different architecture designs, and the most recent success has been attributed to the use of transformer architectures. In this work, we argue that with a well-tuned three-stage pipeline comprising semantic segmentation, localization, and classification, convolutional networks outperform the state-of-the-art approaches. We conducted an ablation study of the existing methods in a population cohort, and report performance generalization across various subgroups. Our code is publicly available to advance research on disc degeneration and low back pain.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data
Authors:
Huy Hoang Nguyen,
Matthew B. Blaschko,
Simo Saarakkala,
Aleksei Tiulpin
Abstract:
Deep neural networks are often applied to medical images to automate the problem of medical diagnosis. However, a more clinically relevant question that practitioners usually face is how to predict the future trajectory of a disease. Current methods for prognosis or disease trajectory forecasting often require domain knowledge and are complicated to apply. In this paper, we formulate the prognosis…
▽ More
Deep neural networks are often applied to medical images to automate the problem of medical diagnosis. However, a more clinically relevant question that practitioners usually face is how to predict the future trajectory of a disease. Current methods for prognosis or disease trajectory forecasting often require domain knowledge and are complicated to apply. In this paper, we formulate the prognosis prediction problem as a one-to-many prediction problem. Inspired by a clinical decision-making process with two agents -- a radiologist and a general practitioner -- we predict prognosis with two transformer-based components that share information with each other. The first transformer in this framework aims to analyze the imaging data, and the second one leverages its internal states as inputs, also fusing them with auxiliary clinical data. The temporal nature of the problem is modeled within the transformer states, allowing us to treat the forecasting problem as a multi-task classification, for which we propose a novel loss. We show the effectiveness of our approach in predicting the development of structural knee osteoarthritis changes and forecasting Alzheimer's disease clinical status directly from raw multi-modal data. The proposed method outperforms multiple state-of-the-art baselines with respect to performance and calibration, both of which are needed for real-world applications. An open-source implementation of our method is made publicly available at \url{https://github.com/Oulu-IMEDS/CLIMATv2}.
△ Less
Submitted 19 September, 2023; v1 submitted 25 October, 2022;
originally announced October 2022.
-
AdaTriplet: Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching
Authors:
Khanh Nguyen,
Huy Hoang Nguyen,
Aleksei Tiulpin
Abstract:
This paper tackles the challenge of forensic medical image matching (FMIM) using deep neural networks (DNNs). FMIM is a particular case of content-based image retrieval (CBIR). The main challenge in FMIM compared to the general case of CBIR, is that the subject to whom a query image belongs may be affected by aging and progressive degenerative disorders, making it difficult to match data on a subj…
▽ More
This paper tackles the challenge of forensic medical image matching (FMIM) using deep neural networks (DNNs). FMIM is a particular case of content-based image retrieval (CBIR). The main challenge in FMIM compared to the general case of CBIR, is that the subject to whom a query image belongs may be affected by aging and progressive degenerative disorders, making it difficult to match data on a subject level. CBIR with DNNs is generally solved by minimizing a ranking loss, such as Triplet loss (TL), computed on image representations extracted by a DNN from the original data. TL, in particular, operates on triplets: anchor, positive (similar to anchor) and negative (dissimilar to anchor). Although TL has been shown to perform well in many CBIR tasks, it still has limitations, which we identify and analyze in this work. In this paper, we introduce (i) the AdaTriplet loss -- an extension of TL whose gradients adapt to different difficulty levels of negative samples, and (ii) the AutoMargin method -- a technique to adjust hyperparameters of margin-based losses such as TL and our proposed loss dynamically. Our results are evaluated on two large-scale benchmarks for FMIM based on the Osteoarthritis Initiative and Chest X-ray-14 datasets. The codes allowing replication of this study have been made publicly available at \url{https://github.com/Oulu-IMEDS/AdaTriplet}.
△ Less
Submitted 10 May, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Predicting Knee Osteoarthritis Progression from Structural MRI using Deep Learning
Authors:
Egor Panfilov,
Simo Saarakkala,
Miika T. Nieminen,
Aleksei Tiulpin
Abstract:
Accurate prediction of knee osteoarthritis (KOA) progression from structural MRI has a potential to enhance disease understanding and support clinical trials. Prior art focused on manually designed imaging biomarkers, which may not fully exploit all disease-related information present in MRI scan. In contrast, our method learns relevant representations from raw data end-to-end using Deep Learning,…
▽ More
Accurate prediction of knee osteoarthritis (KOA) progression from structural MRI has a potential to enhance disease understanding and support clinical trials. Prior art focused on manually designed imaging biomarkers, which may not fully exploit all disease-related information present in MRI scan. In contrast, our method learns relevant representations from raw data end-to-end using Deep Learning, and uses them for progression prediction. The method employs a 2D CNN to process the data slice-wise and aggregate the extracted features using a Transformer. Evaluated on a large cohort (n=4,866), the proposed method outperforms conventional 2D and 3D CNN-based models and achieves average precision of $0.58\pm0.03$ and ROC AUC of $0.78\pm0.01$. This paper sets a baseline on end-to-end KOA progression prediction from structural MRI. Our code is publicly available at https://github.com/MIPT-Oulu/OAProgressionMR.
△ Less
Submitted 26 January, 2022;
originally announced January 2022.
-
Common Limitations of Image Processing Metrics: A Picture Story
Authors:
Annika Reinke,
Minu D. Tizabi,
Carole H. Sudre,
Matthias Eisenmann,
Tim Rädsch,
Michael Baumgartner,
Laura Acion,
Michela Antonelli,
Tal Arbel,
Spyridon Bakas,
Peter Bankhead,
Arriel Benis,
Matthew Blaschko,
Florian Buettner,
M. Jorge Cardoso,
Jianxu Chen,
Veronika Cheplygina,
Evangelia Christodoulou,
Beth Cimini,
Gary S. Collins,
Sandy Engelhardt,
Keyvan Farahani,
Luciana Ferrer,
Adrian Galdran,
Bram van Ginneken
, et al. (68 additional authors not shown)
Abstract:
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using spe…
▽ More
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide.
△ Less
Submitted 6 December, 2023; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Critical Evaluation of Deep Neural Networks for Wrist Fracture Detection
Authors:
Abu Mohammed Raisuddin,
Elias Vaattovaara,
Mika Nevalainen,
Marko Nikki,
Elina Järvenpää,
Kaisa Makkonen,
Pekka Pinola,
Tuula Palsio,
Arttu Niemensivu,
Osmo Tervonen,
Aleksei Tiulpin
Abstract:
Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is needed for diagnosis. Recent advances in the field of Deep Learning (DL), a subfield of Artificial Intelligence (A…
▽ More
Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is needed for diagnosis. Recent advances in the field of Deep Learning (DL), a subfield of Artificial Intelligence (AI), have shown that wrist fracture detection can be automated using Convolutional Neural Networks. However, previous studies did not pay close attention to the difficult cases which can only be confirmed via CT imaging. In this study, we have developed and analyzed a state-of-the-art DL-based pipeline for wrist (distal radius) fracture detection -- DeepWrist, and evaluated it against one general population test set, and one challenging test set comprising only cases requiring confirmation by CT. Our results reveal that a typical state-of-the-art approach, such as DeepWrist, while having a near-perfect performance on the general independent test set, has a substantially lower performance on the challenging test set -- average precision of 0.99 (0.99-0.99) vs 0.64 (0.46-0.83), respectively. Similarly, the area under the ROC curve was of 0.99 (0.98-0.99) vs 0.84 (0.72-0.93), respectively. Our findings highlight the importance of a meticulous analysis of DL-based models before clinical use, and unearth the need for more challenging settings for testing medical AI systems.
△ Less
Submitted 5 March, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Semixup: In- and Out-of-Manifold Regularization for Deep Semi-Supervised Knee Osteoarthritis Severity Grading from Plain Radiographs
Authors:
Huy Hoang Nguyen,
Simo Saarakkala,
Matthew Blaschko,
Aleksei Tiulpin
Abstract:
Knee osteoarthritis (OA) is one of the highest disability factors in the world. This musculoskeletal disorder is assessed from clinical symptoms, and typically confirmed via radiographic assessment. This visual assessment done by a radiologist requires experience, and suffers from moderate to high inter-observer variability. The recent literature has shown that deep learning methods can reliably p…
▽ More
Knee osteoarthritis (OA) is one of the highest disability factors in the world. This musculoskeletal disorder is assessed from clinical symptoms, and typically confirmed via radiographic assessment. This visual assessment done by a radiologist requires experience, and suffers from moderate to high inter-observer variability. The recent literature has shown that deep learning methods can reliably perform the OA severity assessment according to the gold standard Kellgren-Lawrence (KL) grading system. However, these methods require large amounts of labeled data, which are costly to obtain. In this study, we propose the Semixup algorithm, a semi-supervised learning (SSL) approach to leverage unlabeled data. Semixup relies on consistency regularization using in- and out-of-manifold samples, together with interpolated consistency. On an independent test set, our method significantly outperformed other state-of-the-art SSL methods in most cases. Finally, when compared to a well-tuned fully supervised baseline that yielded a balanced accuracy (BA) of $70.9\pm0.8%$ on the test set, Semixup had comparable performance -- BA of $71\pm0.8%$ $(p=0.368)$ while requiring $6$ times less labeled data. These results show that our proposed SSL method allows building fully automatic OA severity assessment tools with datasets that are available outside research settings.
△ Less
Submitted 12 August, 2020; v1 submitted 4 March, 2020;
originally announced March 2020.
-
Adaptive Segmentation of Knee Radiographs for Selecting the Optimal ROI in Texture Analysis
Authors:
Neslihan Bayramoglu,
Aleksei Tiulpin,
Jukka Hirvasniemi,
Miika T. Nieminen,
Simo Saarakkala
Abstract:
The purposes of this study were to investigate: 1) the effect of placement of region-of-interest (ROI) for texture analysis of subchondral bone in knee radiographs, and 2) the ability of several texture descriptors to distinguish between the knees with and without radiographic osteoarthritis (OA). Bilateral posterior-anterior knee radiographs were analyzed from the baseline of OAI and MOST dataset…
▽ More
The purposes of this study were to investigate: 1) the effect of placement of region-of-interest (ROI) for texture analysis of subchondral bone in knee radiographs, and 2) the ability of several texture descriptors to distinguish between the knees with and without radiographic osteoarthritis (OA). Bilateral posterior-anterior knee radiographs were analyzed from the baseline of OAI and MOST datasets. A fully automatic method to locate the most informative region from subchondral bone using adaptive segmentation was developed. We used an oversegmentation strategy for partitioning knee images into the compact regions that follow natural texture boundaries. LBP, Fractal Dimension (FD), Haralick features, Shannon entropy, and HOG methods were computed within the standard ROI and within the proposed adaptive ROIs. Subsequently, we built logistic regression models to identify and compare the performances of each texture descriptor and each ROI placement method using 5-fold cross validation setting. Importantly, we also investigated the generalizability of our approach by training the models on OAI and testing them on MOST dataset.We used area under the receiver operating characteristic (ROC) curve (AUC) and average precision (AP) obtained from the precision-recall (PR) curve to compare the results. We found that the adaptive ROI improves the classification performance (OA vs. non-OA) over the commonly used standard ROI (up to 9% percent increase in AUC). We also observed that, from all texture parameters, LBP yielded the best performance in all settings with the best AUC of 0.840 [0.825, 0.852] and associated AP of 0.804 [0.786, 0.820]. Compared to the current state-of-the-art approaches, our results suggest that the proposed adaptive ROI approach in texture analysis of subchondral bone can increase the diagnostic performance for detecting the presence of radiographic OA.
△ Less
Submitted 21 August, 2019;
originally announced August 2019.
-
Improving Robustness of Deep Learning Based Knee MRI Segmentation: Mixup and Adversarial Domain Adaptation
Authors:
Egor Panfilov,
Aleksei Tiulpin,
Stefan Klein,
Miika T. Nieminen,
Simo Saarakkala
Abstract:
Degeneration of articular cartilage (AC) is actively studied in knee osteoarthritis (OA) research via magnetic resonance imaging (MRI). Segmentation of AC tissues from MRI data is an essential step in quantification of their damage. Deep learning (DL) based methods have shown potential in this realm and are the current state-of-the-art, however, their robustness to heterogeneity of MRI acquisition…
▽ More
Degeneration of articular cartilage (AC) is actively studied in knee osteoarthritis (OA) research via magnetic resonance imaging (MRI). Segmentation of AC tissues from MRI data is an essential step in quantification of their damage. Deep learning (DL) based methods have shown potential in this realm and are the current state-of-the-art, however, their robustness to heterogeneity of MRI acquisition settings remains an open problem. In this study, we investigated two modern regularization techniques -- mixup and adversarial unsupervised domain adaptation (UDA) -- to improve the robustness of DL-based knee cartilage segmentation to new MRI acquisition settings. Our validation setup included two datasets produced by different MRI scanners and using distinct data acquisition protocols. We assessed the robustness of automatic segmentation by comparing mixup and UDA approaches to a strong baseline method at different OA severity stages and, additionally, in relation to anatomical locations. Our results showed that for moderate changes in knee MRI data acquisition settings both approaches may provide notable improvements in the robustness, which are consistent for all stages of the disease and affect the clinically important areas of the knee joint. However, mixup may be considered as a recommended approach, since it is more computationally efficient and does not require additional data from the target acquisition setup.
△ Less
Submitted 27 October, 2019; v1 submitted 12 August, 2019;
originally announced August 2019.
-
Bayesian Feature Pyramid Networks for Automatic Multi-Label Segmentation of Chest X-rays and Assessment of Cardio-Thoratic Ratio
Authors:
Roman Solovyev,
Iaroslav Melekhov,
Timo Lesonen,
Elias Vaattovaara,
Osmo Tervonen,
Aleksei Tiulpin
Abstract:
Cardiothoratic ratio (CTR) estimated from chest radiographs is a marker indicative of cardiomegaly, the presence of which is in the criteria for heart failure diagnosis. Existing methods for automatic assessment of CTR are driven by Deep Learning-based segmentation. However, these techniques produce only point estimates of CTR but clinical decision making typically assumes the uncertainty. In this…
▽ More
Cardiothoratic ratio (CTR) estimated from chest radiographs is a marker indicative of cardiomegaly, the presence of which is in the criteria for heart failure diagnosis. Existing methods for automatic assessment of CTR are driven by Deep Learning-based segmentation. However, these techniques produce only point estimates of CTR but clinical decision making typically assumes the uncertainty. In this paper, we propose a novel method for chest X-ray segmentation and CTR assessment in an automatic manner. In contrast to the previous art, we, for the first time, propose to estimate CTR with uncertainty bounds. Our method is based on Deep Convolutional Neural Network with Feature Pyramid Network (FPN) decoder. We propose two modifications of FPN: replace the batch normalization with instance normalization and inject the dropout which allows to obtain the Monte-Carlo estimates of the segmentation maps at test time. Finally, using the predicted segmentation mask samples, we estimate CTR with uncertainty. In our experiments we demonstrate that the proposed method generalizes well to three different test sets. Finally, we make the annotations produced by two radiologists for all our datasets publicly available.
△ Less
Submitted 8 August, 2019;
originally announced August 2019.
-
Automatic Grading of Individual Knee Osteoarthritis Features in Plain Radiographs using Deep Convolutional Neural Networks
Authors:
Aleksei Tiulpin,
Simo Saarakkala
Abstract:
Knee osteoarthritis (OA) is the most common musculoskeletal disease in the world. In primary healthcare, knee OA is diagnosed using clinical examination and radiographic assessment. Osteoarthritis Research Society International (OARSI) atlas of OA radiographic features allows to perform independent assessment of knee osteophytes, joint space narrowing and other knee features. This provides a fine-…
▽ More
Knee osteoarthritis (OA) is the most common musculoskeletal disease in the world. In primary healthcare, knee OA is diagnosed using clinical examination and radiographic assessment. Osteoarthritis Research Society International (OARSI) atlas of OA radiographic features allows to perform independent assessment of knee osteophytes, joint space narrowing and other knee features. This provides a fine-grained OA severity assessment of the knee, compared to the gold standard and most commonly used Kellgren-Lawrence (KL) composite score. However, both OARSI and KL grading systems suffer from moderate inter-rater agreement, and therefore, the use of computer-aided methods could help to improve the reliability of the process. In this study, we developed a robust, automatic method to simultaneously predict KL and OARSI grades in knee radiographs. Our method is based on Deep Learning and leverages an ensemble of deep residual networks with 50 layers, squeeze-excitation and ResNeXt blocks. Here, we used transfer learning from ImageNet with a fine-tuning on the whole Osteoarthritis Initiative (OAI) dataset. An independent testing of our model was performed on the whole Multicenter Osteoarthritis Study (MOST) dataset. Our multi-task method yielded Cohen's kappa coefficients of 0.82 for KL-grade and 0.79, 0.84, 0.94, 0.83, 0.84, 0.90 for femoral osteophytes, tibial osteophytes and joint space narrowing for lateral and medial compartments respectively. Furthermore, our method yielded area under the ROC curve of 0.98 and average precision of 0.98 for detecting the presence of radiographic OA (KL $\geq 2$), which is better than the current state-of-the-art.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography
Authors:
Aleksei Tiulpin,
Mikko Finnilä,
Petri Lehenkari,
Heikki J. Nieminen,
Simo Saarakkala
Abstract:
Three-dimensional (3D) semi-quantitative grading of pathological features in articular cartilage (AC) offers significant improvements in basic research of osteoarthritis (OA). We have earlier developed the 3D protocol for imaging of AC and its structures which includes staining of the sample with a contrast agent (phosphotungstic acid, PTA) and a consequent scanning with micro-computed tomography.…
▽ More
Three-dimensional (3D) semi-quantitative grading of pathological features in articular cartilage (AC) offers significant improvements in basic research of osteoarthritis (OA). We have earlier developed the 3D protocol for imaging of AC and its structures which includes staining of the sample with a contrast agent (phosphotungstic acid, PTA) and a consequent scanning with micro-computed tomography. Such a protocol was designed to provide X-ray attenuation contrast to visualize AC structure. However, at the same time, this protocol has one major disadvantage: the loss of contrast at the tidemark (calcified cartilage interface, CCI). An accurate segmentation of CCI can be very important for understanding the etiology of OA and ex-vivo evaluation of tidemark condition at early OA stages. In this paper, we present the first application of Deep Learning to PTA-stained osteochondral samples that allows to perform tidemark segmentation in a fully-automatic manner. Our method is based on U-Net trained using a combination of binary cross-entropy and soft Jaccard loss. On cross-validation, this approach yielded intersection over the union of 0.59, 0.70, 0.79, 0.83 and 0.86 within 15 μm, 30 μm, 45 μm, 60 μm and 75 μm padded zones around the tidemark, respectively. Our codes and the dataset that consisted of 35 PTA-stained human AC samples are made publicly available together with the segmentation masks to facilitate the development of biomedical image segmentation methods.
△ Less
Submitted 11 July, 2019;
originally announced July 2019.
-
Breast Tumor Cellularity Assessment using Deep Neural Networks
Authors:
Alexander Rakhlin,
Aleksei Tiulpin,
Alexey A. Shvets,
Alexandr A. Kalinin,
Vladimir I. Iglovikov,
Sergey Nikolenko
Abstract:
Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by p…
▽ More
Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by pathologists; this process is tedious and prone to errors or low agreement rates between assessors. In this work, we evaluated three strong novel Deep Learning-based approaches for automatic assessment of tumor cellularity from post-treated breast surgical specimens stained with hematoxylin and eosin. We validated the proposed methods on the BreastPathQ SPIE challenge dataset that consisted of 2395 image patches selected from whole slide images acquired from 64 patients. Compared to expert pathologist scoring, our best performing method yielded the Cohen's kappa coefficient of 0.70 (vs. 0.42 previously known in literature) and the intra-class correlation coefficient of 0.89 (vs. 0.83). Our results suggest that Deep Learning-based methods have a significant potential to alleviate the burden on pathologists, enhance the diagnostic workflow, and, thereby, facilitate better clinical outcomes in breast cancer treatment.
△ Less
Submitted 3 September, 2019; v1 submitted 5 May, 2019;
originally announced May 2019.