-
Assessing Concordance between RNA-Seq and NanoString Technologies in Ebola-Infected Nonhuman Primates Using Machine Learning
Authors:
Mostafa Rezapour,
Aarthi Narayanan,
Wyatt H. Mowery,
Metin Nafi Gurcan
Abstract:
This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). We performed a detailed comparison of both platforms, demonstrating a strong correlation between them, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88, with a mean of 0.83 and a me…
▽ More
This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). We performed a detailed comparison of both platforms, demonstrating a strong correlation between them, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88, with a mean of 0.83 and a median of 0.85. Bland-Altman analysis further confirmed high consistency, with most measurements falling within 95% confidence limits. A machine learning approach, using the Supervised Magnitude-Altitude Scoring (SMAS) method trained on NanoString data, identified OAS1 as a key marker for distinguishing RT-qPCR positive from negative samples. Remarkably, when applied to RNA-Seq data, OAS1 also achieved 100% accuracy in differentiating infected from uninfected samples using logistic regression, demonstrating its robustness across platforms. Further differential expression analysis identified 12 common genes including ISG15, OAS1, IFI44, IFI27, IFIT2, IFIT3, IFI44L, MX1, MX2, OAS2, RSAD2, and OASL which demonstrated the highest levels of statistical significance and biological relevance across both platforms. Gene Ontology (GO) analysis confirmed that these genes are directly involved in key immune and viral infection pathways, reinforcing their importance in EBOV infection. In addition, RNA-Seq uniquely identified genes such as CASP5, USP18, and DDX60, which play key roles in immune regulation and antiviral defense. This finding highlights the broader detection capabilities of RNA-Seq and underscores the complementary strengths of both platforms in providing a comprehensive and accurate assessment of gene expression changes during Ebola virus infection.
△ Less
Submitted 30 October, 2024;
originally announced October 2024.
-
Computational Pathology for Accurate Prediction of Breast Cancer Recurrence: Development and Validation of a Deep Learning-based Tool
Authors:
Ziyu Su,
Yongxin Guo,
Robert Wesolowski,
Gary Tozbikian,
Nathaniel S. O'Connell,
M. Khalid Khan Niazi,
Metin N. Gurcan
Abstract:
Accurate recurrence risk stratification is crucial for optimizing treatment plans for breast cancer patients. Current prognostic tools like Oncotype DX (ODX) offer valuable genomic insights for HR+/HER2- patients but are limited by cost and accessibility, particularly in underserved populations. In this study, we present Deep-BCR-Auto, a deep learning-based computational pathology approach that pr…
▽ More
Accurate recurrence risk stratification is crucial for optimizing treatment plans for breast cancer patients. Current prognostic tools like Oncotype DX (ODX) offer valuable genomic insights for HR+/HER2- patients but are limited by cost and accessibility, particularly in underserved populations. In this study, we present Deep-BCR-Auto, a deep learning-based computational pathology approach that predicts breast cancer recurrence risk from routine H&E-stained whole slide images (WSIs). Our methodology was validated on two independent cohorts: the TCGA-BRCA dataset and an in-house dataset from The Ohio State University (OSU). Deep-BCR-Auto demonstrated robust performance in stratifying patients into low- and high-recurrence risk categories. On the TCGA-BRCA dataset, the model achieved an area under the receiver operating characteristic curve (AUROC) of 0.827, significantly outperforming existing weakly supervised models (p=0.041). In the independent OSU dataset, Deep-BCR-Auto maintained strong generalizability, achieving an AUROC of 0.832, along with 82.0% accuracy, 85.0% specificity, and 67.7% sensitivity. These findings highlight the potential of computational pathology as a cost-effective alternative for recurrence risk assessment, broadening access to personalized treatment strategies. This study underscores the clinical utility of integrating deep learning-based computational pathology into routine pathological assessment for breast cancer prognosis across diverse clinical settings.
△ Less
Submitted 23 September, 2024;
originally announced September 2024.
-
Machine Learning-Based Analysis of Ebola Virus' Impact on Gene Expression in Nonhuman Primates
Authors:
Mostafa Rezapour,
Muhammad Khalid Khan Niazi,
Hao Lu,
Aarthi Narayanan,
Metin Nafi Gurcan
Abstract:
This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis.…
▽ More
This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a machine learning-based approach, for analyzing gene expression data obtained from nonhuman primates (NHPs) infected with Ebola virus (EBOV). We utilize a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs, deploying the SMAS system for nuanced host-pathogen interaction analysis. SMAS effectively combines gene selection based on statistical significance and expression changes, employing linear classifiers such as logistic regression to accurately differentiate between RT-qPCR positive and negative NHP samples. A key finding of our research is the identification of IFI6 and IFI27 as critical biomarkers, demonstrating exceptional predictive performance with 100% accuracy and Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Alongside IFI6 and IFI27, genes, including MX1, OAS1, and ISG15, were significantly upregulated, highlighting their essential roles in the immune response to EBOV. Our results underscore the efficacy of the SMAS method in revealing complex genetic interactions and response mechanisms during EBOV infection. This research provides valuable insights into EBOV pathogenesis and aids in developing more precise diagnostic tools and therapeutic strategies to address EBOV infection in particular and viral infection in general.
△ Less
Submitted 22 January, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Machine Learning Based Analytics for the Significance of Gait Analysis in Monitoring and Managing Lower Extremity Injuries
Authors:
Mostafa Rezapour,
Rachel B. Seymour,
Stephen H. Sims,
Madhav A. Karunakar,
Nahir Habet,
Metin Nafi Gurcan
Abstract:
This study explored the potential of gait analysis as a tool for assessing post-injury complications, e.g., infection, malunion, or hardware irritation, in patients with lower extremity fractures. The research focused on the proficiency of supervised machine learning models predicting complications using consecutive gait datasets. We identified patients with lower extremity fractures at an academi…
▽ More
This study explored the potential of gait analysis as a tool for assessing post-injury complications, e.g., infection, malunion, or hardware irritation, in patients with lower extremity fractures. The research focused on the proficiency of supervised machine learning models predicting complications using consecutive gait datasets. We identified patients with lower extremity fractures at an academic center. Patients underwent gait analysis with a chest-mounted IMU device. Using software, raw gait data was preprocessed, emphasizing 12 essential gait variables. Machine learning models including XGBoost, Logistic Regression, SVM, LightGBM, and Random Forest were trained, tested, and evaluated. Attention was given to class imbalance, addressed using SMOTE. We introduced a methodology to compute the Rate of Change (ROC) for gait variables, independent of the time difference between gait analyses. XGBoost was the optimal model both before and after applying SMOTE. Prior to SMOTE, the model achieved an average test AUC of 0.90 (95% CI: [0.79, 1.00]) and test accuracy of 86% (95% CI: [75%, 97%]). Feature importance analysis attributed importance to the duration between injury and gait analysis. Data patterns showed early physiological compensations, followed by stabilization phases, emphasizing prompt gait analysis. This study underscores the potential of machine learning, particularly XGBoost, in gait analysis for orthopedic care. Predicting post-injury complications, early gait assessment becomes vital, revealing intervention points. The findings support a shift in orthopedics towards a data-informed approach, enhancing patient outcomes.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Cross-attention-based saliency inference for predicting cancer metastasis on whole slide images
Authors:
Ziyu Su,
Mostafa Rezapour,
Usama Sajjad,
Shuo Niu,
Metin Nafi Gurcan,
Muhammad Khalid Khan Niazi
Abstract:
Although multiple instance learning (MIL) methods are widely used for automatic tumor detection on whole slide images (WSI), they suffer from the extreme class imbalance within the small tumor WSIs. This occurs when the tumor comprises only a few isolated cells. For early detection, it is of utmost importance that MIL algorithms can identify small tumors, even when they are less than 1% of the siz…
▽ More
Although multiple instance learning (MIL) methods are widely used for automatic tumor detection on whole slide images (WSI), they suffer from the extreme class imbalance within the small tumor WSIs. This occurs when the tumor comprises only a few isolated cells. For early detection, it is of utmost importance that MIL algorithms can identify small tumors, even when they are less than 1% of the size of the WSI. Existing studies have attempted to address this issue using attention-based architectures and instance selection-based methodologies, but have not yielded significant improvements. This paper proposes cross-attention-based salient instance inference MIL (CASiiMIL), which involves a novel saliency-informed attention mechanism, to identify breast cancer lymph node micro-metastasis on WSIs without the need for any annotations. Apart from this new attention mechanism, we introduce a negative representation learning algorithm to facilitate the learning of saliency-informed attention weights for improved sensitivity on tumor WSIs. The proposed model outperforms the state-of-the-art MIL methods on two popular tumor metastasis detection datasets, and demonstrates great cross-center generalizability. In addition, it exhibits excellent accuracy in classifying WSIs with small tumor lesions. Moreover, we show that the proposed model has excellent interpretability attributed to the saliency-informed attention weights. We strongly believe that the proposed method will pave the way for training algorithms for early tumor detection on large datasets where acquiring fine-grained annotations is practically impossible.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Attention2Minority: A salient instance inference-based multiple instance learning for classifying small lesions in whole slide images
Authors:
Ziyu Su,
Mostafa Rezapour,
Usama Sajjad,
Metin Nafi Gurcan,
Muhammad Khalid Khan Niazi
Abstract:
Multiple instance learning (MIL) models have achieved remarkable success in analyzing whole slide images (WSIs) for disease classification problems. However, with regard to gigapixel WSI classification problems, current MIL models are often incapable of differentiating a WSI with extremely small tumor lesions. This minute tumor-to-normal area ratio in a MIL bag inhibits the attention mechanism fro…
▽ More
Multiple instance learning (MIL) models have achieved remarkable success in analyzing whole slide images (WSIs) for disease classification problems. However, with regard to gigapixel WSI classification problems, current MIL models are often incapable of differentiating a WSI with extremely small tumor lesions. This minute tumor-to-normal area ratio in a MIL bag inhibits the attention mechanism from properly weighting the areas corresponding to minor tumor lesions. To overcome this challenge, we propose salient instance inference MIL (SiiMIL), a weakly-supervised MIL model for WSI classification. Our method initially learns representations of normal WSIs, and it then compares the normal WSIs representations with all the input patches to infer the salient instances of the input WSI. Finally, it employs attention-based MIL to perform the slide-level classification based on the selected patches of the WSI. Our experiments imply that SiiMIL can accurately identify tumor instances, which could only take up less than 1% of a WSI, so that the ratio of tumor to normal instances within a bag can increase by two to four times. It is worth mentioning that it performs equally well for large tumor lesions. As a result, SiiMIL achieves a significant improvement in performance over the state-of-the-art MIL methods.
△ Less
Submitted 11 December, 2023; v1 submitted 18 January, 2023;
originally announced January 2023.
-
A transformer-based deep learning approach for classifying brain metastases into primary organ sites using clinical whole brain MRI
Authors:
Qing Lyu,
Sanjeev V. Namjoshi,
Emory McTyre,
Umit Topaloglu,
Richard Barcus,
Michael D. Chan,
Christina K. Cramer,
Waldemar Debinski,
Metin N. Gurcan,
Glenn J. Lesser,
Hui-Kuan Lin,
Reginald F. Munden,
Boris C. Pasche,
Kiran Kumar Solingapuram Sai,
Roy E. Strowd,
Stephen B. Tatter,
Kounosuke Watabe,
Wei Zhang,
Ge Wang,
Christopher T. Whitlow
Abstract:
Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radio…
▽ More
Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radiosurgery over 21 years. Contrast-enhanced T1-weighted and T2-weighted Fluid-Attenuated Inversion Recovery brain MRI exams (n=1,582) were preprocessed and input to the proposed deep learning workflow for tumor segmentation, modality transfer, and primary site classification into one of five classes. Ten-fold cross-validation generated overall AUC of 0.878 (95%CI:0.873,0.883), lung class AUC of 0.889 (95%CI:0.883,0.895), breast class AUC of 0.873 (95%CI:0.860,0.886), melanoma class AUC of 0.852 (95%CI:0.842,0.862), renal class AUC of 0.830 (95%CI:0.809,0.851), and other class AUC of 0.822 (95%CI:0.805,0.839). These data establish that whole-brain imaging features are discriminative to allow accurate diagnosis of the primary organ site of malignancy. Our end-to-end deep radiomic approach has great potential for classifying metastatic tumor types from whole-brain MRI images. Further refinement may offer an invaluable clinical tool to expedite primary cancer site identification for precision treatment and improved outcomes.
△ Less
Submitted 20 April, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
Radiopathomics: Integration of radiographic and histologic characteristics for prognostication in glioblastoma
Authors:
Saima Rathore,
Muhammad A. Iftikhar,
Metin N. Gurcan,
Zissimos Mourelatos
Abstract:
Both radiographic (Rad) imaging, such as multi-parametric magnetic resonance imaging, and digital pathology (Path) images captured from tissue samples are currently acquired as standard clinical practice for glioblastoma tumors. Both these data streams have been separately used for diagnosis and treatment planning, despite the fact that they provide complementary information. In this research work…
▽ More
Both radiographic (Rad) imaging, such as multi-parametric magnetic resonance imaging, and digital pathology (Path) images captured from tissue samples are currently acquired as standard clinical practice for glioblastoma tumors. Both these data streams have been separately used for diagnosis and treatment planning, despite the fact that they provide complementary information. In this research work, we aimed to assess the potential of both Rad and Path images in combination and comparison. An extensive set of engineered features was extracted from delineated tumor regions in Rad images, comprising T1, T1-Gd, T2, T2-FLAIR, and 100 random patches extracted from Path images. Specifically, the features comprised descriptors of intensity, histogram, and texture, mainly quantified via gray-level-co-occurrence matrix and gray-level-run-length matrices. Features extracted from images of 107 glioblastoma patients, downloaded from The Cancer Imaging Archive, were run through support vector machine for classification using leave-one-out cross-validation mechanism, and through support vector regression for prediction of continuous survival outcome. The Pearson correlation coefficient was estimated to be 0.75, 0.74, and 0.78 for Rad, Path and RadPath data. The area-under the receiver operating characteristic curve was estimated to be 0.74, 0.76 and 0.80 for Rad, Path and RadPath data, when patients were discretized into long- and short-survival groups based on average survival cutoff. Our results support the notion that synergistically using Rad and Path images may lead to better prognosis at the initial presentation of the disease, thereby facilitating the targeted enrollment of patients into clinical trials.
△ Less
Submitted 19 September, 2019; v1 submitted 17 September, 2019;
originally announced September 2019.