Multidimensional scaling informed by $F$-statistic: Visualizing grouped microbiome data with inference
Authors:
Hyungseok Kim,
Soobin Kim,
Jeffrey A. Kimbrel,
Megan M. Morris,
Xavier Mayali,
Cullen R. Buie
Abstract:
Multidimensional scaling (MDS) is a dimensionality reduction technique for microbial ecology data analysis that represents the multivariate structure while preserving pairwise distances between samples. While its improvement has enhanced the ability to reveal data patterns by sample groups, these MDS-based methods require prior assumptions for inference, limiting their application in general micro…
▽ More
Multidimensional scaling (MDS) is a dimensionality reduction technique for microbial ecology data analysis that represents the multivariate structure while preserving pairwise distances between samples. While its improvement has enhanced the ability to reveal data patterns by sample groups, these MDS-based methods require prior assumptions for inference, limiting their application in general microbiome analysis. In this study, we introduce a new MDS-based ordination, $F$-informed MDS, which configures the data distribution based on the $F$-statistic, the ratio of dispersion between groups sharing common and different characteristics. Using simulated compositional datasets, we demonstrate that the proposed method is robust to hyperparameter selection while maintaining statistical significance throughout the ordination process. Various quality metrics for evaluating dimensionality reduction confirm that $F$-informed MDS is comparable to state-of-the-art methods in preserving both local and global data structures. Its application to a diatom-associated bacterial community suggests the role of this new method in interpreting the community response to the host. Our approach offers a well-founded refinement of MDS that aligns with statistical test results, which can be beneficial for broader compositional data analyses in microbiology and ecology. This new visualization tool can be incorporated into standard microbiome data analyses.
△ Less
Submitted 17 April, 2025; v1 submitted 1 August, 2023;
originally announced August 2023.