-
A Single Visualization Technique for Displaying Multiple Metabolite-Phenotype Associations
Authors:
Mir Henglin,
Teemu Niiranen,
Jeramie D. Watrous,
Kim A. Lehmann,
Joseph Antonelli,
Brian L. Claggett,
Emmanuella J. Demosthenes,
Beatrice von Jeinsen,
Olga Demler,
Ramachandran S. Vasan,
Martin G. Larson,
Mohit Jain,
Susan Cheng
Abstract:
More advanced visualization tools are needed to assist with the analyses and interpretation of human metabolomics data, which are rapidly increasing in quantity and complexity. Using a dataset of several hundred bioactive lipid metabolites profiled in a cohort of over 1400 individuals sampled from a population-based community study, we performed a comprehensive set of association analyses relating…
▽ More
More advanced visualization tools are needed to assist with the analyses and interpretation of human metabolomics data, which are rapidly increasing in quantity and complexity. Using a dataset of several hundred bioactive lipid metabolites profiled in a cohort of over 1400 individuals sampled from a population-based community study, we performed a comprehensive set of association analyses relating all metabolites with eight demographic and cardiometabolic traits and outcomes. We then compared existing graphical approaches with an adapted rain plot approach to display the results of these analyses. The rain plot combines the features of a raindrop plot and a parallel heatmap approach to succinctly convey, in a single visualization, the results of relating complex metabolomics data with multiple phenotypes. This approach complements existing tools, particularly by facilitating comparisons between individual metabolites and across a range of pre-specified clinical outcomes. We anticipate that this single visualization technique may be further extended and applied to alternate study designs using different types of molecular phenotyping data.
△ Less
Submitted 12 October, 2017;
originally announced October 2017.
-
Quantitative Comparison of Statistical Methods for Analyzing Human Metabolomics Data
Authors:
Brian L. Claggett,
Joseph Antonelli,
Mir Henglin,
Jeramie D. Watrous,
Kim A. Lehmann,
Gabriel Musso,
Andrew Correia,
Sivani Jonnalagadda,
Olga V. Demler,
Ramachandran S. Vasan,
Martin G. Larson,
Mohit Jain,
Susan Cheng
Abstract:
Background. Emerging technologies now allow for mass spectrometry based profiling of up to thousands of small molecule metabolites (metabolomics) in an increasing number of biosamples. While offering great promise for revealing insight into the pathogenesis of human disease, standard approaches have yet to be established for statistically analyzing increasingly complex, high-dimensional human meta…
▽ More
Background. Emerging technologies now allow for mass spectrometry based profiling of up to thousands of small molecule metabolites (metabolomics) in an increasing number of biosamples. While offering great promise for revealing insight into the pathogenesis of human disease, standard approaches have yet to be established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes including disease outcomes. To determine optimal statistical approaches for metabolomics analysis, we sought to formally compare traditional statistical as well as newer statistical learning methods across a range of metabolomics dataset types. Results. In simulated and experimental metabolomics data derived from large population-based human cohorts, we observed that with an increasing number of study subjects, univariate compared to multivariate methods resulted in a higher false discovery rate due to substantial correlations among metabolites. In scenarios wherein the number of assayed metabolites increases, as in the application of nontargeted versus targeted metabolomics measures, multivariate methods performed especially favorably across a range of statistical operating characteristics. In nontargeted metabolomics datasets that included thousands of metabolite measures, sparse multivariate models demonstrated greater selectivity and lower potential for spurious relationships. Conclusion. When the number of metabolites was similar to or exceeded the number of study subjects, as is common with nontargeted metabolomics analysis of relatively small sized cohorts, sparse multivariate models exhibited the most robust statistical power with more consistent results. These findings have important implications for the analysis of metabolomics studies of human disease.
△ Less
Submitted 20 February, 2018; v1 submitted 10 October, 2017;
originally announced October 2017.
-
Statistical Methods and Workflow for Analyzing Human Metabolomics Data
Authors:
Joseph Antonelli,
Brian Claggett,
Mir Henglin,
Jeramie D. Watrous,
Kim A. Lehmann,
Pavel Hushcha,
Olga Demler,
Samia Mora,
Teemu Niiranen,
Alexandre C. Pereira,
Mohit Jain,
Susan Cheng
Abstract:
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity and mechanisms underlying human health and disease. Large-scale metabolomics data, generated using targeted or nontargeted platforms, are increasingly more common. Appropriate statistical analysis of these complex high-dimensional data…
▽ More
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity and mechanisms underlying human health and disease. Large-scale metabolomics data, generated using targeted or nontargeted platforms, are increasingly more common. Appropriate statistical analysis of these complex high-dimensional data is critical for extracting meaningful results from such large-scale human metabolomics studies. Herein, we consider the main statistical analytical approaches that have been employed in human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we propose a step-by-step framework for pursuing statistical analyses of human metabolomics data. We discuss the range of options and potential approaches that may be employed at each stage of data management, analysis, and interpretation, and offer guidance on analytical considerations that are important for implementing an analysis workflow. Certain pervasive analytical challenges facing human metabolomics warrant ongoing research. Addressing these challenges will allow for more standardization in the field and lead to analytical advances in metabolomics investigations with the potential to elucidate novel mechanisms underlying human health and disease.
△ Less
Submitted 20 February, 2018; v1 submitted 10 October, 2017;
originally announced October 2017.