Bayesian Graphical Compositional Regression for Microbiome Data
An important task in microbiome studies is to test the existence of and give characterization to differences in the microbiome composition across groups of samples. Important challenges of this problem include the large within-group heterogeneities among samples and the existence of potential confounding variables that, when ignored, increase the chance of false discoveries and reduce the power for identifying true differences. We propose a probabilistic framework to overcome these issues by combining three ideas: (i) a phylogenetic tree-based decomposition of the cross-group comparison problem into a series of local tests, (ii) a graphical model that links the local tests to allow information sharing across taxa, and (iii) a Bayesian testing strategy that incorporates covariates and integrates out the within-group variation, avoiding potentially unstable point estimates. With the proposed method, we analyze the American Gut data to compare the gut microbiome composition of groups of participants with different dietary habits. Our analysis shows that (i) the frequency of consuming fruit, seafood, vegetable, and whole grain are closely related to the gut microbiome composition and (ii) the conclusion of the analysis can change drastically when different sets of relevant covariates are adjusted, indicating the necessity of carefully selecting and including possible confounders in the analysis when comparing microbiome compositions with data from observational studies. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.