Compositionality handling in q2-longitudinal


This question is primarily directed towards @Nicholas_Bokulich , but anyone is welcome to "qiime" in. I am following a similar workflow to the Bokulich et al. 2018 q2-longitudinal paper where the feature-volatility function is used to find important features then the important features are statistically tested with the LME models. When using the relative abundances of a taxa in the LME models, how is the compositionality problem addressed in which the increase of one taxa leads to the decrease of another? I know we can't do analyses such as a ANOVA just on relative abundances because of this so I am curious how the LMEs on relative abundance handle this.

In short: why is it okay to just use relative abundances in LME models?

Thank you very much!

Hi @Zach_Burcham ,
LME is not inherently compositional, so it suffers from this same flaw when dealing with relative abundances (e.g., that a longitudinal change could just be displacement effects). You could address this with an appropriate transformation (e.g., CLR) prior to LME. Nevertheless, LME without transformation is used a bit in the literature in an "exploratory" fashion for examining linear temporal changes in relative abundances of specific taxa, as in my 2018 paper. Such an approach is more for hypothesis generation, e.g., prior to using something like QPCR to confirm changes in absolute abundance of functionally important species (this was not done in the 2018 paper because it was a software announcement using old data for demonstration purposes, not a "real" experiment).

Same with the feature-volatility action. It is an exploratory method, useful for finding interesting associations, but it does not perform a hypothesis test, and (unless if using absolute abundances or transformed data) will be prone to compositionality issues with relative frequency data.

So to summarize: q2-longitudinal just exposes the methods, not the transforms. Data can in theory be transformed and then plugged into q2-longitudinal to perform compositionally aware tests (e.g., LME on CLR-transformed counts). Without that, q2-longitudinal is best used for non-compositional data (e.g., diversity metrics, metadata, and other continuous data) or exploratory analysis of relative abundance data.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.