Doing mantel test on each taxon

If I want to answer the question of which gut taxa are associated with a continuous biological variable of interest (eg serum data)- is it a proper use of mantel test to compute the distance matrix of each individual taxon and correlate it against the distance matrix of the response variable? Or am I better off doing an all-vs-all Spearman correlation on the abundance data (rarefied counts/relative abundance) itself (instead of the distance matrix) against the quantities of the variable of interest?

Hi @ange ,

The short answer is: no and no. There are specific differential abundance methods that should be used for finding correlations between species count data and variables of interest... you can see q2-aldex2, q2-songbird, q2-ancombc, and q2-composition for a few examples (the latter only for categorical variables). There is quite a bit of literature on this as well, regarding the need for compositionally-aware methods to handle microbiome count data, and specifically the issue with correlations of compositional data (which are not independent measurements, so break conventional correlation tests).

Mantel and spearman all-vs-all correlations are not appropriate for this data type — it breaks several assumptions of those tests (I am actually not totally sure about Mantel's assumptions, but the test is meant to operate on a distance matrix, not on raw counts).

Similarly, for building something like an association network (which essentially looks at all-vs-all correlation) there is a similarly long literature out there, and specialized methods like Spiec-easi, for correlating microbiome data. Spearman and Pearson correlation tests should not be used (or if they are they will have a very high false-positive rate)

I recommend reading the papers associated with the plugins listed above to learn about some of the options.