I’m used to looking at differential abundances in terms of p-values, like you would get from DESeq2 via phyloseq, or from group_significance_Type.txt in qiime 1 core diversity.
Now I’m trying to get used to anacom, based on the qiime 2 tutorials.
Is it possible to set a W value that is considered significant, like you would with alpha/p-value, either during the
qiime composition ancom
or within the web viewer?
If not, does qiime 2 still offer a t-test or anova-based differential abundance plugin?
Differential abundance is a super touchy topic – there are hundreds of tools out there to do this, and all of them have their own set of assumptions and weaknesses.
The threshold for the W value is automatically determined (see ANCOM paper), so the hypothesis rejection process is a bit hidden from the user. So no, you cannot set threshold for the W-value.
We used to support a bunch of differential abundance tests such as t-test, anova in group_significance.py in qiime1. I’ve been a little edge about this, since it is really easy to come up with super simple scenarios where the false positive rate is close to 100% (see this benchmark for an example).
But maybe it may be worthwhile to revisit this and bring this functionality back into qiime2. At the very least - it could provide a baseline for benchmarking new differential abundance techniques.
I forgot to mention. There is actually already a qiime2 plugin that does do the conventional differential abundance techniques.
That plugin can be found here. Its a bit more robust than the techniques found in group_significance.py, since it accounts for the discretized nature of the sequencing counts using a permutation FDR approach.
Could you help clarify the W values. In my data set the null hypothesis is generally rejected, unless the W value is around/above 60. Thus, should I infer that the lower the W value the more likely that species is not normally distributed between my categories?
where x_i denotes the ith species abundances from samples x, x_j denotes the jth species abundances from samples x
and where y_i denotes the ith species abundances from samples y, y_j denotes the jth species abundances from samples y.
The W value is just a count of the number of times H_{0(ij)} is rejected for the ith species.
So if you have 1000 species, and W=60, for OTU k, then H_{0k.} is rejected 60 times. This basically means, that the ratio OTU k and 60 other species were detected to be significantly different across the x and y groups.