Dear friends,
I have two questions regarding the application of qiime2 to perform diversity analysis in soil samples. I am aware that “general solutions” are far from a realistic scenario but any comment coming from this great forum would be highly appreciated!
In general, typical studies of soil microbiomes include “noisy” situations like (among others): very high alpha diversity (thousands of ASVs in a single 0.25g sample), high natural betadiversity (considerable differences in microbial composition within samples of the same “treatments” or “biological replicates”), and of course uneven (> 10x) sample sizes due to complications with PCR amplifications in specific samples. Besides, often experimental designs are complex -not typical treatment vs. control situations- but might include several factors with interactions, and also random factors (block desingns). Taking all this into account, I sometimes hesitate if statistical methods that were primarily designed and tested for gut/human/clinical samples can be directly applied for environmental research. In this context:
Q1. About “normalization”: I just discovered the new q2-repeat-rarefy (GitHub - yxia0125/q2-repeat-rarefy). Thanks for the new tool! Repeating the rarefaction sounds conceptually a good idea with soil samples, considering that lots of low abundant and perhaps meaningful ASVs would disappear in a “single shot” rarefaction. Would you recommend this approach versus the “traditional” one (single rarefaction)? In that case, how can we use the output of q2-repeat-rarefy to feed the qiime diversity core-metrics plug in? (That I guess so far it performs single-rarefactions, right?).
Q2. About differential abundance analysis: I know this is a complicated topic with lots of discussions, but would you recommend (or directly discard) any of the available methods in the case of soil samples? If I understand correctly, both ANCOM and the most recent ANCOMBC (GitHub - FrederickHuangLin/ANCOMBC: Differential abundance (DA) analysis for microbial absolute abundance data) might be a good choose to look for “responsive” ASVs among groups of samples (say warm vs. control soil samples), but it is not possible to include interaction of factor terms (like temperature x humidity). Deseq2 allows those more complicated designs, but I read in this forum that it is not recommended for microbiome analysis anymore due to large numbers of false positive. I did not find any method that allows random effects.
THANKS!