Hi everyone,
Happy new year!
I have a couple of fairly general (sets of) questions or issues I’ve been mulling over. I will try to post as separate topics to make the forum easier to search. Here is the first one:
I have recently been running qiime diversity betagroupsignificance
(using both permanova and anosim methods) on a dataset and have been thinking about how to decide whether a given category is ‘important’ or has enough explanatory power to be discussed.
I also came across this thread on the Qiime 1 forum which discussed the difference between adonis and permanova tests, both of which were included in the equivalent command in Qiime 1 (group_significance.py
). At the time @jairideout confirmed that the two tests are very similar but that adonis was preferred in Qiime 1 because:
Adonis is a more robust version of PERMANOVA because it can handle numeric variables (i.e. mapping file categories/columns) in addition to categorical variables. I’ve also found that Adonis results are easier to interpret because an R^2 value is given as part of the output, whereas with PERMANOVA you only get a pseudoF statistic.
These are my questions:

I don’t have a lot of experience in statistics so I feel pretty out of my depth and I’m not sure I really understand the difference between pseudoF and Rsquared that Jai referenced in the linked thread. My understanding is that Rsquared in adonis corresponded to the percentage of variation in the distance matrix explained by the variable, and that pseudoF is something else, but I’m not sure what … so I’m not sure how to interpret my pseudoF statistic results. Is the pseudoF a measure of the difference in variance between the groups? If so, I agree that that does seem less useful for interpretation than the Rsquared value. Can we calculate the Rsquared value ourselves, maybe, or could anyone explain the value of the information contained in the pseudoF statistic to help me better interpret my data?

Why, given the advantages mentioned in the linked thread, was adonis dropped in favour of permanova in Qiime 2?

Are we, as a community, any closer to having a consensus on what constitutes a large enough effect size to consider a variable potentially important in influencing community composition? For example, I work on oral microbiomes (dental plaque and calculus samples), and with the adonis test in Qiime 1 I would usually treat a variable that explained 5% or more of the variation as potentially interesting, because we rarely got any variables explaining more than 20odd percent of variation. Would love to know how others are approaching this in Qiime 2. I’m waiting till I understand the pseudoF statistic better to decide!
Sorry for long post and questions! I’ve been thinking about these things a lot and I hope that any answers might be helpful to other users as well.
Thanks!