Happy new year!
I have a couple of fairly general (sets of) questions or issues I’ve been mulling over. I will try to post as separate topics to make the forum easier to search. Here is the first one:
I have recently been running
qiime diversity beta-group-significance (using both permanova and anosim methods) on a dataset and have been thinking about how to decide whether a given category is ‘important’ or has enough explanatory power to be discussed.
I also came across this thread on the Qiime 1 forum which discussed the difference between adonis and permanova tests, both of which were included in the equivalent command in Qiime 1 (
group_significance.py). At the time @jairideout confirmed that the two tests are very similar but that adonis was preferred in Qiime 1 because:
Adonis is a more robust version of PERMANOVA because it can handle numeric variables (i.e. mapping file categories/columns) in addition to categorical variables. I’ve also found that Adonis results are easier to interpret because an R^2 value is given as part of the output, whereas with PERMANOVA you only get a pseudo-F statistic.
These are my questions:
I don’t have a lot of experience in statistics so I feel pretty out of my depth and I’m not sure I really understand the difference between pseudo-F and R-squared that Jai referenced in the linked thread. My understanding is that R-squared in adonis corresponded to the percentage of variation in the distance matrix explained by the variable, and that pseudo-F is something else, but I’m not sure what … so I’m not sure how to interpret my pseudo-F statistic results. Is the pseudo-F a measure of the difference in variance between the groups? If so, I agree that that does seem less useful for interpretation than the R-squared value. Can we calculate the R-squared value ourselves, maybe, or could anyone explain the value of the information contained in the pseudo-F statistic to help me better interpret my data?
Why, given the advantages mentioned in the linked thread, was adonis dropped in favour of permanova in Qiime 2?
Are we, as a community, any closer to having a consensus on what constitutes a large enough effect size to consider a variable potentially important in influencing community composition? For example, I work on oral microbiomes (dental plaque and calculus samples), and with the adonis test in Qiime 1 I would usually treat a variable that explained 5% or more of the variation as potentially interesting, because we rarely got any variables explaining more than 20-odd percent of variation. Would love to know how others are approaching this in Qiime 2. I’m waiting till I understand the pseudo-F statistic better to decide!
Sorry for long post and questions! I’ve been thinking about these things a lot and I hope that any answers might be helpful to other users as well.