I've been reading the other posts on this topic and am still a little confused. I am looking at abundance of different OTUs at 8 different locations.
The first table shows the W values for all the OTUs that I can reject the null hypothesis for. So, if my W value is 74, does that mean that this particular OTU has significantly different abundance in 74 samples?
The second table shows percentile abundance for all the OTUs that I can reject the null hypothesis for. For one location, the an OTU has 1 at the 0 percentile, 18.5 at the 25th percentile, 71 at the 50th percentile, 118 at the 75th percentile, and 124 at the 100th percentile. Does this mean that all samples had a minimum of 1 sequence and a maximum of 124 sequences that belong to this OTU?
Finally there is a volcano plot. I don't know what to make of this.
Jamie Morton has a really good explanation of how the ANCOM stats work. Does that help answer your first question about W values? Of course @mortonjt could help answer further questions.
For the second question - not quite. Here is an explanation of the percentiles. Note that there are pseudocounts, so all of the counts in the percentiles are off by one.
For the volcano plot, you are looking at the W statistic on the y-axis, and the F-score on the x-axis. So basically the x-axis is summarizing the effect size difference of the given species between your treatment groups, and the y-axis is the strength of the ANCOM test statistic that @colinbrislawn linked.
What you want to get out of this sort of plot are the ASVs with a high F-score and a high W-statistic -- in other words points that are close to the the top right corner. These indicate that an ASV is suspected to be truly different across the groups.
Thank you. I think I understand this a little better. I had already read both the posts linked in this discussion, but the extra details clarify them for me.
I have some other volcano plots with points in the top left corner which I can reject the null hypothesis for. The origin for x-axis for those plots is negative, rather then zero. Just to clarify, those should also be considered truly different across groups, right?
The F-statistic on the x-axis is measure how different one groups is from the average for a specific ASV.
Basically, the null hypothesis here is that all of the groups (on average) are the same. The smaller the F-statistic, the more likely that the null hypothesis can not be rejected.
So the points in the top left corner would indicate that those particular ASVs are distinct. However, the actual changes in their proportions are not large. In terms of prioritizing findings, I'd focus on the points in the top right. But in terms of the points in the top left, it may be worthwhile to investigate, but would require another way to investigate those species.