ANCOM: 'low W taxa identified as significant' issue's workaround, ANCOM2 code/instructions

jwdebelius · January 11, 2020, 5:52pm

Hey, so, I'm kinda late to the party, I but I use a fixed threshhold for my ANCOM. (I'm a bit more conservative because I like 0.8). Typically, in ANCOM, your W value is calculated as the number of tests that are significant and then the distribution of numbers is used to calculate a threshhold based on the assumption that a distribution is bimodal.

However, you can also set significance a priori and say "X% of my tests must be different for significance". (You could always do it manually, but its nice to see ANCOM 2 has it built in.) So, if you set a hard threshold of 0.7, then a feature is significant if 70% of the ratios are significant. If you set a hard threshhold of 0.9, then 90% of features must be significantly different for significance etc.

You can get the threshhold on the current plug in (but not the shiny visualization behavior) by dividing W you get by 1 - the number of features tested (since we do a comparison for every feature except that 1). And, then, if W_norm ≥ threshhold, you call it significant.

I actually prefer this method, which I've been doing for a while, because it feels more like setting a p-value for an assumed distribution, where the p-value of 0.05 says 95% of my data should be less extreme than my value (or there's a 1/20 chance Im wrong), I feel like setting my threshhold at 0.8 means my ASV is changing wtih 80% of my data and that just give me more confidence.

...I suppose you could also do a joint distribution, where you take the max of the threshold and the bimodal distribution, but I've never quite gotten there. (Maybe thats what the R code does, IDK.)

Let me know if you want help with this part. It would nice to not make my students do the calculation by hand!

Best,
Justine