I have been trying to perform ANCOM abundance tests and I have a couple questions.
First of all I split my data up into two different sets because they come from different locations and have drastically different results. I when I run the ancom abundance tests on each on of them they get drastically different results. In one location Many different OTU's have very high W values, but it says not to reject the null hypothesis.
Hi @nricks, @mortonjt will be able to explain this much better than me but if I had to guess I would think the results you are seeing in your 2nd location (with low W values) is likely a result of large rare features from that site. See this explanation of how the W values are determined. With lots of rare taxa that only appear in a few samples, it’s likely that ANCOM is under performing due to violations of the recommended guideline where less than 25% of the features are expected to be different across groups.
I think a first good step is to reduce noise from your feature-table by getting rid of rare and low abundant features that only appear in a few samples or that they make up a very small portion of the features.
Hi @nricks, @Mehrbod_Estaki raises many great points (thank you!), including how the W values are calculated.
Thus W will depend on the number of features, and you cannot really compare W values between ANCOM tests on two different feature tables. Different threshold W scores will be set for each test.
So what you are seeing is not really a "problem". However, what is a problem (as @Mehrbod_Estaki points out) is that in table #2 many many features are significantly different, which is breaking the assumptions of ANCOM.
You might want to check out q2-gneiss; especially since it sounds like you have a potentially complicated experimental setup, gneiss can handle things like multi-factorial experimental designs to determine how species balances differ between groups.