Ancom significance

nricks · February 2, 2018, 7:25pm

I have been trying to perform ANCOM abundance tests and I have a couple questions.
First of all I split my data up into two different sets because they come from different locations and have drastically different results. I when I run the ancom abundance tests on each on of them they get drastically different results. In one location Many different OTU's have very high W values, but it says not to reject the null hypothesis.

However, in the other location it says to reject he null hypothesis in the case of every OTU even though they have low W values

Does anyone have any idea what the problem is?

Mehrbod_Estaki · February 5, 2018, 6:48pm

Hi @nricks,
@mortonjt will be able to explain this much better than me but if I had to guess I would think the results you are seeing in your 2nd location (with low W values) is likely a result of large rare features from that site. See this explanation of how the W values are determined. With lots of rare taxa that only appear in a few samples, it's likely that ANCOM is under performing due to violations of the recommended guideline where less than 25% of the features are expected to be different across groups.
I think a first good step is to reduce noise from your feature-table by getting rid of rare and low abundant features that only appear in a few samples or that they make up a very small portion of the features.

Nicholas_Bokulich · February 6, 2018, 4:46pm

Hi @nricks,
@Mehrbod_Estaki raises many great points (thank you!), including how the W values are calculated.

Thus W will depend on the number of features, and you cannot really compare W values between ANCOM tests on two different feature tables. Different threshold W scores will be set for each test.

So what you are seeing is not really a "problem". However, what is a problem (as @Mehrbod_Estaki points out) is that in table #2 many many features are significantly different, which is breaking the assumptions of ANCOM.

You might want to check out q2-gneiss; especially since it sounds like you have a potentially complicated experimental setup, gneiss can handle things like multi-factorial experimental designs to determine how species balances differ between groups.

I hope that helps!

system · March 9, 2018, 10:46pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.