I am using the Moving Pictures tutorial as a guidelines for analyzing my microbiome data. When trying to perform the ANCOM analysis at the end, I get strange results for ANCOM.
When I look at the results filtered to level 2, all W values are 0 (none significantly different). I don’t necessarily expect anything to be different (the two groups have not differed in the alpha/beta analysis prior to this) but I don’t think all the W values would be 0.
When I filtered at level 6, I get similarly strange results; only one genus is significant at W=34, and the other 58 genuses are not significant with their W values are 0 or 1.
I am a super beginner in all of this (microbiome analysis, statistics related to, and any sort of computer coding) so I am not sure if this is an error or if the W values I am seeing are “correct”.
I not exactly sure what is going on - would you like to attach your dataset?
There are many possibilities that could explain this, namely
Low resolution from level taxonomy summarization could obscure the signal, particularly if only one species is changing
Low counts / zeros could cause false positives in ANCOM
Issues with FDR in the statistical test
A really bad fluke with the statistical test. Remember, we are just conducting statistical tests, and every statistical test has some chance of failing (even it is small).
Having access to the underlying datasets to generate this result could help narrow down these issues.
Not to make this more complicated, but I also have ANCOM results from my full data set- above, I only looked at female, and below are my results from male+female, looking at the same variable "treatment" (morphine versus saline). The level 2 ANCOM is similarly weird (all W=0) and the level 5 ANCOM is also weird (4, 10, and the rest 0 and 1 for W but all seem to be significant?). l2-ancom-Treatment.qzv (29.0 KB) l5-ancom-Treatment.qzv (33.2 KB)
My dataset is also pretty small- 24 total, 12 female and 12 male, each of which have 6 of each treatment (morphine or saline).
Thanks @adrian. Could you also send over the metadata and the tables? That way, we can sanity check to see how many zero / low count entries there are.
@mortonjt, we are also seeing this “lots-o-zeros” situation crop up on a dataset that we are currently analyzing. Feel free to ping me directly to coordinate, if you wish. Thanks!
This is the sort scenario where ANCOM is expected to fail - the vast majority of your OTUs show up in very few samples. And since you need to add pseudocounts to replace the zeros, you are essentially adding a huge bias to your analysis.
Here’s my suggestion.
Definitely filter out all of the OTUs that only appear in one sample
Filter out OTUs that met some count threshold - we definitely filter out OTUs less than 10 counts across all samples. If we don’t have 10 reads for a single OTU, then it will provide very minimal information and is likely garbage.