ANCOM giving strange W values

mortonjt · August 18, 2017, 5:34pm

Ok, here's a summary of your data in table.qzv

In [1]: import qiime2
In [2]: table = qiime2.Artifact('table.qza').view(pd.DataFrame)
In [3]: (table>0).sum(axis=0).sort_values().value_counts()
Out[28]: 
1     1343
2      411
3      271
4      175
5      119
6       98
7       61
8       44
24      40
9       34
12      31
10      28
11      23
13      18
14      16
15      14
23      12
18      11
16      11
21      10
22       9
20       7
19       5
17       2

This is the sort scenario where ANCOM is expected to fail - the vast majority of your OTUs show up in very few samples. And since you need to add pseudocounts to replace the zeros, you are essentially adding a huge bias to your analysis.

Here's my suggestion.

Definitely filter out all of the OTUs that only appear in one sample
Filter out OTUs that met some count threshold - we definitely filter out OTUs less than 10 counts across all samples. If we don't have 10 reads for a single OTU, then it will provide very minimal information and is likely garbage.