ANCOM giving strange W values

Ok, here's a summary of your data in table.qzv

In [1]: import qiime2
In [2]: table = qiime2.Artifact('table.qza').view(pd.DataFrame)
In [3]: (table>0).sum(axis=0).sort_values().value_counts()
Out[28]: 
1     1343
2      411
3      271
4      175
5      119
6       98
7       61
8       44
24      40
9       34
12      31
10      28
11      23
13      18
14      16
15      14
23      12
18      11
16      11
21      10
22       9
20       7
19       5
17       2

This is the sort scenario where ANCOM is expected to fail - the vast majority of your OTUs show up in very few samples. And since you need to add pseudocounts to replace the zeros, you are essentially adding a huge bias to your analysis.

Here's my suggestion.

  1. Definitely filter out all of the OTUs that only appear in one sample
  2. Filter out OTUs that met some count threshold - we definitely filter out OTUs less than 10 counts across all samples. If we don't have 10 reads for a single OTU, then it will provide very minimal information and is likely garbage.
4 Likes