ANCOM - W values are high but null hypothesis was still rejected

ankurnaqib · January 20, 2021, 7:40pm

I ran ANCOM on my data. I used all the filters that are recommended. I removed features that only appear in 1 sample and removed features whose overall frequency was less than 20. In my output (attachment), I see a lot of taxa with high W values but still tagged "FALSE" for the null hypothesis. Can someone help me understand what this means? Thanks a lot. Here is a snapshot from my results

ancom (4).tsv (11.7 KB)

ankurnaqib · January 20, 2021, 8:07pm

Hi. Did anyone notice high W values, but still coming out as "FALSE" under the Reject Null Hypothesis column. Attaching a result that shows HIGH W values but only one "ancom (4).tsv (11.7 KB) TRUE"

jwdebelius · January 20, 2021, 8:11pm

Hi @ankurnaqib,

How many features are in your table prior to ANCOM testing and how many samples do you have?

Best,
Justine

ankurnaqib · January 20, 2021, 8:32pm

Hi Justine,

Thank you for responding. I am comparing two groups (T_Control and cB_Control) of samples (each having 5 samples each). I am also attaching my script that I used for the same.
ANCOM_script.txt (1.3 KB)

My overall feature table has 728 features but I used filters

First to filter out samples with less than 22000 reads,
Remove features that only appear in 2 samples
Remove features whose overall frequency is less than 20.

I then collapse the filtered table to a genus level data.

jwdebelius · January 20, 2021, 9:04pm

Hi @ankurnaqib,

I suspect the answer is more aggressive filtering.

How many genera do you have? What portion of the genera is 109? I'm guessing it's 50% or fewer?

Best,
Justine

ankurnaqib · January 20, 2021, 10:36pm

Hi @jwdebelius

That is what baffled me as well. The total number of genera in my collapsed file are only 114. I have attached it for you to see.collapsed_table.tsv (22.5 KB)

Thank you.

jwdebelius · January 21, 2021, 4:33pm

Hi @ankurnaqib,

Thanks for being patient with me, I'm going in circles a little bit. The file you sent gives a significant signal at W=109 (0.95), but not at W=96 (0.84) with a total of 10 samples. ANCOM tends to be quite conservative so the fact that you're geting a signal at all wtih 10 samples impresses me.

The way the "significant" results are calculated in ANCOM I uses a bimodal approximation of a null distribution and essentially fits the data to estimate that null. So, I think the results make sense to me here.

Best,
Justine

ankurnaqib · January 26, 2021, 4:21pm

Thank you @jwdebelius.