Another topic of ANCOM - W of 0 - all significant

JeremyTournayre · January 10, 2022, 9:20am

Hello,

I just want a confirmation of my "strange" results.

I understand that we need to consider the W paramater and not just the "reject null hypothesis".

I have done two ANCOM analysis a level 7 (species) and a level 6 (genus).
Here are the two results:
genus

Species

At level 7 (species) I have all species significatively different while in level 6 (genus) there is no genus significantly different. I see in other topics that we must take into account the W parameter. If it is at zero we can say that there is no difference.
The level 7 (species) analysis on many species has the same data as the level 6 (genus) analysis, I put an example in green. Why at level 6 (genus) these genus does not appear significant?

I look at the data of d__Eukaryota;p__Ciliophora;c__Intramacronucleata;o__Litostomatea;f__Haptoria;; at level 7 (species) in the taxa-bar-plots.qzv file.
I know the data in ANCOM are normalized with 25% features that is assumed which are not different in abundance so the data that I looked in the taxa-bar-plots.qzv file are not the using one in statistical analysis. I wonder how I can see the normalized data?
Nethertheless in the taxa-bar-plots.qzv file I have only one sample which have 120 for Haptoria. This is strange that the statistical analysis show a W of 1.

I think I have to apply two manual filters: (1) if there is a "1" on the percentile "0" this is strange so I can remove the results, and (2) to remove all results with a W = 0.
Am I right?

Thanks for the QIIME2 support, I have posted so many questions since november :qiime2:.

jwdebelius · January 10, 2022, 5:05pm

Hi @JeremyTournayre,

I agree your results are strange. A lot of this comes down some of the ways ANCOM calculates it's W statistic, which is that it first tests whether there is a significant difference between each pairwise set of features, performs a multiple hypothesis correction, and then calculates W as the number pairwise tests including the feature that were significantly different.

A difference between your genus and species data is the number of tests you're performing and correcting. It's possible that you may have exceeded the test threshhold in species based on signal strength but not genus.

I'm not sure I under your manual filtering suggestion:

I think, instead, I would start by checking my filtering and whether my database supports species level testing. Then, I would calculate my own nromalized W value based on the number of features you're testing, and use that with a threshhold you set aprori, rahter than the built in W value. I'm making a guess here that you don't have a lot of samples and you haven't pre-filtered your data to remove things present in only one sample, which might explain your weird results.

Best,
Justine

system · February 10, 2022, 11:05pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.