Interpreting qiime dsfdr output

vheidrich · March 29, 2020, 11:58am

Hi everyone,

I am trying to get differentially abundant features using dsFDR method, but I am having problem interpreting the output of 'qiime dsfdr' command.
As far as I understood from @serenejiang comment on this post, all features in the csv output file have statistically significant abundance differences across groups, ie, they have passed the alpha constraint from the dsFDR method, although, the p-values available in this file are from the original tests (non-FDR-corrected). First of all, am I missing something here?
I am asking because I have too many features in the csv output (probably all of them) and many p-values are much greater than 0.05, so how can all of them be interesting? Also, in this file is shown a 'Reject' column with 'False' entry for all features. What does it mean? As far as I understood, for this features, we cannot reject the null hypothesis (Reject = False), therefore, there is no difference in their abundance across groups, but how @serenejiang comment should be interpreted then?

Last but not least, if I want to plot the abundance across groups of a specific feature, is it okay to use the non-adjusted p-value to represent the difference statistically? Since this is the only p-value provided, I do not see any alternative.

dsfdr.csv (1.9 KB)

Thank you very much

serenejiang · March 29, 2020, 10:33pm

Hi, Victor. Thank you for your interest in using dsfdr. To clarify that the dsfdr output files should include
all the features, but not all of them are statistically significant from dsfdr. Only those with "Reject" equals to TRUE are found significant. The test statistics and raw pvalues are the ones before multiple correction. So in your case, there is no feature found to be statistically different from dsfdr.

In fact, if you are interesting in differential abundance test for microbiome data, you can use more latest tools such as ANCOM or Songbird.

Hope this helps!

vheidrich · March 30, 2020, 1:01pm

Hi,
Thank you for clarifying that.
Of note, I already tried ANCOM, but I was afraid I was missing something with it, since it has very stringent criteria to identify significant features. Seeing that ANCOM and dsfdr mostly agree, probably this is the reality of the data.

Thanks again!

system · April 30, 2020, 7:01pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.