Gneiss downloaded p-value .csv disagrees with interactive summary

Hi All,

I am looking through results of OLS-regression as part of the gneiss tutorial pipeline and am finding discrepancies between the p-values displayed in the interactive summary .qzv file and those output by downloading the .csv of all the p-values.

For instance, node y10 has a low p-value for the AFDW variable displayed on the interactive plot:

However, when I compare this to the fdr-corrected p-value .csv output sorted by lowest value, it does not match either the corrected or uncorrected values from the summary:

I'm not quite sure what to make of this, but am curious if anyone else has experienced the same. I'll be happy to provide more details if anyone is interested in troubleshooting this with me.

All of the values in the fdr-corrected csv output seem to lie between the corrected and uncorrected values on the interactive plot

Thank you!

Hi @Seth

The entry that you highlighted has an extremely small p-value – very possible that it stemmed from a rounding error. Do you notice this with pvalues closer to 0.05?

Hi Jamie,

Thanks for the reply. I spot checked some more values closer to 0.05 and see a similar trend of the .csv exported values being between the corrected and uncorrected p-values in the interactive plot.

AFDW at node y1 for example:
image

csv file:
image

I should also mention that this is not the case with the .csv download of the uncorrected p-values. Those do seem to agree with the interactive summary.

Hi @Seth, it is a bit difficult for me to provide feedback without seeing the commands or the artifacts that were used to generate this output. Could you provide those details?

Hi @mortonjt,
Certainly! The code that I used to generate the regression summary is the following:

##removing observations with 0 counts
qiime feature-table filter-features \
  --i-table combined-table-nmc-table.qza \
  --p-min-frequency 100 \
  --o-filtered-table combined-min100feature-filtered-table.qza

qiime gneiss correlation-clustering \
    --i-table combined-min100feature-filtered-table.qza \
    --o-clustering combined-min100-filtered-hierarchy.qza

qiime gneiss ilr-hierarchical \
  --i-table combined-min100feature-filtered-table.qza \
  --i-tree combined-min100-filtered-hierarchy.qza \
  --o-balances combined-min100-balances.qza

qiime gneiss ols-regression \
  --p-formula "DAI+AFDW+biomass_prod+DO+precip_roll3+wind_dir+temp+wind_speed_roll3" \
  --i-table combined-min100-balances.qza \
  --i-tree combined-min100-filtered-hierarchy.qza \
  --m-metadata-file combined_meta_010719_blanks.txt \
  --o-visualization combined_min100_norun_regression_summary.qzv \
  --verbose

I’ve uploaded the two artifacts and metadata file used for the final ols-regression step to the dropbox link here: https://www.dropbox.com/sh/s32c0o59vjsqz91/AACgsOtZeksnmCIYEaUfsjTEa?dl=0

I’ve included the qzv summary file that I am comparing to the csv output from the same in there as well. Thanks so much for your help.

Seth

Ok @Seth, that is odd – I’ve raised a bug here.

In the meantime, I would not trust the corrected pvalues, and instead manually perform FDR correction on the uncorrected pvalues.

That being said, all of the regression analysis tools in gneiss are all on track for deprecation, being superseded by rank-based multinomial regression models here that can actually handle zeros and ranked based interpretation (instead of balances). Will make another post once that tool is in the qiime2 library.

1 Like

Thanks @mortonjt, I’m glad to learn I wasn’t just going crazy at least.

I’ll dive into your link to implementing rank-based multinomial regression ASAP. I hope it isn’t too ignorant to ask for a link to some more reference material on this method in general? In any case, I will see what I can accomplish through yours and the Knight lab’s repos on the subject.

The method itself isn’t published yet - will provide link once that’s out - - but underneath the hood, it’s just multinomial regression with L2 regularization (which is commonly known in stats and ML).

That being said, the q2 interface isn’t polished yet. More to come in the next few weeks…

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.