Applications and proper usage of gneiss

wipperman · December 12, 2017, 8:07pm

Hello,

I have gone through the Qiime2 tutorials, and while I am still learning the ins-and-outs of Qiime2, I wanted to share my experience with gneiss and perhaps ask if I am using it properly. I have two metadata variables that I know to be correlated with microbiome composition (i.e., their abundance positively and negatively correlate with several groups of OTUs/genera).

To be clear, since I haven't yet used the Qiime2 pipeline to preprocess my raw data, I am making my own .qza files from a biom table, taxa data, and metadata (see below). Then I am reproducing the gneiss tutorial on the 88soils samples but with this data. The outputs I get do not look quite right, and so my question is more about the process/application of this technique for non-Qiime2 generated data.

I begin with an hdf5.biom (hdf5.biom.qza (106.0 KB)) file and a taxonomy.tsv file (taxa.qza (23.9 KB)
) that I made separately, as well as a metadata file (sample.metadata.txt (572 Bytes)).

The data in the hdf5.biom file were pre-normalized with DESeq("poscounts") to account for discrepancies in library size, and I did this such that all of the counts are > 0 (numbers range from 0.6663828 to 14.6254360).

I then ran the following Qiime2/gneiss commands to reproduce the Qiime portion of the results:

qiime feature-table filter-features --i-table hdf5.biom.qza --o-filtered-table filt.biom.qza --p-min-frequency 3

qiime gneiss gradient-clustering --i-table filt.biom.qza --m-gradient-file sample.metadata.txt --m-gradient-category variable_1 --o-clustering tree.nwk.qza --p-weighted

qiime gneiss dendrogram-heatmap --i-table filt.biom.qza --i-tree tree.nwk.qza --m-metadata-file sample.metadata.txt --m-metadata-category "variable_1" --o-visualization "heatmap" --p-ndim 10 --verbose

qiime gneiss ilr-transform --i-table filt.pseudo.biom.qza --i-tree tree.nwk.qza --o-balances balances.qza

qiime gneiss lme-regression --p-formula "variable_1" --i-table lme_balances.qza --i-tree tree.nwk.qza --m-metadata-file sample.AGE.txt --o-visualization model --p-groups "sample"

qiime gneiss ols-regression --p-formula "variable_1 + variable_2" --i-table ols_balances.qza --i-tree tree.nwk.qza --m-metadata-file sample.metadata.txt --o-visualization regression_summary.qzv

qiime gneiss balance-taxonomy --i-table filt.biom.qza --i-tree tree.nwk.qza --i-taxonomy taxa.qza --p-taxa-level 2 --p-balance-name 'y0' --m-metadata-file sample.metadata.txt --m-metadata-category variable_1 --o-visualization y0_taxa_summary.qzv

So I am able to generate balances, heatmaps, and perform regression okay, but the heatmap (and gradient prediction that I get), always has a large gap in it. The final file (after running the python portion of the tutorial) is here:

My question is, am I not filtering out the OTUs properly? Is the clustering method that I am using incorrect? And finally, is there a way to account for more than one metadata variable in a method like this (i.e., generating the predictions?). Please let me know if there is anything that I can provide that I may have forgot to include--I think these tutorials are helpful and have given me some good ideas, I am just concerned that I am not using the method properly! Thanks so much.

--Matt

mortonjt · December 13, 2017, 8:15pm

Hi @wipperman, I'm looking though the biom table. While I'm not very familiar with the DEseq normalization, you have over 400 features with less than 1e-30 variance. To me, this suggests that those features are not providing much additional information to the analysis, likely because they are only present in a few samples.

I would try filtering those out, by filtering out small variance features, features that only appear in 1 sample, or have less than 10 counts total (before the DEseq normalization).

wipperman · December 15, 2017, 5:06pm

Hey @mortonjt, Thanks so much--that seems to have done the trick! I will make attempts to analyze this data with dada2 in Qiime2 and generate new biom/taxonomy files with a different pipeline, but I think the tools in Qiime2, particularly gneiss, were useful enough for me to just take the data I have and directly import them into .qza format so that I could give the tools a shot! Thank you again for the help. --Matt

system · January 15, 2018, 11:06pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.