Some clarifications on gneiss correlation-clustering

Hello, I'm running a QIIME2 differential abundance analysis on the results of a 16S metagenomics study. I tried to stick to this tutorial.
However, I think I need some clarifications about how it exactly works, because the results obtained by running the plugin gneiss correlation-clustering confused me.

I used, as suggested in the tutorial, the plugins in the following order:

I used a hierarchical logic, and decided to calculate regression according to three different variables (Replicate, Soil and Site) specified in my metadata file.
Then, I finally built a heatmap sorted by Site, in which the ratios y0 to y9 were shown.

qiime gneiss correlation-clustering --i-table table-no-clo-mit.qza --o-clustering gneiss/hierarchy.qza

qiime gneiss ilr-hierarchical --i-table table-no-clo-mit.qza --i-tree gneiss/hierarchy.qza --o-balances gneiss/balances.qza

qiime gneiss ols-regression --p-formula "Replicate+Site+Soil" --i-table gneiss/balances.qza --i-tree gneiss/hierarchy.qza --m-metadata-file 16S_metadata.tsv --o-visualization gneiss/regression_summary.qzv

qiime gneiss dendrogram-heatmap --i-table table-no-clo-mit.qza --i-tree gneiss/hierarchy.qza --m-metadata-file 16S_metadata.tsv --m-metadata-column Site --p-color-map seismic --o-visualization gneiss/site_heatmap.qzv

This is what the heatmap looks like:

And finally decided, according to the results of the heatmap, to use the balance 'y0' to sum up my results, because it apparently covers most of my diversity....

And here come my issues. I wanted to make comparisons for the variable 'Site' and I used...

--p-taxa-level 5

,which is FAMILY level...

1 stands for Kingdom, 2 for Phylum, 3 for Class, 4 for Order, 5 for Family, 6 for Genus and 7 for Species, don't they?

qiime gneiss balance-taxonomy --i-table table-no-clo-mit.qza --i-tree gneiss/hierarchy.qza --i-taxonomy taxonomy26.qza --p-taxa-level 5 --p-balance-name 'y0' --m-metadata-file 16S_metadata.tsv --m-metadata-column Site --o-visualization gneiss/y0_family_site_summary_test_forforum.qzv

And here is what two graphs look like...


  • First of all, since I specified level -5, why are genera shown here (with the exception of Bradyrhizobiacee?) This is the result I expected for level -6!
    I can fix it easily by choosing the upper level (4) for families, but I'd like to understand what's behind this...
  • Why are there some redundant taxa in the Proportion plot? Just look at how many times g__DA101 is repeated, both in under and overexpressed parts of the graph... What does stand that for? Did I make any mistakes in my previous analysis? Apparently, Balance Taxonomy is alright and shows no redundance.

Thank you for the clarifications, maybe I missed something about the logic of this approach!

2 Likes

Judging from your results, it looks like gneiss probably defines 0 as kingdom... 5 as class (not sure, not a gneiss developer, just speculating)

Because this is not collapsing your ASVs by taxonomy; it is showing you the taxonomic ID of the ASVs associated with each balance. So these are different ASVs with the same genus annotation.

I hope that helps!

1 Like

Judging from your results, it looks like gneiss probably defines 0 as kingdom… 5 as class (not sure, not a gneiss developer, just speculating)

Yes, apparently, the nomenclature is somehow 'switched' from one position! I solved the issue this way. I had assumed the nomenclature was the same as qiime taxa collapse, but apparently it isn't the case.
However, I still can't understand why, when investigating a given taxa level (genera in the example I provided), in the Proportion Plot the first line is represented by a family...

not a gneiss developer

Do you know who the developer/s is/are? And if I can contact them on GitHub?

Because this is not collapsing your ASVs by taxonomy; it is showing you the taxonomic ID of the ASVs associated with each balance. So these are different ASVs with the same genus annotation.

I guessed something similar! But is there any way to collapse them, for the sake of a more clear visualization?
I have tried to re run the complete gneiss pipeline as follows:

  • Collapse the input table at phyla level (in this example)

      qiime taxa collapse --i-table table-no-clo-mit.qza --i-taxonomy taxonomy26.qza --p-level 2 --o-collapsed-table gneiss_test/collassata_phyla.qza
    
  • Run each of the following steps on the collapsed table and the newly generated files (apparently, not all of them are necessary to run gneiss correlation-clustering).

      qiime gneiss correlation-clustering --i-table gneiss_test/collassata_phyla.qza --o-clustering gneiss_test/hierarchy_phyla.qza
    
      qiime gneiss ilr-hierarchical --i-table gneiss_test/collassata_phyla.qza --i-tree gneiss_test/hierarchy_phyla.qza --o-balances gneiss_test/balances_phyla.qza
    
      qiime gneiss ols-regression --p-formula "Replicate+Site+Soil" --i-table gneiss_test/balances_phyla.qza --i-tree gneiss_test/hierarchy_phyla.qza --m-metadata-file 16S_metadata.tsv --o-visualization gneiss_test/regression_summary_phyla.qzv
    
      qiime gneiss dendrogram-heatmap --i-table gneiss_test/collassata_phyla.qza --i-tree gneiss_test/hierarchy_phyla.qza --m-metadata-file 16S_metadata.tsv --m-metadata-column Site --p-color-map seismic --o-visualization gneiss_test/site_heatmap_phyla.qzv
    
  • But once I get to the gneiss correlation-clustering again, that's what I get:

      `qiime gneiss balance-taxonomy --i-table gneiss_test/collassata_phyla.qza --i-tree gneiss_test/hierarchy_phyla.qza --i-taxonomy taxonomy26.qza --p-taxa-level 1 --p-balance-name 'y0' --m-metadata-file 16S_metadata.tsv --m-metadata-column Site --o-visualization gneiss_test/y0_phyla_site_summary.qzv`
    

Plugin error from gneiss:

  • "None of [Index(['k__Bacteria;p__Firmicutes', 'k__Bacteria;p__TM7',\n 'k__Bacteria;p__Chloroflexi', 'k__Bacteria;p__Actinobacteria',\n 'k__Bacteria;p__Gemmatimonadetes'],\n dtype='object', name='Feature ID')] are in the [index]"*

Debug info has been saved to /tmp/qiime2-q2cli-err-pizcrryq.log

I'm also attaching the debug info.

Any idea what could be wrong? Theorically, it sounded fine, but apparently I must have cut out some relevant information!

Thanks in advance!

Hi again @Sparkle,
The visualizers you are using in gneiss are actually deprecated... when running them or reading the help docs you should see this note:

This command is deprecated and will be removed in a future version of this plugin.

So just beware that you are using a visualizer that the developers plan to remove in the next release of gneiss. The developer of gneiss, @mortonjt, now favors his newer method, songbird, so you may want to try that out instead:

Because that ASV is only classified to the family level, so gneiss is being nice and showing you the terminal taxonomic label.

Not in an easy way — you could use python to customize the labels easily enough but there is no way to modify these labels in QIIME 2. collapsing will NOT work.

Yep, collapsing creates a new feature table where the IDs = the shared taxonomy. So you lose the ASV-level information for running gneiss (bad) and cannot relate that information back to the original gneiss you ran (also bad).

So I'm sorry I can't give a better answer! It really boils down to this: you want a customized output and are running up against the technical capabilities of gneiss and QIIME 2... you need to work out a custom solution to an individualized need

This could be called a feature request too... but gneiss visualizers have been deprecated so you should probably start switching over to songbird instead.

2 Likes

Hi @Nicholas_Bokulich
Thanks for your answer! Yes, I got exactly that warning (in yellow) about qiime gneiss balance-taxonomy being deprecated while running it!
I'm trying songbird already, but it's already giving me some compatibility issues (apparently involving the feature table), for which I'm going to open another thread.

Because that ASV is only classified to the family level, so gneiss is being nice and showing you the terminal taxonomic label.

I see!

Not in an easy way — you could use python to customize the labels easily enough but there is no way to modify these labels in QIIME 2. collapsing will NOT work.

This sounds demanding... since I'm far from being a skilled progammer... I guess I'll give songbird a go before venturing into Python scripting, also to see how the results look like.

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.