issues moving between greengenes classification and diversity metrics

Hello all! I'm having some issues with my code, and I'm using greengenes 13_8 for my reference data, which I realize is highly supported in QIIME2 so I'm sure this is a user-end ignorance issue, but any help is appreciated.

I'm trying to use both the diversity core-metrics-phylogenetic and the gemelli plugin to generate and compare different diversity metrics. I have used the feature classifier to generate my taxonomy and used feature-table group to group my feature columns. Output below:

I am attempting to use the tree provided by greengenes instead of building one out myself, as tree building is not my area of expertise and it isn't the point of my project. However, when I attempt to use core-metrics-phylogenetic I run into the following issue:

I have not had any issues with my metadata up to this point, I have verified that my frequency table is the table corresponding to the previous photo, and I have checked that my tree upload (rooted) makes sense given my view of it in FigTree.

Attempts to Fix

  1. Filter the frequency data using the tree using phylogeny filter-table. This runs with no errors, but my output is devoid of all features.


  1. Try to identify features based on OTU ID instead of taxonomic labels since I can see in FigTree that the tips of gg's tree are OTU IDs, and my frequency table has no such labels. Running feature-table rename-ids does not work, however, and I suspect that it's because otu's are store in column "Feature ID" which is not a MetadataColumn[Categorical] object.

  2. I consulted this forum topic to further investigate the status of my classification results, but my taxonomy barplot looks fine.

  1. Finally, I tried using taxa collapse as outlined in this post because I can see categorical text at nodes in the tree. This yields the following results

This is about as much as I've done so far. I apologize for the essay, but if anyone has seen this before and has feedback I'd be really grateful.

Hello!

I don't know your goals for the analysis, but now I am curious why you grouped your samples before calculating diversity metrics. Is it was your purpose to group some samples? I am asking because in your farther analysis if you would like to compare those groups, you will not have replicates for statistical analyses (for the column by which samples are grouped).

The issue here is that IDs form your feature table and labels in the tree do not correspond with each other. Solution will be to build your own tree in qiime2 with your files, so IDs will be the same.

That failed due to the same reason - IDs are different so all features were filtered out.

Regarding the mismatching labels: is there a way to change the tip labels in my reference tree without starting from scratch? I have a reference taxonomy and tree from greengenes that correspond to each other, and my taxonomic assignments were based on that reference source. I'm confused as to why building my own tree would be needed if I'm not attempting to dispute or alter the relationships among taxa, especially when I trust that the tree generated for the greengenes reference was better assembled than one I would make myself.

If the issue is that my features contain k_ p_ c_ o_ f_ g_ s_ labels, the ref taxonomy has a #categorical k_ p_ c_ o_ f_ g_ s_ and #q2types OTU column, and the ref tree uses OTU labels, is there an alternative way to simply rename my features with OTU labels?

I appreciate your insights, thank you

Now I understood your question. In that case probably this tutorials can help you:

  1. Tutorial how to build tree in qiime2
  2. Tutorial how to use precomputed tree.

First is for building your own tree in qiime2.
Second is more suitable for your case since it will insert sequences from your table into precomputed tree. So you will be able to use tree from GreenGenes

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.