Feature ids not present as tip names in phylogeny?


Trying to perform the core diversity phylogenetic analyses using the 99% OTU greengenes tree (13_8) I have used for multiple other datasets and am having issues with this particular dataset:

All feature_ids must be present as tip names in phylogeny. feature_ids not corresponding to tip names (n=3482): 8c6b63509ec84004d5ebc0456cb19ed3 addd58502b84482b6b99785541ef6b23 9362d478e1aede04a1417d0d2a9e366a 5c64ee251da17f4e4312c928b4a3eef0 …

I have not done anything to this tree since I last used it (even made sure that I didn’t somehow copy an incorrect tree, and I am using the same file I had used previously, so don’t think it is any issue with my tree. All I had done with my dada was import them, run dada2, classify with the naive bayes classifier I downloaded directly from the qiime2 site (gg-13-8-99-515-806-nb-classifier.qza), and then filter unwanted taxa (mitochondria, Archaea, etc) and am now trying to run the following command:

qiime diversity core-metrics-phylogenetic --i-table dada2_278_table.qza --i-phylogeny 99_greengenes_tree_CEO.qza --p-sampling-depth 76819 --m-metadata-file FIV_ging_mapping.txt --p-n-jobs 8 --output-dir core_div_11132018

Any suggestions on where I might have gone wrong?
I’m guessing I can just make a tree from my data and performing the phylogenetic analyses with that (any comments on why to do this or not do this?), but would like to figure out what went wrong so I can avoid this in the future :slight_smile:


Hi @c.older,
Your dada2 table that you are using does not have taxonomy assignments but instead has hashed IDs such as those you see in the error 8c6b63509ec84004d5ebc0456cb19ed3 etc…If you want to use this table you’ll need to create your own tree using the rep-seqs file you have and use that tree instead of the 99_greengenes_tree_CEO.qza file.
See this section of the Moving Pictures tutorial for one approach as to how to build a tree. I’ve personally moved on to using fragment-insertion for my tree-building needs which as of 2018.11 release is part of the core plugins.
Your other (less accurate) option is to assign taxonomy using your classifier trained with greengenes then use the taxa collapse action to assign taxonomic names to your ASV table. Then the original command you used should work. Keep in mind with this approach if there are any unassigned taxa or taxa that can’t be classified into your reference database you may get an error (or maybe they’ll be ignored and will be fine, not sure).
Hope that helps!

1 Like

Thanks for your response!
I’m not super well-versed on taxonomy/trees - is there any issue with making tree from my own rep seqs vs using a pretty well established one like greengenes? Not sure If taxa I might be missing in my sequences would allow for the most accurate tree?
Why do you use fragment-insertion over the method outlined in the Moving Pictures tutorial?

Hi @c.older,
No problem, you’ll get caught up in no time! To that end, check out this tutorial which might be of some help as it does a good job of explaining the whys and hows of various phylogeny analyses in qiime2. Eitherway you should make your own tree based on your rep-seqs. Which approach you choose is now the question, and your question above is basically asking to compare the reference-based approach vs. a de novo approach, the latter which is the focus of the link above. The usefulness of fragment-insertion is nicely demonstrated here and you can also read the paper cited there for more details too. Fragment-insertion does actually use a reference database (by default greengenes 13_8, but can be replaced with any other) and inserts de novo fragments for taxa not found there, though if a taxon is too far away from the nearest reference they may be dropped all together, therefore if you have a rare environment which may not be well represented in greengenes you should consider either using a different reference database or building a de novo tree.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.