Plugin error from diversity-lib: The table does not appear to be completely represented by the phylogeny

Hello Q2 team,

I am working with public OTU tables, and am trying to run some analyses. I collapsed a subset of Human Microbiome Project samples with Greengenes feature IDs with taxa collapse, and as a result I now have a collapsed feature table with taxon names: hmp_genus.qza. (912.8 KB) So far so good!

Before collapsing, I could use something like weighted unifrac because the feature IDs matched the tip names in the un-annotated Greengenes reference tree. However, the full taxonomic names (such as 'k__Bacteria; p__Firmicutes; c__Bacilli; o__Bacillales; f__Bacillaceae; g__Anoxybacillus; s__kestanbolensino') do not appear to match the convention in the annotated Greengenes tree file - 99_otus_tree.qza (1.8 MB) -, where the name of nodes are truncated (e.g 'g__Methanopyrus; s__kandleri' for a leaf). (I assume that) as a result, I receive an error when I try to use a method involving phylogeny - 'The table does not appear to be completely represented by the phylogeny' in the case of unifrac and " 'NoneType' object has no attribute 'remove' " in the case of gneiss ilr-phylogenetic.

I could re-replace the taxonomic names with feature IDs in theory, but I was wondering if this is an indicator of some other issue, or if I was actually doing something incorrectly (since I would really like to not switch more than once along the pipeline),

I was using qiime2-2021.4.

Thank you very much in advance!

Hi @guyshur!

This is a great question - while I do think that replacing the taxonomic names with feature IDs could be an option, I share your same concern about this being caused by a deeper issue. I'm going to loop in a couple of folks who may be able to provide some further insight on this!

@Mehrbod_Estaki @Nicholas_Bokulich have either of you seen this issue before, where the node names aren't truncated?