Qiime2R error: taxa names do not match

Hi,

I'm trying to create a phyloseq object from some QIIME 2 artifacts. I have a problem similar to later posts in this question.

I have imported the data from QIIME 2 as follows:

> #Importing metadata
> HGS_metadata<-read_tsv("HGS_mapping_file.txt")
> HGS_metadata
> #Looks good
> 
> #Importing feature table
> HGS_features<-read_qza("HGS_merged_table_filtered3.qza")
> HGS_features$data[1:5,1:5]
> 
> #Importing taxonomy
> HGS_taxonomy<-read_qza("silva_HGS_taxonomy.qza")
> #Converting to taxtable
> HGS_taxtable<-HGS_taxonomy$data %>% as.tibble() %>% separate(Taxon, sep = ";", c("Kingdom","Phylum","Class","Order","Family","Genus","Species"))
> HGS_taxtable
> #Looks good I think
> 
> HGS_tree<-read_qza("insertion-tree.qza")
> HGS_tree$data

This was the command I used to try to create a phyloseq object:

HGS_phy<-phyloseq(otu_table(HGS_features$data, taxa_are_rows=T), phy_tree(HGS_tree$data), tax_table(as.data.frame(HGS_taxtable%>% select(-Confidence) %>% column_to_rownames("Feature.ID") %>% as.matrix()), sample_data(HGS_metadata %>% as.data.frame() %>% column_to_rownames("SampleID"))))

This is the error I got:

Error in validObject(.Object) : invalid class “phyloseq” object: 
 Component taxa/OTU names do not match.
 Taxa indices are critical to analysis.
 Try taxa_names()
In addition: Warning message:
In .local(object) : Coercing from data.frame class to character matrix 
prior to building taxonomyTable. 
This could introduce artifacts. 
Check your taxonomyTable, or coerce to matrix manually.

I tried making a Venn diagram of IDs shared between the feature table and taxonomy table with gplots as recommended in the linked post:
> gplots::venn(list(taxonomy=rownames(HGS_taxtable), featuretable=colnames(HGS_features)))
39%20am

When I view the HGS_features data and HGS_taxtable, I can see that feature IDs are definitely shared between the two even though the program doesn't seem to be picking them up. I'm wondering if this is a problem with how I'm importing the feature table (which has no header names, just all the sample names listed across the top and feature IDs down the side) ... but not sure how to fix this.

Thank you for any help!

Could you post the previews of your objects so I can take a look and try to figure out the issue?

Hi Jordan,

Thanks for agreeing to take a look. I’m not sure exactly what you’re asking for – is a preview the output of a specific command, or are you looking for a screenshot or copy-paste of the objects … ?

I’ve pasted the first few lines of the taxtable and feature table in the meantime in case that’s what you wanted?

> head(HGS_taxtable)
# A tibble: 6 x 9
  Feature.ID      Kingdom   Phylum    Class     Order     Family     Genus   Species  Confidence
  <fct>           <chr>     <chr>     <chr>     <chr>     <chr>      <chr>   <chr>         <dbl>
1 000071683f5ef8… D_0__Bac… D_1__Pro… D_2__Alp… D_3__Cau… D_4__Caul… NA      NA            1.000
2 000641f2935341… D_0__Bac… D_1__Pro… D_2__Del… D_3__Oli… D_4__0319… NA      NA            1.000
3 0008fceb4dfefc… D_0__Bac… D_1__Arm… D_2__unc… D_3__met… D_4__meta… D_5__m… D_6__me…      1.000
4 00093c7b060f8f… D_0__Bac… D_1__Pla… D_2__Pla… D_3__Gem… D_4__Gemm… D_5__u… NA            0.999
5 000b2eeefd90bd… D_0__Bac… D_1__Aci… D_2__The… D_3__The… D_4__Ther… D_5__S… NA            1.000
6 000c1c5797d21f… D_0__Bac… D_1__Bac… D_2__Bac… D_3__Cyt… D_4__Micr… D_5__u… D_6__me…      0.987
> HGS_features$data[1:5, 1:5]
                                 PB11A1.PlayfordHGS.CAS.2016 PB11A2.PlayfordHGS.CAS.2016
0008fceb4dfefccceae9bc9980f70df8                           0                           0
000b2eeefd90bdf6052dc1287b0615cb                           0                           0
000c1c5797d21feba7165b842eef553a                           0                           0
000c729b6daaebda387c5ba21ca7b560                           0                           0
000cac2874f76f8eaf4d1fa39bddfaff                           0                           0
                                 PB11AC.PlayfordHGS.CAS.2016 PB11AC2.PlayfordHGS.CAS.2016
0008fceb4dfefccceae9bc9980f70df8                           0                            0
000b2eeefd90bdf6052dc1287b0615cb                           0                            0
000c1c5797d21feba7165b842eef553a                           0                            0
000c729b6daaebda387c5ba21ca7b560                           0                            0
000cac2874f76f8eaf4d1fa39bddfaff                           0                            0
                                 PB11N1.PlayfordHGS.CAS.2016
0008fceb4dfefccceae9bc9980f70df8                           0
000b2eeefd90bdf6052dc1287b0615cb                           0
000c1c5797d21feba7165b842eef553a                           0
000c729b6daaebda387c5ba21ca7b560                           0
000cac2874f76f8eaf4d1fa39bddfaff                           0

Thanks,
Matilda

Hi @Matilda_H-D! This is a bit of a “just passing through, don’t mind me” type of comment, but your venn diagram looks pretty weird to me - why are there zero column names in the feature table? Also, why are you using the column names for that comparison, since your feature table has features as rows, not _columns? Just some food for thought.

Sorry for not seeing this earlier, Mathew is correct re the issue with the venn plot. In your code:
as.data.frame(HGS_taxtable%>% select(-Confidence) %>% column_to_rownames("Feature.ID") %>% as.matrix()
I have the sneaking suspicion that you are loosing your row names in the conversion when you make your phyloseq object. Perhaps this would fix your issue:
(HGS_taxtable %>% select(-Confidence) %>% as.data.frame() %>% rownames_to_column("Feature.ID))

1 Like

Hi all,

I am also having a similar issue with this same error message, although mine I think is because of my phylogenetic tree.

Previously I have used the exact same files and script to successfully create a phyloseq object which included a phylogenetic tree. Using this phyloseq object I was able to carry out downstream analysis such as calculating phylogenetic distances (eg. Faiths PD) with no issues.

I have returned to this project to rerun some analysis and am using the exact same files and script. However I now get the following error when creating the phyloseq object – (only when I include the tree):

Error in validObject(.Object) : invalid class “phyloseq” object:
Component taxa/OTU names do not match.
Taxa indices are critical to analysis.
Try taxa_names()

I created the tree in Qiime2 using SEPP.

Previously I used the command below to ensure that the names of the tree matched my OTU table:
setequal(taxa_names(OTU), TREE$tip.label)

or

setequal(taxa_names(OTU), taxa_names(TREE))

These above commands no longer work and I get “FALSE” in response.

I wonder if there is a compatibility issue between the new phyloseq package and my phylogenetic tree?

Any suggestions on how to resolve this would be hugely appreciated.

Thank you!

Kath