QIIME2R, silva, and phyloseq

Hey again!
I have been trying out the new Silva 138 (SILVA 138 Classifiers). I have have used both methods to create phyloseq object for both Silva 132 and 138. When compare both methods using all.equal (), both outcomes for Silva 132 are OK. However, for 138, both outcomes differed in the number of NAs in tax_table. After checking NAs in the Silva138 tax table, I could notice Phyloseq () was correct. Something is happening with qiimeR2. I am happy with my objects created with phyloseq () but thought it was good to let you guys know about this. :slight_smile:

1 Like

Interesting, could you possibly send me the files or share lines from the files where the import differs?

1 Like

Hi @jbisanz, this is what I did Supportcode.txt (4.8 KB). I will also send you the .qza in a message.

1 Like

@Cotissima: I’m sorry maybe I’m out off topic a little bit.

I’ve been trying to create phyloseq object woth the new Silva 138 with qiime2R, but I’m stucked.

This is the error message:
df must be a data frame without row names in column_to_rownames().
This is the codes which involved the taxonomy:
error_making_phyloseq.txt (3.3 KB)

I’ m already try this methods:

[Tutorial: Integrating QIIME2 and R for data visualization and analysis using qiime2R]
But still stucked

Can I ask your workflow codes to import basic qiime2 files to successfully make phyloseq object?

Thank you and sorry for OOT.

1 Like

hi @didietkeren,
I actually had the same problem as you. What I did is trying the tax_table() command line by line to see what was wrong. I realised that when you use as.data.frame()at the beginning. The structure of your data changes to: Classes ‘tbl_df’, ‘tbl’ and ‘data.frame’ rather than just data.frame. I think that is what generates the issue.
It worked by not including the as.data.frame bit

                tax_table( taxtable138 %>%
                             select(-Confidence) %>%
                             column_to_rownames(var = "Feature.ID") %>%
                sample_data(metadata %>% 
                             remove_rownames() %>%
                             column_to_rownames(var = "SampleID"))

hope this works!


I’ve looked into this and both methods appear to give the same import when I adjust the string splitting on line 9 below.


  read_qza("taxonomy_silva138.qza")$data %>% 
  as_tibble() %>% 
  separate (Taxon, sep="; ", c("Kingdom","Phylum","Class","Order","Family","Genus","Species")) %>% # need to change sep to "; " from ";" to remove leading space in SILVA taxonomy strings
  as.data.frame() %>%
  column_to_rownames("Feature.ID") %>%

tax_phy<-qza_to_phyloseq(features="merged-table.qza",tree = "rooted-tree.qza", taxonomy="taxonomy_silva138.qza", metadata = "JFFS_metadata_qiime2.tsv") %>% tax_table() %>% as.data.frame()

tax_phy %>% as.data.frame() %>% rownames_to_column("SV") %>% gather(-SV, key="Level", value="Phy_Assignment") %>%
    tax_phy %>% as.data.frame() %>% rownames_to_column("SV") %>% gather(-SV, key="Level", value="Manual_Assignment")
merger %>% filter(Phy_Assignment!=Manual_Assignment) # nothing
merger %>% filter(is.na(Phy_Assignment) | is.na(Manual_Assignment)) #nothing with discordant NAs

I think the issue you are having is coming from lines like this:

036a90f816e0e135614f344bdf517e5d d__Bacteria; p__Firmicutes; c__Clostridia; o__Oscillospirales; f__Oscillospiraceae; g__NK4A214_group;Ambiguous_taxa

This is something I have not seen before in SILVA and needs to be account for. Could you please look into the taxonomic assignment of this feature in your 132 dataset and/or send me the artifact and I will take a closer look and do some thinking on the best way to catch and remove?

1 Like

@jbisanz that make sense. If I understand correctly, the problem is generated due to the lack of space after the “;” when moving into species level? I have sent you the .qza taxonomy file for SILVA 132.
I am a bit confused now. If the problem is in the taxonomy file itself, why do I only see it in one of the import methods?

Thanks so much for looking into this

The method used to parse the taxonomy strings between the methods is slightly different. I am going to write an additional function for parsing taxonomy strings to help streamline this for the future.