Silva Taxonomic Clean Up in qiime2r

mradz · March 6, 2020, 9:59am

Hi Jordan, I have also used SILVA and am having issues with the way the taxonomy is structured. How am I able to work on the imported taxonomy file to remove everything from the taxon string to show genus only for each row?

For example,

Change:

D_0__Bacteria;D_1__Proteobacteria;D_2__Gammaproteobacteria;D_3__Pasteurellales;D_4__Pasteurellaceae;D_5__Haemophilus;D_6__Pasteurellaceae bacterium canine oral taxon 272

To:

Haemophilus

SoilRotifer · March 6, 2020, 3:12pm

Hi @mradz,

One solution to the problem is outlined here. A slightly different solution is outlined below:

phy <- qza_to_phyloseq( features="feature-table.qza", tree="tree-mproot.qza", taxonomy="taxonomy.qza",metadata="mapping.txt" )

tax <- as.data.frame(as(tax_table(phy), "matrix"))
ntax <- tax %>% separate("Kingdom",
c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep=";")
tax_table(phy) <- tax_table(as.matrix(ntax))

Or, you can try using one of the newly formatted SILVA(v138) references for your taxonomy. These import w/o issue using qiime2r (I use it quite often).

-Mike

mradz · March 9, 2020, 3:44am

Hi Mike,

Thanks for that. In the following line:

When running:

ntax <- tax %>% separate(“Kingdom”,
c(“Kingdom”, “Phylum”, “Class”, “Order”, “Family”, “Genus”, “Species”), sep=";")

I get the following error:

Error: unexpected input in "ntax <- tax %>% separate(�"

SoilRotifer · March 9, 2020, 1:24pm

Sorry, I forgot the gsub part. Hopefully, this will work:

tax <- as.data.frame(as(tax_table(phy), "matrix"))
tax$Kingdom <- gsub("D_\d+__","", tax$Kingdom)
ntax <- tax %>% separate("Kingdom", c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species"), sep=";")
tax_table(phy) <- tax_table(as.matrix(ntax))

-Mike