how to remove taxonomic rank prefixes in R?

Hey Jordan (and all),

First, thank you @jbisanz for tutorial, as @Negin said, it is very useful.

The question is not straight reply to this previous discussion but I have a question regarding the Qiime2r and phyloseq so I’ll just post it here.

I managed to import data from qiime2 to R and pyloseq without any major problems (if the lack of my experience is not considered as problem :wink: ).

I managed to work with different statistical tests and taxonomic-bar plots but I still have the taxonomic prefixes (k_ , p _ , etc) before my taxonomic names.

In phyloseq there is a command to parse these out if you are using qiime txt files and not qza files. If I understood correctly, this command is used in the importing phase using phyloseq (parseFunction=parse_taxonomy_greengenes).

Simple question, how to get rid off the prefixes?

As I said, I’m not so comfortable with R yet so there is probably a simple explanation how this can be done. Please correct me if I’m mixing things up!


In the future, please open a new topic.

The easiest way to rid yourself of these forever: remove them from the reference taxonomy files, and then train your own taxonomy classifier with these modified files. You can do this in all sorts of ways — if you are uncomfortable with R or bash or python or other programming languages, you can even just open the file and find/replace all.

1 Like

Thank you for your help!

Sorry for inconvenient way of posting…!
Next time I’ll open a new topic instead of commenting the old ones!