How to remove D_0__ from in front of bacteria (D_0__Bacteria) in a qimme2 created biom file

Hello
After running I created a biom file, and I converted to tsv file to run the krona chart. On the chart D_0__ , D_1__, D_2__, D_3__ and the other level appeared (“D_0__Bacteria;D_1__Cyanobacteria;D_2__Oxyphotobacteria;D_3__Synechococcales;D_4__Cyanobiaceae;D_5__Synechococcus”). I want to get rid of these D_ in front of the organisms but I don’t know tow to do it. If you help me I’ll be glad.
best wishes
arne

Hi @arne,

Redo taxonomy option
One option is to redo the taxonomy assignment using these prototype SILVA 138 classifiers:

Note, you may have to retain the classifiers yourself, with the raw files provided.

There will be more news on silva taxonomy soon. So, keep your :eyes: peeled!

Find / replace option
Otherwise you can make use of any text editor that can make use of regular expressions, in order to find D_[0-9][0-9]*__ and replace with '' / nothing, or whatever.

Alternatively, if you do not mind the unix / linux command line, you can use sed as follows:
sed 's/D_[0-9][0-9]*__//g' < input_taxonomy.txt > output_fixed_taxonomy.txt

I am sure others will double-check my sed command. But that should work.

-Best wishes!
-Mike

3 Likes

I just wanted to refer the folks in this thread over to the tutorial linked below.

-Best wishes!
-Mike

2 Likes

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.