How to filter or reassign taxonomic names in QIIME 2?

So, I am a bit new to galaxy, i have been trying to navigate there. I am using the v3-v4 region, of the microbiome of certain samples. Now on generating the taxa bar plot, the results clearly indicate a contamination with E.coli. I have tried so many ways to filter the chloroplasts, mitochondria and e. coli, and i have two outcomes. one they are not filtered at all. Secondly for one, they are filtered but i loose the names and only have ids. I trained my classifier, I have tried re-training it filtering the sequences but its not possible. Is it possible to reassign my qiime taxa bar plots taxa ids to names on the server? What other ways can i filter the sequences so that it doesn’t affect the alpha and beta diversity calculations?

Hi @Ivyrose_Mnialoh,

Can you please provide all the commands you have been running that result in the outcomes you are observing? You can attach the QZV files of your barplots too, this way we can look through your provenance.

1 Like

Galaxy326-[qiime2 taxa barplot on data 309_ visualization.qzv] (4).qzv (1.3 MB)Attached is the qzv file after filtering and loosing my taxonomic names. Is there a simple way to reassign the names back? Also i am using this on the galaxy platform.https://usegalaxy.eu/api/datasets/4838ba20a6d86765d65eb89cb7ebdd20/display?to_ext=qza this is the lonk to my command line though i cannot have all of it, or maybe i dont know how to show it here.

Hi @Ivyrose_Mnialoh,

Based on the provenance information, I think the issue has to do with an incorrect qiime taxa filter-table command. That is you have the general form:

qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-mode contains \
  --p-exclude 'd__Bacteria;p__Cyanobacteria;c__Cyanobacteriia;o__Chloroplast;f__Chloroplast;g__Chloroplast;__d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Escherichia-Shigella d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rickettsiales;f__Mitochondria;g__Mitochondria' \
  --o-filtered-table filtered-table.qza

You need to have a , to delimit the taxonomic strings you'd like to search for and filter. Also, I noticed that the search string contains a __ prior to d__ in : ...Chloroplast;__d__Bacteria...

I added the , delimiters before each d__ and fixed the above mentioned __d. So your new command should be:

qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-mode contains \
  --p-exclude 'd__Bacteria;p__Cyanobacteria;c__Cyanobacteriia;o__Chloroplast;f__Chloroplast;g__Chloroplast,d__Bacteria;p__Proteobacteria;c__Gammaproteobacteria;o__Enterobacterales;f__Enterobacteriaceae;g__Escherichia-Shigella,d__Bacteria;p__Proteobacteria;c__Alphaproteobacteria;o__Rickettsiales;f__Mitochondria;g__Mitochondria' \
  --o-filtered-table filtered-table.qza

Note the , between the taxa strings. A good minimal example is provided here.

1 Like

@Ivyrose_Mnialoh

I should also note, that you do not need to use the entire taxonomy string when using --p-mode contans. You can condense as follows:

qiime taxa filter-table \
  --i-table table.qza \
  --i-taxonomy taxonomy.qza \
  --p-mode contains \
  --p-exclude 'Chloroplast,Escherichia-Shigella,Mitochondria' \
  --o-filtered-table filtered-table.qza

Thus, any string that contains the items your list will be removed.

Or whatever substring of the overall taxonomy that you'd like to remove. For example, you could also do:

...
--p-exclude 'o__Chloroplast,g__Escherichia-Shigella,f__Mitochondria'
...

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.