--p-mode exact unable to remove specific taxa in qiime taxa filter-table

I am analyzing a dataset using qiime2-2020.2 where I need to remove all sequences under the domain Eukaryota and the phylum Cyanobacteria. I successfully removed these taxa with --p-mode contains using the following command:

qiime taxa filter-table \
--i-table table.qza \
--i-taxonomy taxonomy.qza \
--p-mode contains \
--p-exclude "D_0__Eukaryota",'D_0__Bacteria;D_1__Cyanobacteria" \
--o-filtered-table filtered_table_no_eukaryota_cyanobacteria.qza

I want to remove a specific taxa using the output of the first filtering step as the input for the second filtering step. Here is the command I have tried to use:

qiime taxa filter-table
--i-table filtered_table_no_eukaryota_cyanobacteria.qza \
--i-taxonomy taxonomy.qza \
--p-mode exact \
--p-exclude "D_0__Bacteria;D_1__Proteobacteria;D_2__Alphaproteobacteria;D_3__Sphingomonadales;D_4__Sphingomonadaceae;D_5__Sphingomonas;__" \
--o-filtered-table contaminants_removed.qza

When I check to see if this taxa has been removed, it still remains in my dataset.

How can I remove this specific taxa?

Thanks,

Ryan

Hello Ryan,

As the setting --p-mode exact implies, this filter is looking for taxonomy annotations that exactly match the full string. Taxa names that are slightly different will not be removed.

When I check to see if this taxa has been removed, it still remains in my dataset.

Would you be willing to post the taxonomy that you still see? Then we can compare it to the string you posted and look for differences! :male_detective: :mag_right:

Colin

Hi Colin,

Thanks for the reply. Here is the taxa I am trying to remove

That sure looks like what you entered. Strange! :thinking:

You could try --p-mode contains --p-exclude "D_5__Sphingomonas", which should also remove that taxa (and have fewer letters to check for typos).

Does anyone else have a clue why Ryan’s text string did not match? It’s valid unicode and there are no issues with fancy quotes or strange characters.

If you look in the taxonomy assignments you will not find this string, that is why your "exact" mode is failing to match.

The ";__" at the end is tacked on in the barplot visualization that you shared, because you are specifying that you want the level 7 taxonomy to be displayed. That specific annotation has not 7th level, so an empty annotation is added to the end.

This is the exact mode string that you want to filter:

D_0__Bacteria;D_1__Proteobacteria;D_2__Alphaproteobacteria;D_3__Sphingomonadales;D_4__Sphingomonadaceae;D_5__Sphingomonas

Good luck!

1 Like

Hi Nicholas,

That string worked! Thank you for your input.

Ryan

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.