Quetion about Filtering data

I find that if I want to get the taxonomy result without mitochondria, I need to use ‘qiime taxa filter-seqs’ to filter sequences and then use ‘qiime feature-classifier classify-sklearn’ to classfy the rest of sequences without mitochondria again. Can’t we just filter the taxonomy.qza directly? And get a file named taxonomy_without_mitochondria.qza? I don’t want to do the taxonomy-process again because it needs a lot of RAM…

Hi @laoshiren,

Have you had a chance to look through the filtering tutorial? This section of it has exactly what you are looking for, in fact the example used is actually also filtering mitochondria.
Have a look there first and let us know if you run into any problems.

2 Likes

Of course I have seen this before.But I think you didn’t understand what I said. This filtering tutorial can only filter sequences and feature tables, not the taxonomy result(taxonomy.qza), is it? Of course, the filtering sequences result don’t have mitochondria anymore, but the taxonomy.qza still have…It can only give us the filtered-sequences, but I want the filtered-taxonomy…If I want to get the taxonomy.qza wiithout mitochondria, I need to classify the filtered-sequences. I mean can’t we just filter the taxonomy.qza at the same time when we filter sequences and feature tables? I didn’t see a command named ‘–o-filtered-taxonomy’ or something like that…I don’t know if I am clear…

1 Like

Hi @laoshiren,

Gotcha! Sorry I must have not read your original question carefully.
The short answer is no, there is currently no way of filtering the FeatureData[Taxonomy], though it is an issue raised for future implementation. I’m guessing it is not a matter of priority in development though because extra taxa in those artifacts don’t affect downstream use. For example, if you are assigning taxonomy to your feature table the extra taxa would just not be used, no error will be raised.
May I ask why you need those filtered from your taxonomy artifact? You can always filter these outside of qiime2 if you really need them, the post here has some suggestions as far as doing this.

1 Like

@laoshiren another possibility for filtering out these sequences before taxonomy classification is to use exclude-seqs to filter out any sequences that are with some % similarity to a database of mitochondrial sequences.

1 Like

Thank you very much. Because I just want to see the structure of the bacteria community. I want to know the percentage of each taxa except mitochondria and chloroplast. The two are not bacteria. I konw the qiime1 can do this kind of thing with ‘filter_taxa_from_otu_table.py’ and ‘summarize_taxa_through_plots.py’. But still thank you a lot.

1 Like

filter_taxa_from_otu_table.py does not filter taxa from the taxonomy assignment data or from sequences... qiime taxa filter-table is the exact QIIME 2 analog of filter_taxa_from_otu_table.py

That command does not perform any filtering, it collapses by taxonomy and generates barplots. See this command for the exact QIIME 2 analog.

There is really no need to reclassify the filtered sequences, since having extra features in the taxonomy.qza artifact does not impact anything downstream in QIIME 2. QIIME 2 will always just operate on the features found in the feature table. Just classify once, filter mitochondria and chloroplast from your feature table, and that is all you need to do.

That is why there is not filter-taxonomy command — it is useless in QIIME2.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.