Hello bioinformatic folks!
I’m having an issue in my Qiime 2 pipeline that you might be able to help with. I have a couple of ASVs that were assigned as unknown Streptophyta but when I blasted the sequences it said they were fungal. I would like to remove them but I can’t get the code quite right. I can remove other ASVs that were Chlorophyta or misidentified fungi, but not the unknown Streptophyta ones. Here is a screen shot of how they word the taxonomy:
Here is the first version of the code I used:
>qiime taxa filter-table --i-table reduced-table.qza --i-taxonomy taxonomy.qza --p-exclude unknown,Unknown,unassigned,Unassigned,Chlorophyta,Cycadales,Streptophyta;NA;__;__;__;__,Streptophyta;__;__;__;__;__ --o-filtered-table filtered-table.qza
When I ran it, it treated the semicolons like space between commands and gave me the errors: “NA command is unfound” and “__ command is unfound”. I have tried several other versions of the code where I put an apostrophe around parts of the code or backslash around the semicolons. It allows the code to run and removes the Chlorophyta and Cycadales but leaves the unknown Streptophyta.
I am running qiime2-2024.10.1 that was installed with conda.
I can’t seem to find a forum topic that matches my problem.
Any help is much appreciated!
Hello @bryan.fisher1,
When I ran it, it treated the semicolons like space between commands and gave me the errors: “NA command is unfound” and “__ command is unfound”.
I think this is simply because your shell is trying to interpret your taxonomic string as a series of commands. I would put them in quotes "" and try again.
1 Like
Hi @bryan.fisher1,
I'd like to point out that the taxonomy strings as presented in the visualizer are not actually how they are stored in the taxonomy.qza file. That is, the taxonomy string is padded with ;__ as you unveil more taxonomic ranks within the visualizer.
I am not sure if this is intended to be a microbial survey, or not. If it is, then you'll want to remove Streptophyta too, as those are plants.
So to extend the prior suggestion you should be able to run:
...
--p-query-delimiter ',' \
--p-exclude "unknown,Unknown,unassigned,Unassigned,Chlorophyta,Cycadales,Streptophyta,Chloroplast,Mitochondria" \
--p-mode contains \
...
1 Like
Hi Mike,
I am looking at plant ITS2.
To filter it, I split the commands into two functions, one for “contains” and one for “exact”.
qiime taxa filter-table --i-table sample-filtered-table.qza --i-taxonomy taxonomy.qza --p-exclude unknown,Unknown,unassigned,Unassigned,Chlorophyta,Cycadales --o-filtered-table filtered-table1.qza
qiime taxa filter-table --i-table filtered-table1.qza --i-taxonomy taxonomy.qza --p-mode exact --p-exclude "Streptophyta;NA","Streptophyta" --o-filtered-table filtered-table2.qza
1 Like

Okay, that is what I was going to suggest if this was plant and not microbial data.
So, I assume you got things working now?