I’m at the point in my pipeline of assigning taxonomy to my sample of ITS sequences and filtering. I used a UNITE classifier generously provided by Colin Brislawn, and subsequently would like to filter the sequences to only those assigned k__Fungi with a confidence of 90%. I know how to filter based on k__Fungi, but cannot seem to find the syntax to filter based on confidence of assignment. Is there a way to do this after classifying using “qiime taxa filter-table” or “qiime taxa filter-seqs”? Or do I need to train my own UNITE classifier and specify “–p-confidence .9” in the training?
If it’s helpful to know– I’m working with soil samples, so we expect to find a number of non-fungal organisms; I’m mainly working on my university’s cluster system (which doesn’t have rescript in their qiime2 software); and filtering to 90% confidence assignments will still keep 5,700 assigned sequences out of 8,800 (~65% of the total).
(There is also an equivalent / similar threshold when building the database, but it's related to something different.)
There should be a confidence column in your already made taxonomy.qza files,
so filtering afterward should work. But I'm not sure how best to do that from within Qiime2.
Here is the .tsv file inside a taxonomy.qza file, and you can see the confidence column right there!