Feature-classifier classify-consensus-blast should have a cutoff for query cover

Which causes blast to select short (~20 bp) alignments with high percent ID rather than Unassigned
For example here is a taxonomy.qzv and rep_seqs.qzv:
https://drive.google.com/open?id=11jvC1flUSAgLOe7rSTIP_qw9MgKOHYOE
feature 10e2e7ff15ba43c2bf6af8f038d608d3 is annotated as a Agalychnis saltator, however a web blast shows all the results with less than 15% query coverage, but 95%+ percent ID.

There should be a --p-query-coverage option behaving similar to --p-perc-identity

2 Likes

Thanks @dwt, this is a great suggestion and I have raised this issue to track this.

We have a query coverage parameter in q2-quality-control exclude-seqs so it should be straightforward to implement in classify-consensus-blast. In the mean time, you can use that method to filter out sequences with low coverage prior to or following classification.

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.