Nextera adapters trouble – how short is too short a read

Nicholas_Bokulich · May 21, 2021, 10:16am

Exactly. The classifiers in q2-feature-classifier are meant to (by default) only classify to a depth at which the classifier is confident about the result (based on the reference data). This can be adjusted by changing the confidence threshold (classify-sklearn) or the consensus threshold (classify-consensus-* methods), e.g., to allow less confident classification (not necessarily a good thing!).

There is a QIIME 2 plugin, RESCRIPt, which can create a nice visualization of taxonomic resolution, e.g., number of unclassified taxa at each rank. See here for details:

The degree of resolution for any given taxon is going to depend on the length of the sequences and the marker-gene region. I suggest you just let yourself be guided by the results — unclassified sequences at phylum level are a sign of technical issues, but at genus level lack of classification is usually just lack of resolving power, about which there is little you can do (except choose a different marker gene or longer sequencing read length!), so there is not a satisfactory % for accepting/rejecting a result, only to accept/reject the technology! (and that threshold is decided by you )

I hope that helps!