Classifier trained on NCBI RefSeq 16S rRNA data gives weird results

SoilRotifer · July 29, 2022, 7:53pm

I often see results like this when the sequences are not in the same orientation as the reference database. This has been encountered before here and here.

Perhaps try running qiime rescript orient-seqs ... on your sequences. The scikit learn classifier tries to handle this (see one of the threads above), but may not always work. Also keep in mind that the NCBI RefSeq is not as expansive as SILVA, so some issues, like failing to detect the correct sequence orientation, may be exaggerated.