>90% unassigned sequences when using feature-classifier classify-sklearn

Hi all,

I am seeking for help with solving a strange problem that I encountered for the first time. I used the following command to process my raw sequencing data, which were generated by using 16S primers 799F and 1193R (amplicon is about 400 bp).

qiime dada2 denoise-paired --i-demultiplexed-seqs Paired-end-demux.qza --p-trunc-len-f 280 --p-trunc-len-r 215 --p-trim-left-f 19 --p-trim-left-r 18 --o-representative-sequences rep-seqs.qza --o-table table.qza --o-denoising-stats stats-dada2.qza --p-n-threads 16 --verbose

Overall, i got good sequences after this step, and I also Blasted a few top sequences in NCBI and got bacterial hit/ with no problem. But when i used the following command to run the taxonomy step in Qiime2, the taxonomy file i got provided >90% unassigned results (txt file attached).

qiime feature-classifier classify-sklearn --i-classifier StinglessBee_16S_124/Analysis/silva-132-99-nb-classifier.qza --i-reads Pollination/Bacterial_analysis/seqs_01 --o-classification taxonomy_03.qza

taxonomy.tsv (1.6 MB)

I really don't know what was the problem causing this chaos, does anyone has any idea to solve it? Thank you in advance.

HI @hongwei2017,

If you search the forum you'll find that this is likely due to your reads being in a different orientation than the reference database. You can confirm this by looking at the orientation start and stop positions of the query and hit sequence of your BLAST result (remember BLAST will search your query in both directions, i.e. reverse complement, to find a hit). Here is a thread to get you started:



Thank you Mike! It worked after solving the orientation issues of my sequences. Really appreciate your kind reply!


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.