Sequence ordering can lead to inconsistent results for comments

Hi everyone,
Recently, I was working on a project related to fungi. After the noise reduction process, I got some ASV. I used these ASV for annotation. I used unite8.0 database, but the strange thing is that the first annotation result was very good, but I annotated the sequence again after sorting the sequence from long to short. Many ASV annotated to the species level before only annotated to the kingdom level this time, This makes me very confused. Will the order between sequences affect the annotation results?

here is the command I used :

qiime feature-classifier classify-sklearn
--i-classifier unite8.0_its_fungi_classifier.qza
--i-reads ASV_reps.qza
--o-classification taxonomy.qza
--p-confidence 0.7
--p-n-jobs 6

The following is the sequence I used twice. The old file is obtained by sorting the new file from long to short.
ASV_reps_old.qza (650.8 KB) ASV_reps_new.qza (532.5 KB)

Thank you for any suggestions

Welcome to the forum @zhigang !

No it will not.

How did you sort the sequences? You might have introduced an error while sorting, because this result is a common issue when classifying sequences that are in mixed or incorrect orientations:

By any chance did you also have other sequences that had improved annotations after sorting? (e.g., went from kingdom to species-level annotation). It is very possible that your sequences are in mixed orientations. The classify-sklearn method infers the read orientation from the first 100 sequences, and requires that all reads are in the same orientation. So re-sorting the reads will lead to a different result if the reads are in mixed orientations.

You can use the RESCRIPt plugin to harmonize your read orientation prior to classifying:

Give that a try and let us know what you find.

good luck!

2 Likes