Issue with feature-classifier (single sample vs multiple samples)

Thanks for sharing the files!

I looked at the reads inside of rep-seqs_R1_R2_R3.qza, and ran it through vsearch dereplicate.
As expected, all reads were unique.

Then I ran vsearch dereplicate again with --strand both, which checks for identical reads in both directions (forward and reverse.)
This function found lots of hits in the reverse strand!

This means your input reads were in a mixed orientation.
classify-sklearn assumes that all reads are in the same orientation.

To fix this issue, use the RESCRIPt action orient-seqs

2 Likes