95% of sequences return as "unidentified"- ITS2 (Reads reverse orientation)


I just processed my sequences using classify-sklearn using the Unite-ver8 classifier created by SydneyMorgan. I was able to run the code, but >90% of my sequences came back as unidentified. Fungal-taxonomy-paired-end.qzv (1.5 MB)

I am working with Taylor et al. 2016 primers (ITS2), so my data are in reverse orientation (Foward files), which does not align with the Unite database, as I was previously told (but forgot as I ran the code).

I wanted to see if you could please let me know how and what I can do to fix this problem.

@Nicholas_Bokulich previously mentioned that I could re-import the files in the correct order, in order to test out q2-itsxpress (Analysis still pending) when I was having issues with Cutadapt.

Given my current problem, I wanted to see if I would have to reimport the files, as suggested before, and or if there is a better way to do this, at my current position in the pipeline?

UPDATE: I did not want to erase the post, as this might be helpful for someone else, but I just saw this post, and saw that I should add --p-read-orientation, set to 'same' and 'reverse-complement.' I am hopeful that this will fix my problem, but please let me know if I should delete this post, given there is a post w the similar problem. (Sorry I did not see this earlier).


Running the code and adding –p-read-orientation reverse-compliment worked, and now I have great data :smiley:

