During dereplication with VSEARCH ( vsearch dereplicate-sequences ), the feature IDs are replaced with hashes of the sequence. I’m using Q2 for working with Sanger sequences. If that feature could be turned off selectively so I don’t have to reconnnect my sequences to their original ID later on, that would be very useful.


Thanks for the suggestion @sformel! I have opened an issue to get that feature added some time in the future. Contributions are always very welcome if you want to take a swing at it :wink::cricket_bat_and_ball:

While Q2-vsearch changes the feature-IDs, using vsearch directly will not. Have you considered simply running vsearch directly?

vsearch --derep_fulllength FILENAME --output FILENAME --sizeout 

Let me know if that’s helpful.


It hadn’t occurred to me, although that makes a lot of sense. I’ll give it a shot, thanks!

