Dereplication - turn off replacement of feature IDs


During dereplication with VSEARCH ( vsearch dereplicate-sequences ), the feature IDs are replaced with hashes of the sequence. I’m using Q2 for working with Sanger sequences. If that feature could be turned off selectively so I don’t have to reconnnect my sequences to their original ID later on, that would be very useful.


(Matthew Ryan Dillon) #2

(Matthew Ryan Dillon) #3

(Matthew Ryan Dillon) #4

(Nicholas Bokulich) #5

Thanks for the suggestion @sformel! I have opened an issue to get that feature added some time in the future. Contributions are always very welcome if you want to take a swing at it :wink::cricket_bat_and_ball:

(Nicholas Bokulich) #6

(Colin Brislawn) #7

While Q2-vsearch changes the feature-IDs, using vsearch directly will not. Have you considered simply running vsearch directly?

vsearch --derep_fulllength FILENAME --output FILENAME --sizeout 

Let me know if that’s helpful.



It hadn’t occurred to me, although that makes a lot of sense. I’ll give it a shot, thanks!