vsearch de novo output --uc option

devonorourke · October 5, 2020, 2:05pm

Glad you were able to get what you needed. I bet there are folks here that can provide better insights regarding the reproducibility of your results; maybe @colinbrislawn?
I believe there is an option for -randseed within vsearch that might assist in making sure results line up between trials, but that argument doesn't appear to be invoked with the QIIME version.

My hunch is that the QIIME method and VSEARCH methods should produce the same results, and that the centroids are identified based on abundances of sequences. In the QIIME version, this information is available in the feature-table.qza file you supply, but with VSEARCH, those abundances aren't immediately available if you simply export the fasta file from the .qza file. I believe you should be able to get the same centroids identified provided that you've gathered those abundance values for each representative sequence and included them in the fasta header, then run vsearch with the --sizein argument.

Then again, you might not care about it if the differences are trivial!

Cheers