Extract sequenced region also for classify-consensus-vsearch?

II wonder if it is advisable (or even valid) to first extract only the sequenced region from the reference sequences before alignment using consensus methods such as classify-consensus-vsearch? (similar what we do before training the Naive Bayes classifier?)

Using the Silva-99-reference trimmed to e.g. V3-V4 region, it takes only a few minutes to do classification with classify-consensus-vsearch.


Hi @Kristian_Holm,
Great question!

It is not necessary but it is not a bad idea, either! In fact, alignment against trimmed reads will speed up vsearch classification considerably. This is balanced by the increased time to trim the sequences… but if you are going to use those trimmed sequences repeatedly (e.g., for future projects), then it will save you some time.

There could also be benefits for classification accuracy, but I have not tested that…

Yep! It really does speed things up!

I hope that helps!

