Vsearch clustering

c.older · June 28, 2018, 4:20pm

Hi,

I am following the workflow described in this previous post (Importing and Demultiplex process for 4 Fastq Files: R1, R2, Index1 and Index2 - #5 by thermokarst) due to the format of the data I have received.

My question relates to using vsearch to perform OTU clustering - should I be using a version of the greengenes OTUs that are trimmed to the region that my primers targeted? I know once I get to classification, I should be using a classifier that has been trimmed down but also wanted to check for this clustering. I'm thinking it could only be better to do this since it would be more specific and it's probably unnecessary to compare to a bunch of extra bases, but thought I'd check in case there's something I am missing in my chain of thought

Nicholas_Bokulich · June 28, 2018, 4:24pm

No, you don't need to. But you are right, it is probably better (faster, maybe not more accurate).

You don't need to (you've probably seen the notes in the training tutorial) but it does increase accuracy slightly for 16S reads.

It can't be worse! (unless if it's trimmed to far — e.g., make sure primers are trimmed from your query sequences if you are going to trim the reference too)

I hope that helps!

c.older · June 28, 2018, 4:34pm

Thank you for answering all of this so clearly!!

system · July 29, 2018, 10:34pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.