vsearch clustering and feature classifiers

the_dummy · September 26, 2019, 12:21pm

Hello,

I'm a bit confused about how to use vsearch clustering and feature classifiers. Can I use the table generated by open or closed reference to draw taxa bar plots? Is it reliable to do that way? Or should I skip clustering and go for feature classifiers directly? I see that clustering methods were not used in Moving Pictures and Parkinson's Mouse.

I know that I should go for closed reference with a known environment but I don't know if I should use the taxonomy and OTUs file I got from let's say NCBI or generate a classifier for analysis.

These questions might sound irrelevent but I'm getting lost more I search and read about them. A little help would be much appreciated.

Thank you...

Nicholas_Bokulich · September 26, 2019, 2:25pm

Hi,

It sounds like you are a little confused about the differences between OTU clustering, denoising methods (which have more or less supplanted OTU clustering methods), and taxonomy classification.

OTU clustering and denoising can be viewed as two different solutions (the first crude, the second more sophisticated) to the same problem: removing sequence errors (and also dereplicating redundant sequence information). Read the overview tutorial for some more discussion.

Feature classification is a different problem — you are attempting to determine the identity of the sequences you have, and this should only be performed after sequences have been dereplicated and denoised (or clustered).

Because we recommend denoising methods instead of OTU clustering; check out the literature comparing these methods for our motivations.

You definitely cannot use open-ref clustering and go straight to bar plots, because the de novo clustered sequences will not have been classified! You can do this with closed-ref sequencing but in my opinion it is not super reliable, since closed-ref is just aligning your short fragmentary sequence to the closest match, when many matches may exist to that short fragment. So while this is a valid approach, I am suspicious of the taxonomy classifications that you retrieve with that approach.

So all in all I would recommend denoising your reads then using feature-classifier to taxonomically classify, as shown in the tutorials. Let us know if you have questions!

the_dummy · September 27, 2019, 6:47am

Oh, this is such a relief. Thank you very very much for this explanation, you have cleared my mind.

system · October 28, 2019, 12:47pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.