denoising vs dereplication

llenzi · March 26, 2021, 11:46am

Both deblur and dada2 should return error-free sequences, which will be beneficial for the taxonomy assignment step. Also a smaller data set to handle after the denoising, because sequences with errors are not discarded but reverted to the original state. You could compensate this via filtering the low abundance OTUs, but you always taking the risk to discard real low abundant clusters as well by applying a filtering by abundance.

There are few interesting discussion on comparing OTUs (clusters obtained by vsearch, as in old qiime1) and ASVs/ESVs (amplicon sequence variants obtained by denoisers as dada2 or deblur), one is:

Another good one is:

Which remind me about chimera filtering steps embedded into deblur and dada2, that seems missing in your pipeline from the step you mention.

One question, what are you going to do next with the dereplicated sequences?

Keep in mind, working with ASVs or OTUs are both accepted way to get your results, so as long you are doing correctly they are both valid! Mostly is a matter of preference and knowing what you are working with!

Luca