Long tail of rare ASVs after dada2 denoise-paired

Hi @Future_Microbiome,

Sorry, this turned into a long answer!

Basically, when there is too much noise in the data (i.e. bad quality data) dada2 starts throwing away samples because they are too noisy. When we trim, we typically are cutting out the noiser part of the data(typically we see the quality decrease as the sequences length increases) allowing for more samples to make it through dada2's denoising steps. :tada::

Your 200 trunc run looks the best to me. You are keeping the most sequences through filtering and merging with that trunc length. Having said that, this is your data so you should choose the run that you think makes the most sense for your data.

Unfortunately, it looks like no matter what you do, you are losing alot of reads to chimeric filtering. There are some parameters that you can tweak to try to get more reads passed this step. I am always alittle weary :fearful: when messing with chimeric filtering default parameters because we don't want to relax our thresholds too much and end up with chimeras in our data :-1: .

Here are some good chimeric filtering forum posts, if you want to test out tweaking those parameters: high chimera rate in dada2 - #4 by Nicholas_Bokulich :smile:

My rule of thumb is if all samples lose more then 50 % of their reads in any step (filtering, denoising, merging, or chimeric detection), I typically look at changing parameters to try to get a better results. For your data, your chimeric detection step is losing ~50% in each sample. But sometimes thats just data, sometimes its not perfect :person_shrugging:

Circling back to your orginal question :question: :

I think that you are using dada2 correctly and that unforunately the quality of the sequences are leading to less than stellar results from dada2. For continuing on with this dada, I would recommend looking at setting a prevelance threshold.

I hope all this helps! Let me know if you have any more questions or need clarifications on anything!

:turtle:

3 Likes