Dada and Deblur, cutoffs and comparison

For paired-end sequencing data, DADA2 requires the truncate-length-forward and truncate-length-reverse parameters. We are using MiSeq PE250 reads and I am inspecting demux output but mostly use very aggressive forward 180 and reverse 120 truncation cutoffs (usually just a slight quality drop there).
I heard from a colleague that you recommended to cut all positions after median Q25? While differences seem small when changing these parameters, they still do change results according to my experience mostly in respect to resolving similar sequences. I am also asking because I am planning to produce a pipeline that should automate as many parameters as possible and a value such as Q25 seems rather low to me but would be easy to implement.

Deblur requires a trim-length parameter, what value would you recommend for merged paired-end reads e.g. for the common MiSeq primer pair 515f / 806r ? After trimming primers, around 250 to 252 bp long sequences are expected here. When we choose 250 bp, some merged sequences are truncated to 250bp (we loose last bp information) and all sequences shorter than 250 bp are discarded that might make specific taxa with short amplicon <250bp undetectable. I am rather afraid to exclude too many taxa by choosing a too high cutoff but also dont like to loose too much information with a too low cutoff.

What denoising method (DADA2 or Deblur) do you prefer and why? According to my experience, deblur is less sensitive especially for low abundant strains (more false negatives) but more precise (less false positives) than dada2. The ease of analyzing data from several sequencing runs in one go with Deblur is convenient. However the loss of short amplicons or truncation of long amplicons is a rather severe disadvantage of Deblur in my opinion.

1 Like

Hi @DaS,
Excellent questions all around!
Unfortunately as you may have suspected there is no clear-cut answers to any of your questions but rather some recommendations and guidelines. Let's jump in.

As long as you allow enough overlap bp for proper merging, I'm all for using aggressive truncating parameters since these are less error-prone. Quality over quantity. It also reduces processing time as well. Proper merging with dada2 requires a min 20bp overlap, but consider natural length variability, so depending on your region target you may want to leave more if you can afford it.

I have seen 20 and 25 median scores being recommended as a 'starting point'. The higher the better obviously, but not all datasets have the luxury of being able to use higher cutoff points due to low read #s or the need for longer amplicons for proper merging. With small fully overlapping regions like yours in this case, this is much easier, especially since the overlaps can significantly reduce errors.

Of course we would expect changes since we are dealing with different ASVs and lengths, and from my experience these changes are more likely to affect alpha diversity measures than beta diversity. Phylogenetic-based indexes are also more resistance to these parameter changes from experience. Could you provide a bit more detail about what is changing between your experiment results in these scenarios? Another way to resolve parameter effects is to collapse your features to a say genus level and perform your analysis then. It's hard to make the argument that one parameter is more correct than the other, but rather that they are rooted in different amounts of information.

Automating these processes is a very interesting and ambitious idea, something I've thought about a lot myself but haven't found a good convincing way of doing yet, especially with PE data, since there are some key decision making steps that require user-input. This is easier with single-end or PE data with almost complete overlap, so once you figure out the nature of your sample-type perhaps you can do this with your primer sets. And Q score of 25 is not rather low in my opinion as it corresponds to an error rate of about 0.003%! Also consider that overlap regions can reduce this significantly with consensus. But yes, the higher the better...

Good question, and again this depends on what you are asking your data. Personally I would say why not go for the shortest amplicon length that seems reasonable in a length distribution plot. For example, you expect amplicon length of 250-252 but what about one that is 240? It is likely a true feature that is naturally shorter. If we are too conservative with these length we are introducing length bias and preventing ourselves from real discovery. Since deblur uses a positive filter anyways, I would just stick with a reasonable minimum length and not get too greedy with length. As for deblur, there is a secondary motive to use shorter lengths anyways (explained below)

Unfortunately this kind of decision making is still necessary for this kind of analyses and one of the reasons why I hesitate with automation. Finding a compromise between resolution, discovery, and depth is very much so dependent on your experiment's question. If you expect high effect size between your groups then all this will likely not matter much, if you are looking for discovery and sensitivity in your data then some fine tuning needs to be done.

This topic has been covered quite a few times on the forum already so might be wroth doing some searching. And your observed points are pretty valid and on par with my experience as well. I don't think one method is superior to another other in all cases, each have their own strengths and weaknesses but both perform very well in most cases. For a more thorough comparison check out this recent paper that compares these methods: Denoising the denoisers. But a few thoughts that may help some decision making.
As you already mentioned deblur is very convenient for analyzing multiple projects, in fact I believe that was one of the key factors driving its design. Dada2 of course can also be used for this purpose, but it requires that equal parameters be used across the studies, or ultimately collapse the final merged tables to a common level (i.e genus) which is less informative than ASVs.
Dada2 can work with variable lengths amplicons which resolves the decision making process required by deblur, so in your case instead of deciding where to trim ALL your sequences in deblur, you can include them all using dada2.
One key defining differences between the two is the error model used for denoising. Dada2 trains its own model per each run and the algorithm can be applied across sequencing platforms, whereas deblur uses a pre-packaged model specific to Illumina machines. This training step as you imagine also adds processing time to a run so you are going to experience longer processing time with dada2. This is exaggerated as bigger data is used.
Your comment about deblur producing less false positives is more true as the amplicon lengths increase compared to dada2 but likely at the price of being too conservative. Check out this post by one of deblur developers regarding the calculation used for deblur's expected error rate and how significantly the length can affect it. With dada2 this is less of an issue since your error model is run-specific so it MAY be more sensitive if you have good quality data.
Finally, if you are experiencing too many false-positives from dada2 (confirm by blast) you could always try using a positive filter on your feature table the way deblur does and see if that helps.

Hope this helps a bit.
ps. not-proofread. Excuse any errors...

2 Likes

Hi @Mehrbod_Estaki,

thanks alot for your detailed response!

About automating DADA2 cutoffs: This is an idea to let people come to a first result with minimal input parameters and let them optimize later. We have the computational resources to afford repeated trials. Also the aim is to empower collaboration partners to run analysis on our infrastructure themselves and play with parameters. No idea if that will work out…
I regarded Q25 as low because the sequencing quality that I am working with is rather high and using a Q25 cutoff is removing only approximately the last 10 / 30 nucleotides for my PE 250 MiSeq runs.

About changing DADA2 parameters: I observe minor differences such as resolving similar sequences or increases / decreases of the number of output features. When having a mock dataset (know what sequences are expected) the parameters can be optimized, but I cant see how that could be done for a set of real samples.
Might it be worth it to sequence on each run a mock community, optimize parameters for the mock sample and using the parameters for all samples? But the mock community would need to resemble the sample communities as close as possible. Seems to be not feasible.
Despite varying parameter collapsed tables or phylogenetic, quantitative indices are almost identical. But what would we be if we wouldnt attempt to optimize everything?

About DADA2 vs Deblur: We seem to have here similar impressions. And yes, also according to my experience (mock datasets) Deblur is too conservative: alomst no false positive but also doesnt find a good fraction of expected sequences.
I’ll have a look at the paper/forum post you suggested, thanks!

That helped a lot!

1 Like

Great! Of course accessibility is much desired in the field so hopefully you'll hit your stride somewhere. That being said I do have an underlying fear that without proper training and understanding of the methods, untrained users might pick & choose the outcome they prefer if they assume all the methods are correct, so training/educating them should also be emphasized rather than just compare all the different results. Just a -perhaps irrational- thought.

Now you're asking the right questions! Though a topic for another discussion, this is an excellent point, one that has been discussed on the forum before. Personally I think all runs should dedicate a few samples to mock communities and negative controls! Have a look through q2-quality-control and q2-perc-norm for some options with regards to this and keep an eye out on future implementations to deal with batch-effects.
Good luck with your quest!

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.