Feedback on choice of dada2 trimming parameters


I see in the previous posts that the choice of the trim and truncate parameters for dada2 are very subjective. I have 3 16S Miseq runs (515F and 806R primers) and I am hoping to set the same values for dad2 params for each MiSeq. Please see the quality plots below and would appreciate your input on:

  • Do --p-trim-left-f 20 --p-trim-left-r 20 --p-trunc-len-f 230 sound appropriate for this data?

  • Would you suggest using 160 or 180 for --p-trunc-len-r parameter?

  • Considering that the reverse read quality are not that great (may be expected) is it worth using them at all? I'm guessing it would affect the taxonomic assignment, but wondering the tradeoff of using reverse read to get ~100 bp (F&R) overlap, vs using only using forward reads.



Hi @Richard_Rodrigues1,
Am I understanding correct that the end-goal is to combine these 3 runs following dada2? If not, then there’s no need to use the same denoising parameters, only if they are to be combined do they need to be the same.

So as you mentioned the choice of dada2 parameters is rather subjective, but it doesn’t mean it can’t follow some go-to logical approaches.
In your case the reverse reads are unfortunately rather poor, in my experience poorer than average, so at the risk of losing too many reads I would personally just stick with the forward reads. Since the region you are targeting is very short to begin with (~290) the loss of your reverse reads will not be that detrimental to your resolution.

I can’t tell in the images but as long as 230 occurs before a big dip in your quality scores I think this is a sensible starting point. If you find that you are losing too many reads after this you could try truncating a bit more, say down to 200-210 and that should retain more reads.

The main tradeoff is that with those poor quality reads included, you are more likely to have DADA2 filter them out and you would lose the forward reads accompanying those, meaning less reads/sample in general. But those that survive would have slightly higher resolution since they will be longer, merged reads. Depending on how many reads/samples you have you can make a choice as to which is more important for your goal. Personally, I would stick with forward reads though.


This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.