Does Trimming PE Data More Aggressively Mean Shorter Dada2 Analysis time?



I was wondering when does Dada2 trim the sequences before or after denoising?

And if before does trimming more aggressively (eg trimming sequences so they are shorter) mean that analysis takes less time?

I was wondering this as I wait (for what feels forever! :rofl: ) for my paired end sequences to be processed, as my sequences are approximately 331bps long and I am using 2x300bps kit, so there is a lot of over lap, would I be better trimming to around 380bps instead of the 450bps or so I am currently?



Trimming will indeed decrease your analysis time.

In addition, it will increase the sensitivity of your analysis when you are trimming off the low-quality tails, as those tails degrade the ability of the method to detect low-frequency variants.

I haven’t seen your quality profiles, but generally the tail of the reverse 2x300 reads are quite poor, and you have a large amount of overlap. I would highly recommend applying -p-trunc-len-f and p-trunc-len-r as guided by your read qualit profiles, for both speed and accuracy at low-freqencies.


Thank you very much for your response!

Good to hear, I’m still honing my pipeline, and don’t have an adequate computer for analysis so running dada2 takes a week! I will try running again with more aggressive trimming to compare results and post them here for other peoples reference.



@Micro_Biologist have you tried running DADA2 with multiple threads? That’ll speed things up by running things in parallel. Check out the --p-n-threads option on dada2 denoise-single or dada2 denoise-paired! :tada:

Yes unfortunately even running with 6 threads (with an i7, so technically still 4 cores), thankfully we have got approval to buy something a bit beefier (hopefully a 32 core eypc server chip) so it won’t be an issue for long! Having said that I was also just curious and if more aggressive trimming will also improve the quality meaning more retained reads then all the better!

