Long DADA2 run time?


I am re-running my samples again in DADA2 (denoise-paired end run; ~ 11 million reads). Last time with QIIME2-2019.1 version it took around two days to finish the run. But now (QIIME2-2019.4) it has been running for over four days.

qiime dada2 denoise-paired
–i-demultiplexed-seqs demux-paired-end.qza
–p-trim-left-f 7
–p-trim-left-r 7
–p-trunc-len-f 298
–p-trunc-len-r 256
–o-table 20190608_table.qza
–o-representative-sequences 20190608_rep-seq.qza
–o-denoising-stats 20190608_denosing_stats
–p-min-fold-parent-over-abundance 3
–p-n-threads 12

I added the parameter "–p-min-fold-parent-over-abundance 3 " in this run since I had many reads being discarded as chimeras before. Is it taking longer because of this or because of the new update? Or is it due to some other issue?

I am running it in ubuntu through my university server.

Thanks in advance,

Hey @Shruthi,

That is actually very strange, as the 2019.4 release is actually an order of magnitude faster as bioconda was able to rebuild DADA2 with modern compilers (and normal optimizations), letting us get at the real speed of DADA2.

I also don’t see how min-fold-over-abundance would change anything, since the calculation happens no matter what, you are just changing a threshold for filtering (which isn’t a bad idea if you have a lot of chimeras).

Are you certain that you submitted both jobs the same way? Usually for institutional machines, these kinds of jobs go into a queueing system, perhaps something changed there, or you got particularly lucky with the servers utilization?

I guess I don’t have any good recommendations at this point, hopefully it finishes soon (assuming it is actually running at all)!

1 Like

Hey @ebolyen,

No difference in the jobs but it finally finished running after 6 days.


1 Like