dada 2 denoise number of cores

Hello!

I have a question about running dada 2 denoise-single. Namely I am using QIIME2-2019.7 verion and would like to know how much memory and cores should be sufficient to run this script smoothly and fast? (previously I used 8GB memory and cpus-per-task=1 but with this it ran for 2 weeks and then crashed due to network error). So my goal is to make it run smoother, but I do not know exactly how to estimate these parameters (i am running it in a server). My input .qza file is 195 GB and in total I have 1700 samples.

Hi @kreetelyll!

Holy moly! Just to confirm, is this amplicon data? Also, I suspect this was multiple runs?

Yeah, I know it is a lot.

yes, amplicon data

Also yes, done in 3 different runs

@ebolyen might still have something to say here about this, but I will briefly “QIIME” in! If you have multiple runs, you will need to denoise them separately, then merge the individual run results. Check out the Atacama tutorial for an example of that! The reason for this is because the error model is built on a per-run basis, so combing runs might be a bit confusing for DADA2. Processing these per-run will also have the added benefit of requiring less RAM to get you by. Hope that helps! :t_rex:

3 Likes

@thermokarst Is this the right tutorial for denoising runs separately? I do not find any indication that there are multiple runs in this tutorial. It is multiplexed paired-end but nothing about multiple runs being denoised separately and then merging.
Thanks!

I already found the right tutorial (FMT tutorial https://docs.qiime2.org/2019.7/tutorials/fmt/)

Oops so sorry, you are right, the FMT tutorial is the one to check. Sorry about that!