I’m running QIIME2 on a HPC of my University, unfortunately I’m experiencing some problems with the command in the object.
The HPC uses SLURM as job scheduling system, and you have to specify the time limit for each job you submit. For my account, the time limit is maximum 12hours.
First, I import the data as QIIME2 artifact, with the output demux-paired-end.qza
and then to denoise (trimming and quality filtering) the data I use dada2 with the following command line:
Using a dataset of 25 samples, it doesn’t work within 12 hours and it’s always cancelled, not matter how many memory and CPSs I use.
My questions are:
Do you know if the this QIIME2 command could support n-tasks in SLURM to run it on multiple cores? Or does it support a paraIisation across multiple nodes?
Is it possible to chunk this command, to make it smaller and faster? For example, run the command for each sample separately instead of run it of one big file?
dada2 will often take > 12 hr for a single job to run. You may want to discuss with your admin to see if you can increase the maximum time limit at least temporarily...
See the --p-n-threads parameter for this method
In theory yes you could break up the samples but this is probably not a good idea since it will impact the error model and alter denoising/chimera checking.
Look at multithreading to make faster, and see if your system admin will give you a longer timelimit...
the option --p-n-threads corresponds to the number of tasks, is it right?
It means that I should specify the same number, in --ntasks for SLURM job submission.