Working in parallel on a cluster

i am working on an SGE cluster, how do i set up working in parallel? i see some scripts have an option to set the number of threads, but i am not sure how to submit the jobs to a queue.

Hi @nir! This is a great question! Right now QIIME 2 doesn’t have any specific knowledge about cluster schedulers (such as SGE). As you mentioned, some QIIME 2 Methods do support specifying multiple threads, but it is really on a per-action basis. We have generally found this strategy to work well enough for now, but we have had several discussions about QIIME 2 being able to directly communicate with a scheduler (stay tuned!). In the meantime, I would suggest you coordinate with your cluster’s sysadmin to figure out the best way for you to submit your jobs to your particular cluster (we can’t really comment here on specifics because it varies from institution to institution). Good luck, and keep us posted (and feel free to ping us if you have any questions). Thanks! :t_rex:

I know how to submit jobs to the cluster (generally qsub). If i understand correctly, currently I can’t break the command to multiple jobs that could be sent independently to the cluster, just as one command with multiple threading support. right?

Hi @nir, yes, I think we are on the same page! Just to make sure it is clear, while splitting jobs and running them in parallel in QIIME 1 was generally required for many larger datasets, we are finding that this generally isn’t the case in QIIME 2, for a variety of reasons. The commands that support specifying the number of threads (e.g. dada2 denoise-*) are the major performance bottlenecks we have identified to date, which allows for you to execute those particular steps in parallel. Keep us posted!

1 Like

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.