parallelize qiime rescript evaluate-fit-classifier on hpc

brittair · October 27, 2023, 3:00pm

qiime2-amplicon-2023.9, miniconda3, macM2

command:
qiime rescript evaluate-fit-classifier
--i-sequences V1_V2_27f_338r/silva-138.1-ssu-nr99-seqs-27f-338r-derep.qza
--i-taxonomy V1_V2_27f_338r/silva-138.1-ssu-nr99-tax-27f-338r-derep.qza
--p-n-jobs -2
--o-classifier V1_V2_27f_338r/silva-138.1-ssu-nr99-27f-338r-classifier.qza
--o-observed-taxonomy V1_V2_27f_338r/silva-138.1-ssu-nr99-27f-338r_predicted_taxonomy.qza
--o-evaluation V1_V2_27f_338r/silva-138.1-ssu-nr99-27f-338r_classifier_eval.qzv

question: I read that this step requires a lot of memory and and time (e.g., v3-4 takes >35 hours with 12 cores according to https://github-wiki-see.page/m/shenjean/diversity/wiki/Step-6:-Taxonomy-assignment). I've read the community plugin support post from 2021 (Is there a way to parallelize evaluate-fit-classifier?), but other than flagging --p-n-jobs I'm not sure how to best configure this on an hpc.
Any advice about how to parallelize this on an hpc to speed things up (with fairly detailed explanations/instructions for a newbie)? I've visited the help documentation and see options for --parallel-config and --parallel, but am not experienced enough to know how to make a parallel configuration and set reasonable parameters, or how many cpus, tasks, time, etc. should be allocated for the job. Or perhaps I am doomed to pay the price of 35 hours. Any experienced advice will be much appreciated.

Nicholas_Bokulich · October 28, 2023, 7:12am

Hi @brittair ,

Unfortunately, yes. Because as mentioned on that other topic:

But the good news is that you only need to fit the classifier once (hopefully), and then you can re-use it in your environment many times, no need to repeat this step.

You can also just fit the classifier with q2-feature-classifier if you do not want to perform the evaluation steps. This will save you some hours of additional time. That's the step that is parallelized by the --p-n-jobs option, though, so you can leverage your hpc to speed up that step.

Sorry I can't give a more satisfying answer!

brittair · October 28, 2023, 7:29am

I suspected that might be the case from the 2021 post, but wanted to ask just in case things have changed. Thank you kindly for the clarification!