qiime feature-classifier classify-sklearn _ Which are the best parameter's values?

sleva88 · February 9, 2021, 4:13pm

Hi Everyone! I am using a lot qiime2 to try Taxonomic classification of varius NGS dataset. I noticed that sometime, depending on the dimension of the dataset, the use of the plugin qiime feature-classifier classify-sklearn, may cause SIGSEV -11 (segmentation error).
Does somebody of you know, or exeperienced, which are the best fitting values for the parameters :
--p-reads-per-batch
--p-n-jobs
That could reduce the probability of this type of error?

And also, what exactly mean those parameters? I have read the explanation in the plugin page of the qiime2 website but i didn't find it very exaustive.

Thanks to everyone that could share some piece of information!,
Sleva.

Nicholas_Bokulich · February 10, 2021, 7:41am

splits the query sequences into smaller batches so that only a few are read into memory at one time.

A smaller number leads to less RAM used, at the cost of a little more time

The number of jobs to run in parallel. More jobs = more RAM but less time

There is no "best", since it depends on system specs, datasets, etc... but if you are getting memory errors use 1 job and a smaller batch size (2000?) and just be prepared to wait a bit longer for the job to complete.

Good luck!

system · March 13, 2021, 1:41pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.