Memory Error using the pretrained silva 119 classifier

jairideout · February 12, 2018, 5:17pm

The short answer is that 4GB RAM doesn't appear to be enough memory for the command to complete, either in single-job or multi-job mode. My guess is that you're running the command without using multiple jobs (which is the default behavior), so setting a different chunk size / reads-per-batch won't have any effect. I think you'll have better luck with the 16GB RAM computer. Let us know how it goes!

See below for specific answers to your questions -- apologies that my previous post wasn't clear about single vs multiple jobs and chunk size (looks like I led you on a bit of a wild goose chase!).

Note: I was using the word "single-threaded" in my previous post, but classify-sklearn actually uses multi-processing instead of multi-threading. There are technical differences between the two modes of parallelism, but the idea is roughly the same.

Sorry about that, it looks like --p-chunk-size was renamed to --p-reads-per-batch in the QIIME 2 2017.9 release (changelog notes).

You're likely running the command in single-job mode unless you're using --p-n-jobs with a value other than 1. If you're not including --p-n-jobs in the command, it will run in single-job mode by default. If you have a limited amount of memory, you'll want to run the command in single-job mode (in other words, you can omit the --p-n-jobs option altogether).

Single-job mode means that the command will run on a single CPU instead of processing the data in smaller parallel jobs. You can speed up the runtime of the command by specifying -1 to use all CPUs, or a value greater than 1 to use the specified number of CPUs. However, that will use up more memory than running in single-job mode, so it won't help reduce memory requirements.

Let me know if you have any other questions about this, and sorry again for the confusion!