The short answer is that 4GB RAM doesn't appear to be enough memory for the command to complete, either in single-job or multi-job mode. My guess is that you're running the command without using multiple jobs (which is the default behavior), so setting a different chunk size / reads-per-batch won't have any effect. I think you'll have better luck with the 16GB RAM computer. Let us know how it goes!
See below for specific answers to your questions -- apologies that my previous post wasn't clear about single vs multiple jobs and chunk size (looks like I led you on a bit of a wild goose chase!).
Note: I was using the word "single-threaded" in my previous post, but classify-sklearn
actually uses multi-processing instead of multi-threading. There are technical differences between the two modes of parallelism, but the idea is roughly the same.
Sorry about that, it looks like --p-chunk-size
was renamed to --p-reads-per-batch
in the QIIME 2 2017.9
release (changelog notes).
You're likely running the command in single-job mode unless you're using --p-n-jobs
with a value other than 1
. If you're not including --p-n-jobs
in the command, it will run in single-job mode by default. If you have a limited amount of memory, you'll want to run the command in single-job mode (in other words, you can omit the --p-n-jobs
option altogether).
Single-job mode means that the command will run on a single CPU instead of processing the data in smaller parallel jobs. You can speed up the runtime of the command by specifying -1
to use all CPUs, or a value greater than 1
to use the specified number of CPUs. However, that will use up more memory than running in single-job mode, so it won't help reduce memory requirements.
Let me know if you have any other questions about this, and sorry again for the confusion!