Estimating the amount of memory needed is really hard because it depends on the size and complexity of the database. My first thought is to make sure your computer or VM has as much memory as possible; 23 GB is a good amount so it sounds like you have already done this. Keep in mind that this is not a setting you pass to this script; you will have to increase your memory in the VM settings, by closing other programs, or by running on a designated node.
Your suggestion of decreasing chunk-size is the next option, and I'm glad you tried that. Finally, use a very small chunk size, like --p-classify--chunk-size 100 just to make sure the script can run at all.
Thanks for the quick reply! I actually set the memory in the VM settings after closing the programs, but I wonder if there is a way to make sure the program can really get so much memory?
Another question is, what’s the meaning of the flag --p-memory TEXT? I did not find the description of this flag but the default setting.
After increasing the memory given to the VM, you can open up Ubuntu's System Monitor check on the amount of memory available. You should see all 23 GB inside the VM.
Yeah, they didn't document that command. Qiime 2 is a work in progress. I bet it's maximum memory used for each block, but qiime devs can confirm that for us. Once we know, I can add that to the documentation.
I would follow Colin’s advice and try a very small --p-classify--chunk-size to start with. Please let us know how you go with that.
In answer to the question regarding --p-memory, it’s not actually at all related to this problem, sorry, but is an artefact of how we register scikit-learn classifiers for use in the plugin. I have created an issue suggesting that it be removed from the interface.