silva classifier run out of memory

Good morning, I’m a new user and I am having problems with taxonomical assignment. I’m running Qiime2 installed with VirtualBox. The sequences that I have imported were already demultiplexed and I’ve used dada2 for the denoise and joining.

My question is how much RAM does the taxonomical assignment require. What I have tried to use is the Silva classifier found on the QIIME2 web site.
Silva 138 99% OTUs full-length sequences (MD5: fddefff8bfa2bbfa08b9cad36bcdf709 )

I have just 12 joined and demultiplexed samples and I have provided 10G to the virtual box, but when I start the taxonomy assignment process it doesn’t work, and I get an error saying that it needs more memory.
I was wondering if there is a minimum amount of RAM for this classifier to work properly, because in the future I will need to run the process with more samples and I can’t even get it with 12.
Any help would be welcome.
Thank you so much!

Hi @SaraGajas,

It depends entirely on the size of the database, and several parameters (n_jobs, reads_per_batch; search the forum for more details on how these impact memory use) so can be difficult to predict.

SILVA full-length with 1 job running should be possible to run with 10 GB RAM, but your mileage may vary (other users have reported up to 32GB RAM with similar classifiers, though again this depends on other parameters as well). Search the forum for info on reducing memory use, e.g., with the reads-per-batch parameter

The number of samples does not really matter at all. This is what the reads-per-batch parameter does… it feeds in a subset of the query sequences at a time to reduce memory load (and thus most of the memory demand is from the classifier stored in memory). Thus, you could have 500 trillion samples or sequences but it gets fed in batches… in other words, your sample count will not impact memory use if you use reads-per-batch correctly, but it will mean longer runtimes.

Good luck!

1 Like