MemoryError when training classifier with SILVA

Hi @Nastassia_Patin,
Thank you for clarifying!

It looks like others have successfully used the SILVA classifier with around 20-32 GB RAM, with the exception of this post where the user has a very large data set. Given the size of your data set, I’d expect a lower memory need.

So I should ask: how much RAM do you have? The SILVA classifier does eat up a lot of memory and a standard laptop (e.g., around 8 GB) is probably not enough. It looks like the AWS free tier only has 1 GB of RAM (though perhaps I’m reading the wrong info), which would be woefully inadequate.

If you are limited by memory availability, I can suggest some alternatives:

  1. Use the greengenes pre-trained classifiers instead. They are smaller and less memory-intensive (should work fine on most standard laptops).
  2. Use BLAST or VSEARCH consensus classifiers available in q2-feature-classifier instead. They do not perform quite as well as the naive bayes classifier used in classify-sklearn, but they still do perform very well with appropriate parameter settings. And they should require less memory (I’m not 100% positive but it is worth a try).

I hope that helps!

3 Likes