Hello everyone! I am a new user for Qiime2, and I ran into issues with my memory when I tried doing taxonomy annotation with the Silva classifier. I am running qiime2-2020.2 with VirtualBox with dynamically allocated VDI, 6000 MB of assigned base memory, and 3 assigned CPU. My computer has 213 GB of storage total and 4 CPU. Below is the code that I ran:
After I ran this code, my VDI's size kept on expanding to the point where it's now 48.3 GB and filled up all the memory on my computer, and VirtualBox was forced to pause itself because there is no more storage left (see screen shot). I tried deleting some old stuff from my host computer, but every time I freed up more storage and tried again, Qiime2 filled it just as fast. Does anyone have ideas on how to fix this? It is typical that doing annotation with the Silva classifier would use up that much storage?
Unable to allocate 6.19 GiB for an array with shape (830193664,) and data type float64
Debug info has been saved to /tmp/qiime2-q2cli-err-3m89vxud.log
Traceback (most recent call last):
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/q2cli/commands.py”, line 328, in call
results = action(**arguments)
File “</home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/decorator.py:decorator-gen-343>”, line 2, in classify_sklearn
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/action.py”, line 234, in bound_callable
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/sdk/result.py”, line 289, in _view
result = transformation(self._archiver.data_dir)
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/qiime2/core/transform.py”, line 70, in transformation
new_view = transformer(view)
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/q2_feature_classifier/_taxonomic_classifier.py”, line 72, in _1
pipeline = joblib.load(os.path.join(dirname, ‘sklearn_pipeline.pkl’))
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 605, in load
obj = _unpickle(fobj, filename, mmap_mode)
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 529, in _unpickle
obj = unpickler.load()
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/pickle.py”, line 1050, in load
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 355, in load_build
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 198, in read
array = self.read_array(unpickler)
File “/home/qiime2/miniconda/envs/qiime2-2020.2/lib/python3.6/site-packages/joblib/numpy_pickle.py”, line 144, in read_array
array = unpickler.np.empty(count, dtype=self.dtype)
MemoryError: Unable to allocate 6.19 GiB for an array with shape (830193664,) and data type float64
This is a common point of confusion, but memory and disk space refer to two separate aspects - memory refers to RAM, while disk space has to do with the size of your hard drive/other storage.
The error message you are sharing is referring to the disk space (file storage) - specifically your host machine has run out of disk space (which also means that your guest machine is out of disk space).
This is a configuration setting in virtualbox - you can set a disk to be fixed, or dynamically allocated - sounds like you went for the latter.
You might just need more disk space, unfortunately. One option is to rent an AWS instance.
Yes, as well as using up lots of memory, too (see my discussion above).
This is a memory error (separate from the storage errors above) - this is something that your computer has a fixed limit of, too, like storage space. You might be able to assign more RAM to your virtualbox machine, or, see my link above about AWS.
Good luck and keep us posted!
(Matthew Ryan Dillon)
Looks like a RAM issue to me. Have you tried the prototype Silva 138 classifiers referenced below?
They do have a smaller memory footprint. If you want to save on even more drive space and memory requirements use the classifier without the species labels. Note, you may have to retrain the classifiers yourself. In which case, make use of the provided sequence and taxonomy qza files.