Silva memory error

Yin_Hui_Cheok · December 4, 2019, 2:02pm

Hi guys, I encountered some errors while trying to run taxonomy classification based on silva classifier. I had 48 16s rDNA samples, total size of my DNA is around 1.2gb.

I did some digging in the forum regarding the same issues, and try and error from those previous, like these:

4 DEC 8.18pm
$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-chunk-size 20000 --o-classification taxonomy.qza

4 DEC 8.30pm
$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-reads-per-batch 20000 --o-classification taxonomy.qza

4 DEC 8.45pm
$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-n-jobs 20 --p-reads-per-batch 1000 --o-classification taxonomy.qza

4 DEC 9pm
$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-n-jobs 1 --p-reads-per-batch 2000 --o-classification taxonomy.qza

4 DEC 9.15pm [success]
$ qiime feature-classifier classify-sklearn --i-classifier gg-13-8-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --o-classification taxonomy.qza

4 DEC 9.30pm
$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-reads-per-batch 1000 --o-classification taxonomy.qza

4 DEC 9.45pm
$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-reads-per-batch 1000 --p-pre-dispatch 1 --p-n-jobs 20 --o-classification taxonomy.qza

All of the command above except the one using greengenes classifier work. The same error keep killing the analysis:

Plugin error from feature-classifier:

Unable to allocate array with shape (796852224,) and data type float64

Debug info has been saved to /tmp/qiime2-q2cli-err-t6driasm.log

I am currently dual-booting my laptop, and natively install qiime2 from ubuntu. My laptop had 8gb ram and i allocated 300gb to my ubuntu os system.

SO, here is my questions:

how reliable is greengene since it had not been updated since years ago?
is there any possible ways to conducted the analysis by using silva classifier with my current laptop?

Thank you in advance

Yin_Hui_Cheok · December 4, 2019, 2:54pm

Error upgraded with latest trial:

$ qiime feature-classifier classify-sklearn --i-classifier silva-132-99-515-806-nb-classifier.qza --i-reads rep-seqs.qza --p-reads-per-batch 500 --p-pre-dispatch 1 --o-classification taxonomy.qza --verbose
Traceback (most recent call last):
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in call
results = action(**arguments)
File "</home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-347>", line 2, in classify_sklearn
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 229, in bound_callable
spec.view_type, recorder)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/result.py", line 289, in _view
result = transformation(self._archiver.data_dir)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/core/transform.py", line 70, in transformation
new_view = transformer(view)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_feature_classifier/_taxonomic_classifier.py", line 72, in _1
pipeline = joblib.load(os.path.join(dirname, 'sklearn_pipeline.pkl'))
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 605, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 529, in _unpickle
obj = unpickler.load()
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/pickle.py", line 1050, in load
dispatchkey[0]
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 355, in load_build
self.stack.append(array_wrapper.read(self))
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 198, in read
array = self.read_array(unpickler)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 144, in read_array
array = unpickler.np.empty(count, dtype=self.dtype)
MemoryError: Unable to allocate array with shape (796852224,) and data type float64

Plugin error from feature-classifier:

Unable to allocate array with shape (796852224,) and data type float64

See above for debug info.

timanix · December 4, 2019, 3:11pm

Hi! I am afraid you need at least 16 GB of RAM to run classification. It's greedy for RAM.

Yin_Hui_Cheok · December 4, 2019, 3:18pm

Hi @timanix. Is there no other options? It will cost me alot to upgrade ram

Nicholas_Bokulich · December 4, 2019, 3:27pm

there is another option: see the --p-reads-per-batch parameter. Set to something low like 1000 or 2000. It will cost you more time, but reduce memory demand.

timanix · December 4, 2019, 3:36pm

If @Nicholas_Bokulich suggestions will not work for you, maybe you can borrow a stronger machine just to run analysis that's require a lot of RAM

Yin_Hui_Cheok · December 4, 2019, 4:10pm

Yup, tried with many --p-reads-per-batch ranging from 50 to 20000. All of these get the same outcome.

Traceback (most recent call last):
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2cli/commands.py", line 328, in call
results = action(**arguments)
File "</home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/decorator.py:decorator-gen-347>", line 2, in classify_sklearn
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/action.py", line 229, in bound_callable
spec.view_type, recorder)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/sdk/result.py", line 289, in _view
result = transformation(self._archiver.data_dir)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/qiime2/core/transform.py", line 70, in transformation
new_view = transformer(view)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/q2_feature_classifier/_taxonomic_classifier.py", line 72, in _1
pipeline = joblib.load(os.path.join(dirname, 'sklearn_pipeline.pkl'))
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 605, in load
obj = _unpickle(fobj, filename, mmap_mode)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 529, in _unpickle
obj = unpickler.load()
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/pickle.py", line 1050, in load
dispatchkey[0]
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 355, in load_build
self.stack.append(array_wrapper.read(self))
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 198, in read
array = self.read_array(unpickler)
File "/home/cheok/miniconda3/envs/qiime2-2019.10/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 144, in read_array
array = unpickler.np.empty(count, dtype=self.dtype)
MemoryError: Unable to allocate array with shape (796852224,) and data type float64

Nicholas_Bokulich · December 4, 2019, 4:13pm

Sounds like you are left with one of two options:

Use a less memory-intensive classifier, like greengenes
as @timanix advises:

timanix · December 4, 2019, 4:27pm

If you will be completely desperate about it and if you will be able to send me all necessary files and commands I can run it for you.

system · January 4, 2020, 10:27pm

This topic was automatically closed 31 days after the last reply. New replies are no longer allowed.