Hi - I’m trying to use the latest SILVA database to classify my microbiome reads. Previously I’ve used Greengenes, and trained my own classifier following the QIIME2 “training feature classifiers” tutorial.
For SILVA, I downloaded the classifier directly from here: Silva 132 classifiers (the 515-806 one, which matches the primers I’ve used).
I then assumed that I could simply use that file to directly generate taxonomy with my rep-seqs.qza file, without any further manipulation of the silva classifier file. However, when I run the code below I get “Plugin error from feature-classifier.”
Hi @mihcir,
What does the error message say? It should report a log file that you can read, which will contain the full error message. Please share that.
My guess is this is a memory error — you will see MemoryError at the bottom of the error message. See here for some tips to solve this issue.
Here’s the error log, it’s quite long but you’re right that it ends with “Memory Error”:
Traceback (most recent call last):
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/q2cli/commands.py”, line 274, in call_
_ results = action(**arguments)_
_ File “”, line 2, in classify_sklearn_
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/action.py”, line 226, in bound_callable_
_ spec.view_type, recorder)_
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/sdk/result.py”, line 266, in view
_ result = transformation(self.archiver.data_dir)
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/qiime2/core/transform.py”, line 70, in transformation_
_ new_view = transformer(view)_
_ File "/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/q2_feature_classifier/taxonomic_classifier.py", line 72, in 1
_ pipeline = joblib.load(os.path.join(dirname, ‘sklearn_pipeline.pkl’))
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 578, in load_
_ obj = unpickle(fobj, filename, mmap_mode)
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 508, in unpickle
_ obj = unpickler.load()_
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/pickle.py”, line 1043, in load_
_ dispatchkey[0]_
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 341, in load_build_
_ self.stack.append(array_wrapper.read(self))_
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 184, in read_
_ array = self.read_array(unpickler)_
_ File “/home/mihcir/miniconda3/envs/qiime2-2018.6/lib/python3.5/site-packages/sklearn/externals/joblib/numpy_pickle.py”, line 130, in read_array_
_ array = unpickler.np.empty(count, dtype=self.dtype)_ MemoryError
I checked the link you mentioned but I already have --p-n-jobs at its smallest setting of 1. Is there a way to further reduce the memory requirement, or do you think from the full error log that perhaps I have a different problem?
Use greengenes instead of SILVA — that should help a lot. You can also adjust the reads-per-batch parameter (e.g. try 2000) to have a longer but lower memory job.