Dear all.
I am struggling for a while with a memory error from classifier. I trained it by myself using qiime2 2019.01. It worked fine with smaller dataset recently.
Here is the command I am using:
!qiime feature-classifier classify-sklearn \
--i-classifier training-feature-classifiers/classifier.qza \
--i-reads rep-seqs.qza \
--o-classification taxonomy.qza
And I am receiving an error:
Plugin error from feature-classifier:
Debug info has been saved to /home/bio/anaconda2/tempfiles/qiime2-q2cli-err-x96sso5f.log
Here is the log:
Traceback (most recent call last):
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2cli/commands.pyβ, line 274, in call
results = action(**arguments)
File β</home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/decorator.py:decorator-gen-338>β, line 2, in classify_sklearn
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.pyβ, line 231, in bound_callable
output_types, provenance)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/qiime2/sdk/action.pyβ, line 365, in callable_executor
output_views = self._callable(**view_args)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/classifier.pyβ, line 215, in classify_sklearn
confidence=confidence)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_skl.pyβ, line 45, in predict
for chunk in _chunks(reads, chunk_size)) for m in c)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.pyβ, line 917, in call
if self.dispatch_one_batch(iterator):
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.pyβ, line 759, in dispatch_one_batch
self._dispatch(tasks)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.pyβ, line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.pyβ, line 182, in apply_async
result = ImmediateResult(func)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.pyβ, line 549, in init
self.results = batch()
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.pyβ, line 225, in call
for func, args, kwargs in self.items]
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.pyβ, line 225, in
for func, args, kwargs in self.items]
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_skl.pyβ, line 52, in _predict_chunk
return _predict_chunk_with_conf(pipeline, separator, confidence, chunk)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/q2_feature_classifier/_skl.pyβ, line 66, in _predict_chunk_with_conf
prob_pos = pipeline.predict_proba(X)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/utils/metaestimators.pyβ, line 118, in
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/pipeline.pyβ, line 382, in predict_proba
return self.steps[-1][-1].predict_proba(Xt)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/naive_bayes.pyβ, line 104, in predict_proba
return np.exp(self.predict_log_proba(X))
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/sklearn/naive_bayes.pyβ, line 86, in predict_log_proba
log_prob_x = logsumexp(jll, axis=1)
File β/home/bio/anaconda2/envs/qiime2-2019.1/lib/python3.6/site-packages/scipy/special/_logsumexp.pyβ, line 112, in logsumexp
tmp = np.exp(a - a_max)
MemoryError
I supposed that I have too little space on my system partition, so I exported TMPDIR to another partition with a lot of space, but I am still receiving the same mistake. I have about 128 GB of RAM and 600 GB of free space on ROM, no heavy processes in parallel. In my current dataset about 410 samples. I was using Jupyter Lab instead of usual terminal, everything worked with smaller dataset.
Now I am trying to repeat this command from a terminal, but I donβt think thats it is an issue.
Please, help me figure out what I need to do to solve it.
Many thanks.
Update: Got the same error from the terminal.