I am trying to train a SILVA128 classifier on 341F-806R primers for my current data set. Command as follows:
qiime feature-classifier fit-classifier-naive-bayes --i-reference-reads ref-seqs_silva.qza --i-reference-taxonomy 99_otu_taxonomy_silva.qza --o-classifier silva128_341_806_classifier --verbose
The command is erroring out with the following:
/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/q2_feature_classifier-2017.2.0-py3.5.egg/q2_feature_classifier/classifier.py:94: UserWarning: The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.18.1. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable.)
Traceback (most recent call last):
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/q2cli-2017.2.0-py3.5.egg/q2cli/commands.py", line 217, in call
results = action(**arguments)
File "", line 2, in fit_classifier_naive_bayes
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/qiime2-2017.2.0-py3.5.egg/qiime2/sdk/action.py", line 171, in callable_wrapper
output_types, provenance)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/qiime2-2017.2.0-py3.5.egg/qiime2/sdk/action.py", line 248, in callable_executor
output_views = callable(**view_args)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/q2_feature_classifier-2017.2.0-py3.5.egg/q2_feature_classifier/classifier.py", line 191, in generic_fitter
pipeline)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/q2_feature_classifier-2017.2.0-py3.5.egg/q2_feature_classifier/_skl.py", line 31, in fit_pipeline
pipeline.fit(X, y)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/sklearn/pipeline.py", line 270, in fit
self._final_estimator.fit(Xt, y, **fit_params)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/q2_feature_classifier-2017.2.0-py3.5.egg/q2_feature_classifier/custom.py", line 25, in fit
return super().fit(X, y, sample_weight=sample_weight)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/sklearn/naive_bayes.py", line 566, in fit
Y = labelbin.fit_transform(y)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/sklearn/base.py", line 494, in fit_transform
return self.fit(X, **fit_params).transform(X)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/sklearn/preprocessing/label.py", line 335, in transform
sparse_output=self.sparse_output)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/sklearn/preprocessing/label.py", line 520, in label_binarize
Y = Y.toarray()
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/scipy/sparse/compressed.py", line 920, in toarray
return self.tocoo(copy=False).toarray(order=order, out=out)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/scipy/sparse/coo.py", line 252, in toarray
B = self._process_toarray_args(order, out)
File "/home/fgplab/miniconda3/envs/qiime2-2017.2/lib/python3.5/site-packages/scipy/sparse/base.py", line 1009, in _process_toarray_args
return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError
The workstation I am using has 125GB of ram available, and watching the system activity as the training runs, it fails approaching 8gb of ram usage. I'm not entirely sure how a memory error is being thrown here.
Any help is appreciated!