I'm trying to train a classifier using the following command:
qiime feature-classifier fit-classifier-naive-bayes
--i-reference-reads silva-138-trunc178-ref-seqs.qza
--i-reference-taxonomy ../silva-138-99-tax.qza
--p-classify--chunk-size 250
--o-classifier trunc178-classifier.qza
--verbose
I have a computer with 16 GB of RAM, 12 of which are allocated to my virtual machine. Initially I was having an issue with insufficient RAM available for the array, so I poked around the forum and found the tip for changing the chunk-size. I tried that with a few different sizes. From 20000 to 5000, the error message said there still wasn't enough RAM. From 2500 to 1000, it simply spat out the word "Killed." For 500 and 250, it's back to saying that it can't allocate 4.47 GiB. So now I'm confused because there should be 12 GB available. Any suggestions? Here is the full output from the final chunk-size of 250 I tried as show above:
/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_feature_classifier/classifier.py:102: UserWarning: The TaxonomicClassifier artifact that results from this method was trained using scikit-learn version 0.24.1. It cannot be used with other versions of scikit-learn. (While the classifier may complete successfully, the results will be unreliable.)
warnings.warn(warning, UserWarning)
Traceback (most recent call last):
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/q2cli/commands.py", line 339, in call
results = action(**arguments)
File "", line 2, in fit_classifier_naive_bayes
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 245, in bound_callable
outputs = self.callable_executor(scope, callable_args,
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 391, in callable_executor
output_views = self._callable(**view_args)
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_feature_classifier/classifier.py", line 330, in generic_fitter
pipeline = fit_pipeline(reference_reads, reference_taxonomy,
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_feature_classifier/_skl.py", line 32, in fit_pipeline
pipeline.fit(X, y)
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/sklearn/pipeline.py", line 346, in fit
self._final_estimator.fit(Xt, y, **fit_params_last_step)
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/q2_feature_classifier/custom.py", line 40, in fit
self.partial_fit(cX, cy, sample_weight=csample_weight,
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/sklearn/naive_bayes.py", line 589, in partial_fit
self._update_feature_log_prob(alpha)
File "/home/qiime2/miniconda/envs/qiime2-2021.11/lib/python3.8/site-packages/sklearn/naive_bayes.py", line 777, in update_feature_log_prob
smoothed_fc = self.feature_count + alpha
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 4.47 GiB for an array with shape (73259, 8192) and data type float64
Plugin error from feature-classifier:
Unable to allocate 4.47 GiB for an array with shape (73259, 8192) and data type float64